Skip to content

Converting a StableLM fine tuned model fails with Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. #4171

Closed
@TheBloke

Description

@TheBloke

Prerequisites

Tested on latest commit, 8e672ef , and also on commits from yesterday.

Current Behavior

Trying to convert model https://huggingface.co/pansophic/rocket-3B

Results in:

 [pytorch2] tomj@MC:/workspace/git/gguf-llama (master ✘)✭ ᐅ python3 ./convert-hf-to-gguf.py /workspace/process/pansophic_rocket-3b/source --outtype f16 --outfile /workspace/process/pansophic_rocket-3b/gguf/rocket-3b.fp16.gguf
Loading model: source
gguf: This GGUF file is for Little Endian only
Set model parameters
Set model tokenizer
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
gguf: Adding 50009 merge(s).
gguf: Setting special token type bos to 0
gguf: Setting special token type eos to 0
gguf: Setting special token type unk to 0
Exporting model to '/workspace/process/pansophic_rocket-3b/gguf/rocket-3b.fp16.gguf'
gguf: loading model part 'pytorch_model.bin'
Traceback (most recent call last):
  File "/workspace/git/gguf-llama/./convert-hf-to-gguf.py", line 897, in <module>
    model_instance.write()
  File "/workspace/git/gguf-llama/./convert-hf-to-gguf.py", line 126, in write
    self.write_tensors()
  File "/workspace/git/gguf-llama/./convert-hf-to-gguf.py", line 98, in write_tensors
    data = data_torch.squeeze().numpy()
RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.

I noticed that the latest commits mentioned StaleLM so I tried rolling back to before them, but still got the same error.

I have confirmed that the model loads OK via Transformers, so it appears to be valid.

Any thoughts @Galunid ?

Thanks in advance

Environment and Context

Ubuntu 22.04, Python 3.10

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions