Код: Выделить всё
from transformers import pipeline
model_path = "/mnt/d/Projects/models/gpt-oss-20b"
pipe = pipeline("text-generation", model=model_path, torch_dtype="auto", device_map="auto")
pipe("Hello", max_new_tokens=20)
KeyError: 'model.layers.5.mlp.experts.gate_up_proj'
Вот еще некоторые подробности из трассировки:
Код: Выделить всё
Using MXFP4 quantized models requires a GPU, we will default to dequantizing the model to bf16
Loading checkpoint shards: 100%
Some parameters are on the meta device because they were offloaded to the cpu and disk.
Device set to use cpu
Traceback (most recent call last):
File "/home/dev/projects/wolf-in-ai-clothing/convo_test.py", line 19, in invoke
response = model(user_message, max_new_tokens=20, num_return_sequences=1)
File ".../transformers/pipelines/text_generation.py", line 419, in _forward
output = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
File ".../transformers/models/gpt_oss/modeling_gpt_oss.py", line 375, in forward
hidden_states, _ = self.mlp(hidden_states) # diff with llama: router scores
File ".../transformers/models/gpt_oss/modeling_gpt_oss.py", line 159, in forward
routed_out = self.experts(hidden_states, router_indices=router_indices, routing_weights=router_scores)
File ".../accelerate/utils/offload.py", line 118, in __getitem__
return self.dataset[f"{self.prefix}{key}"]
File ".../accelerate/utils/offload.py", line 165, in __getitem__
weight_info = self.index[key]
KeyError: 'model.layers.5.mlp.experts.gate_up_proj'
Код: Выделить всё
pip install git+https://github.com/triton-lang/triton.git@main#subdirectory=python/triton_kernels Код: Выделить всё
pip install git+https://github.com/huggingface/transformers.gitКод: Выделить всё
pip install kernelsОкружающая среда:
- Python 3.12.3
- transformers 4.56.0.dev0 (также пробовал 4.55.1)
- torch 2.8.0
- ускорить 1.10.0
- Ubuntu 22.04 на WSL2, без графического процессора, 32 ГБ ОЗУ
Подробнее здесь: https://stackoverflow.com/questions/797 ... ers-on-cpu
Мобильная версия