Ошибка KeyError при локальной загрузке GPT-OSS-20B с трансформаторами на ЦП

Ошибка KeyError при локальной загрузке GPT-OSS-20B с трансформаторами на ЦП ⇐ Python

Ответить

1 сообщение • Страница 1 из 1

Anonymous

Ошибка KeyError при локальной загрузке GPT-OSS-20B с трансформаторами на ЦП

Цитата

Сообщение Anonymous » 02 дек 2025, 17:40

Я пытаюсь загрузить gpt-oss-20b локально, используя преобразователи Hugging Face только с процессором. Минимальный код:

Код: Выделить всё

from transformers import pipeline
model_path = "/mnt/d/Projects/models/gpt-oss-20b"
pipe = pipeline("text-generation", model=model_path, torch_dtype="auto", device_map="auto")
pipe("Hello", max_new_tokens=20)

Я получаю:

KeyError: 'model.layers.5.mlp.experts.gate_up_proj'

Вот еще некоторые подробности из трассировки:

Код: Выделить всё

Using MXFP4 quantized models requires a GPU, we will default to dequantizing the model to bf16
Loading checkpoint shards: 100%
Some parameters are on the meta device because they were offloaded to the cpu and disk.
Device set to use cpu
Traceback (most recent call last):
File "/home/dev/projects/wolf-in-ai-clothing/convo_test.py", line 19, in invoke
response = model(user_message, max_new_tokens=20, num_return_sequences=1)
File ".../transformers/pipelines/text_generation.py", line 419, in _forward
output = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
File ".../transformers/models/gpt_oss/modeling_gpt_oss.py", line 375, in forward
hidden_states, _ = self.mlp(hidden_states)  # diff with llama: router scores
File ".../transformers/models/gpt_oss/modeling_gpt_oss.py", line 159, in forward
routed_out = self.experts(hidden_states, router_indices=router_indices, routing_weights=router_scores)
File ".../accelerate/utils/offload.py", line 118, in __getitem__
return self.dataset[f"{self.prefix}{key}"]
File ".../accelerate/utils/offload.py", line 165, in __getitem__
weight_info = self.index[key]
KeyError: 'model.layers.5.mlp.experts.gate_up_proj'

Я проверил, что каталог существует и содержит файлы модели. Аналогичная проблема возникает в обсуждениях обнимающих лиц, где я следовал инструкциям @noobaymax:

Код: Выделить всё

pip install git+https://github.com/triton-lang/triton.git@main#subdirectory=python/triton_kernels

Код: Выделить всё

pip install git+https://github.com/huggingface/transformers.git

Код: Выделить всё

pip install kernels

но результат остается прежним.
Окружающая среда:

Python 3.12.3
transformers 4.56.0.dev0 (также пробовал 4.55.1)
torch 2.8.0
ускорить 1.10.0
Ubuntu 22.04 на WSL2, без графического процессора, 32 ГБ ОЗУ

Как правильно загрузить эту модель в ЦП?

Подробнее здесь: https://stackoverflow.com/questions/797 ... ers-on-cpu

1764686452

Anonymous

Я пытаюсь загрузить gpt-oss-20b локально, используя преобразователи Hugging Face только с процессором. Минимальный код:
[code]from transformers import pipeline
model_path = "/mnt/d/Projects/models/gpt-oss-20b"
pipe = pipeline("text-generation", model=model_path, torch_dtype="auto", device_map="auto")
pipe("Hello", max_new_tokens=20)
[/code]
Я получаю:

KeyError: 'model.layers.5.mlp.experts.gate_up_proj'

Вот еще некоторые подробности из трассировки:
[code]Using MXFP4 quantized models requires a GPU, we will default to dequantizing the model to bf16
Loading checkpoint shards: 100%
Some parameters are on the meta device because they were offloaded to the cpu and disk.
Device set to use cpu
Traceback (most recent call last):
File "/home/dev/projects/wolf-in-ai-clothing/convo_test.py", line 19, in invoke
response = model(user_message, max_new_tokens=20, num_return_sequences=1)
File ".../transformers/pipelines/text_generation.py", line 419, in _forward
output = self.model.generate(input_ids=input_ids, attention_mask=attention_mask, **generate_kwargs)
File ".../transformers/models/gpt_oss/modeling_gpt_oss.py", line 375, in forward
hidden_states, _ = self.mlp(hidden_states)  # diff with llama: router scores
File ".../transformers/models/gpt_oss/modeling_gpt_oss.py", line 159, in forward
routed_out = self.experts(hidden_states, router_indices=router_indices, routing_weights=router_scores)
File ".../accelerate/utils/offload.py", line 118, in __getitem__
return self.dataset[f"{self.prefix}{key}"]
File ".../accelerate/utils/offload.py", line 165, in __getitem__
weight_info = self.index[key]
KeyError: 'model.layers.5.mlp.experts.gate_up_proj'
[/code]
Я проверил, что каталог существует и содержит файлы модели. Аналогичная проблема возникает в обсуждениях обнимающих лиц, где я следовал инструкциям @noobaymax:
[code]pip install git+https://github.com/triton-lang/triton.git@main#subdirectory=python/triton_kernels [/code]
[code]pip install git+https://github.com/huggingface/transformers.git[/code]
[code]pip install kernels[/code]
но результат остается прежним.
Окружающая среда:
[list]
[*]Python 3.12.3

[*]transformers 4.56.0.dev0 (также пробовал 4.55.1)

[*]torch 2.8.0

[*]ускорить 1.10.0

[*]Ubuntu 22.04 на WSL2, без графического процессора, 32 ГБ ОЗУ

[/list]
Как правильно загрузить эту модель в ЦП? 

Подробнее здесь: [url]https://stackoverflow.com/questions/79735857/keyerror-when-loading-gpt-oss-20b-locally-with-transformers-on-cpu[/url]