BFloat16 не поддерживается в MPS (macOS)

BFloat16 не поддерживается в MPS (macOS) ⇐ Python

1 сообщение • Страница 1 из 1

Anonymous

BFloat16 не поддерживается в MPS (macOS)

Цитата

Сообщение Anonymous » 30 июн 2024, 01:50

Я получил доступ к модели на основе ламы на Huggingface под названием «LeoLM/leo-hessianai-7b-chat».
Я загрузил модель на свой Mac с устройством, установленным как «MPS». Загрузка прошла успешно, однако, когда я хочу протестировать модель, я получаю следующую ошибку:

Код: Выделить всё

TypeError: BFloat16 is not supported on MPS

Выше я вижу подсказку:

Код: Выделить всё

FP4 quantization state not initialized. Please call .cuda() or .to(device) on the LinearFP4 layer first.

Вот мой код:

Код: Выделить всё

from torch import cuda, bfloat16
import transformers

device = torch.device("mps")

model_id = 'LeoLM/leo-hessianai-7b-chat'

#device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# set quantization configuration to load large model with less GPU memory
# this requires the `bitsandbytes` library
bnb_config = transformers.BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type='nf4',
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=bfloat16
)

# begin initializing HF items, need auth token for these
hf_auth = 'HF_KEY'
model_config = transformers.AutoConfig.from_pretrained(
model_id,
use_auth_token=hf_auth
)

model = transformers.AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=False, # True for flash attention
config=model_config,
quantization_config=bnb_config,
device_map='auto',
use_auth_token=hf_auth
)
model.eval()
print(f"Model loaded on {device}")

tokenizer = transformers.AutoTokenizer.from_pretrained(
model_id,
use_auth_token=hf_auth
)

generate_text = transformers.pipeline(
model=model, tokenizer=tokenizer,
return_full_text=True,  # langchain expects the full text
task='text-generation',
# we pass model parameters here too
temperature=0.0,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
max_new_tokens=512,  # mex number of tokens to generate in the output
repetition_penalty=1.1  # without this output begins repeating
)

res = generate_text("Explain the difference between a country and a continent.")
print(res[0]["generated_text"])

Что мне нужно изменить, чтобы он заработал?

Подробнее здесь: https://stackoverflow.com/questions/773 ... -mps-macos

1719701426

Anonymous

Я получил доступ к модели на основе ламы на Huggingface под названием «LeoLM/leo-hessianai-7b-chat».
Я загрузил модель на свой Mac с устройством, установленным как «MPS». Загрузка прошла успешно, однако, когда я хочу протестировать модель, я получаю следующую ошибку:
[code]TypeError: BFloat16 is not supported on MPS
[/code]
Выше я вижу подсказку:
[code]FP4 quantization state not initialized. Please call .cuda() or .to(device) on the LinearFP4 layer first.
[/code]
Вот мой код:
[code]from torch import cuda, bfloat16
import transformers

device = torch.device("mps")

model_id = 'LeoLM/leo-hessianai-7b-chat'

#device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'

# set quantization configuration to load large model with less GPU memory
# this requires the `bitsandbytes` library
bnb_config = transformers.BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type='nf4',
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=bfloat16
)

# begin initializing HF items, need auth token for these
hf_auth = 'HF_KEY'
model_config = transformers.AutoConfig.from_pretrained(
model_id,
use_auth_token=hf_auth
)

model = transformers.AutoModelForCausalLM.from_pretrained(
model_id,
trust_remote_code=False, # True for flash attention
config=model_config,
quantization_config=bnb_config,
device_map='auto',
use_auth_token=hf_auth
)
model.eval()
print(f"Model loaded on {device}")

tokenizer = transformers.AutoTokenizer.from_pretrained(
model_id,
use_auth_token=hf_auth
)

generate_text = transformers.pipeline(
model=model, tokenizer=tokenizer,
return_full_text=True,  # langchain expects the full text
task='text-generation',
# we pass model parameters here too
temperature=0.0,  # 'randomness' of outputs, 0.0 is the min and 1.0 the max
max_new_tokens=512,  # mex number of tokens to generate in the output
repetition_penalty=1.1  # without this output begins repeating
)

res = generate_text("Explain the difference between a country and a continent.")
print(res[0]["generated_text"])
[/code]
Что мне нужно изменить, чтобы он заработал? 

Подробнее здесь: [url]https://stackoverflow.com/questions/77359161/bfloat16-is-not-supported-on-mps-macos[/url]

Ответить Пред. тема След. тема

1 сообщение • Страница 1 из 1

Быстрый ответ

Заголовок:

Имя пользователя:

Изменение регистра текста:

Смайлики

Ещё смайлики…

К этому ответу прикреплено по крайней мере одно вложение.

Если вы не хотите добавлять вложения, оставьте поля пустыми. Можно прикреплять файлы, перетаскивая их в окно сообщения.

Максимально разрешённый размер вложения: 15 МБ.

Имя файла:

Комментарий к файлу:

Имя файла	Комментарий к файлу	Размер	Статус

Похожие темы

Ответы

Просмотры

Последнее сообщение

Ошибка выполнения: серверная часть MPS поддерживается в MacOS 12.3+. Текущую версию ОС можно запросить с помощью `sw_ver

Последнее сообщение Anonymous « 01 ноя 2024, 18:45
Добавлено в форуме Python

Anonymous » 01 ноя 2024, 18:45 » в форуме Python

Поскольку поддержка графического процессора Pytorch для Apple Silicon была только что выпущена, я попытался установить PyTorch, выполнив действия, описанные по следующей ссылке. На данный момент доступна только ночная сборка, поэтому я установил ее....

0 Ответы

14 Просмотры

Последнее сообщение Anonymous
01 ноя 2024, 18:45
Поддержка Python для BFloat16 в macOS

Последнее сообщение Anonymous « 30 июн 2024, 01:20
Добавлено в форуме Python

Anonymous » 30 июн 2024, 01:20 » в форуме Python

Я пытаюсь использовать модель bigscience/bloom в macOS для завершения текста. Вот скрипт Python, который я использую:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer =...

0 Ответы

22 Просмотры

Последнее сообщение Anonymous
30 июн 2024, 01:20
Использование bfloat16 и tensorflow на графическом процессоре

Последнее сообщение Anonymous « 30 июн 2024, 01:19
Добавлено в форуме Python

Anonymous » 30 июн 2024, 01:19 » в форуме Python

Почему-то, когда я использую bfloat16, tensorflow не может обнаружить мой графический процессор.

Можно ли использовать bfloat16 на графическом процессоре? Или я могу использовать bfloat16 только в тензорном процессоре?

Означает ли сообщение об...

0 Ответы

31 Просмотры

Последнее сообщение Anonymous
30 июн 2024, 01:19
Сохраните bfloat16 в двоичном формате.

Последнее сообщение Anonymous « 30 июн 2024, 01:20
Добавлено в форуме Python

Anonymous » 30 июн 2024, 01:20 » в форуме Python

Каков идиоматический способ сохранения bfloat torch.tensor на диск в виде необработанного двоичного файла? Код ниже выдаст ошибку, поскольку numpy не поддерживает bfloat16.
import torch
import numpy as np

tensor = torch.tensor( ).bfloat16()

#...

0 Ответы

19 Просмотры

Последнее сообщение Anonymous
30 июн 2024, 01:20
Уменьшение памяти Tensorflow TPU v2/v3 bfloat16

Последнее сообщение Anonymous « 30 июн 2024, 01:21
Добавлено в форуме Python

Anonymous » 30 июн 2024, 01:21 » в форуме Python

Моя модель слишком велика, чтобы получить партию >64 с обычными устройствами TPU v2. На сайте устранения неполадок упоминается, что в будущих версиях tensorflow будет поддержка bfloat16. Могут ли недавно поддерживаемые версии tf 1.9–1.12...

0 Ответы

28 Просмотры

Последнее сообщение Anonymous
30 июн 2024, 01:21

Вернуться в «Python»