Как преобразовать чистый текст в векторный массив с помощью BridgeTower

Как преобразовать чистый текст в векторный массив с помощью BridgeTower ⇐ Python

1 сообщение • Страница 1 из 1

Anonymous

Как преобразовать чистый текст в векторный массив с помощью BridgeTower

Цитата

Сообщение Anonymous » 07 ноя 2024, 13:28

ребята:
Я хотел бы использовать модель BridgeTower для обработки текста и изображений в векторы, сохранения их в векторной базе данных LanceDB, а затем получения соответствующих изображений с помощью текстовых запросов. Чтобы воспроизвести проблему, выполните следующие действия:

[*]Сначала обработайте текст и изображение в векторные данные:

Код: Выделить всё

def bt_embedding_from_local_pretrained(prompt, image_path):
model_name = "D:\\download\\bridgetower-large-itm-mlm-itc"
processor = AutoProcessor.from_pretrained(model_name)
model = BridgeTowerModel.from_pretrained(model_name)

if image_path is not None and image_path != '':
image = Image.open(image_path)
inputs = processor(images=image, text=prompt, return_tensors="pt")
else:
return

with torch.no_grad():
outputs = model(**inputs)
embeddings = outputs.pooler_output
embeddings_list = embeddings.squeeze().tolist()

return embeddings_list

text = "The image features a young boy walking on a playground floor, which is designed to look like a carpet. The boy is wearing a blue shirt and appears to be enjoying his time at the playground. \n\nIn the background, there are two chairs, one located near the left side of the playground and the other closer to the right side. The playground also has a bench situated in the middle of the scene."
image_path = "./frame_0.jpg"
vector = bt_embedding_from_local_pretrained(text, image_path)
print(vector)
print(len(vector))

#output
[-0.4280051290988922, 0.8150287866592407, -0.4738779664039612, -0.8128997683525085, 0.0006316843791864812, 0.4806518256664276, 0.22251057624816895, 0.6701756715774536, ....
2048  #the length of vector

Впоследствии я вставил несколько наборов данных этого типа в LanceDB и теперь обработаю текст в векторные данные BridgeTower для извлечения. Моя текущая проблема заключается в том, что я не могу обработать чистый текст в векторный массив одинаковой длины (2048) с помощью BridgeTower, поскольку для следующего вызова метода требуется экземпляр изображения. Есть ли способ добиться этого?

Код: Выделить всё

inputs = processor(images=image, text=prompt, return_tensors="pt")

спасибо!

Подробнее здесь: https://stackoverflow.com/questions/791 ... ridgetower

1730975323

Anonymous

ребята:
Я хотел бы использовать модель BridgeTower для обработки текста и изображений в векторы, сохранения их в векторной базе данных LanceDB, а затем получения соответствующих изображений с помощью текстовых запросов. Чтобы воспроизвести проблему, выполните следующие действия:

[*]Сначала обработайте текст и изображение в векторные данные:

[code]def bt_embedding_from_local_pretrained(prompt, image_path):
model_name = "D:\\download\\bridgetower-large-itm-mlm-itc"
processor = AutoProcessor.from_pretrained(model_name)
model = BridgeTowerModel.from_pretrained(model_name)

if image_path is not None and image_path != '':
image = Image.open(image_path)
inputs = processor(images=image, text=prompt, return_tensors="pt")
else:
return

with torch.no_grad():
outputs = model(**inputs)
embeddings = outputs.pooler_output
embeddings_list = embeddings.squeeze().tolist()

return embeddings_list

text = "The image features a young boy walking on a playground floor, which is designed to look like a carpet. The boy is wearing a blue shirt and appears to be enjoying his time at the playground. \n\nIn the background, there are two chairs, one located near the left side of the playground and the other closer to the right side. The playground also has a bench situated in the middle of the scene."
image_path = "./frame_0.jpg"
vector = bt_embedding_from_local_pretrained(text, image_path)
print(vector)
print(len(vector))

#output
[-0.4280051290988922, 0.8150287866592407, -0.4738779664039612, -0.8128997683525085, 0.0006316843791864812, 0.4806518256664276, 0.22251057624816895, 0.6701756715774536, ....
2048  #the length of vector
[/code]
Впоследствии я вставил несколько наборов данных этого типа в LanceDB и теперь обработаю текст в векторные данные BridgeTower для извлечения. Моя текущая проблема заключается в том, что я не могу обработать чистый текст в векторный массив одинаковой длины (2048) с помощью BridgeTower, поскольку для следующего вызова метода требуется экземпляр изображения. Есть ли способ добиться этого?
[code]inputs = processor(images=image, text=prompt, return_tensors="pt")
[/code]
спасибо! 

Подробнее здесь: [url]https://stackoverflow.com/questions/79166023/how-to-process-pure-text-into-a-vector-array-using-bridgetower[/url]