Почему LLaMA 3.1 не выполняет инструкции в запросе на минимизацию данных? - Цифровое Кемерово

Почему LLaMA 3.1 не выполняет инструкции в запросе на минимизацию данных? ⇐ Python

Ответить

1 сообщение • Страница 1 из 1

Anonymous

Почему LLaMA 3.1 не выполняет инструкции в запросе на минимизацию данных?

Цитата

Сообщение Anonymous » 14 ноя 2024, 15:08

Я использую LLaMA 3.1 для создания оптимизированных вариантов кода JavaScript, следуя подсказке, обеспечивающей соблюдение принципов минимизации данных. Моя цель — либо оптимизировать модель, либо подтвердить, что в предоставленный код JavaScript не требуется никаких изменений. Несмотря на тщательную настройку, ответы модели часто неточно отражают данные инструкции. В частности, он всегда не может оптимизироваться по запросу.
Вот упрощенная версия моего кода:

Код: Выделить всё

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct", torch_dtype=torch.float16).to(device)
torch.backends.cudnn.benchmark = True

def generate_variants(filter_code, n_variants=3):
variants = []

text = f"""
system
The following definition refers to data minimization:
'Data minimization is a principle restricting data collection to what is necessary in relation to the purposes for which they are processed.'
user
Please optimize the provided JavaScript code with the following instructions:
- If excessive anonymization is applied without clear utility, reduce the level of data anonymization to retain only the necessary level.
- Eliminate unnecessary API function calls, keeping only those essential for minimal and efficient data processing.
- Remove any unnecessary data attributes, ensuring that only essential data attributes are collected, processed, and stored.
If the code already complies with data minimization, please add a comment '// No changes needed' to indicate that no modifications are required.
Return only the JavaScript code, without any additional explanations, comments, or introductory text.
Mock the code to make it run.
Here is the code to optimize:

{filter_code}

assistant
"""

for i in range(n_variants):
seed_value = random.randint(0, 10000)
torch.manual_seed(seed_value)
random.seed(seed_value)

input_ids = tokenizer(text, return_tensors="pt").input_ids.to(device)

generated_ids = model.generate(
input_ids,
max_length=1000,
temperature=0.1,
top_k=5,
top_p=0.8
)

response = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
cleaned_response = response[len(text):].strip()
if "// No changes needed" in cleaned_response:
cleaned_response = f"{filter_code} // No changes needed"

variants.append(cleaned_response)

return variants

Я установил определенные параметры и использую фиксированное приглашение, и мне не разрешено настраивать текст приглашения или эти параметры генерации.
Это, например, одного из кодов для оптимизации:

Код: Выделить всё

const GoogleCalendar = {
newEventAdded: {
Where: "[some street address]",
Starts: "9:00 AM",
Ends: "10:00 AM"
},
addDetailedEvent: {
skip: () => console.log("Event skipped."),
setDescription: (description) => console.log(`Description set: ${description}`),
setAllDay: (isAllDay) => console.log(`All-day set: ${isAllDay}`),
setStartTime: (startTime) => console.log(`Start time set: ${startTime}`),
setEndTime: (endTime) => console.log(`End time set: ${endTime}`)
}
};

if (GoogleCalendar.newEventAdded.Where.indexOf("[some street address]") < 0) {
GoogleCalendar.addDetailedEvent.skip();
} else {
GoogleCalendar.addDetailedEvent.setDescription("In the office from "
+ GoogleCalendar.newEventAdded.Starts
+ " to " + GoogleCalendar.newEventAdded.Ends);
GoogleCalendar.addDetailedEvent.setAllDay("true");
GoogleCalendar.addDetailedEvent.setStartTime(GoogleCalendar.newEventAdded.Starts);
GoogleCalendar.addDetailedEvent.setEndTime(GoogleCalendar.newEventAdded.Ends);
}

Вывод всегда либо сгенерирован неправильно, либо всегда говорит: «Изменений не требуется», даже если это явно необходимо.
Что-то не так с моим шаблон подсказки?

Подробнее здесь: https://stackoverflow.com/questions/791 ... ion-prompt

1731586093

Anonymous

Я использую LLaMA 3.1 для создания оптимизированных вариантов кода JavaScript, следуя подсказке, обеспечивающей соблюдение принципов минимизации данных. Моя цель — либо оптимизировать модель, либо подтвердить, что в предоставленный код JavaScript не требуется никаких изменений. Несмотря на тщательную настройку, ответы модели часто неточно отражают данные инструкции. В частности, он всегда не может оптимизироваться по запросу.
Вот упрощенная версия моего кода:
[code]device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct", torch_dtype=torch.float16).to(device)
torch.backends.cudnn.benchmark = True

def generate_variants(filter_code, n_variants=3):
variants = []

text = f"""
system
The following definition refers to data minimization:
'Data minimization is a principle restricting data collection to what is necessary in relation to the purposes for which they are processed.'
user
Please optimize the provided JavaScript code with the following instructions:
- If excessive anonymization is applied without clear utility, reduce the level of data anonymization to retain only the necessary level.
- Eliminate unnecessary API function calls, keeping only those essential for minimal and efficient data processing.
- Remove any unnecessary data attributes, ensuring that only essential data attributes are collected, processed, and stored.
If the code already complies with data minimization, please add a comment '// No changes needed' to indicate that no modifications are required.
Return only the JavaScript code, without any additional explanations, comments, or introductory text.
Mock the code to make it run.
Here is the code to optimize:

{filter_code}

assistant
"""

for i in range(n_variants):
seed_value = random.randint(0, 10000)
torch.manual_seed(seed_value)
random.seed(seed_value)

input_ids = tokenizer(text, return_tensors="pt").input_ids.to(device)

generated_ids = model.generate(
input_ids,
max_length=1000,
temperature=0.1,
top_k=5,
top_p=0.8
)

response = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
cleaned_response = response[len(text):].strip()
if "// No changes needed" in cleaned_response:
cleaned_response = f"{filter_code} // No changes needed"

variants.append(cleaned_response)

return variants
[/code]
Я установил определенные параметры и использую фиксированное приглашение, и мне не разрешено настраивать текст приглашения или эти параметры генерации.
Это, например, одного из кодов для оптимизации:
[code]const GoogleCalendar = {
newEventAdded: {
Where: "[some street address]",
Starts: "9:00 AM",
Ends: "10:00 AM"
},
addDetailedEvent: {
skip: () => console.log("Event skipped."),
setDescription: (description) => console.log(`Description set: ${description}`),
setAllDay: (isAllDay) => console.log(`All-day set: ${isAllDay}`),
setStartTime: (startTime) => console.log(`Start time set: ${startTime}`),
setEndTime: (endTime) => console.log(`End time set: ${endTime}`)
}
};

if (GoogleCalendar.newEventAdded.Where.indexOf("[some street address]") < 0) {
GoogleCalendar.addDetailedEvent.skip();
} else {
GoogleCalendar.addDetailedEvent.setDescription("In the office from "
+ GoogleCalendar.newEventAdded.Starts
+ " to " + GoogleCalendar.newEventAdded.Ends);
GoogleCalendar.addDetailedEvent.setAllDay("true");
GoogleCalendar.addDetailedEvent.setStartTime(GoogleCalendar.newEventAdded.Starts);
GoogleCalendar.addDetailedEvent.setEndTime(GoogleCalendar.newEventAdded.Ends);
}
[/code]
Вывод всегда либо сгенерирован неправильно, либо всегда говорит: «Изменений не требуется», даже если это явно необходимо.
Что-то не так с моим шаблон подсказки? 

Подробнее здесь: [url]https://stackoverflow.com/questions/79188683/why-does-llama-3-1-fail-to-follow-instructions-in-a-data-minimization-prompt[/url]

Ответить

1 сообщение • Страница 1 из 1

Быстрый ответ

Заголовок:

Имя пользователя:

Изменение регистра текста:

Смайлики

Ещё смайлики…

К этому ответу прикреплено по крайней мере одно вложение.

Если вы не хотите добавлять вложения, оставьте поля пустыми. Можно прикреплять файлы, перетаскивая их в окно сообщения.

Максимально разрешённый размер вложения: 15 МБ.

Имя файла:

Комментарий к файлу:

Имя файла	Комментарий к файлу	Размер	Статус

Вернуться в «Python»