Promptfoo: RAG-Metrics — утверждение контекстной достоверности требует строкового вывода от поставщика

Promptfoo: RAG-Metrics — утверждение контекстной достоверности требует строкового вывода от поставщика ⇐ Python

1 сообщение • Страница 1 из 1

Anonymous

Promptfoo: RAG-Metrics — утверждение контекстной достоверности требует строкового вывода от поставщика

Цитата

Сообщение Anonymous » 05 мар 2026, 18:44

Я планирую оценить точно настроенный LLM в той же системе RAG, что и базовая модель.

Поэтому я настроил оценку PromptFoo.

В процессе я столкнулся с ошибкой, которую просто не могу понять. Надеюсь, кто-нибудь сможет мне помочь, возможно, я что-то упускаю из виду! Заранее спасибо!

Я генерирую тесты из файла jsonl с помощью генератора тестов, реализованного в create_tests.py.

При добавлении метрики контекстной достоверности я получил следующую ошибку:

Код: Выделить всё

Provider call failed during eval
{
"providerId": "file://providers/provider_base_model.py",
"providerLabel": "base",
"promptIdx": 0,
"testIdx": 0,
"error": {
"name": "Error",
"message": "Invariant failed: context-faithfulness assertion requires string output from the provider"
}
}

Вот код для воспроизведения:
config.yaml

Код: Выделить всё

description: RAFT-Fine-Tuned-Adapter-Evaluation
commandLineOptions:
envPath: .env.local
cache: false
repeat: 1
maxConcurrency: 1
python:
path: .venv

prompts:
- "UNUSED_PROMPT"

providers:
- id: 'file://providers/provider_base_model.py'
label: 'base'
config:
url: 'http://localhost:8000/test-base'
- id: 'file://providers/provider_base_model.py'
label: 'adapter'
config:
url: 'http://localhost:8000/test-adapter'

defaultTest:
options:
provider:
file://providers/code_model.yml

tests:
- path: file://test_generators/create_tests.py:create_tests
config:
dataset: 'data/test_data.jsonl'

create_tests.py

Код: Выделить всё

import json

def load_test_data(path: str):
json_lines = []
with open(path, "r", encoding="utf-8") as f:
for line in f:
if line.strip():  # skip empty lines
json_lines.append(json.loads(line))
return json_lines

def generate_test_cases(dataset_path, model):
test_cases = []
test_data = load_test_data(dataset_path)

for item in test_data:
cot_answer, final_answer = item["cot_answer"].split(":", 1)
test_cases.append({
"vars": {
"cot_answer": cot_answer,
"expected_answer": final_answer,
"query": item["question"],
},
"assert": [{
"type": "g-eval",
"threshold": 0.8,
"contextTransform": "output.answer",
"value": f"""Compare the model output to this expected answer:
{final_answer}
Score 1.0 if meaning matches."""
},
{
"type": "context-recall",
"value": final_answer,
"contextTransform": "output.context",
"threshold": 0.8,
"metric": "ctx_recall",
},
{
"type": "context-relevance",
"contextTransform": "output.context",
"threshold": 0.3,
"metric": "ctx_relevance",
},
{
"type": "context-faithfulness",
"contextTransform": "output.context",
"threshold": 0.8,
"metric": "faithfulness",
},
{
"type": "answer-relevance",
"threshold": 0.7,
"metric":  "answer_relevance",
}]
})

return test_cases

def create_tests(config):
dataset_path = config.get('dataset', '/path/to/dataset')
model = config.get('model', 'base')
return generate_test_cases(dataset_path=dataset_path, model=model)

provider_base_model.py

Код: Выделить всё

def call_api(question, options, context):
config = options.get("config", {}) or {}

payload = context.get("vars", {}) or {}

question = payload.get("query")

url = config.get("url", "")
params = {
"question": question
}

resp = requests.get(url, params=params)

try:
data = resp.json()
except ValueError:
data = {"error": "Invalid JSON from server", "raw": resp.text}

# Promptfoo erwartet mind. ein "output"-Feld
return {
"output": {
"answer": data.get("output"),
"context": data.get("contexts")
},
"metadata": {
"status": resp.status_code,
"raw": data
},
}

Чтобы устранить ошибку, я изменил своего поставщика, чтобы он возвращал одну строку для выходного ключа, и добавил поля ответа и контекста в метаданные.

Также изменил contextTransform на метаданные.context.
Пример:
в поставщике_base_model.py

Код: Выделить всё

    return {
"output": str(data),
"metadata": {
"answer": data.get("output"),
"context": data.get("contexts")
"status": resp.status_code,
"raw": data
},
}

Тогда promtfoo не находит поле контекста с ошибкой:

Код: Выделить всё

{
"providerId": "file://providers/provider_base_model.py",
"providerLabel": "base",
"promptIdx": 0,
"testIdx": 0,
"error": {
"name": "Error",
"message": "Invariant failed: context-faithfulness assertion requires string output from the provider"
}
}

Добавление ответа и контекста в качестве ключей верхнего уровня в возврат моего провайдера и только добавление контекста или ответа в contextTransform привело к той же ошибке!>

Подробнее здесь: https://stackoverflow.com/questions/798 ... g-output-f

1772725483

Anonymous

Я планирую оценить точно настроенный LLM в той же системе RAG, что и базовая модель.

Поэтому я настроил оценку PromptFoo.

В процессе я столкнулся с ошибкой, которую просто не могу понять.  Надеюсь, кто-нибудь сможет мне помочь, возможно, я что-то упускаю из виду! Заранее спасибо!

Я генерирую тесты из файла jsonl с помощью генератора тестов, реализованного в create_tests.py.

При добавлении метрики контекстной достоверности я получил следующую ошибку:
[code]Provider call failed during eval
{
"providerId": "file://providers/provider_base_model.py",
"providerLabel": "base",
"promptIdx": 0,
"testIdx": 0,
"error": {
"name": "Error",
"message": "Invariant failed: context-faithfulness assertion requires string output from the provider"
}
}
[/code]
Вот код для воспроизведения:
config.yaml[code]description: RAFT-Fine-Tuned-Adapter-Evaluation
commandLineOptions:
envPath: .env.local
cache: false
repeat: 1
maxConcurrency: 1
python:
path: .venv

prompts:
- "UNUSED_PROMPT"

providers:
- id: 'file://providers/provider_base_model.py'
label: 'base'
config:
url: 'http://localhost:8000/test-base'
- id: 'file://providers/provider_base_model.py'
label: 'adapter'
config:
url: 'http://localhost:8000/test-adapter'

defaultTest:
options:
provider:
file://providers/code_model.yml

tests:
- path: file://test_generators/create_tests.py:create_tests
config:
dataset: 'data/test_data.jsonl'

[/code]
create_tests.py
[code]import json

def load_test_data(path: str):
json_lines = []
with open(path, "r", encoding="utf-8") as f:
for line in f:
if line.strip():  # skip empty lines
json_lines.append(json.loads(line))
return json_lines

def generate_test_cases(dataset_path, model):
test_cases = []
test_data = load_test_data(dataset_path)

for item in test_data:
cot_answer, final_answer = item["cot_answer"].split(":", 1)
test_cases.append({
"vars": {
"cot_answer": cot_answer,
"expected_answer": final_answer,
"query": item["question"],
},
"assert": [{
"type": "g-eval",
"threshold": 0.8,
"contextTransform": "output.answer",
"value": f"""Compare the model output to this expected answer:
{final_answer}
Score 1.0 if meaning matches."""
},
{
"type": "context-recall",
"value": final_answer,
"contextTransform": "output.context",
"threshold": 0.8,
"metric": "ctx_recall",
},
{
"type": "context-relevance",
"contextTransform": "output.context",
"threshold": 0.3,
"metric": "ctx_relevance",
},
{
"type": "context-faithfulness",
"contextTransform": "output.context",
"threshold": 0.8,
"metric": "faithfulness",
},
{
"type": "answer-relevance",
"threshold": 0.7,
"metric":  "answer_relevance",
}]
})

return test_cases

def create_tests(config):
dataset_path = config.get('dataset', '/path/to/dataset')
model = config.get('model', 'base')
return generate_test_cases(dataset_path=dataset_path, model=model)
[/code]
provider_base_model.py
[code]def call_api(question, options, context):
config = options.get("config", {}) or {}

payload = context.get("vars", {}) or {}

question = payload.get("query")

url = config.get("url", "")
params = {
"question": question
}

resp = requests.get(url, params=params)

try:
data = resp.json()
except ValueError:
data = {"error": "Invalid JSON from server", "raw": resp.text}

# Promptfoo erwartet mind. ein "output"-Feld
return {
"output": {
"answer": data.get("output"),
"context": data.get("contexts")
},
"metadata": {
"status": resp.status_code,
"raw": data
},
}
[/code]
Чтобы устранить ошибку, я изменил своего поставщика, чтобы он возвращал одну строку для выходного ключа, и добавил поля ответа и контекста в метаданные.

Также изменил contextTransform на метаданные.context.
Пример:
в поставщике_base_model.py
[code]    return {
"output": str(data),
"metadata": {
"answer": data.get("output"),
"context": data.get("contexts")
"status": resp.status_code,
"raw": data
},
}
[/code]
Тогда promtfoo не находит поле контекста с ошибкой:
[code]{
"providerId": "file://providers/provider_base_model.py",
"providerLabel": "base",
"promptIdx": 0,
"testIdx": 0,
"error": {
"name": "Error",
"message": "Invariant failed: context-faithfulness assertion requires string output from the provider"
}
}
[/code]
Добавление ответа и контекста в качестве ключей верхнего уровня в возврат моего провайдера и только добавление контекста или ответа в contextTransform привело к той же ошибке!> 

Подробнее здесь: [url]https://stackoverflow.com/questions/79899098/promptfoo-rag-metrics-context-faithfulness-assertion-requires-string-output-f[/url]