Код: Выделить всё
[INFO] 2024-11-01 04:58:52,085 - Index: testing-index -- Workflow (1/16): create_base_text_units started.
/usr/local/lib/python3.10/site-packages/numpy/core/fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
return bound(*args, **kwds)
[INFO] 2024-11-01 04:58:55,377 - Index: testing-index -- Workflow (1/16): create_base_text_units complete.
[INFO] 2024-11-01 04:58:56,901 - Index: testing-index -- Workflow (2/16): create_base_extracted_entities started.
[INFO] 2024-11-01 05:08:42,824 - Index: testing-index -- Workflow (2/16): create_base_extracted_entities complete.
[INFO] 2024-11-01 05:08:44,580 - Index: testing-index -- Workflow (3/16): create_final_covariates started.
/usr/local/lib/python3.10/site-packages/datashaper/engine/verbs/convert.py:65: FutureWarning: errors='ignore' is deprecated and will raise in a future version. Use to_numeric without passing `errors` and catch exceptions explicitly instead
column_numeric = cast(pd.Series, pd.to_numeric(column, errors="ignore"))
[INFO] 2024-11-01 05:17:35,607 - Index: testing-index -- Workflow (3/16): create_final_covariates complete.
[INFO] 2024-11-01 05:17:37,188 - Index: testing-index -- Workflow (4/16): create_summarized_entities started.
[INFO] 2024-11-01 05:22:38,155 - Index: testing-index -- Workflow (4/16): create_summarized_entities complete.
[INFO] 2024-11-01 05:22:39,881 - Index: testing-index -- Workflow (5/16): join_text_units_to_covariate_ids started.
[INFO] 2024-11-01 05:22:40,033 - Index: testing-index -- Workflow (5/16): join_text_units_to_covariate_ids complete.
[INFO] 2024-11-01 05:22:41,396 - Index: testing-index -- Workflow (6/16): create_base_entity_graph started.
[INFO] 2024-11-01 05:22:49,056 - Index: testing-index -- Workflow (6/16): create_base_entity_graph complete.
[INFO] 2024-11-01 05:22:50,693 - Index: testing-index -- Workflow (7/16): create_final_entities started.
/usr/local/lib/python3.10/site-packages/numpy/core/fromnumeric.py:59: FutureWarning: 'DataFrame.swapaxes' is deprecated and will be removed in a future version. Please use 'DataFrame.transpose' instead.
return bound(*args, **kwds)
Error executing verb "text_embed" in create_final_entities: : Failed to resolve 'gwcdfas0csrtxc09.search.windows.net' ([Errno -2] Name or service not known)
Код: Выделить всё
this is the pipeline_settings.yaml, # Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
# this yaml file serves as a configuration template for the graphrag indexing jobs
# some values are hardcoded while others denoted by PLACEHOLDER will be dynamically set
input:
type: blob
file_type: text
file_pattern: .*\.txt$
storage_account_blob_url: $STORAGE_ACCOUNT_BLOB_URL
connection_string: $STORAGE_CONNECTION_STRING
container_name: PLACEHOLDER
base_dir: .
storage:
type: blob
storage_account_blob_url: $STORAGE_ACCOUNT_BLOB_URL
connection_string: $STORAGE_CONNECTION_STRING
container_name: PLACEHOLDER
base_dir: output
reporting:
type: blob
storage_account_blob_url: $STORAGE_ACCOUNT_BLOB_URL
connection_string: $STORAGE_CONNECTION_STRING
container_name: PLACEHOLDER
base_dir: logs
cache:
type: blob
storage_account_blob_url: $STORAGE_ACCOUNT_BLOB_URL
connection_string: $STORAGE_CONNECTION_STRING
container_name: PLACEHOLDER
base_dir: cache
llm:
type: azure_openai_chat
api_base: $GRAPHRAG_API_BASE
api_version: $GRAPHRAG_API_VERSION
model: $GRAPHRAG_LLM_MODEL
deployment_name: $GRAPHRAG_LLM_DEPLOYMENT_NAME
cognitive_services_endpoint: $GRAPHRAG_COGNITIVE_SERVICES_ENDPOINT
api_key: $OPENAI_API_KEY
model_supports_json: True
tokens_per_minute: 80000
requests_per_minute: 480
thread_count: 50
concurrent_requests: 25
parallelization:
stagger: 0.25
num_threads: 10
async_mode: threaded
embeddings:
async_mode: threaded
llm:
type: azure_openai_embedding
api_base: $GRAPHRAG_API_BASE
api_version: $GRAPHRAG_API_VERSION
batch_size: 16
model: $GRAPHRAG_EMBEDDING_MODEL
deployment_name: $GRAPHRAG_EMBEDDING_DEPLOYMENT_NAME
cognitive_services_endpoint: $GRAPHRAG_COGNITIVE_SERVICES_ENDPOINT
api_key: $OPENAI_API_KEY
tokens_per_minute: 350000
concurrent_requests: 25
requests_per_minute: 2100
thread_count: 50
max_retries: 50
parallelization:
stagger: 0.25
num_threads: 10
vector_store:
type: azure_ai_search
collection_name: PLACEHOLDER
title_column: name
overwrite: True
url: $AI_SEARCH_URL
audience: $AI_SEARCH_AUDIENCE
api_key: $AI_SEARCH_SERVICE_KEY
entity_extraction:
prompt: PLACEHOLDER
community_reports:
prompt: PLACEHOLDER
summarize_descriptions:
prompt: PLACEHOLDER
# claim extraction is disabled by default in the graphrag library so we enable it for the solution accelerator
claim_extraction:
enabled: True
snapshots:
graphml: True
Должен ли я включить сюда дополнительную информацию?
Подробнее здесь: https://stackoverflow.com/questions/791 ... me-or-serv