-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
Do you need to file an issue?
- I have searched the existing issues and this bug is not already filed.
- My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.
Describe the bug
Using qwen-plus and text-embedding-v4 model , unable to create community reports.
Steps to reproduce
(paper_agent) C:\app\program\longma\paper_agent>graphrag index --root ./
Logging enabled at C:\app\program\longma\paper_agent\logs\indexing-engine.log
Running standard indexing.
🚀 create_base_text_units
id ... n_tokens
0 fb44015ed53c2e2372c5cda5dae65c587b9549b0367f0e... ... 1200
1 2a0abeeaed50fa6e55cfc6c1afddff910f0db9d6f5aaca... ... 1200
2 c81094a29750491828c3084199cf9c4adda9877a2560aa... ... 1200
3 86a12353c7b60f3b7f240624ab11b5fe028125dab4d145... ... 1200
4 b3dc46401aa94564391faeeb15a48326c3e9a04bbcf88d... ... 1200
5 d0c7f7c73258d7184c3b26d3041e1ce9fb9283a7ab1580... ... 1200
6 9d2d2f266170558992529a926f83276d55b4ea7cc948d5... ... 1200
7 4c785ff88c7c52b0c6c34ee9881663e5be7b3cc0bb726b... ... 1200
8 d059491e86b14c8d43841180ff9de25c7a6f71ade1af68... ... 1200
9 5b9bfe1d14f3c695d1a33f70c927f24dbb95b5b24fe32a... ... 273
[10 rows x 4 columns]
🚀 create_final_documents
id ... text_unit_ids
0 d6fad6b52521c131aa3215edaca87e92c0474a9d442854... ... [fb44015ed53c2e2372c5cda5dae65c587b9549b0367f0...
[1 rows x 5 columns]
🚀 extract_graph
None
❌ compute_communities
None
⠏ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_documents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── extract_graph ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
❌ Errors occurred during the pipeline run, see logs for more details.
Expected Behavior
repair
GraphRAG Config Used
This config file contains required core defaults that must be set, along with a handful of common optional settings.
For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/
LLM settings
There are a number of settings to tune the threading and token limits for LLM calls - check the docs.
encoding_model: cl100k_base # this needs to be matched to your model!
llm:
api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file
type: openai_chat # or azure_openai_chat
model: qwen-plus
model_supports_json: true # recommended if this is available for your model.
audience: "https://cognitiveservices.azure.com/.default"
api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
api_version: 2024-02-15-preview
organization: <organization_id>
deployment_name: <azure_model_deployment_name>
parallelization:
stagger: 0.3
num_threads: 50
async_mode: threaded # or asyncio
embeddings:
async_mode: threaded # or asyncio
vector_store:
type: lancedb # one of [lancedb, azure_ai_search, cosmosdb]
db_uri: 'output\lancedb'
collection_name: default
overwrite: true
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: text-embedding-v4
api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
# api_version: 2024-02-15-preview
# audience: "https://cognitiveservices.azure.com/.default"
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
Input settings
input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"
chunks:
size: 1200
overlap: 100
group_by_columns: [id]
Storage settings
If blob storage is specified in the following four sections,
connection_string and container_name must be provided
cache:
type: file # one of [blob, cosmosdb, file]
base_dir: "cache"
reporting:
type: file # or console, blob
base_dir: "logs"
storage:
type: file # one of [blob, cosmosdb, file]
base_dir: "output"
only turn this on if running graphrag index with custom settings
we normally use graphrag update with the defaults
update_index_storage:
type: file # or blob
base_dir: "update_output"
Workflow settings
skip_workflows: []
entity_extraction:
prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 1
summarize_descriptions:
prompt: "prompts/summarize_descriptions.txt"
max_length: 500
claim_extraction:
enabled: false
prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 1
community_reports:
prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000
cluster_graph:
enabled: false
max_cluster_size: 10
embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes
umap:
enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)
snapshots:
graphml: false
embeddings: false
transient: false
Query settings
The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.
See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query
local_search:
prompt: "prompts/local_search_system_prompt.txt"
global_search:
map_prompt: "prompts/global_search_map_system_prompt.txt"
reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"
drift_search:
prompt: "prompts/drift_search_system_prompt.txt"
reduce_prompt: "prompts/drift_search_reduce_prompt.txt"
basic_search:
prompt: "prompts/basic_search_system_prompt.txt"
Logs and screenshots
LOGS as follows:
17:58:11,523 graphrag.index.run.run_workflows ERROR error running workflow compute_communities
Traceback (most recent call last):
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 141, in getattr
return self._fntab[name]
KeyError: 'LLVMPY_AddSymbol'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 122, in _load_lib
self.lib_handle = ctypes.CDLL(str(lib_path))
File "C:\Users\Administrator.conda\envs\paper_agent\lib\ctypes_init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 1114] ��̬���ӿ�(DLL)��ʼ������ʧ�ܡ�
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\run\run_workflows.py", line 166, in run_workflows
result = await run_workflow(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\workflows\compute_communities.py", line 31, in run_workflow
base_communities = compute_communities(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\flows\compute_communities.py", line 21, in compute_communities
communities = cluster_graph(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\operations\cluster_graph.py", line 29, in cluster_graph
node_id_to_community_map, parent_mapping = compute_leiden_communities(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\operations\cluster_graph.py", line 64, in compute_leiden_communities
from graspologic.partition import hierarchical_leiden
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graspologic_init.py", line 8, in
import graspologic.inference
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graspologic\inference_init.py", line 6, in
from .latent_distribution_test import latent_distribution_test
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graspologic\inference\latent_distribution_test.py", line 7, in
from hyppo.ksample import KSample
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo_init.py", line 1, in
import hyppo.discrim
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\discrim_init_.py", line 1, in
from .discrim_one_samp import DiscrimOneSample
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\discrim\discrim_one_samp.py", line 5, in
from .utils import CheckInputs
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\discrim_utils.py", line 4, in
from ..tools import check_ndarray_xy, check_reps, contains_nan, convert_xy_float64
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\tools_init.py", line 5, in
from .power import *
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\tools\power.py", line 5, in
from ..conditional import COND_INDEP_TESTS
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\conditional_init.py", line 5, in
from .pdcorr import PartialDcorr
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\conditional\pdcorr.py", line 2, in
from numba import jit
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\numba_init_.py", line 73, in
from numba.core import config
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\numba\core\config.py", line 17, in
import llvmlite.binding as ll
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding_init_.py", line 4, in
from .dylib import *
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\dylib.py", line 36, in
ffi.lib.LLVMPY_AddSymbol.argtypes = [
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 144, in getattr
cfn = getattr(self._lib, name)
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 136, in _lib
self._load_lib()
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 130, in _load_lib
raise OSError("Could not find/load shared object file") from e
OSError: Could not find/load shared object file
17:58:11,547 graphrag.callbacks.file_workflow_callbacks INFO Error running pipeline! details=None
17:58:11,642 graphrag.cli.index ERROR Errors occurred during the pipeline run, see logs for more details.
Additional Information
- GraphRAG Version: 1.2.0
- Operating System: windows
- Python Version:3.10