Skip to content

[Bug]: <title>compute_communities,Errors occurred during the pipeline run, see logs for more details. #2074

@hope12122

Description

@hope12122

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

Using qwen-plus and text-embedding-v4 model , unable to create community reports.

Steps to reproduce

(paper_agent) C:\app\program\longma\paper_agent>graphrag index --root ./

Logging enabled at C:\app\program\longma\paper_agent\logs\indexing-engine.log
Running standard indexing.
🚀 create_base_text_units
id ... n_tokens
0 fb44015ed53c2e2372c5cda5dae65c587b9549b0367f0e... ... 1200
1 2a0abeeaed50fa6e55cfc6c1afddff910f0db9d6f5aaca... ... 1200
2 c81094a29750491828c3084199cf9c4adda9877a2560aa... ... 1200
3 86a12353c7b60f3b7f240624ab11b5fe028125dab4d145... ... 1200
4 b3dc46401aa94564391faeeb15a48326c3e9a04bbcf88d... ... 1200
5 d0c7f7c73258d7184c3b26d3041e1ce9fb9283a7ab1580... ... 1200
6 9d2d2f266170558992529a926f83276d55b4ea7cc948d5... ... 1200
7 4c785ff88c7c52b0c6c34ee9881663e5be7b3cc0bb726b... ... 1200
8 d059491e86b14c8d43841180ff9de25c7a6f71ade1af68... ... 1200
9 5b9bfe1d14f3c695d1a33f70c927f24dbb95b5b24fe32a... ... 273

[10 rows x 4 columns]
🚀 create_final_documents
id ... text_unit_ids
0 d6fad6b52521c131aa3215edaca87e92c0474a9d442854... ... [fb44015ed53c2e2372c5cda5dae65c587b9549b0367f0...

[1 rows x 5 columns]
🚀 extract_graph
None
❌ compute_communities
None
⠏ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_final_documents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── extract_graph ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
❌ Errors occurred during the pipeline run, see logs for more details.

Expected Behavior

repair

GraphRAG Config Used

This config file contains required core defaults that must be set, along with a handful of common optional settings.

For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

LLM settings

There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

encoding_model: cl100k_base # this needs to be matched to your model!

llm:
api_key: ${GRAPHRAG_API_KEY} # set this in the generated .env file
type: openai_chat # or azure_openai_chat
model: qwen-plus
model_supports_json: true # recommended if this is available for your model.

audience: "https://cognitiveservices.azure.com/.default"

api_base: https://dashscope.aliyuncs.com/compatible-mode/v1

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

parallelization:
stagger: 0.3

num_threads: 50

async_mode: threaded # or asyncio

embeddings:
async_mode: threaded # or asyncio
vector_store:
type: lancedb # one of [lancedb, azure_ai_search, cosmosdb]
db_uri: 'output\lancedb'
collection_name: default
overwrite: true
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: text-embedding-v4
api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
# api_version: 2024-02-15-preview
# audience: "https://cognitiveservices.azure.com/.default"
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>

Input settings

input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"

chunks:
size: 1200
overlap: 100
group_by_columns: [id]

Storage settings

If blob storage is specified in the following four sections,

connection_string and container_name must be provided

cache:
type: file # one of [blob, cosmosdb, file]
base_dir: "cache"

reporting:
type: file # or console, blob
base_dir: "logs"

storage:
type: file # one of [blob, cosmosdb, file]
base_dir: "output"

only turn this on if running graphrag index with custom settings

we normally use graphrag update with the defaults

update_index_storage:

type: file # or blob

base_dir: "update_output"

Workflow settings

skip_workflows: []

entity_extraction:
prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 1

summarize_descriptions:
prompt: "prompts/summarize_descriptions.txt"
max_length: 500

claim_extraction:
enabled: false
prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 1

community_reports:
prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000

cluster_graph:
enabled: false
max_cluster_size: 10

embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes

umap:
enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)

snapshots:
graphml: false
embeddings: false
transient: false

Query settings

The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.

See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

local_search:
prompt: "prompts/local_search_system_prompt.txt"

global_search:
map_prompt: "prompts/global_search_map_system_prompt.txt"
reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"

drift_search:
prompt: "prompts/drift_search_system_prompt.txt"
reduce_prompt: "prompts/drift_search_reduce_prompt.txt"

basic_search:
prompt: "prompts/basic_search_system_prompt.txt"

Logs and screenshots

LOGS as follows:
17:58:11,523 graphrag.index.run.run_workflows ERROR error running workflow compute_communities
Traceback (most recent call last):
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 141, in getattr
return self._fntab[name]
KeyError: 'LLVMPY_AddSymbol'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 122, in _load_lib
self.lib_handle = ctypes.CDLL(str(lib_path))
File "C:\Users\Administrator.conda\envs\paper_agent\lib\ctypes_init
.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 1114] ��̬���ӿ�(DLL)��ʼ������ʧ�ܡ�

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\run\run_workflows.py", line 166, in run_workflows
result = await run_workflow(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\workflows\compute_communities.py", line 31, in run_workflow
base_communities = compute_communities(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\flows\compute_communities.py", line 21, in compute_communities
communities = cluster_graph(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\operations\cluster_graph.py", line 29, in cluster_graph
node_id_to_community_map, parent_mapping = compute_leiden_communities(
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graphrag\index\operations\cluster_graph.py", line 64, in compute_leiden_communities
from graspologic.partition import hierarchical_leiden
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graspologic_init
.py", line 8, in
import graspologic.inference
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graspologic\inference_init
.py", line 6, in
from .latent_distribution_test import latent_distribution_test
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\graspologic\inference\latent_distribution_test.py", line 7, in
from hyppo.ksample import KSample
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo_init.py", line 1, in
import hyppo.discrim
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\discrim_init_.py", line 1, in
from .discrim_one_samp import DiscrimOneSample
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\discrim\discrim_one_samp.py", line 5, in
from .utils import CheckInputs
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\discrim_utils.py", line 4, in
from ..tools import check_ndarray_xy, check_reps, contains_nan, convert_xy_float64
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\tools_init
.py", line 5, in
from .power import *
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\tools\power.py", line 5, in
from ..conditional import COND_INDEP_TESTS
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\conditional_init
.py", line 5, in
from .pdcorr import PartialDcorr
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\hyppo\conditional\pdcorr.py", line 2, in
from numba import jit
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\numba_init_.py", line 73, in
from numba.core import config
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\numba\core\config.py", line 17, in
import llvmlite.binding as ll
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding_init_.py", line 4, in
from .dylib import *
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\dylib.py", line 36, in
ffi.lib.LLVMPY_AddSymbol.argtypes = [
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 144, in getattr
cfn = getattr(self._lib, name)
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 136, in _lib
self._load_lib()
File "C:\Users\Administrator.conda\envs\paper_agent\lib\site-packages\llvmlite\binding\ffi.py", line 130, in _load_lib
raise OSError("Could not find/load shared object file") from e
OSError: Could not find/load shared object file
17:58:11,547 graphrag.callbacks.file_workflow_callbacks INFO Error running pipeline! details=None
17:58:11,642 graphrag.cli.index ERROR Errors occurred during the pipeline run, see logs for more details.

Additional Information

  • GraphRAG Version: 1.2.0
  • Operating System: windows
  • Python Version:3.10

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageDefault label assignment, indicates new issue needs reviewed by a maintainer

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions