pip install symbolicaiRun symconfig to init the cache and get the path of the current configuration that you need to set up. For this project, you'll need an EMBEDDING_ENGINE_MODEL. We support openai and local through llamacpp. Read more about the configuration here and local models here.
Once the setup is done, install this package:
sympkg i ExtensityAI/LightRAG-symai --submodules
Install LightRAG submodule. The path is located in packages under [get-this-using-symconfig/].symai/packages/ExtensityAI/LightRAG-symai.
cd path/to/LightRAG-symai/LightRAG
pip install -e .You'll also need to have psql installed with pgvector extension. Once installed, create the database using psql. Make sure to use the same database name, user, and password (if set) as the one specified in the configuration file.
createuser -s lightrag
createdb lightrag
psql -U lightrag -d ragLastly, enable the pgvector extension:
rag=# CREATE EXTENSION IF NOT EXISTS vector;
Create a config.json file in the root directory of the project:
cd path/to/LightRAG-symai
touch config.jsonThen, set up your configuration. Here's an example:
{
"backend": "postgres",
"tokenizer_name": "Xenova/gpt-4o",
"chunker_name": "RecursiveChunker",
"local": {
"working_dir": "./lightrag_cache"
},
"postgres": {
"host": "localhost",
"port": 5432,
"user": "lightrag",
"password": "lightrag1234",
"database": "rag",
"working_dir": "rag_store",
"workspace": "default",
"embedding_batch_num": 8
},
"rag_settings": {
"embedding_dim": 1536,
"max_token_size": 8191,
"embedding_cache_enabled": false,
"embedding_cache_sim_threshold": 0.90,
"llm_cache_enabled": false,
"batch_size": 20
}
}Then the plugin can be used as follows:
import asyncio
import logging
from lightrag import QueryParam
from symai import Import
from symai.components import FileReader
async def main():
# Load configuration
Config = Import.load_expression("ExtensityAI/LightRAG-symai", "Config")
config = Config.load("path/to/config.json")
# Initialize RAGManager with backend
RAGManager = Import("ExtensityAI/LightRAG-symai", working_dir=config.local.working_dir)
rag_manager = await RAGManager.init_with_backend(config=config)
# Optional: Enable detailed logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("lightrag")
logger.setLevel(logging.DEBUG)
logger.disabled = False
# Add documents to the RAG system
reader = FileReader()
document_contents = [
reader("path/to/document1.pdf").value[0],
reader("path/to/document2.pdf").value[0]
]
doc_ids = ["doc1", "doc2"] # Provide identifiers for the documents
# Upsert documents (insert or update)
await rag_manager.chunk_and_upsert(
document_contents=document_contents[0], # or `document_paths` or `document_urls`
document_ids=doc_ids[0],
workspace="default" # Optional workspace name
)
# Query the RAG system
query = "your query here"
query_params = QueryParam(
mode="naive",
top_k=5,
only_need_context=False,
return_doc_names=True,
response_type="Long answer with multiple sections and paragraphs; list facts and details."
)
result = await rag_manager.query(query, param=query_params, workspace="default")
# Access results
if isinstance(result, str):
print(result) # Cache hit
else:
print(result["doc_names"]) # List of relevant document names
print(result["response"]) # Generated response based on the documents
if __name__ == "__main__":
asyncio.run(main())See the example notebook for more detailed usage examples and information on more advanced features, such as tagging: ➡️ Example.ipynb
- Document Insertion & Updates: Support for both inserting new documents and updating existing ones
- Workspace Management: Organize documents in different workspaces
- Tag System: Add tags to documents for organized insertion and querying
- Configurable Query Parameters: Fine-tune search results with customizable parameters
- Detailed Logging: Optional DEBUG level logging for troubleshooting
- Postgres Backend: Robust storage and retrieval using PostgreSQL
- File Format Support: Works with various document formats through FileReader component