-
Notifications
You must be signed in to change notification settings - Fork 20k
Open
Open
Feature
Copy link
Labels
coreRelated to the package `langchain-core`Related to the package `langchain-core`feature requestrequest for an enhancement / additional functionalityrequest for an enhancement / additional functionality
Description
Checked other resources
- This is a feature request, not a bug report or usage question.
- I added a clear and descriptive title that summarizes the feature request.
- I used the GitHub search to find a similar feature request and didn't find it.
- I checked the LangChain documentation and API reference to see if this feature already exists.
- This is not related to the langchain-community package.
Feature Description
Hi LangChain team,
While reviewing the code in
libs/core/langchain_core/output_parsers/xml.py,
I noticed that it currently uses Python’s built-in xml module for parsing.
The standard XML parser is known to be quite slow — especially when handling larger documents. I’d like to suggest evaluating a new library I’ve developed called pygixml, which could dramatically improve performance in this area.
Why pygixml
- Built on top of pugixml (C++) and Cython, designed for performance and clean Python integration.
- 16× to 33× faster than Python’s built-in
xmlparser (and around 5× faster than lxml in benchmarks). - Offers a highly intuitive, Pythonic API with full XPath 1.0 support.
- Each node has additional utilities like
mem_id,xpath, and recursive text access.
I’m the maintainer and author of pygixml, and I’d be happy to either:
- Submit a pull request integrating
pygixmlinto LangChain’s XML output parser, or - Help evaluate the potential performance benefits before merging.
Would love to know if you’re open to a PR — I’m confident this change could significantly improve XML parsing speed in LangChain.
Thanks for the great work you do,
Mohammad Raziei
Author of pygixml
Use Case
change speed of xml parsing
Proposed Solution
using pygixml
Alternatives Considered
No response
Additional Context
No response
Metadata
Metadata
Assignees
Labels
coreRelated to the package `langchain-core`Related to the package `langchain-core`feature requestrequest for an enhancement / additional functionalityrequest for an enhancement / additional functionality