Skip to content

Rust-based parsers and toolings used by django-components

License

Notifications You must be signed in to change notification settings

django-components/djc-core

Repository files navigation

djc-core

PyPI - Version PyPI - Python Version PyPI - License PyPI - Downloads GitHub Actions Workflow Status

Rust-based parsers and toolings used by django-components. Exposed as a Python package with maturin.

Installation

pip install djc-core

Packages

HTML transfomer

Transform HTML in a single pass. This is a simple implementation.

This implementation was found to be 40-50x faster than our Python implementation, taking ~90ms to parse 5 MB of HTML.

Usage

from djc_core import set_html_attributes

html = '<div><p>Hello</p></div>'
result, _ = set_html_attributes(
  html,
  # Add attributes to the root elements
  root_attributes=['data-root-id'],
  # Add attributes to all elements
  all_attributes=['data-v-123'],
)

To save ourselves from re-parsing the HTML, set_html_attributes returns not just the transformed HTML, but also a dictionary as the second item.

This dictionary contains a record of which HTML attributes were written to which elemenents.

To populate this dictionary, you need set watch_on_attribute to an attribute name.

Then, during the HTML transformation, we check each element for this attribute. And if the element HAS this attribute, we:

  1. Get the value of said attribute
  2. Record the attributes that were added to the element, using the value of the watched attribute as the key.
from djc_core import set_html_attributes

html = """
  <div data-watch-id="123">
    <p data-watch-id="456">
      Hello
    </p>
  </div>
"""

result, captured = set_html_attributes(
  html,
  # Add attributes to the root elements
  root_attributes=['data-root-id'],
  # Add attributes to all elements
  all_attributes=['data-djc-tag'],
  # Watch for this attribute on elements
  watch_on_attribute='data-watch-id',
)

print(captured)
# {
#   '123': ['data-root-id', 'data-djc-tag'],
#   '456': ['data-djc-tag'],
# }

Architecture

This project uses a multi-crate Rust workspace structure to maintain clean separation of concerns:

Crate structure

  • djc-html-transformer: Pure Rust library for HTML transformation
  • djc-template-parser: Pure Rust library for Django template parsing
  • djc-core: Python bindings that combines all other libraries

Design philosophy

To make sense of the code, the Python API and Rust logic are defined separately:

  1. Each crate (AKA Rust package) has lib.rs (which is like Python's __init__.py). These files do not define the main logic, but only the public API of the crate. So the API that's to be used by other crates.
  2. The djc-core crate imports other crates
  3. And it is only this djc-core where we define the Python API using PyO3.

Development

  1. Setup python env

    python -m venv .venv
  2. Install dependencies

    pip install -r requirements-dev.txt

    The dev requirements also include maturin which is used packaging a Rust project as Python package.

  3. Install Rust

    See https://www.rust-lang.org/tools/install

  4. Run Rust tests

    cargo test
  5. Build the Python package

    maturin develop

    To build the production-optimized package, use maturin develop --release.

  6. Run Python tests

    pytest

    NOTE: When running Python tests, you need to run maturin develop first.

Deployment

Deployment is done automatically via GitHub Actions.

To publish a new version of the package, you need to:

  1. Bump the version in pyproject.toml and Cargo.toml
  2. Open a PR and merge it to main.
  3. Create a new tag on the main branch with the new version number (e.g. 1.0.0), or create a new release in the GitHub UI.

About

Rust-based parsers and toolings used by django-components

Resources

License

Code of conduct

Stars

Watchers

Forks

Sponsor this project

  •  
  •  

Packages

No packages published

Contributors 2

  •  
  •