MRT is a Python-based framework for machine learning model quantization and compilation. It is built on top of CVMRuntime and is designed to convert pre-trained models into CVM-Compatiable formats, including fixed-point representations and zero-knowledge (ZK) circuits for verifiable inference.
The core of the framework is the Trace object, which represents the computational graph of a model and provides a high-level API for applying various transformations, such as quantization, calibration, and operator fusion.
- Model Ingestion: Imports models from PyTorch or TVM Relay.
- Quantization: Supports post-training quantization with calibration.
- Export: Exports models to various formats, including:
- Simulated quantized formats for accuracy evaluation.
- Fixed-point format for blockchain runtime deployment.
- Zero-Knowledge Integration: Generates Circom circuits from quantized models for use in ZK-SNARK-based verifiable inference.
- Frontend: The
mrt.frontendmodule handles the import of models from other frameworks (e.g., PyTorch) into the MRT representation. - MIR (Model Intermediate Representation): The
mrt.mirmodule defines the core data structures for representing the model's computational graph. - Quantization: The
mrt.quantizationmodule contains the logic for model quantization, including calibration, scaling, and precision revision. - Runtime: The
mrt.runtimemodule provides tools for model evaluation and analysis. - ZKML: The
mrt.frontend.zkmlandmrt.trace_to_circommodules handle the conversion of models into Circom circuits.
- Trace: The central abstraction in MRT. A
Traceobject represents the model's computational graph and provides methods for applying transformations. - Quantization: The process of converting a model's weights and activations to a lower-precision format (e.g., 8-bit integers). This is essential for deploying models on resource-constrained hardware and for use in ZK-SNARKs.
- Calibration: The process of determining the appropriate scaling factors for quantization. This is typically done by running the model on a small, representative dataset.
- Circom: A domain-specific language for writing arithmetic circuits for ZK-SNARKs. MRT can automatically generate Circom code from a quantized model, enabling verifiable inference.
To set up the Python environment, run the following command from the project root:
source env.shThis script will add the python directory to your PYTHONPATH, allowing you to import the mrt package.
Run individual tests:
# Frontend tests
python tests/frontend/test_frontend_loading.py
python tests/frontend/pytorch/test.pytorch.py
# Classification model tests
python tests/classification/test.resnet.py
python tests/classification/test.mnist.py
# TVM/Relax tests
python tests/test.relax.py
# Template for new tests
python tests/test.template.py
# PyTest
pytest tests/frontend/pytorch/test_pytorch.py::test_conv_model -vAll test files should be located in the tests/ directory with subdirectories for different categories (frontend, classification, detection, nlp).
Git commit message should be simple, and add core feature name at the begin, examples:
[python]: add torch module
[tests]: fix test_frontend_loading
- The project follows standard Python coding conventions (PEP 8).
- Print essencial info and exit immediately if any unsupported operation/corner case is performed.
- Tests are located in the
testsdirectory, with subdirectories for different model types and framework components. - The tests demonstrate the intended usage of the framework and provide a good starting point for understanding its capabilities.
- The
tests/test.trace.pyfile is particularly important, as it shows the end-to-end workflow from model import to Circom generation.