|
| 1 | +Thank you for your interest in contributing to Optimum ExecuTorch! |
| 2 | + |
| 3 | +## Developing Optimum ExecuTorch |
| 4 | + |
| 5 | +### Setting up the development environment |
| 6 | +To install Optimum ExecuTorch for development: |
| 7 | +``` |
| 8 | +python install_dev.py |
| 9 | +``` |
| 10 | + |
| 11 | +### Testing local chagnes |
| 12 | +Optimum ExecuTorch does not have an editable install at the moment, so to test your local changes, you will need to reinstall. |
| 13 | +To prevent the reinstall from overwriting other dependencies, some of which you may have modified, you can run the following ahead of your test: |
| 14 | +``` |
| 15 | +pip install --no-deps --no-build-isolation . |
| 16 | +``` |
| 17 | + |
| 18 | +An example command for testing local changes to Gemma3: |
| 19 | +``` |
| 20 | +pip install --no-deps --no-build-isolation . |
| 21 | +RUN_SLOW=1 python -m pytest tests/models/test_modeling_gemma3.py -s -k test_gemma3_image_vision_with_custom_sdpa_kv_cache_8da4w_8we --log-cli-level=INFO |
| 22 | +``` |
| 23 | + |
| 24 | +To run tests marked with `@slow`, just set `RUN_SLOW=1`. |
| 25 | + |
| 26 | +## Enabling a new model on Optimum |
| 27 | + |
| 28 | +Our design philsophy is to have as little model-specific code as possible, which means all optimizations, export code, etc. are model-agnostic. |
| 29 | +This allows us to theoretically export any new model straight from the source, with a few caveats which will be explained later. |
| 30 | +For example, most Large Language Models should be able to be exported using this library. |
| 31 | + |
| 32 | +### 💡 How to "enable" a model on Optimum |
| 33 | +❓ Currently, the [homepage README](README.md?tab=readme-ov-file#-supported-models) lists all of the "supported" models. What does this mean, and what about models not on this list? |
| 34 | + |
| 35 | +👉 These supported models all have a test file associated with them, such as [Gemma3](https://github.com/huggingface/optimum-executorch/blob/main/tests/models/test_modeling_gemma3.py), which has been used to validate the E2E of the model (export + run generation loop on exported artifact). |
| 36 | +The test file is then used in CI to guard against potential regressions. |
| 37 | +Once you have a PR up for adding the test to the repo, feel free to edit the homepage README to include the new model. |
| 38 | + |
| 39 | +As an example, in the Gemma3 test file, we have validated that the model is able to export and returns correct output to a test prompt for different export configurations - now other users will know that Gemma3 works and are able to export the model like so: |
| 40 | +``` |
| 41 | +optimum-cli export executorch \ |
| 42 | + --model google/gemma-3-1b-it \ |
| 43 | + --task text-generation \ |
| 44 | + --recipe xnnpack \ |
| 45 | + --use_custom_sdpa \ |
| 46 | + --use_custom_kv_cache \ |
| 47 | + --qlinear 8da4w \ |
| 48 | + --qembedding 8w |
| 49 | +``` |
| 50 | + |
| 51 | +However, there are many models without test files in Optimum that probably still work - just that no one has went through the trouble of validating them. |
| 52 | +This is where you come in - feel free to contribute if there is a model you are interested in that does not yet have a test file! |
| 53 | + |
| 54 | +If you run into any issues, they will most likely stem from the following: |
| 55 | +- ❓ How much model-specific code is in Transformers for this model? |
| 56 | +- ❓ Do we already have the model type supported in Optimum? |
| 57 | +- ❓ Is the model itself torch.exportable? |
| 58 | + |
| 59 | +### ❌ Model-specific code is in Transformers |
| 60 | +To address this issue, we will need to upstream changes to the Transformers library, or update our code to match. |
| 61 | +For instance, if hypothetically Transformers introduced a new type of cache, and this cache is used in a new LLM, we would need to handle this new cache type in Optimum. |
| 62 | +Or, hypothetically if we are expecting a certain attribute in a Transformers model and it exists instead with a slighly different name, this may be an opportunity to upstream some naming standardization changes to Transformers. |
| 63 | +[Here](https://github.com/huggingface/transformers/pull/40919) is an example of one such standardization. |
| 64 | + |
| 65 | +### ❌ Model type is not supported in Optimum |
| 66 | +All of the supported model types are in [integrations.py](https://github.com/huggingface/optimum-executorch/blob/main/optimum/exporters/executorch/integrations.py), which contains wrapper classes that facilitate torch.exporting a model: |
| 67 | +- `CausalLMExportableModule` - LLMs (Large Language Models) |
| 68 | +- `MultiModalTextToTextExportableModule` - Multimodal LLMs (Large Language Models with support for audio/image input) |
| 69 | +- `VisionEncoderExportableModule` - Vision Encoder backbones (such as DiT or MobileViT) |
| 70 | +- `MaskedLMExportableModule` - Masked language models (for predicting masked characters) |
| 71 | +- `Seq2SeqLMExportableModule` - General Seq2Seq encoder-decoder models (such as T5 and Whisper) |
| 72 | + |
| 73 | +This is where most of the complexity around "enabling" a model on Optimum arises from, since post torch.export() every model follows the same flow per backend for transforming the torch.export() artifact into an Excecutorch `.pte` artifact. |
| 74 | +If the model type doesn't exist in Optimum then we will need to write a new class for it. |
| 75 | + |
| 76 | +### ❌ Model is not torch.exportable |
| 77 | +To address this issue, we will need to upstream changes to the model's modeling file in Transformers to make the model exportable. |
| 78 | +After doing this, it's a good idea to add a torch.export test to guard against future regressions (which tend to happen frequently since Transformers moves fast). |
| 79 | +[Here](https://github.com/huggingface/transformers/blob/87f38dbfcec48027d4bf2ea7ec8b8eecd5a7bc85/tests/models/smollm3/test_modeling_smollm3.py#L175) is an example. |
0 commit comments