✨ Add compilation time test #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

gkumbhat wants to merge 4 commits into foundation-model-stack:main from gkumbhat:add_compilation_time_test

Contributor

gkumbhat commented Jul 25, 2025 •

edited

Loading

Changes

Add a test for verifying compilation time to be as expected for given model and sequence length.
It currently only compares for dynamic shapes and only for paged attention.

gkumbhat force-pushed the add_compilation_time_test branch from 59888d0 to 8b67444 Compare

July 29, 2025 02:50

gkumbhat marked this pull request as ready for review

July 29, 2025 02:51

gkumbhat added 2 commits

July 30, 2025 10:33


          ✨ Add compilation time test

6c5e5ef

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>


          🎨 Fix formatting

e7c18f8

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

gkumbhat force-pushed the add_compilation_time_test branch from 8b67444 to e7c18f8 Compare

July 30, 2025 15:37

JRosenkranz reviewed

View reviewed changes

aiu_fms_testing_utils/utils/__init__.py Outdated

    
                  dprint(f"PT compile complete, took {pt_compile_model_time:.3f}s")

              def get_env_to_int_list(env_var_name, default):

Contributor

JRosenkranz Jul 30, 2025 •

edited

Loading

can you add type hints to this as well as explanation for each parameter. Should default's default here be None?

Contributor Author

gkumbhat Jul 30, 2025

Yep, I'll add hints and description..

I thought of making this as a required param.. But None would work as well, since thats the default for os.environ.get

JRosenkranz reviewed

View reviewed changes

aiu_fms_testing_utils/utils/__init__.py Outdated

    
                  dprint(f"PT compile complete, took {pt_compile_model_time:.3f}s")

              def get_env_to_int_list(env_var_name, default):

Contributor

JRosenkranz Jul 30, 2025 •

edited

Loading

I could see this function serving much more utility than just ints. I think we may want to include a third optional parameter which is a function which takes the value and returns the value the user wants. (by default it would just be a list of strings). For instance, we may want to return a list of floats or custom objects. If we don't want to add that much control, we could also have a third parameter to just specify the type to return back

Contributor Author

gkumbhat Jul 30, 2025

true.. 3rd parameter as function would require more careful error handling... But we can easily add 3rd parameter with specific return type.. I'll add that.

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py Outdated

    
                  "SHARE_GPT_DATASET_PATH", os.path.expanduser("~/share_gpt.json")

              )

              ATTN_NAME = "spyre_paged_attn"

Contributor

JRosenkranz Jul 30, 2025

we should make this configurable

Contributor Author

gkumbhat Jul 30, 2025

yep. I'll add that

Collaborator

tharapalanivel Jul 31, 2025

I assume when we make it configurable we'll want to reuse the ATTN_TYPE map for consistency so should we move some of these constants outside?

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py

    
              from aiu_fms_testing_utils.utils.aiu_setup import dprint

              GRANITE_3p3_8B_INSTRUCT = "ibm-granite/granite-3.3-8b-instruct"

              SHARE_GPT_DATASET_PATH = os.environ.get(

Contributor

JRosenkranz Jul 30, 2025

given this is just testing compilation time, I don't think we need real data

Contributor Author

gkumbhat Jul 30, 2025

True.. I thought of keeping it similar to what we do in other places for consistency..

We can do a random text generation as well

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py Outdated

    
              ATTN_NAME = "spyre_paged_attn"

              COMPILE_DYNAMIC_SHAPE = True

Contributor

JRosenkranz Jul 30, 2025

we should make this configurable

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py Outdated

    
                  "COMMON_COMPILATION_EXPECTED_TIME", [10]

              )  # In minutes

              COMMON_SHAPE_TYPE = "dynamic"

Contributor

JRosenkranz Jul 30, 2025

we should make this configurable

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py Outdated

    
                  # the compiler supports certain max context lengths (VLLM_DT_MAX_CONTEXT_LEN)

                  # this will ensure that we select smallest supported VLLM_DT_MAX_CONTEXT_LEN that fits the largest possible context (prompt size + max_new_tokens)

                  __largest_context = max(common_seq_lengths) + max(common_max_new_tokens)

Contributor

JRosenkranz Jul 30, 2025

for this test, given we are testing timing, we may want the largest context to be on a per-test basis (rather than spanning all tests). This way we can know the performance implications of changing these values

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py

    
              # TODO: This is copied from test_decoders.py would be good to consolidate

              def __prepare_inputs(batch_size, seq_length, tokenizer, seed=0):

Contributor

JRosenkranz Jul 30, 2025

we can make this simpler and just use a simple torch.arange with the sizes required

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py

    
                  torch.set_default_dtype(torch.float16)

                  os.environ["COMPILATION_MODE"] = "offline_decoder"

                  dprint(

Contributor

JRosenkranz Jul 30, 2025

include attention type to this print

JRosenkranz reviewed

View reviewed changes

tests/testing/test_compilation.py

    
                      COMPILE_DYNAMIC_SHAPE = False

                  model.compile(backend="sendnn", options={"sendnn.dynamic": COMPILE_DYNAMIC_SHAPE})

                  warmup_model(

Contributor

JRosenkranz Jul 30, 2025 •

edited

Loading

This only includes the initial compile (not the device warmup). Do we want to include that as well -- inference.py has an example

aiu-fms-testing-utils/scripts/inference.py

Line 824 in bd1090e

if (

Contributor Author

gkumbhat Jul 30, 2025

I'll add it. Thanks for the pointers


          📝 Add doc string for get_env_to_int_list function

d680f67

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

tharapalanivel reviewed

View reviewed changes

tests/testing/test_compilation.py

    
            @@ -0,0 +1,150 @@
          
              """This module contains test related to compilation operation"""

Collaborator

tharapalanivel Jul 31, 2025

testing dir right now contains tests of tests iiuc, this should live in models and we can move it to integration_tests or something similar when we restructure?

tests/testing/test_compilation.py Outdated

    
                  "SHARE_GPT_DATASET_PATH", os.path.expanduser("~/share_gpt.json")

              )

              ATTN_NAME = "spyre_paged_attn"

Collaborator

tharapalanivel Jul 31, 2025

I assume when we make it configurable we'll want to reuse the ATTN_TYPE map for consistency so should we move some of these constants outside?


          🚧🏷️ Work in progress add util function and types

8640dc2

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet