WhisperTT evals #899

idjuricTT · 2025-11-12T15:52:25Z

Before you open a pull-request, please check if a similar issue already exists or has been closed before.

When you open a pull-request, please be sure to include the following

A descriptive title: [xxx] XXXX
A detailed description

If you meet the lint warnings, you can use following scripts to reformat code.

pip install pre-commit
pre-commit install
pre-commit run --all-files

Thank you for your contributions!

…rTT class

- Added `openslr_librispeech_other.yaml` and `openslr_librispeech.yaml` configuration files for task definitions. - Implemented utility functions in `utils.py` for processing audio and text documents. - Created `basic.py`, `english.py`, and `english.json` for English text normalization, including handling of spelling variations and number normalization. - Enhanced the whisper normalizer with new functionalities for both Chinese and English text processing.

- Changed dataset_path to 'parquet' and updated dataset_kwargs to include a specific data file URL. - Modified test_split from 'test' to 'train' and set dataset_name to 'null' for task definition adjustments.

- Enhanced the `import_function` to first attempt a relative file import and fallback to an absolute module import if the relative path does not exist. - Improved error handling by re-raising import errors with context for better debugging. - Removed unused `openslr_librispeech/_default_yaml_template` and related whisper normalizer files to streamline the codebase.

…ield names - Updated the `librispeech_process_result` function to handle both "gt" and "transcript" as valid keys for ground truth in documents, improving compatibility with different LibriSpeech datasets. - Added error handling to raise a KeyError if neither field is found, providing clearer feedback on document structure.

…fields - Updated the `librispeech_process_result` function to safely retrieve the "source" field, defaulting to "unknown" if not present. - Added logic to infer the "task" field from context, defaulting to "asr_en" for LibriSpeech datasets when not explicitly provided.

…eld names - Updated the `librispeech_doc_to_audio` function to check for various field names ("audio", "file", "path", "audio_path") in the document, improving compatibility with different LibriSpeech datasets. - Added error handling to raise a KeyError if no valid audio field is found, providing clearer feedback on document structure.

…ionality - Simplified the `librispeech_doc_to_audio` function to directly return the "audio" field, removing unnecessary checks. - Streamlined the `librispeech_process_result` function to directly access "gt", "source", and "task" fields without additional error handling, assuming their presence. - Added a new `librispeech_doc_to_target` function to return the ground truth from the document, enhancing modularity.

…dularity - Updated the `openasr_doc_to_audio` function to handle multiple audio field names ("audio", "file", "path", "audio_path"), enhancing compatibility with various datasets. - Introduced a new `openasr_doc_to_target` function to normalize the retrieval of ground truth fields ("text", "transcript", "gt"), improving modularity and error handling in the `openasr_process_result` function.

…sper model - Updated the `warmup_model` function to create a mesh device instead of a single device, enabling compatibility with the mesh-enabled Whisper model. - Added logging to indicate the creation and successful warming up of the Whisper model.

- Updated the `warmup_model` function to streamline the creation of the mesh device by removing unnecessary parameters, enhancing code clarity and maintainability.

- Refactored the WhisperTT class to utilize HTTP calls to the tt-media-server for audio transcription, allowing evaluations to run outside of Docker. - Added methods for encoding audio to base64 and transcribing audio via the API. - Updated model initialization to include parameters for base URL, timeout, and retries, enhancing flexibility and error handling.

- Added a new parameter `num_concurrent` to the WhisperTT class for improved concurrency handling. - Updated the initialization method to log a warning for any unexpected keyword arguments instead of raising an assertion error, enhancing robustness and user feedback.

…TT class - Updated the audio array conversion to float32 to prevent "Unsupported bit depth: 64" errors when creating WAV files, ensuring compatibility with server requirements.

…-eval into ben/samt/whisper-tt

kcz358 · 2025-11-17T08:49:20Z

lmms_eval/tasks/open_asr/utils.py

I checked that in OpenASR hf dataset, the audio and answer key is just audio and text? Is it necessary to try fallback for this?

kcz358 · 2025-11-17T08:53:52Z

Hi, thanks for the contribution. I checked through your files and seems like you are using a very old version of lmms-eval. Is it possible to checkout from the main branch and try to make the edits on the newest main? Thanks!

kcz358 · 2025-11-17T09:00:05Z

lmms_eval/tasks/openslr_librispeech/openslr_librispeech_other.yaml

+dataset_path: parquet
+dataset_kwargs:
+  data_files:
+    test: "https://huggingface.co/datasets/openslr/librispeech_asr/resolve/71cacbfb7e2354c4226d01e70d77d5fca3d04ba1/other/test/0000.parquet"


I think this can be config as a load dataset path with load_dataset(openslr/librispeech_asr, "other", split="test") so it can be configured much nicer in the yaml file

kcz358 · 2025-11-17T09:00:45Z

pyproject.toml

    "pycocoevalcap",
    "tqdm-multiprocess",
-    "transformers>=4.39.2",
+    "transformers==4.38.0",


May remove the pinning version of transformers as we are incorporate more models

bgoelTT and others added 28 commits April 11, 2025 13:55

Register new WhisperTT model

c051fd4

Update pyproject.toml to pin transformers version

fb63702

Update pyproject.toml to pin numpy version

98c783b

Update registration with new directory structure

1fc08ca

Change path for sample audio to use TT_METAL_HOME

9eba618

Import pathlib.Path

f634eff

Remove deprecated call to enable_async()

9d5ca8d

Update tt-metal installation location

fdb27d8

Fix f-string typo

529b5ff

Update warmup_model function to accept model_repo parameter in Whispe…

2053bb6

…rTT class

Update openslr_librispeech_other.yaml configuration

cc1c2d1

- Changed dataset_path to 'parquet' and updated dataset_kwargs to include a specific data file URL. - Modified test_split from 'test' to 'train' and set dataset_name to 'null' for task definition adjustments.

Refactor warmup_model function to remove default model_repo parameter

e1bb35d

Refactor warmup_model function to simplify mesh device creation

04b878b

- Updated the `warmup_model` function to streamline the creation of the mesh device by removing unnecessary parameters, enhancing code clarity and maintainability.

Ensure audio array is float32 for 32-bit WAV file creation in Whisper…

080da21

…TT class - Updated the audio array conversion to float32 to prevent "Unsupported bit depth: 64" errors when creating WAV files, ensuring compatibility with server requirements.

Update default API key in WhisperTT class for testing purposes

13f654c

Authorization fix

45b7178

Run requests in parallel

6feea1d

Merge branch 'ben/samt/whisper-tt' of https://github.com/bgoelTT/lmms…

510ecfc

…-eval into ben/samt/whisper-tt

fivanovicTT mentioned this pull request Nov 14, 2025

Downloading Whisper eval dataset tenstorrent/tt-inference-server#1131

Open

kcz358 reviewed Nov 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WhisperTT evals #899

WhisperTT evals #899

Uh oh!

idjuricTT commented Nov 12, 2025

Uh oh!

kcz358 Nov 17, 2025

Uh oh!

kcz358 commented Nov 17, 2025

Uh oh!

kcz358 Nov 17, 2025

Uh oh!

kcz358 Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WhisperTT evals #899

Are you sure you want to change the base?

WhisperTT evals #899

Uh oh!

Conversation

idjuricTT commented Nov 12, 2025

When you open a pull-request, please be sure to include the following

Uh oh!

kcz358 Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

kcz358 commented Nov 17, 2025

Uh oh!

kcz358 Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

kcz358 Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants