Skip to content
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .ci/ignore_treon_docker.txt
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,5 @@ notebooks/qwen2.5-omni-chatbot/qwen2.5-omni-chatbot.ipynb
notebooks/intern-video2-classiciation/intern-video2-classification.ipynb
notebooks/flex.2-image-generation/flex.2-image-generation.ipynb
notebooks/wan2.1-text-to-video/wan2.1-text-to-video.ipynb
notebooks/ace-step-music-generation/ace-step-music-generation.ipynb
notebooks/ace-step-music-generation/ace-step-music-generation.ipynb
notebooks/fireredtts2/fireredtts2.ipynb
6 changes: 6 additions & 0 deletions .ci/skipped_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -538,6 +538,12 @@
skips:
- os:
- macos-13
- notebook: notebooks/fireredtts2/fireredtts2.ipynb
skips:
- os:
- macos-13
- ubuntu-22.04
- windows-2022
- notebook: notebooks/qwen3-vl/qwen3-vl.ipynb
skips:
- os:
Expand Down
3 changes: 3 additions & 0 deletions .ci/spellcheck/.pyspelling.wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ BLACKBOX
boolean
CatVTON
CausVid
CER
CentOS
centric
CFG
Expand Down Expand Up @@ -302,6 +303,7 @@ feedforward
FeedForward
FFN
FFmpeg
FireRedTTS
FIL
FEIL
finetuned
Expand Down Expand Up @@ -912,6 +914,7 @@ Ruizhongtai
Runtime
runtime
runtimes
RVQ
Safetensors
SageMaker
sagittal
Expand Down
33 changes: 33 additions & 0 deletions notebooks/fireredtts2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Multi-speaker dialogue generation with FireRedTTS‑2 and OpenVINO

FireRedTTS‑2 is a long-form streaming TTS system for multi-speaker dialogue generation, delivering stable, natural speech with reliable speaker switching and context-aware prosody. It is highlighted by following features:
- **Long Conversational Speech Generation**: It currently supports 3 minutes dialogues with 4 speakers and can be easily scaled to longer conversations
with more speakers by extending training corpus.
- **Multilingual Support**: It supports multiple languages including English, Chinese, Japanese, Korean, French, German, and Russian. Support zero-shot voice cloning for cross-lingual and code-switching scenarios.
- **Ultra-Low Latency**: Building on the new **12.5Hz streaming** speech tokenizer, we employ a dual-transformer architecture that operates on a text–speech interleaved sequence, enabling flexible sentence-by-sentence generation and reducing first-packet latency,Specifically, on an L20 GPU, our first-packet latency as low as 140ms while maintaining high-quality audio output.
- **Strong Stability**:Our model achieves high similarity and low WER/CER in both monologue and dialogue tests.
- **Random Timbre Generation**:Useful for creating ASR/speech interaction data.

More details can be found in the [paper](https://arxiv.org/abs/2509.02020), original [repository](https://github.com/FireRedTeam/FireRedTTS2) and [model card](https://huggingface.co/FireRedTeam/FireRedTTS2)

In this tutorial we consider how to run and optimize FireRedTTS‑2 using OpenVINO.

## Notebook contents
The tutorial consists from following steps:

- Install requirements
- Convert and Optimize model
- Run OpenVINO model inference
- Launch Interactive demo

In this demonstration, you'll create interactive assistant that can answer questions about provided image's content or generate images based on text instructions.

The images bellow illustrates example of voice cloning and dialogue generation.

<img width="1862" height="1125" alt="image" src="https://github.com/user-attachments/assets/a7512db5-78cd-4379-956b-893c13534862" />

## Installation instructions
This is a self-contained example that relies solely on its own code.</br>
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start.
For details, please refer to [Installation Guide](../../README.md).
<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=5b5a4db0-7875-4bfb-bdbd-01698b5b1a77&file=notebooks/fireredtts2/README.md" />
Loading