Skip to content

Commit acb7e00

Browse files
committed
FARGAN instructions
1 parent fdb198e commit acb7e00

File tree

1 file changed

+51
-0
lines changed

1 file changed

+51
-0
lines changed

dnn/torch/fargan/README.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# Framewise Auto-Regressive GAN (FARGAN)
2+
3+
Implementation of FARGAN, a low-complexity neural vocoder.
4+
5+
## Data preparation
6+
7+
For data preparation you need to build Opus as detailed in the top-level README.
8+
You will need to use the --enable-deep-plc configure option.
9+
The build will produce an executable named "dump_data".
10+
To prepare the training data, run:
11+
12+
./dump_data -train in_speech.pcm out_features.f32 out_speech.pcm
13+
14+
Where the in_speech.pcm speech file is a raw 16-bit PCM file sampled at 16 kHz.
15+
The speech data used for training the model can be found at:
16+
https://media.xiph.org/lpcnet/speech/tts_speech_negative_16k.sw
17+
18+
## Training
19+
20+
To perform pre-training, run the following command:
21+
```
22+
python ./train_fargan.py out_features.f32 out_speech.pcm output_dir --epochs 400 --batch-size 4096 --lr 0.002 --cuda-visible-devices 0
23+
```
24+
Once pre-training is complete, run adversarial training using:
25+
```
26+
python adv_train_fargan.py out_features.f32 out_speech.pcm output_dir --lr 0.000002 --reg-weight 5 --batch-size 160 --cuda-visible-devices 0 --initial-checkpoint output_dir/checkpoints/fargan_400.pth
27+
```
28+
The final model will be in output_dir/checkpoints/fargan_adv_50.pth.
29+
30+
The model can optionally be converted to C using:
31+
```
32+
python dump_fargan_weights.py output_dir/checkpoints/fargan_adv_50.pth fargan_c_dir
33+
```
34+
which will create a fargan_data.c and a fargan_data.h file in the fargan_c_dir directory.
35+
Copy these files to the opus/dnn/ directory (replacing the existing ones) and recompile Opus.
36+
37+
## Inference
38+
39+
To run the inference, start by generating the features from the audio using:
40+
```
41+
./fargan_demo -features test_speech.pcm test_features.f32
42+
```
43+
Synthesis can be achieved either using the PyTorch code or the C code.
44+
To synthesize from PyTorch, run:
45+
```
46+
python test_fargan.py output_dir/checkpoints/fargan_adv_50.pth test_features.f32 output_speech.pcm
47+
```
48+
To synthesize from the C code, run:
49+
```
50+
./fargan_demo -fargan-synthesis test_features.f32 output_speech.pcm
51+
```

0 commit comments

Comments
 (0)