You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Training a multi-speaker model using voice embeddings as speaker features.
Compute embedding vectors by compute_embedding.py and feed them to your TTS network. (TTS side needs to be implemented but it should be straight forward)
Pruning bad examples from your TTS dataset.
Compute embedding vectors and plot them using the notebook provided. Thx @nmstoker for this!
Use as a speaker classification or verification system.
Speaker diarization for ASR systems.
The model provided here is the halve of the baseline model. I figured, it is easier to train and the final performance does not differ too much compared to the larger version.