senga-LUK-aligned-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 4000
training_steps: 40000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.112	90.9091	1000	0.0942
0.0891	181.8182	2000	0.0886
0.077	272.7273	3000	0.0871
0.071	363.6364	4000	0.0907
0.0644	454.5455	5000	0.0919
0.0623	545.4545	6000	0.1022
0.0556	636.3636	7000	0.1009
0.0516	727.2727	8000	0.1042
0.0519	818.1818	9000	0.1083
0.0478	909.0909	10000	0.1100
0.0448	1000.0	11000	0.1130
0.0453	1090.9091	12000	0.1121
0.0416	1181.8182	13000	0.1163
0.043	1272.7273	14000	0.1197
0.0391	1363.6364	15000	0.1215
0.0397	1454.5455	16000	0.1206
0.0387	1545.4545	17000	0.1236
0.0488	1636.3636	18000	0.1256
0.0373	1727.2727	19000	0.1260
0.0387	1818.1818	20000	0.1279
0.0336	1909.0909	21000	0.1279
0.0349	2000.0	22000	0.1297
0.0343	2090.9091	23000	0.1296
0.0342	2181.8182	24000	0.1328
0.0335	2272.7273	25000	0.1343
0.0329	2363.6364	26000	0.1328
0.0307	2454.5455	27000	0.1357
0.03	2545.4545	28000	0.1346
0.0318	2636.3636	29000	0.1357
0.0325	2727.2727	30000	0.1368
0.0405	2818.1818	31000	0.1364
0.0299	2909.0909	32000	0.1370
0.029	3000.0	33000	0.1357
0.0312	3090.9091	34000	0.1369
0.0286	3181.8182	35000	0.1372
0.0281	3272.7273	36000	0.1377
0.0306	3363.6364	37000	0.1382
0.0295	3454.5455	38000	0.1371
0.0306	3545.4545	39000	0.1379
0.0302	3636.3636	40000	0.1379

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

this model