Model evaluation result issue

by kalimjelly - opened Sep 22, 2025

Sep 22, 2025

Hello, I used the code you provided and the completely consistent URDU dataset without any additional processing (except for replacing the testing of individual data files with traversing the entire folder to test all files). However, the results displayed in the classification report are vastly different from the results you provided. The situation of other datasets (including TESS, RAVDESS, and SAVEE) is also similar, showing significant differences between various emotional labels. What is the reason for this?

please comment on this.

kalimjelly

Sep 25, 2025

I have solved this problem, the author made a mistake here:
audio_array, sampling_rate = librosa.load(audio_path, sr=feature_extractor.sampling_rate)
This should be:audio_array, sampling_rate = librosa.load(audio_path, sr=None)
The running result after code modification(URDU):

firdhokk

Owner Nov 1, 2025

Thank you for this great article/code.

It was very helpful! I found one small detail when running it. The original code (sr=feature_extractor.sampling_rate) seemed to cause an error in my workflow.

I changed it to audio_array, sampling_rate = librosa.load(audio_path, sr=None) and the code ran successfully (for Urdu language data). It seems that the feature extractor is better at handling its own resampling.

Thanks again, I hope this feedback is useful!

Translated with DeepL.com (free version)

firdhokk changed discussion status to closed Nov 1, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment