Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The identification result is not accurate #3

Open
lzj520 opened this issue Aug 2, 2019 · 3 comments
Open

The identification result is not accurate #3

lzj520 opened this issue Aug 2, 2019 · 3 comments

Comments

@lzj520
Copy link

lzj520 commented Aug 2, 2019

I used more than 3,000 male voices and more than 3,000 female voices, and most of the results of the training identified male voices as female voices

@SuperKogito
Copy link
Owner

Hello,

The issue description is very vague, please define the accuracy level or inaccuracy?
At this stage, the issue seems a bit similar to #1 so you should look into your recordings. They should have the same characteristics (sample rate, mono, stereo or poly, etc.) and that's why for example recordings with different microphones can be challenging in similar recognition problems. The database used in the project is normalized and all files have the same sample rate and are all mono. You can verify this using ffmpeg -i filename.wav. This should result in something like ..., 16000 Hz, mono, s16, 256 kb/s. In case, your recordings do not have similar characteristics like the ones in the SLR45, then use ffmpeg to convert them and adjust them.

  • Furthermore, do you generate the gender GMMs based on your data or do you use the ones based on the SLR45?
  • If you do not mind fully/partially sharing the voice files, it might help identify the problem?

@lzj520
Copy link
Author

lzj520 commented Aug 12, 2019

image
Audio format is all picture

@SuperKogito
Copy link
Owner

Well you can see that your data has a lower rate; 8k instead of 16k and 128kb/s instead of 256kb/s, so it holds less information than the one used in the project. This might explain the degradation in accuracy. Nevertheless, this should not result in a huge drop of accuracy so please provide a percentage for how accurate the code is on your data-set?
One possible improvement is to make sure that all your data holds the same type of information. So please make sure all files share the same rate and same output of ffmpeg -i ...
In general, difference in performance from a data-set to another is not uncommon therefore the use of a reference data-set is advised.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants