-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem with best_match_for_query #25
Comments
Hi Lila, How long is the query audio? Can you please send me the files audiofile and audioexpert? I will then try to reproduce your issue and see what the problem is. Thanks, Andrew |
Dear Andrew, I still have a few questions for you: Also, still in fp.py, can you explain me what is the aim of the variable "slop" in the function "actual_matches"? I have also performed some evaluations on the matching of two fingerprints computed on the same audio file with different "start" values. First, I have noticed that the matching is very low when the fingerprints are computed with different "start" values. Is there a way to improve this? I understand that the values can hardly coincide at the beginning and the end of the file, but I find strange that the fingerprints do not match better in the middle of the file. The only way I have found to increase the recognition rate is the store in the database many fingerprints of the same audiofile starting at different instants :( Also, I have tried to change the start values by a very small time lapse (2ms) to see if it was possible to get a better matching. I thought the low matching could came from the fact that the moving windows used during the fingerprint process of the reference and the query audio files were not aligned. Then I found that there is no difference for " i < start < i+1 ", where i is an integer value. In other words, the float values for "start" are not recognized. Is this normal? Am I doing something silly again? Thanks for your help, |
Hi Lila,
I fixed a bug in split_codes() a few days ago which addresses a related issue, so hopefully this should be fixed now. Can you please try again? The segments should be 60 segments in length, with overlap of 30 seconds.
This reduces the resolution of the time codes to reduce the sensitivity to timing jitter when time aligning between query and fingerprint. It is a trade off between sensitivity and timing jitter. (Also, see below.)
One reason might be that the codegen takes a little while to warm up, so you need to let it run for long enough - right now this is at least 20 seconds, but we're working on getting this down. What sorts of accuracy rates are you seeing in the various cases? And what length of fingerprints are you using?
How are you shifting the start values? Are you doing this to the audio file at the signal level, or are you adjusting the time codes after the fingerprint has been generated? The absolute values of the time codes shouldn't really matter too much. The important thing is that the shifts between the query and database fingerprint time codes are consistent, i.e., it's the relative time shifts which are important to get right. This is where the slop factor above comes in; it makes the relative distances between time codes more "sloppy" so that the differences between time codes in query and database fingerprints match to a greater extent.
I'm not sure what you mean here. What is start? Best, Andrew |
Thank you for your answers I fixed a bug in split_codes() a few days ago which addresses a related issue, so hopefully this should be fixed now. Can you please try again? The segments should be 60 segments in length, with overlap of 30 seconds. --> So you change the denominator in segmentlength = 60 * 1000.0 / 23.2, What sorts of accuracy rates are you seeing in the various cases? And what length of fingerprints are you using? --> I was trying to reduce as much as possible the length of the query (from 30 sec to 5 sec). The results are the following: (duration in sec | accuracy: percentage of audio excerpt correctly identified) Concerning the time alignment, I noticed something weird: there is a time-lag that increases linearly while the start value increases. (Just to make sure, what I call "start value" is the second argument of song.util.codegen(audioseg, start = 0, duration = 30)). I'm not sure what you mean here. What is start? Thank you for your time |
Hi Lila, |
It has been a while I haven't used it but it used to work perfectly |
Hi Lila, |
Dear all,
Can someone explains me what I am doing wrong?
I am trying to make my own database of fingerprint, on which I want to evaluate the audio identification performance with some specific degradation of the signal.
I have an audio file named : audiofile
I compute the fingerprint of this file using the entire song:
using the function contained in dedup by lamere I extract trid, raw_code and ingest_data using:
then I ingest the new data:
and I add the new song to my dictionary named done:
Now I want to make a test and identify the same song (at a first step), what I really want to do is to retrieve the song given a short excerpt of the song. I do not manage to do that.
When I call :
I have
response.match() = False :(
What is wrong with what I am doing? I am not even trying to recognize a modified version of the song.
I have a last question, If I compute the codegen using the entire song, is the system suppose to identify the song when I query a short excerpt of the song?
Thank you for your help,
Lila
The text was updated successfully, but these errors were encountered: