-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset download #4
Comments
I would also be happy to see the data set. In addition, it would be nice to have the training code available. In my test runs the results seem to be sensitive depending on how the staffs are cropped from a larger image. I would think that this could be improved by adding more distortions to the training set. |
will the dataset be made public? |
From what I can see, the data set was never published. While I still hope that this might change in the future, I started an attempt to train this model on a mix of the PrIMuS data set and the Grandstaff data set. The results aren't as robust yet as what I get with the weights provided in this repo, but in some cases it works well. I put the training code so far on my fork of this repo: https://github.com/liebharc/Polyphonic-TrOMR |
You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results? |
I haven't calculated a symbol error rate yet. Right now, I run the inference on a small set of example images, such as https://github.com/BreezeWhite/oemer/blob/main/figures/tabi.jpg (after splitting it into single staff images) to get a feeling on how well it performs. Is the code you are using to calculate the SER available somewhere? To get meaningful results, I'd also need another data set to calculate the SER on. Since PrIMuS is used for the training, I can't of course also use it to rate the performance of the results. At least for monophonic examples, it shouldn't be too hard for me to find another data set. |
I will open-source my code once everything is ready, but currently, it's still under development. You can calculate the symbol error rate by measuring the edit distance between the predicted sequence generated by the computational model and the ground truth. You can use the command 'pip install editdistance' to install the tool for calculating the edit distance. Regarding the dataset, I trained the model using approximately 60,000 images from the PrIMuS dataset and then tested it on around 10,000 images. I also experimented with training on a smaller scale of images and found that TrOMR may not fully demonstrate its true capabilities when the dataset size is small. |
Is the dataset open source? How to download?
The text was updated successfully, but these errors were encountered: