-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation Fine-Tuning leads to 0 line detection #683
Comments
👋 hey @mittagessen any idea 😃? |
Sorry, I'm working on an application that is due tonight. I'll be able
to have a look at everything that accumulated over the last couple of
weeks afterwards.
|
Hey @mittagessen any availabitility to check the problem ? |
30 images can be ok but you'll need to train quite a bit longer than a
single epoch, probably 50+. Although I have to say it is weird that the
base model breaks so completely you're not seeing *any* line output
anymore after only 30 training step, especially as your data seems to be
fairly similar to what the base model has been trained on.
To verify your training data there's a script
`contrib/segmentation_overlay.py` that you can feed your ALTO files into
to see what kraken makes out of them. You should also check that you
aren't introducing spurious new line classes. That's most easily seen
when running the training without all the verbosity switches which will
print a table of detected classes before training actually starts. There
should be one line class (default) and one text region if you weren't
planning on introducing a more complex typology.
The validation metrics are fairly useless for segmentation training
unfortunately, but the current main branch of kraken prints training
losses as well. You should see those going down over time. If that isn't
the case something is wrong^TM and we'll have to investigate.
|
Hey @mittagessen I've checked the I see the lines as in the previous photo and, as I don't want to detect any region it shows the whole photo. Also when launching the training I get: WARNING Setting baseline location to centerline from unset model. train.py:1032
INFO Training line types: train.py:1038
INFO default 2 258 train.py:1040
INFO Training region types: train.py:1041
INFO text 3 30 train.py:1043
DEBUG Constructing Adam optimizer (lr: 0.0002, momentum: 0.9) Which looks to show one type for each line and region types. Any idea of why there is a problem ? |
Okay, I've spotted 🔎 an "invalid geometry" on 2 xml files when I used the overlay properly on all the data. ➡ Now it looks to train without forgetting everything after one epoch. I'll wait for 50+ epochs to see if it can overfit just to confirm if it's learning. If so I will scale the annotation to have around 200 images for fine-tuning. 📢 Will tell you soon ! |
With 30 examples the training works. ✅ ❎ But I tried to do it with 100 annotated data then used the overlay to be sure all the data was well structured (really painfull process tbh... as I had to go through a lot of little things to avoid Also it looks like it's loading at each epoch the image: [03/04/25 19:21:26] DEBUG Attempting to load segmentation.py:163
/home/ubuntu/trocr_handwritten/20250304_line_detection/alto_xml/../images/FRANOM22_COLH78_0458_003
4_6.jpg Would it be possible to make it once to accelerate the training ? Best, Arnault |
Hey,
Thanks for the great work, I have some questions on the fine-tuning. I think it may come from the format of my input data. I've been looking at this link to try to get the right
xml
well shaped for myjpg
images. But after fine-tuning (even after 1 epoch) i don't get any line 👀 .Here is an example of
xml
file I have:And here is an BASELINE points on the image:

Then I'm using:
ketos -vvv segtrain -i /home/ubuntu/models/blla.mlmodel -f xml /home/ubuntu/data/20250204_line_detecti on/alto_xml/*.xml -cl -o /home/ubuntu/models/ft_kraken -d cuda:0
And everything looks to train, but the mean_iu stays around 0.25 and even decreases.
[02/04/25 15:54:37] INFO validation run: accuracy 0.9899430871009827 mean_acc 0.9899430871009827 mean_iu 0.2532690465450287 freq_iu 0.96146160364151
After a few epochs, when I run the inference, I don't get any line though...
Also, I'm using only 30 pictures to test the training before annotating more and scale the process. Do you have any idea why this is not working ?
The text was updated successfully, but these errors were encountered: