Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad YOLOv8 performance when compared to Ultralytics implementation #2442

Open
jagogardiner opened this issue May 8, 2024 · 9 comments
Open
Assignees
Labels

Comments

@jagogardiner
Copy link

jagogardiner commented May 8, 2024

Current Behavior:

We are currently seeing bad performance when it comes to both classification and box prediction on our Keras implementation of the model. When training over 10 epochs and evaluating on the same dataset, we see mAP 50 as low as 0.05 compared to Ultralytics scoring 0.94. Both models are not using preset weights, on the yolo_v8_s backbone. Even with 50-60 epochs of training, the performance of the Keras model doesn't seem right.

Expected Behavior:

YOLOv8 detector being able to score similar results and score the same performance.

Steps To Reproduce:

model = keras_cv.models.YOLOV8Detector(
        num_classes=len(class_mapping),
        bounding_box_format=xyxy,
        backbone="yolo_v8_xs_backbone",
        fpn_depth=2,
)

optimizer = keras.optimizers.SGD(
        learning_rate=lr, momentum=0.9, global_clipnorm=10.0
)

model.compile(
        optimizer=optimizer,
        classification_loss="binary_crossentropy",
        box_loss="ciou",
)

model.fit(
        train_ds,
        epochs=10,
        callbacks=[callbacks],
        validation_data=val_ds,
)

Version:

Tensorflow 2.16.1
Keras 3.2.0 (had to revert due to issue #2421)
Keras-CV 0.8.2

Anything else:

Followed tutorials such as: https://keras.io/guides/keras_cv/object_detection_keras_cv/
Issues such as #2333 and #2353 do concern me.

Full implementation: https://github.com/jagogardiner/Doodlecode

@jagogardiner jagogardiner changed the title Bad YOLOv8 performance when compared to Ultralytics implementation. Bad YOLOv8 performance when compared to Ultralytics implementation May 8, 2024
@mushihuahua
Copy link

real, me too bro

@legenda971
Copy link

Have you considered using a different backbone for your model? The pretrained models from Ultralytics, such as those available in the YOLO series, are typically trained on datasets like COCO or Open Images V7.

@jagogardiner
Copy link
Author

Have you considered using a different backbone for your model? The pretrained models from Ultralytics, such as those available in the YOLO series, are typically trained on datasets like COCO or Open Images V7.

I have tried a few backbones:

  • yolo_v8_xs_backbone(&coco)
  • yolo_v8_s_backbone(&coco)
  • yolo_v8_m_backbone(&coco)

Going up to the larger backbones was not viable as I could only train it on my personal RTX 3070. However, I used both pretrained and untrained weights when testing the Ultralytics implementation, and both give me better results than the Keras implementation.

I honestly might be making a mistake somewhere, the performance just doesn't seem right to me. But even with the COCO pretrained weights, I still witness poor performance.

I did also try some different backbones from the documentation (https://keras.io/api/keras_cv/models/tasks/yolo_v8_detector/) but for some reason a lot of the models such as resnet or efficientnet won't seem to work, with Keras giving me an error that the backbone is not valid.

Thanks!

@gregdaly
Copy link

gregdaly commented May 10, 2024

I've replicated this with the KerasCV object detection example on PascalVOC with the tensorflow backend (Keras 3.3.3, Keras-CV 0.9.0, TF 2.16.1) even the pre-trained model has a high loss and a mAP of 0.005. Training for 50 epochs makes the predictions visibly worse.

I've created a colab notebook where you can test this out - https://colab.research.google.com/drive/1tsRAHGZkifmQCWqYaMUYREOCTPrBfgS0?usp=sharing

Two other issues of note, the jax backend is much slower than tf for object detection and the model throws an error for any losses other than CIoU.

@tu6berk
Copy link

tu6berk commented May 17, 2024

I personally would advise you to doubt the correctness of PyCocoCallback you are possibly using in your code.

I opened an issue several weeks back, however, it has not been resolved yet, nor was it closed. Especially, if the number of unique labels in your dataset is only a few, or there is significant imbalance in the underlying class distribution, the current implementation might incorrectly encode the ground truth, henceforth, the inevitable erroneous evaluation metric readings...

Note that, even when this issue is resolved (which is very easy to resolve), it still does not guarantee the correctness of yolov8 implementation in keras-cv, and concerns were raised on all that here, which are still not addressed. 🤷🏿

@jagogardiner
Copy link
Author

I've replicated this with the KerasCV object detection example on PascalVOC with the tensorflow backend (Keras 3.3.3, Keras-CV 0.9.0, TF 2.16.1) even the pre-trained model has a high loss and a mAP of 0.005. Training for 50 epochs makes the predictions visibly worse.

I've created a colab notebook where you can test this out - https://colab.research.google.com/drive/1tsRAHGZkifmQCWqYaMUYREOCTPrBfgS0?usp=sharing

Two other issues of note, the jax backend is much slower than tf for object detection and the model throws an error for any losses other than CIoU.

Glad to know the same issues are with PascalVOC. I do think there must be some error in the implementation of Keras’ YoloV8, as on my exact same dataset we tested with Ultralytics the predictions are infinitely better.

I personally would advise you to doubt the correctness of PyCocoCallback you are possibly using in your code.

I opened an issue several weeks back, however, it has not been resolved yet, nor was it closed. Especially, if the number of unique labels in your dataset is only a few, or there is significant imbalance in the underlying class distribution, the current implementation might incorrectly encode the ground truth, henceforth, the inevitable erroneous evaluation metric readings...

Note that, even when this issue is resolved (which is very easy to resolve), it still does not guarantee the correctness of yolov8 implementation in keras-cv, and concerns were raised on all that here, which are still not addressed. 🤷🏿

I have seen these issues too. I could try testing with a fix, but unfortunately the metrics don’t make much of a difference. Even when just evaluating the model by inputting images to predict, you can just tell the model is not learning properly. The classification is way out, and the boxes go absolutely haywire until around a confidence of 0.51. Even then, it would still pick up empty space on our drawings as objects.

I understand YOLO is not particularly engineered towards static image object detection tasks, and is preferable when using video, but I would expect it to be able to work it out somewhat.

@alhaal
Copy link

alhaal commented Oct 8, 2024

Are there any updates? I have the same issues

@tarasboulba
Copy link

Hello all,
Is there any updates on this issue? Thanks!

@aosaltik
Copy link

aosaltik commented Jan 7, 2025

Did you try to use binary or categorical focal crossentropy for the classification loss to get rid of class imbalance issue?

https://www.tensorflow.org/api_docs/python/tf/keras/losses/BinaryFocalCrossentropy
https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalFocalCrossentropy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants