All datasets are OdBL for data, CC-BY-SA for images
https://world.openfoodfacts.org/data
- https://static.openfoodfacts.org/exports/products.random-modulo-10000.tar.gz
- https://static.openfoodfacts.org/exports/products.random-modulo-10000.images.tar.gz
- https://static.openfoodfacts.org/exports/products.random-modulo-1000.tar.gz
- https://static.openfoodfacts.org/exports/products.random-modulo-1000.images.tar.gz
- Ground truth data is in the Product database sample
- Predicted categories are in the Robotoff insight dump which is accessible at ***
- A all-in-one package was generated by Alex: https://openfoodfacts.org/data/dataforgood2022/predict_categories_dataset_documentation.txt
A set of 20k French products with:
- original image containing a nutrition facts table
- 3596710454181.nutrition.jpg
- rotation angle and bouding box coordinates of the cropped nutrition facts table
- in the products.csv file
- cropped image of the nutritions facts table
- 3596710454181.nutrition.cropped.jpg
- Google Cloud Vision resulting json file for the cropped image
- 3596710454181.nutrition.cropped.jpg.json
- Nutrition values as entered by users in the OFF database
- 3596710454181.nutriments.json
Location: https://static.openfoodfacts.org/exports/nutrition-lc-fr-country-fr-last-edit-date-2019-08.tar.gz (16.9 Gb)
Command used to generate the test set: ./extract_nutrition_test_set.pl --lc fr --query countries_tags=en:france --query last_edit_dates_tags=2019-08 --dir /srv/off/html/exports/nutrition-lc-fr-country-fr-last-edit-date-2019-08
2 sets of 1k and 12k French products with:
- original image containing an ingredients list
- 3596710454181.ingredients.jpg
- rotation angle and bouding box coordinates of the cropped ingredients list
- in the products.csv file
- cropped image of the ingredients facts table
- 3596710454181.ingredients.cropped.jpg
- Google Cloud Vision resulting json file for the cropped image
- 3596710454181.ingredients.cropped.jpg.json
- Ingredients list as entered by users (possibly from OCR and possibly with errors) in the OFF database
- in the products.csv file
Location 1k products: https://static.openfoodfacts.org/exports/ingredients-lc-fr-country-fr-last-edit-date-2019-08-1k.tar.gz ()
Location 12k products: https://static.openfoodfacts.org/exports/ingredients-lc-fr-country-fr-last-edit-date-2019-08.tar.gz ()
Command used to generate the test set: ./extract_ingredients_test_set.pl --lc fr --query countries_tags=en:france --query last_edit_dates_tags=2019-08 --dir /srv/off/html/exports/ingredients-lc-fr-country-fr-last-edit-date-2019-08-1k