diff --git a/README.md b/README.md new file mode 100644 index 0000000..9ed53b0 --- /dev/null +++ b/README.md @@ -0,0 +1,110 @@ +# Wheat Grains Detector +**For Celiac Disease sufferers** + +![App in action](ezgif-1-138e5d970f8d.gif) +## Android + OpenCV + Tensorflow Object Detection API +_**Please note:** this is a sidenotes for my own personal use rather than a detailed step-by-step instruction on how to build the project. However an experienced developer should get what's going on._ + +## Preparing dataset + +`LabelImg`/`VoTT`/`LabelBox` for segmentation and annotation are all fine. `LabelImg` is finest. + +### Generate TFRecord + +```bash +python create_pascal_tf_record_ex.py --annotations_dir dataset/train --label_map_path dataset/label_map.pbtxt --output_path train.record +python create_pascal_tf_record_ex.py --annotations_dir dataset/val --label_map_path dataset/label_map.pbtxt --output_path val.record +``` + +## Training model + +Connect to [Colaboratory](https://colab.research.google.com) and upload _colab/Wheat_Grains_Detector.ipynb_ + +To upload: +* _colab/ssd_mobilenet_v2_coco.config_ +* _dataset/label_map.pbtxt_ +* _train.record_ +* _val.record_ + +Will be downloaded: +* _frozen.zip_ + +## Building Android app + +### Install JDK + +```bash +sudo apt-get install openjdk-8-jdk-headless +``` +or + +Download [Java SE Development Kit 8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) and unpack to _``_. _jdk-8u181-linux-x64.tar.gz_ was used. +```bash +export PATH=$PATH:/bin/ +``` + +### Install Android SDK + +Download [SDK tools](https://developer.android.com/studio/) and unpack to _``_. _sdk-tools-linux-4333796.zip_ was used. +```bash +export ANDROID_SDK_HOME= + +export PATH=$PATH:$ANDROID_SDK_HOME/tools/bin/ + +sdkmanager "build-tools;28.0.2" +export PATH=$PATH:$ANDROID_SDK_HOME/build-tools/28.0.2/ + +sdkmanager "platforms;android-21" + +sudo apt-get install adb +``` +Connect `adb` to a device. + +### Build OpenCV library + +Download [Android pack](https://opencv.org/releases.html) and unpack to _``_. _opencv-3.4.3-android-sdk.zip_ was used. +```bash +export OPENCV_SDK_JAVA=/sdk/java/ +pushd $OPENCV_SDK_JAVA +mkdir -p build/gen/ build/obj/ +aapt package -m -J build/gen/ -M AndroidManifest.xml -S res/ -I $ANDROID_SDK_HOME/platforms/android-21/android.jar +aidl -obuild/gen/ src/org/opencv/engine/OpenCVEngineInterface.aidl +``` +create _build/gen/BuildConfig.java_ with the following content: +```java +package org.opencv; + +public final class BuildConfig { + public static final boolean DEBUG = Boolean.parseBoolean("false"); +} +``` +```bash +javac -d build/obj/ -bootclasspath $ANDROID_SDK_HOME/platforms/android-21/android.jar build/gen/BuildConfig.java build/gen/org/opencv/R.java build/gen/org/opencv/*/*.java src/org/opencv/*/*.java +aapt package -F build/opencv.jar -M AndroidManifest.xml -S res/ -I $ANDROID_SDK_HOME/platforms/android-21/android.jar build/obj/ +popd +``` + +### Build app + +```bash +pushd android +mkdir -p build/gen/ build/obj/ build/bin/lib/ build/bin/assets/ +aapt package -m -J build/gen/ -M AndroidManifest.xml -S $OPENCV_SDK_JAVA/res/ -S res/ -I $ANDROID_SDK_HOME/platforms/android-21/android.jar +javac -d build/obj/ -bootclasspath $ANDROID_SDK_HOME/platforms/android-21/android.jar -classpath $OPENCV_SDK_JAVA/build/opencv.jar build/gen/com/github/failure_to_thrive/wheat_grains_detector/R.java src/com/github/failure_to_thrive/wheat_grains_detector/MainActivity.java +dx --dex --output=build/bin/classes.dex $OPENCV_SDK_JAVA/build/opencv.jar build/obj/ +``` +copy _``/sdk/native/libs/armeabi-v7a/_ to _build/bin/lib/_ +copy _``/sdk/native/libs/arm64-v8a/_ to _build/bin/lib/_ +unpack _frozen.zip_ to _build/bin/assets/_ +```bash +aapt package -F build/WheatGrainsDetector.unaligned.apk -M AndroidManifest.xml -S $OPENCV_SDK_JAVA/res/ -S res/ -I $ANDROID_SDK_HOME/platforms/android-21/android.jar build/bin/ +apksigner sign --key --cert build/WheatGrainsDetector.unaligned.apk +zipalign -p 4 build/WheatGrainsDetector.unaligned.apk build/WheatGrainsDetector.apk +adb install build/WheatGrainsDetector.apk +popd +``` +```bash +adb logcat WGD:* *:S +``` + +That's all! diff --git a/android/AndroidManifest.xml b/android/AndroidManifest.xml new file mode 100644 index 0000000..270fd15 --- /dev/null +++ b/android/AndroidManifest.xml @@ -0,0 +1,38 @@ + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/android/res/drawable/gluten_free.png b/android/res/drawable/gluten_free.png new file mode 100644 index 0000000..ab59a7e Binary files /dev/null and b/android/res/drawable/gluten_free.png differ diff --git a/android/res/layout/activity_main.xml b/android/res/layout/activity_main.xml new file mode 100644 index 0000000..c92839b --- /dev/null +++ b/android/res/layout/activity_main.xml @@ -0,0 +1,15 @@ + + + + + diff --git a/android/res/values/strings.xml b/android/res/values/strings.xml new file mode 100644 index 0000000..9d679bc --- /dev/null +++ b/android/res/values/strings.xml @@ -0,0 +1,4 @@ + + + Wheat Grains Detector + diff --git a/android/src/com/github/failure_to_thrive/wheat_grains_detector/MainActivity.java b/android/src/com/github/failure_to_thrive/wheat_grains_detector/MainActivity.java new file mode 100644 index 0000000..9f92fa3 --- /dev/null +++ b/android/src/com/github/failure_to_thrive/wheat_grains_detector/MainActivity.java @@ -0,0 +1,198 @@ +package com.github.failure_to_thrive.wheat_grains_detector; + +import org.opencv.android.BaseLoaderCallback; +import org.opencv.android.CameraBridgeViewBase.CvCameraViewFrame; +import org.opencv.android.LoaderCallbackInterface; +import org.opencv.android.OpenCVLoader; +import org.opencv.core.Mat; +import org.opencv.android.CameraBridgeViewBase; +import org.opencv.android.CameraBridgeViewBase.CvCameraViewListener2; +import org.opencv.core.Core; +import org.opencv.core.Point; +import org.opencv.core.Scalar; +import org.opencv.core.Size; +import org.opencv.dnn.Net; +import org.opencv.dnn.Dnn; +import org.opencv.imgproc.Imgproc; + +import android.app.Activity; +import android.os.Bundle; +import android.util.Log; +import android.view.SurfaceView; +import android.view.WindowManager; +import android.content.res.AssetManager; + +import java.io.BufferedInputStream; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; + +public class MainActivity extends Activity implements CvCameraViewListener2 { + private static final String TAG = "WGD"; + + private CameraBridgeViewBase mOpenCvCameraView; + private Net net; +// private String path; + + private BaseLoaderCallback mLoaderCallback = new BaseLoaderCallback(this) { + @Override + public void onManagerConnected(int status) { + switch (status) { + case LoaderCallbackInterface.SUCCESS: + { + Log.i(TAG, "OpenCV loaded successfully"); + mOpenCvCameraView.enableView(); + } break; + default: + { + super.onManagerConnected(status); + } break; + } + } + }; + + /** Called when the activity is first created. */ + @Override + public void onCreate(Bundle savedInstanceState) { + super.onCreate(savedInstanceState); + getWindow().addFlags(WindowManager.LayoutParams.FLAG_KEEP_SCREEN_ON); + + setContentView(R.layout.activity_main); + + mOpenCvCameraView = (CameraBridgeViewBase) findViewById(R.id.CameraView); + + mOpenCvCameraView.setCvCameraViewListener(this); + } + + @Override + public void onPause() + { + super.onPause(); + if (mOpenCvCameraView != null) + mOpenCvCameraView.disableView(); + } + + @Override + public void onResume() + { + super.onResume(); + if (!OpenCVLoader.initDebug()) { + Log.d(TAG, "Internal OpenCV library not found. Using OpenCV Manager for initialization"); + OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION_3_0_0, this, mLoaderCallback); + } else { + Log.d(TAG, "OpenCV library found inside package. Using it!"); + mLoaderCallback.onManagerConnected(LoaderCallbackInterface.SUCCESS); + } + } + + public void onDestroy() { + super.onDestroy(); + if (mOpenCvCameraView != null) + mOpenCvCameraView.disableView(); + } + + public void onCameraViewStarted(int width, int height) { + try { + net = Dnn.readNetFromTensorflow(Extract("frozen_inference_graph.pb"), Extract("config.pbtxt")); + Log.i(TAG, "Network loaded successfully"); + } catch (IOException e) { + e.printStackTrace(); + Log.d(TAG, e.getMessage()); + finish(); + } +/* + File sd = new File(android.os.Environment.getExternalStorageDirectory(), "WGD_frames/"); + if (!sd.exists()) + sd.mkdir(); + path = sd.getPath(); +*/ + } + + // Extract a file from assets to a storage and return a path. + private String Extract(String filename) throws IOException { + AssetManager assetManager = getAssets(); + BufferedInputStream inputStream = new BufferedInputStream(assetManager.open(filename)); + byte[] data = new byte[inputStream.available()]; + inputStream.read(data); + inputStream.close(); + File outFile = new File(getFilesDir(), filename); + FileOutputStream outputStream = new FileOutputStream(outFile); + outputStream.write(data); + outputStream.close(); + return outFile.getAbsolutePath(); + } + + public void onCameraViewStopped() { + } + + public Mat onCameraFrame(CvCameraViewFrame inputFrame) { + Mat frame = inputFrame.rgba(); + Imgproc.cvtColor(frame, frame, Imgproc.COLOR_RGBA2RGB); + + int cols = frame.cols(); + int rows = frame.rows(); + + // Pad a frame to make it square. Shrinking with a loss of aspect ratio so commonly applying everywhere is a bad idea. + int largest = Math.max(cols, rows); + Mat square = new Mat(); + Core.copyMakeBorder(frame, square, 0, largest - rows, 0, largest - cols, Core.BORDER_CONSTANT, new Scalar(0, 0, 0)); + + Mat blob = Dnn.blobFromImage(square, 1, + new Size(300, 300), + new Scalar(0, 0, 0), false, false); + net.setInput(blob); + Mat detections = net.forward(); + + detections = detections.reshape(0, (int)detections.total() / 7); + for (int i = 0; i < detections.rows(); ++i) { + double confidence = detections.get(i, 2)[0]; + if (confidence > 0.3) { + int classId = (int)detections.get(i, 1)[0]; + int left = (int)(detections.get(i, 3)[0] * cols); + int top = (int)(detections.get(i, 4)[0] * rows); + int right = (int)(detections.get(i, 5)[0] * cols); + int bottom = (int)(detections.get(i, 6)[0] * rows); + // Bring coordinates back to the original frame. + if (cols > rows) { + top *= (double)cols/rows; + bottom *= (double)cols/rows; + } + else { + left *= (double)rows/cols; + right *= (double)rows/cols; + } + + Scalar color; + switch (classId) { + case 2: + case 3: + color = confidence > 0.6 ? new Scalar(255, 0, 0) : new Scalar(255, 255, 0); + break; + default: + color = new Scalar(127, 127, 127); + } + // Draw rectangle around detected object. + Imgproc.rectangle(frame, new Point(left, top), + new Point(right, bottom), + color, 2); +/* + // Print class and confidence. + String label = String.format("%d %.3f", classId, confidence); + int[] baseline = new int[1]; + Size labelSize = Imgproc.getTextSize(label, Core.FONT_HERSHEY_SIMPLEX, 0.5, 1, baseline); + Imgproc.rectangle(frame, new Point(left, top), + new Point(left + labelSize.width, top + labelSize.height + baseline[0]), + color, Core.FILLED); + Imgproc.putText(frame, label, new Point(left, top + labelSize.height + baseline[0]/2), + Core.FONT_HERSHEY_SIMPLEX, 0.5, new Scalar(0, 0, 0)); +*/ + } + } +/* + Mat save = new Mat(); + Imgproc.cvtColor(frame, save, Imgproc.COLOR_RGB2BGR); + org.opencv.imgcodecs.Imgcodecs.imwrite(path + String.format("/%tj_%1$tH%1$tM%1$tS%1$tL.jpg", new java.util.Date()), save); +*/ + return frame; + } +} diff --git a/colab/Wheat_Grains_Detector.ipynb b/colab/Wheat_Grains_Detector.ipynb new file mode 100644 index 0000000..6b60ac4 --- /dev/null +++ b/colab/Wheat_Grains_Detector.ipynb @@ -0,0 +1,183 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Wheat Grains Detector", + "version": "0.3.2", + "provenance": [], + "private_outputs": true, + "collapsed_sections": [] + }, + "kernelspec": { + "name": "python2", + "display_name": "Python 2" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "metadata": { + "id": "B1kY1JS6hJw3", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "import tensorflow as tf\n", + "device_name = tf.test.gpu_device_name()\n", + "print (device_name)\n", + "import os\n", + "os.environ['COLAB_TPU_ADDR']" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "R20AobdUkxTX", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "! ls -l -a" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "n5mczoqelcVg", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "from google.colab import files\n", + "files.upload()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "a9kj7iAPpsgZ", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "! wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tar.gz\n", + "! tar -xvf ssd_mobilenet_v2_coco_2018_03_29.tar.gz" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "7OpHN-WMtr3e", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "! git clone https://github.com/tensorflow/models.git\n", + "\n", + "! pip install Cython\n", + "! pip install pycocotools\n", + "! apt-get install protobuf-compiler\n", + "! cd models/research/ && protoc object_detection/protos/*.proto --python_out=." + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "ljvVhWFmIGfb", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "#%reset -f\n", + "#! rm -r model\n", + "import sys\n", + "sys.path.append(\"models/research/\")\n", + "sys.path.append(\"models/research/slim/\")\n", + "import os\n", + "tpu_name = os.environ.get('COLAB_TPU_ADDR', None)\n", + "if tpu_name is not None:\n", + " %run models/research/object_detection/model_tpu_main.py --tpu_name=grpc://$tpu_name --pipeline_config_path=ssd_mobilenet_v2_coco.config --model_dir=model/ --alsologtostderr\n", + "else:\n", + " %run models/research/object_detection/model_main.py --pipeline_config_path=ssd_mobilenet_v2_coco.config --model_dir=model/ --alsologtostderr" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "bM9FzIt8ogc8", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "#! XZ_OPT=-9 tar -cJvf model.tar.xz model\n", + "#from google.colab import files\n", + "#files.download(\"model.tar.xz\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "2Uv-Cobz-7rr", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "%reset -f\n", + "#! rm -r export\n", + "import sys\n", + "sys.path.append(\"models/research/\")\n", + "sys.path.append(\"models/research/slim/\")\n", + "%run models/research/object_detection/export_inference_graph.py --input_type=image_tensor --pipeline_config_path=ssd_mobilenet_v2_coco.config --trained_checkpoint_prefix=model/model.ckpt-5814 --output_directory=export/" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "__WmBDP27mw7", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "! wget https://raw.githubusercontent.com/opencv/opencv/master/samples/dnn/tf_text_graph_ssd.py\n", + "! wget https://raw.githubusercontent.com/opencv/opencv/master/samples/dnn/tf_text_graph_common.py\n", + "%run tf_text_graph_ssd.py --input export/frozen_inference_graph.pb --output config.pbtxt --config ssd_mobilenet_v2_coco.config\n", + "! zip -9 -j frozen export/frozen_inference_graph.pb config.pbtxt\n", + "from google.colab import files\n", + "files.download(\"frozen.zip\")" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "yU2oTJw3LryB", + "colab_type": "code", + "colab": {} + }, + "cell_type": "code", + "source": [ + "#! " + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file diff --git a/colab/ssd_mobilenet_v2_coco.config b/colab/ssd_mobilenet_v2_coco.config new file mode 100644 index 0000000..939158e --- /dev/null +++ b/colab/ssd_mobilenet_v2_coco.config @@ -0,0 +1,199 @@ +# SSD with Mobilenet v2 configuration for MSCOCO Dataset. +# Users should configure the fine_tune_checkpoint field in the train config as +# well as the label_map_path and input_path fields in the train_input_reader and +# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that +# should be configured. + +model { + ssd { + num_classes: 4 + box_coder { + faster_rcnn_box_coder { + y_scale: 10.0 + x_scale: 10.0 + height_scale: 5.0 + width_scale: 5.0 + } + } + matcher { + argmax_matcher { + matched_threshold: 0.5 + unmatched_threshold: 0.5 + ignore_thresholds: false + negatives_lower_than_unmatched: true + force_match_for_each_row: true + } + } + similarity_calculator { + iou_similarity { + } + } + anchor_generator { + ssd_anchor_generator { + num_layers: 6 + min_scale: 0.2 + max_scale: 0.95 + aspect_ratios: 1.0 + aspect_ratios: 2.0 + aspect_ratios: 0.5 + aspect_ratios: 3.0 + aspect_ratios: 0.3333 + } + } + image_resizer { + fixed_shape_resizer { + height: 300 + width: 300 + } + } + box_predictor { + convolutional_box_predictor { + min_depth: 0 + max_depth: 0 + num_layers_before_predictor: 0 + use_dropout: false + dropout_keep_probability: 0.8 + kernel_size: 1 + box_code_size: 4 + apply_sigmoid_to_scores: false + conv_hyperparams { + activation: RELU_6, + regularizer { + l2_regularizer { + weight: 0.00004 + } + } + initializer { + truncated_normal_initializer { + stddev: 0.03 + mean: 0.0 + } + } + batch_norm { + train: true, + scale: true, + center: true, + decay: 0.9997, + epsilon: 0.001, + } + } + } + } + feature_extractor { + type: 'ssd_mobilenet_v2' + min_depth: 16 + depth_multiplier: 1.0 + conv_hyperparams { + activation: RELU_6, + regularizer { + l2_regularizer { + weight: 0.00004 + } + } + initializer { + truncated_normal_initializer { + stddev: 0.03 + mean: 0.0 + } + } + batch_norm { + train: true, + scale: true, + center: true, + decay: 0.9997, + epsilon: 0.001, + } + } + } + loss { + classification_loss { + weighted_sigmoid { + } + } + localization_loss { + weighted_smooth_l1 { + } + } + hard_example_miner { + num_hard_examples: 3000 + iou_threshold: 0.99 + loss_type: CLASSIFICATION + max_negatives_per_positive: 3 + min_negatives_per_image: 3 + } + classification_weight: 1.0 + localization_weight: 1.0 + } + normalize_loss_by_num_matches: true + post_processing { + batch_non_max_suppression { + score_threshold: 1e-8 + iou_threshold: 0.6 + max_detections_per_class: 100 + max_total_detections: 100 + } + score_converter: SIGMOID + } + } +} + +train_config: { + batch_size: 24 + optimizer { + rms_prop_optimizer: { + learning_rate: { + exponential_decay_learning_rate { + initial_learning_rate: 0.004 + decay_steps: 800720 + decay_factor: 0.95 + } + } + momentum_optimizer_value: 0.9 + decay: 0.9 + epsilon: 1.0 + } + } +# fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt" + fine_tune_checkpoint: "ssd_mobilenet_v2_coco_2018_03_29/model.ckpt" + fine_tune_checkpoint_type: "detection" + # Note: The below line limits the training process to 200K steps, which we + # empirically found to be sufficient enough to train the pets dataset. This + # effectively bypasses the learning rate schedule (the learning rate will + # never decay). Remove the below line to train indefinitely. + num_steps: 200000 + data_augmentation_options { + random_horizontal_flip { + } + } + data_augmentation_options { + ssd_random_crop { + } + } +} + +train_input_reader: { + tf_record_input_reader { +# input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record-?????-of-00100" + input_path: "train.record" + } +# label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt" + label_map_path: "label_map.pbtxt" +} + +eval_config: { + num_examples: 8000 + # Note: The below line limits the evaluation process to 10 evaluations. + # Remove the below line to evaluate indefinitely. + max_evals: 10 +} + +eval_input_reader: { + tf_record_input_reader { +# input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record-?????-of-00010" + input_path: "val.record" + } +# label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt" + label_map_path: "label_map.pbtxt" + shuffle: false + num_readers: 1 +} \ No newline at end of file diff --git a/create_pascal_tf_record_ex.py b/create_pascal_tf_record_ex.py new file mode 100644 index 0000000..245103b --- /dev/null +++ b/create_pascal_tf_record_ex.py @@ -0,0 +1,165 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== + +# This script has been modified. +# +# The original can be found here: +# https://github.com/tensorflow/models/blob/master/research/object_detection/dataset_tools/create_pascal_tf_record.py + +r"""Convert raw PASCAL dataset to TFRecord for object_detection. + +Example usage: + python create_pascal_tf_record_ex.py \ + --annotations_dir=dataset/train/ \ + --label_map_path=dataset/label_map.pbtxt \ + --output_path=train.record +""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import hashlib +import io +import logging +import os +import glob + +from lxml import etree +import PIL.Image +import tensorflow as tf + +from object_detection.utils import dataset_util +from object_detection.utils import label_map_util + + +flags = tf.app.flags +flags.DEFINE_string('annotations_dir', None, + 'Path to annotations directory.') +flags.mark_flag_as_required('annotations_dir') +flags.DEFINE_string('images_dir', None, 'Path to images directory. The same as annotations_dir if omitted.') +flags.DEFINE_string('label_map_path', 'label_map.pbtxt', + 'Path to label map proto') +flags.DEFINE_string('output_path', None, 'Path to output TFRecord') +flags.mark_flag_as_required('output_path') +FLAGS = flags.FLAGS + + +def dict_to_tf_example(data, + images_dir, + label_map_dict): + """Convert XML derived dict to tf.Example proto. + + Notice that this function normalizes the bounding box coordinates provided + by the raw data. + + Args: + data: dict holding PASCAL XML fields for a single image (obtained by + running dataset_util.recursive_parse_xml_to_dict) + images_dir: Path to image described by the PASCAL XML file + label_map_dict: A map from string label names to integers ids. + + Returns: + example: The converted tf.Example. + + Raises: + ValueError: if the image pointed to by data['filename'] is not a valid JPEG + """ + full_path = os.path.join(images_dir, data['filename']) +# full_path = data['path'] + with tf.gfile.GFile(full_path, 'rb') as fid: + encoded_jpg = fid.read() + encoded_jpg_io = io.BytesIO(encoded_jpg) + image = PIL.Image.open(encoded_jpg_io) + if image.format != 'JPEG': + raise ValueError('Image format not JPEG') + key = hashlib.sha256(encoded_jpg).hexdigest() + + width = int(data['size']['width']) + height = int(data['size']['height']) + + xmin = [] + ymin = [] + xmax = [] + ymax = [] + classes = [] + classes_text = [] + truncated = [] + poses = [] + difficult_obj = [] + if 'object' in data: + for obj in data['object']: + difficult = bool(int(obj['difficult'])) + difficult_obj.append(int(difficult)) + + xmin.append(float(obj['bndbox']['xmin']) / width) + ymin.append(float(obj['bndbox']['ymin']) / height) + xmax.append(float(obj['bndbox']['xmax']) / width) + ymax.append(float(obj['bndbox']['ymax']) / height) + classes_text.append(obj['name'].encode('utf8')) + classes.append(label_map_dict[obj['name']]) + truncated.append(int(obj['truncated'])) + poses.append(obj['pose'].encode('utf8')) + + example = tf.train.Example(features=tf.train.Features(feature={ + 'image/height': dataset_util.int64_feature(height), + 'image/width': dataset_util.int64_feature(width), + 'image/filename': dataset_util.bytes_feature( + data['filename'].encode('utf8')), + 'image/source_id': dataset_util.bytes_feature( + data['filename'].encode('utf8')), + 'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')), + 'image/encoded': dataset_util.bytes_feature(encoded_jpg), + 'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')), + 'image/object/bbox/xmin': dataset_util.float_list_feature(xmin), + 'image/object/bbox/xmax': dataset_util.float_list_feature(xmax), + 'image/object/bbox/ymin': dataset_util.float_list_feature(ymin), + 'image/object/bbox/ymax': dataset_util.float_list_feature(ymax), + 'image/object/class/text': dataset_util.bytes_list_feature(classes_text), + 'image/object/class/label': dataset_util.int64_list_feature(classes), + 'image/object/difficult': dataset_util.int64_list_feature(difficult_obj), + 'image/object/truncated': dataset_util.int64_list_feature(truncated), + 'image/object/view': dataset_util.bytes_list_feature(poses), + })) + return example + + +def main(_): + annotations_dir = FLAGS.annotations_dir + + images_dir = FLAGS.images_dir + if not images_dir: + images_dir = annotations_dir + + writer = tf.python_io.TFRecordWriter(FLAGS.output_path) + + label_map_dict = label_map_util.get_label_map_dict(FLAGS.label_map_path) + + annotations_list = glob.glob(os.path.join(annotations_dir, '*.xml')) + for idx, xml_file in enumerate(annotations_list): + if idx % 100 == 0: + logging.info('On image %d of %d', idx, len(annotations_list)) + with tf.gfile.GFile(xml_file, 'r') as fid: + xml_str = fid.read() + xml = etree.fromstring(xml_str) + data = dataset_util.recursive_parse_xml_to_dict(xml)['annotation'] + + tf_example = dict_to_tf_example(data, images_dir, label_map_dict) + writer.write(tf_example.SerializeToString()) + + writer.close() + + +if __name__ == '__main__': + tf.app.run() diff --git a/dataset/label_map.pbtxt b/dataset/label_map.pbtxt new file mode 100644 index 0000000..78635ae --- /dev/null +++ b/dataset/label_map.pbtxt @@ -0,0 +1,16 @@ +item { + id: 1 + name: 'litter' +} +item { + id: 2 + name: 'wheat' +} +item { + id: 3 + name: 'oats' +} +item { + id: 4 + name: 'semka' +} diff --git a/dataset/train/P_20181107_130553_1.jpg b/dataset/train/P_20181107_130553_1.jpg new file mode 100644 index 0000000..f81dbbf Binary files /dev/null and b/dataset/train/P_20181107_130553_1.jpg differ diff --git a/dataset/train/P_20181107_130553_1.xml b/dataset/train/P_20181107_130553_1.xml new file mode 100644 index 0000000..8227e3d --- /dev/null +++ b/dataset/train/P_20181107_130553_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_130553_1.jpg + dataset/train/P_20181107_130553_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 233 + 565 + 329 + 636 + + + + wheat + Unspecified + 0 + 0 + + 373 + 405 + 482 + 471 + + + + wheat + Unspecified + 0 + 0 + + 510 + 329 + 573 + 444 + + + + wheat + Unspecified + 0 + 0 + + 748 + 200 + 872 + 263 + + + + wheat + Unspecified + 0 + 0 + + 709 + 765 + 763 + 874 + + + + wheat + Unspecified + 0 + 0 + + 688 + 436 + 773 + 532 + + + + wheat + Unspecified + 0 + 0 + + 800 + 469 + 888 + 556 + + + + wheat + Unspecified + 0 + 0 + + 842 + 691 + 940 + 757 + + + + wheat + Unspecified + 0 + 0 + + 770 + 323 + 836 + 422 + + + + wheat + Unspecified + 0 + 0 + + 567 + 836 + 650 + 910 + + + + semka + Unspecified + 0 + 0 + + 351 + 666 + 469 + 740 + + + diff --git a/dataset/train/P_20181107_130733_1.jpg b/dataset/train/P_20181107_130733_1.jpg new file mode 100644 index 0000000..1670025 Binary files /dev/null and b/dataset/train/P_20181107_130733_1.jpg differ diff --git a/dataset/train/P_20181107_130733_1.xml b/dataset/train/P_20181107_130733_1.xml new file mode 100644 index 0000000..4d4f273 --- /dev/null +++ b/dataset/train/P_20181107_130733_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_130733_1.jpg + dataset/train/P_20181107_130733_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 329 + 323 + 405 + 400 + + + + wheat + Unspecified + 0 + 0 + + 390 + 562 + 445 + 636 + + + + wheat + Unspecified + 0 + 0 + + 403 + 452 + 474 + 529 + + + + wheat + Unspecified + 0 + 0 + + 625 + 411 + 713 + 471 + + + + wheat + Unspecified + 0 + 0 + + 691 + 441 + 773 + 523 + + + + wheat + Unspecified + 0 + 0 + + 704 + 600 + 759 + 693 + + + + wheat + Unspecified + 0 + 0 + + 813 + 458 + 868 + 540 + + + + wheat + Unspecified + 0 + 0 + + 883 + 724 + 946 + 811 + + + + wheat + Unspecified + 0 + 0 + + 888 + 501 + 943 + 589 + + + + wheat + Unspecified + 0 + 0 + + 842 + 669 + 905 + 759 + + + + semka + Unspecified + 0 + 0 + + 597 + 548 + 696 + 611 + + + diff --git a/dataset/train/P_20181107_130825_1.jpg b/dataset/train/P_20181107_130825_1.jpg new file mode 100644 index 0000000..a378799 Binary files /dev/null and b/dataset/train/P_20181107_130825_1.jpg differ diff --git a/dataset/train/P_20181107_130825_1.xml b/dataset/train/P_20181107_130825_1.xml new file mode 100644 index 0000000..085aaa7 --- /dev/null +++ b/dataset/train/P_20181107_130825_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_130825_1.jpg + dataset/train/P_20181107_130825_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 255 + 606 + 320 + 702 + + + + wheat + Unspecified + 0 + 0 + + 279 + 301 + 381 + 375 + + + + wheat + Unspecified + 0 + 0 + + 367 + 384 + 447 + 471 + + + + wheat + Unspecified + 0 + 0 + + 582 + 839 + 637 + 932 + + + + wheat + Unspecified + 0 + 0 + + 696 + 740 + 751 + 817 + + + + wheat + Unspecified + 0 + 0 + + 778 + 625 + 864 + 707 + + + + wheat + Unspecified + 0 + 0 + + 773 + 348 + 869 + 436 + + + + wheat + Unspecified + 0 + 0 + + 866 + 260 + 951 + 337 + + + + wheat + Unspecified + 0 + 0 + + 532 + 246 + 608 + 320 + + + + wheat + Unspecified + 0 + 0 + + 781 + 436 + 880 + 510 + + + + semka + Unspecified + 0 + 0 + + 342 + 850 + 447 + 918 + + + diff --git a/dataset/train/P_20181107_131007_1.jpg b/dataset/train/P_20181107_131007_1.jpg new file mode 100644 index 0000000..730fab2 Binary files /dev/null and b/dataset/train/P_20181107_131007_1.jpg differ diff --git a/dataset/train/P_20181107_131007_1.xml b/dataset/train/P_20181107_131007_1.xml new file mode 100644 index 0000000..89ff08b --- /dev/null +++ b/dataset/train/P_20181107_131007_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_131007_1.jpg + dataset/train/P_20181107_131007_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + semka + Unspecified + 0 + 0 + + 337 + 474 + 405 + 581 + + + + wheat + Unspecified + 0 + 0 + + 263 + 222 + 370 + 296 + + + + wheat + Unspecified + 0 + 0 + + 364 + 224 + 466 + 285 + + + + wheat + Unspecified + 0 + 0 + + 625 + 255 + 729 + 331 + + + + wheat + Unspecified + 0 + 0 + + 554 + 614 + 647 + 672 + + + + wheat + Unspecified + 0 + 0 + + 581 + 342 + 636 + 460 + + + + wheat + Unspecified + 0 + 0 + + 704 + 364 + 784 + 458 + + + + wheat + Unspecified + 0 + 0 + + 743 + 507 + 814 + 606 + + + + wheat + Unspecified + 0 + 0 + + 440 + 501 + 495 + 608 + + + + wheat + Unspecified + 0 + 0 + + 672 + 628 + 762 + 688 + + + + wheat + Unspecified + 0 + 0 + + 700 + 298 + 755 + 400 + + + diff --git a/dataset/train/P_20181107_131146_1.jpg b/dataset/train/P_20181107_131146_1.jpg new file mode 100644 index 0000000..9ace024 Binary files /dev/null and b/dataset/train/P_20181107_131146_1.jpg differ diff --git a/dataset/train/P_20181107_131146_1.xml b/dataset/train/P_20181107_131146_1.xml new file mode 100644 index 0000000..25a59bc --- /dev/null +++ b/dataset/train/P_20181107_131146_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_131146_1.jpg + dataset/train/P_20181107_131146_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + semka + Unspecified + 0 + 0 + + 227 + 748 + 337 + 822 + + + + wheat + Unspecified + 0 + 0 + + 364 + 452 + 438 + 556 + + + + wheat + Unspecified + 0 + 0 + + 263 + 304 + 340 + 394 + + + + wheat + Unspecified + 0 + 0 + + 518 + 430 + 622 + 499 + + + + wheat + Unspecified + 0 + 0 + + 567 + 309 + 677 + 367 + + + + wheat + Unspecified + 0 + 0 + + 543 + 565 + 650 + 639 + + + + wheat + Unspecified + 0 + 0 + + 885 + 477 + 940 + 581 + + + + wheat + Unspecified + 0 + 0 + + 1053 + 554 + 1121 + 663 + + + + wheat + Unspecified + 0 + 0 + + 666 + 444 + 768 + 507 + + + + wheat + Unspecified + 0 + 0 + + 617 + 625 + 677 + 721 + + + + wheat + Unspecified + 0 + 0 + + 798 + 573 + 855 + 672 + + + diff --git a/dataset/train/P_20181107_131335_1.jpg b/dataset/train/P_20181107_131335_1.jpg new file mode 100644 index 0000000..199c3bf Binary files /dev/null and b/dataset/train/P_20181107_131335_1.jpg differ diff --git a/dataset/train/P_20181107_131335_1.xml b/dataset/train/P_20181107_131335_1.xml new file mode 100644 index 0000000..63ff26b --- /dev/null +++ b/dataset/train/P_20181107_131335_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_131335_1.jpg + dataset/train/P_20181107_131335_1.jpg + + Unknown + + + 1152 + 1153 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 298 + 763 + 367 + 886 + + + + wheat + Unspecified + 0 + 0 + + 413 + 991 + 493 + 1098 + + + + wheat + Unspecified + 0 + 0 + + 468 + 969 + 570 + 1076 + + + + wheat + Unspecified + 0 + 0 + + 507 + 741 + 597 + 853 + + + + wheat + Unspecified + 0 + 0 + + 578 + 815 + 658 + 905 + + + + wheat + Unspecified + 0 + 0 + + 639 + 686 + 738 + 760 + + + + wheat + Unspecified + 0 + 0 + + 639 + 754 + 743 + 840 + + + + wheat + Unspecified + 0 + 0 + + 570 + 738 + 680 + 793 + + + + wheat + Unspecified + 0 + 0 + + 188 + 236 + 257 + 348 + + + + wheat + Unspecified + 0 + 0 + + 259 + 274 + 364 + 367 + + + + semka + Unspecified + 0 + 0 + + 482 + 101 + 611 + 186 + + + diff --git a/dataset/train/P_20181107_131549_1.jpg b/dataset/train/P_20181107_131549_1.jpg new file mode 100644 index 0000000..cadf074 Binary files /dev/null and b/dataset/train/P_20181107_131549_1.jpg differ diff --git a/dataset/train/P_20181107_131549_1.xml b/dataset/train/P_20181107_131549_1.xml new file mode 100644 index 0000000..d6ea52d --- /dev/null +++ b/dataset/train/P_20181107_131549_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_131549_1.jpg + dataset/train/P_20181107_131549_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 323 + 556 + 403 + 663 + + + + wheat + Unspecified + 0 + 0 + + 373 + 249 + 455 + 362 + + + + wheat + Unspecified + 0 + 0 + + 304 + 504 + 416 + 567 + + + + wheat + Unspecified + 0 + 0 + + 444 + 386 + 548 + 480 + + + + wheat + Unspecified + 0 + 0 + + 370 + 455 + 480 + 526 + + + + wheat + Unspecified + 0 + 0 + + 589 + 650 + 710 + 737 + + + + wheat + Unspecified + 0 + 0 + + 685 + 362 + 754 + 469 + + + + wheat + Unspecified + 0 + 0 + + 765 + 562 + 842 + 666 + + + + wheat + Unspecified + 0 + 0 + + 576 + 200 + 663 + 290 + + + + wheat + Unspecified + 0 + 0 + + 718 + 822 + 792 + 916 + + + + semka + Unspecified + 0 + 0 + + 765 + 183 + 844 + 301 + + + diff --git a/dataset/train/P_20181107_131816_1.jpg b/dataset/train/P_20181107_131816_1.jpg new file mode 100644 index 0000000..00fa315 Binary files /dev/null and b/dataset/train/P_20181107_131816_1.jpg differ diff --git a/dataset/train/P_20181107_131816_1.xml b/dataset/train/P_20181107_131816_1.xml new file mode 100644 index 0000000..734ab67 --- /dev/null +++ b/dataset/train/P_20181107_131816_1.xml @@ -0,0 +1,146 @@ + + train + P_20181107_131816_1.jpg + dataset/train/P_20181107_131816_1.jpg + + Unknown + + + 1152 + 1153 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 411 + 175 + 523 + 263 + + + + wheat + Unspecified + 0 + 0 + + 482 + 431 + 570 + 524 + + + + wheat + Unspecified + 0 + 0 + + 727 + 560 + 837 + 628 + + + + wheat + Unspecified + 0 + 0 + + 581 + 321 + 677 + 436 + + + + wheat + Unspecified + 0 + 0 + + 724 + 323 + 820 + 428 + + + + wheat + Unspecified + 0 + 0 + + 859 + 389 + 944 + 488 + + + + wheat + Unspecified + 0 + 0 + + 911 + 538 + 1024 + 601 + + + + wheat + Unspecified + 0 + 0 + + 1032 + 741 + 1090 + 853 + + + + wheat + Unspecified + 0 + 0 + + 499 + 181 + 614 + 258 + + + + wheat + Unspecified + 0 + 0 + + 284 + 332 + 350 + 444 + + + + semka + Unspecified + 0 + 0 + + 713 + 787 + 848 + 872 + + + diff --git a/dataset/val/P_20181107_132140_1.jpg b/dataset/val/P_20181107_132140_1.jpg new file mode 100644 index 0000000..8b7f050 Binary files /dev/null and b/dataset/val/P_20181107_132140_1.jpg differ diff --git a/dataset/val/P_20181107_132140_1.xml b/dataset/val/P_20181107_132140_1.xml new file mode 100644 index 0000000..93906c5 --- /dev/null +++ b/dataset/val/P_20181107_132140_1.xml @@ -0,0 +1,146 @@ + + val + P_20181107_132140_1.jpg + dataset/val/P_20181107_132140_1.jpg + + Unknown + + + 1152 + 1153 + 3 + + 0 + + wheat + Unspecified + 0 + 0 + + 276 + 807 + 408 + 900 + + + + wheat + Unspecified + 0 + 0 + + 320 + 560 + 457 + 647 + + + + wheat + Unspecified + 0 + 0 + + 449 + 518 + 565 + 598 + + + + wheat + Unspecified + 0 + 0 + + 812 + 658 + 892 + 787 + + + + wheat + Unspecified + 0 + 0 + + 834 + 960 + 958 + 1062 + + + + wheat + Unspecified + 0 + 0 + + 680 + 579 + 826 + 667 + + + + wheat + Unspecified + 0 + 0 + + 334 + 271 + 460 + 345 + + + + wheat + Unspecified + 0 + 0 + + 837 + 343 + 914 + 477 + + + + wheat + Unspecified + 0 + 0 + + 894 + 439 + 974 + 571 + + + + wheat + Unspecified + 0 + 0 + + 636 + 617 + 716 + 738 + + + + semka + Unspecified + 0 + 0 + + 600 + 787 + 685 + 930 + + + diff --git a/dataset/val/P_20181107_132448_1.jpg b/dataset/val/P_20181107_132448_1.jpg new file mode 100644 index 0000000..64cfa23 Binary files /dev/null and b/dataset/val/P_20181107_132448_1.jpg differ diff --git a/dataset/val/P_20181107_132448_1.xml b/dataset/val/P_20181107_132448_1.xml new file mode 100644 index 0000000..6edfa78 --- /dev/null +++ b/dataset/val/P_20181107_132448_1.xml @@ -0,0 +1,146 @@ + + val + P_20181107_132448_1.jpg + dataset/val/P_20181107_132448_1.jpg + + Unknown + + + 1152 + 1152 + 3 + + 0 + + wheat + Unspecified + 1 + 0 + + 1 + 600 + 104 + 691 + + + + wheat + Unspecified + 0 + 0 + + 274 + 691 + 381 + 768 + + + + wheat + Unspecified + 0 + 0 + + 449 + 768 + 532 + 869 + + + + wheat + Unspecified + 0 + 0 + + 471 + 672 + 578 + 751 + + + + wheat + Unspecified + 0 + 0 + + 822 + 677 + 888 + 792 + + + + wheat + Unspecified + 0 + 0 + + 718 + 943 + 806 + 1058 + + + + wheat + Unspecified + 0 + 0 + + 208 + 386 + 318 + 455 + + + + wheat + Unspecified + 0 + 0 + + 488 + 315 + 551 + 438 + + + + wheat + Unspecified + 0 + 0 + + 597 + 551 + 707 + 614 + + + + wheat + Unspecified + 0 + 0 + + 397 + 356 + 474 + 449 + + + + semka + Unspecified + 0 + 0 + + 178 + 916 + 268 + 1036 + + + diff --git a/ezgif-1-138e5d970f8d.gif b/ezgif-1-138e5d970f8d.gif new file mode 100644 index 0000000..5534b8f Binary files /dev/null and b/ezgif-1-138e5d970f8d.gif differ diff --git a/frozen.zip b/frozen.zip new file mode 100644 index 0000000..95c8d28 Binary files /dev/null and b/frozen.zip differ