Merge pull request #102 from GDSC-Delft-Dev/dev

Draft: Sprint 4 merge
GDSC-Delft-Dev · Mar 13, 2023 · 3e49168 · 3e49168
2 parents 7864f5f + 783a4b1
commit 3e49168
Show file tree

Hide file tree

Showing 94 changed files with 16,431 additions and 596 deletions.
diff --git a/.github/workflows/pipeline.yml b/.github/workflows/pipeline.yml
@@ -2,8 +2,8 @@ name: Pipeline build
 
 on:
   push:
-    paths:
-      - 'src/backend'
+    #paths:
+    #  - 'src/backend/**'
     branches:
       - '*'
 
@@ -13,7 +13,57 @@ env:
   GAR_REPOSITORY: pipelines-dev
 
 jobs:
+  lint:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v3
+
+      - uses: actions/setup-python@v4
+        name: Setup Python 3.10
+        with:
+          python-version: "3.10"
+          cache: 'pip'
+      - run: |
+          cd src/backend
+          python -m pip install --upgrade pip
+          pip install mypy pylint
+          pip install -r requirements.txt
+
+      - name: Run pylint
+        run: |
+          cd src/backend
+          pylint ./pipeline --fail-under 9
+
+      - name: Run mypy
+        run: |
+          cd src/backend
+          mypy . --explicit-package-bases
+
+  test:
+    runs-on: ubuntu-latest
+    steps:  
+      - name: Checkout
+        uses: actions/checkout@v2
+
+      - name: Set up Python 3.10
+        uses: actions/setup-python@v4
+        with:
+          python-version: "3.10"
+
+      - name: Install dependencies
+        run: |
+          cd src/backend
+          python -m pip install --upgrade pip
+          pip install pytest
+          pip install -r requirements.txt
+
+      - name: Run pytest
+        run: pytest 
+
   deploy:
+    needs: [lint, test]
+
     # Add 'id-token' with the intended permissions for workload identity federation
     permissions:
       contents: 'read'

diff --git a/.gitignore b/.gitignore
@@ -6,5 +6,8 @@ __pycache__
 .mypy_cache
 .pytest_cache
 expected_preprocess_masked.npy
+.vscode
+*json
 nutrient_masks.npy
 .vscode
+.idea
diff --git a/README.md b/README.md
@@ -1,24 +1,102 @@
-# apa
-Autonomous precision agriculture with UAVs
+<!-- TODO: add code coverage? -->
+# Terrafarm
 
+![example workflow](https://github.com/GDSC-Delft-Dev/apa/actions/workflows/pipeline.yml/badge.svg)
 
-# Setup
+## About
 
+### Problem we solving
+Growing (high-quality) crops sustainably for an ever-increasing population is one of the biggest challenges we face today, as farmers all over the world are faced with complex decision making problems for a vast amount of crops. To this end, a variety of parameters need to be traced - think of application of fertilizer, soil humidity or availability of nutrients.
+
+In traditional agriculture, fields are treated as homogeneous entities, which generally leads to sub-optimal treatment due to lack of (localized) traceability. This is problematic, as oversupply of agricultural inputs leads to environmental pollution. Moreover, unnecessary large quantities can go to waste if produce are not harvested at their optimal time. Finally, this clearly leads to low yield density and hence missed profits for farmers.
+
+[Precision agriculture](https://en.wikipedia.org/wiki/Precision_agriculture) on the other hand, aims to produce more crops with fewer resources while maintaining quality. This sustainable agricultural model utilizes IT solutions to allow for localized treatment to a much finer degree. This paradigm shift is becoming increasingly urgent because of the worldwide increase in food demands for example: the number of people who will require food in 2050 is estimated at nine billion.
+
+### Our solution
+Our **mobile app Terrafarm** allows farmers to perform **smart monitoring, analysis and planning** in an intuitive and affordable manner. In fact, our system uses **image processing and deep learning** to extact **actionable insights** from multispectral drone images. These insights - think of pest infestations, moisture content or nutrient deficiencies - are visualized to users, thereby providing full transparancy. We aim to target both small- and medium-scale farmers. Detailed information about our image processing pipeline and Flutter mobile app can be found under `apa/src/backend` and `apa/src/frontend` respectively.
+
+<div>
+    <img src="assets/Terrafarm-poster-0.jpg" alt="Image 1" width="500" style="display:inline-block;">
+    <img src="assets/Terrafarm-poster-1.jpg" alt="Image 2" width="500" style="display:inline-block;">
+</div>
+
+<p style="text-align:center;">Figure: Information poster presenting Terrafarm</p>
+
+
+
+# Build Tools
+
+![image](https://img.shields.io/badge/Flutter-02569B?style=for-the-badge&logo=flutter&logoColor=white)
+</br>
+![image](https://img.shields.io/badge/Dart-0175C2?style=for-the-badge&logo=dart&logoColor=white)
+</br>
+![image](https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue)
+</br>
+![image](https://img.shields.io/badge/firebase-ffca28?style=for-the-badge&logo=firebase&logoColor=black)
+</br>
+![image](https://img.shields.io/badge/Google_Cloud-4285F4?style=for-the-badge&logo=google-cloud&logoColor=white)
+</br>
+![image](https://img.shields.io/badge/TensorFlow-FF6F00?style=for-the-badge&logo=tensorflow&logoColor=white)
+</br>
+![image](https://img.shields.io/badge/OpenCV-27338e?style=for-the-badge&logo=OpenCV&logoColor=white)
+</br>
+![image](https://img.shields.io/badge/GitHub_Actions-2088FF?style=for-the-badge&logo=github-actions&logoColor=white)
+
+# Getting Started
+
+Follow these steps to set up your project locally.
+
+Clone the repo
 ```
-# Clone the repo
 git clone https://github.com/GDSC-Delft-Dev/apa.git
+```
+
+## Setup backend
 
-# Setup virtual python environment
+Setup virtual python environment
+```
 pip install virtualenv
 virtualenv env
+```
 
-# Activate on MacOS or Linux
+Activate on MacOS or Linux
+```
 source env/bin/activate
-# Activate on Windows
-source env/Scripts/activate
+```
 
+Activate on Windows
+```
+source env/Scripts/activate
+```
+Install Python requirements
+```
 pip install -r requirements.txt
 ```
+Please refer to `apa/src/backend/README.md` for detailed information on the image processing pipeline.
+<!-- TODO: Perhaps more info? -->
+
+## Setup frontend
+Please refer to `apa/src/frontend/README.md`.
+
+
+# Contributing
+Anyone who is eager to contribute to this project is very welcome to do so. Simply take the following steps:
+1. Fork the project
+2. Create your own feature branch
+3. Commit your changes
+4. Push to the `dev` branch and open a PR
 
 # Datasets
-You can play with the datasets in the *notebooks* folder.
+You can play with the datasets in the `notebooks` folder.
+
+
+# License
+Distributed under the MIT License. See `LICENSE.txt` for more information.
+
+# Contact
+- Google Developers Student Club Delft - [email protected]
+- Paul Misterka - [email protected]
+- Mircea Lica - [email protected]
+- David Dinucu-Jianu - [email protected]
+- Nadine Kuo - [email protected]
+<!-- Not sure if I shou -->
diff --git a/assets/Terrafarm-poster-0.jpg b/assets/Terrafarm-poster-0.jpg
diff --git a/assets/Terrafarm-poster-1.jpg b/assets/Terrafarm-poster-1.jpg
diff --git a/src/backend/README.md b/src/backend/README.md
@@ -32,15 +32,15 @@ Our project uses `mypy` and `pylint` to assert the quality of the code. You can
 
 ```
 python -m mypy . --explicit-package-bases
-python -m pylint ../pipeline
+python -m pylint ./pipeline
 ```
 
 ### CI/CD
 The CI/CD pushes the build from the latest commit to the `pipelines-dev` repository in the Google Artifact Registry.
 
 ### Modules
 To make the code extendible, maintainable, and multithreaded, the pipeline is divided into modules. Modules are run sequentially, and each can have multiple implementations that execute different logic, but compute the same type of data. We distinguish the following modules:
-- Mosaicing module - transforms the flyover images into a single farmland bird's eye view image
+- Mosaicing module - transforms the flyover images into a single farmland bird's eye view image. Moreover the module creates non-overlapping patches used in subsequent pipeline stages.
 - Index module - computes pixel-level indicies that provide general information about the field
 - Insight module - evaluates the database and indicies to provide actionable and localized insights that identify issues and improve farming efficiency
 - Segmentation module - computes pixel-level masks for a number of different annotations that can be directly shown to the user
@@ -87,4 +87,4 @@ The module is a logical part of the image processing pipeline, chained sequentia
 The parallel module is a module that can run multiple threads of execution at the same time, essentially allowing parallel module invocations. Parallel modules implement logical groups of functionalities, such as the calculation of all indicies (e.g. `NDVI` and `Mositure`) that do not rely on each other.
 
 #### Pipeline data object
-The data object contains all data relevant to the pipeline job. The pipeline initializes the data object dynamically through the use of the `prepare()` method. Note that, similarily to constructors, the preparation of your implementation should follow the preparation of the base class.
+The data object contains all data relevant to the pipeline job. The pipeline initializes the data object dynamically through the use of the `prepare()` method. Note that, similarily to constructors, the preparation of your implementation should follow the preparation of the base class.
diff --git a/src/backend/main.py b/src/backend/main.py
@@ -1,20 +1,29 @@
 import glob
-from pipeline.templates import full_pipeline, default_pipeline, nutrient_pipeline
-from pipeline.mat import Mat
+from pipeline.templates import full_pipeline, default_pipeline, training_pipeline, nutrient_pipeline
+import firebase_admin
+from firebase_admin import credentials, firestore
+import asyncio
+from pipeline.mat import Mat, Channels
+import numpy as np
 
 def main():
     """Main entry point."""
 
     # Get test data
-    import os
-    imgs = [Mat.read(file) for file in glob.glob("./pipeline/test/data/mosaicing/farm/D*.JPG")]
+    imgs = [Mat.read(file) for file in glob.glob("pipeline/test/data/mosaicing/farm/D*.JPG")]
+    imgs = imgs[:1]
 
     # Run the pipeline
-    print(len(imgs))
     pipeline = nutrient_pipeline()
     pipeline.show()
-    res = pipeline.run(imgs)
-    # Print the result
+
+    # Authenticate to firebase
+    if pipeline.config.cloud.use_cloud:
+        cred = credentials.Certificate("terrafarm-378218-firebase-adminsdk-nept9-e49d1713c7.json")
+        firebase_admin.initialize_app(cred)
+
+    # Run the pipeline
+    res = asyncio.run(pipeline.run(imgs))
     print(res)
 
 if __name__ == "__main__":

diff --git a/src/backend/mypy.ini b/src/backend/mypy.ini
@@ -1,2 +1,3 @@
 [mypy]
-ignore_missing_imports = True
+ignore_missing_imports = True
+disable_error_code = attr-defined, call-overload
diff --git a/src/backend/pipeline/config.py b/src/backend/pipeline/config.py
@@ -2,16 +2,24 @@
 from .modules.module import Module
 from .modules.runnable import Runnable
 from typing import Any, Type
+from dataclasses import dataclass
+
+@dataclass
+class CloudConfig:
+    """Configures cloud resources."""
+    use_cloud: bool = False
+    bucket_name: str = ""
 
 class Config:
     """
     Initializes the config.
 
     Args:
+        use_cloud: whether use cloud resources, e.g. persist data to cloud storage.
+                   If False, the user doesn't need to provide GCP credentials.
         modules: dictionary of modules to initialize and their initialization data
     """
-    def __init__(self, modules: dict[Type[Module], Any], 
-                 parallel_modules: dict[Type[Module], Any]):
+    def __init__(self, modules: dict[Type[Module], Any], cloud: CloudConfig = CloudConfig()):
+        assert len(modules) > 0, "No modules specified"
         self.modules: dict[Type[Module], Any] = modules
-        self.parallel_modules: dict[Type[Module], Any] = parallel_modules
-
+        self.cloud: CloudConfig = cloud
diff --git a/src/backend/pipeline/mat.py b/src/backend/pipeline/mat.py
@@ -1,20 +1,19 @@
 from __future__ import annotations
-from enum import IntEnum
+from enum import Enum
 import cv2
 import numpy as np
 
-class Channels(IntEnum):
-    """
-    Defines channel types for the input images
-    """
-    R = 0
-    G = 1
-    B = 2
-    NIR = 3
-    FIR = 4
-    T = 5
-    A = 6
-    GREYSCALE = 7
+class Channels(Enum):
+    """Defines channel types for the input images"""
+    R = "Red"
+    G = "Green"
+    B = "Blue"
+    RE = "Red Edge"
+    NIR = "Near Infrared"
+    MIR = "Mid Infrared"
+    T = "Thermal"
+    GREYSCALE = "Grayscale"
+    A = "Alpha"
 
 default_channels = [Channels.R, Channels.G, Channels.B]
 
@@ -51,10 +50,11 @@ def read(cls, path: str) -> Mat:
         return cls(mat, channels = default_channels)
 
     @classmethod
-    def fread(cls, paths: dict[str, list[Channels]]) -> Mat:
+    def fread(cls, paths: list[tuple[str, list[Channels]]]) -> Mat:
         """
-        UNTESTED. Full reads an image with an arbitrary number of
-        channels from multiple source paths.
+        Rreads an image with an arbitrary number of
+        channels from multiple source paths that contain images 
+        with different numbers of channels.
 
         Args:
             paths: a dictionary of paths and their corresponding channels
@@ -65,23 +65,53 @@ def fread(cls, paths: dict[str, list[Channels]]) -> Mat:
         """
 
         # Load the images
-        mats = [cv2.imread(path) for path in paths.keys()]
+        mats = [cv2.imread(path[0], cv2.IMREAD_UNCHANGED) if len(path[1]) == 1
+                else cv2.imread(path[0])
+                for path in paths]
 
-        # Verify that the number of channels match and the
-        # dimensions match
+        # Verify input data integrity
+        assert len(mats) == len(paths), "Reading images failed"
         shape = mats[0].shape[:2]
-        for mat, channels in zip(mats, paths.values()): #type: tuple[cv2.Mat, list[Channels]]
-            assert mat.ndim == len(channels)
+        for mat, channels in zip(mats, [path[1] for path in paths]): #type: tuple[cv2.Mat, list[Channels]]
+            # Check number of channels
+            if len(channels) == 1:
+                assert mat.ndim == 2, "Image is not grayscale"
+            else:
+                assert len(channels) == mat.shape[2], "Image has incorrect number of channels"
+
+            # Check image dimensions
             assert shape == mat.shape[:2]
 
-        # Flatten channels
-        channels = sum(paths.values(), [])
-
         # Combine arrays
-        arr = np.concatenate([np.asarray(mat[:,:,:]) for mat in mats], axis=2)
+        # Split multichannel mats
+        arr_mats = np.array([mat if mat.ndim == 2 else np.split(mat) for mat in mats])
+
+        # Concatenate grayscales
+        arr = np.transpose(arr_mats, (1, 2, 0))
+
+        # Aggregate channels
+        channels = sum([path[1] for path in paths], [])
 
         # Return the combined data
         return cls(arr, channels)
+
+    @classmethod
+    def freads(cls, paths: list[str], channels: list[Channels]) -> Mat:
+        """
+        Reads an image with an arbitrary number of channels from
+        multiple source paths containing grayscale images only.
+
+        Args:
+            paths: a list of paths to read the images from
+            channels: a list of channels in the images paths
+                      (in the order they appear in the paths)
+        
+        Returns:
+            The loaded Mat.
+        """
+
+        assert len(paths) == len(channels), "Number of paths and channels must match"
+        return Mat.fread([(path, [channel]) for path, channel in zip(paths, channels)])
 
     def get(self) -> cv2.Mat:
         """