-
Notifications
You must be signed in to change notification settings - Fork 7
Image Processing
The Open Computer Vision (OpenCV) library is a huge community effort to provide a single consolidated source of tools and functions to perform all sorts of image processing techniques on images and other data.
OpenCV is developed to be a stand-alone solution, but this doesn't really help with integrating with an existing ROS environment. Fortunately ROS provides a selection of packages that will allow us to do just that; vision_opencv. This package also provides a great range of tutorials that will assist with setting up some basic functionality between ROS and OpenCV.
Additionally, as of time of writing, it is important to note that there is no official OpenCV package in Ubuntu, and as such the most examples online will suggest to download and compile OpenCV yourself. THIS IS NOT NECESSARY FOR USE IN ROS!
Note: The ROS repositories include a per-compiled package for OpenCV that will automatically be installed when the vision_opencv package is installed. Manually installing OpenCV for use with ROS will cause issues!
So, make sure you use the pre-packaged solution, and follow the tutorials in the vision_opencv package for setting up your OpenCV environment. To start the actual development of your image processing software, the QUTAS repository offers the package kinetic_sample_packages, which include a very basic, but decent C++ template to get an image processing node up and running.
There are multiple ways that we can approach the task of performing image processing, most of which boil down to the loose definition of on-board and off-board systems (in this situation, we are referring to the airborne platform). For most small to mid-sized systems (those which have payloads less than that of a person), there is likely to be a fairly large constraint on the amount of weight that can be allocated to any sort of computers or other hardware. For this reason, depending on the complexity of the image processing that must be done, it may be highly likely that the processing cannot be performed on-board.
There are some inherent dangers to performing off-board processing, to list a few:
- The need to deal with disconnections
- Added latency to the system
- The need to transfer (potentially) a lot of data over the air Off-board processing can however come with the following benefits:
- Access to much more powerful computers
- (Most likely) No constraints to power draw
- Much easier to perform user-reliant processing (such as the user selecting a location to inspect)
- Easier logging and data collection
Some typical approaches to systems that may work in either on-board or off-board configurations are:
- Everything on-board, including processing and logging, minimal feedback to the ground
- Mission-critical systems run on-board, usually with some form of high-bandwidth telemetry to log information to the ground when possible
- Non flight-critical systems run off-board, usually with a reliable high-bandwidth telemetry for data transfer (i.e. navigation interpretation and camera software on-board, everything else off-board)
When developing for an image processing system, particularly for an on-board solution, it is highly suggested that most of the development is performed in a Hardware-In-The-Loop (HITL) type environment. This would involve using a high-powered computer (relatively speaking, such as a desktop computer) to develop the initial software, while making sure it is compatible with the on-board system.
The typical development of a solution of this type of integration may look like the following:
- Set up all hardware and similar base system on both the on-board computer and a development computer
- Pick out a final solution that will (ideally) work on both computers (or at least the on-board computer)
- Configure the on-board computer to capture images using ROS, and make sure they are accessible on the development computer
- Begin writing the image processing software on the development computer using the test images captured by the on-board computer
- Periodically run the image processing software on the on-board computer to ensure compatibility and assess performance
- Once performance criteria have been met, and the system is more-or-less outputting the intended results (e.g. not much more compiling or prototyping needs to be done, just adjusting parameters), move the software to the the on-board computer
- Finalize that the system is working as intended
If the image processing system is complex in the design itself (e.g. has multiple processing stages), it may be possible to offload more of the processing to the development or on-board computer depending on the performance.
Regardless of the chosen method for development, keep in mind the following two points:
- Make life as easy as possible during the development stages
- Perform stress tests to ensure chosen hardware is capable of running the developed software
Below you can find a few small snippets of code to help you out with some common image manipulation tasks.
For most of our work, we would prefer to use 2D arrays to access the pixel locations in an image. Sometimes however, this is not possible. As an example, you might receive a message containing the raw image data, as a sort of flattened or reshaped array (just a long list of data of the image). In the case that you simply want to extract specific data from this image, or you want to overwrite specific pixels in the image, you have to use a specific technique, usually referred to as using the stride of an array.
The stride of an array is usually just the width of the 2D array. That is, if we have a flattened image array, and we would like to treat it as a 2D, we can use the width of what the 2D array is supposed to be as the stride.
As an example, assume we have a 2D binary image that has been flattened such that each row is appended to the last, making one long list of pixels. If we wanted to make some changes to the specific pixels (x,y)
, we could do the following:
#Create an empty image vector
width = 640
height = 480
img = [bool] * width * height
x = 10
y = 40
img[x + (y * width)] = True
This allows us to easily jump through an array to specific spots in an array. This method can also be expanded to account for additional data that may be compressed in, such as having a 3 channel image (RGB) all compressed into one vector.
As an example, if we only wanted the red channel at some point in an image that was compressed in a way such as [RGB, RGB, RGB, ...]:
channels = 3
width = 640
height = 480
img = [int] * channels * width * height
x = 10
y = 40
c = 0 # channel offset (red: 0, green: 1, blue: 2)
px_red = img[channels * (x + (y * width)) + c]
Within the field of image processing, it is quite easy to forget just how much data is being manipulated.
An 800x600 RGB image, for example, will take up 1.3MB in memory, where as a 1920x1080 RGB image will take up 5.9MB. This may not seem like a lot when thinking about the gigabytes of RAM that may be available, but it still takes a long time to allocate room and move around that amount of data. For this reason, any sensible guide in the field of image processing (assuming it is for a real programming language) will make use of two techniques to avoid re-allocating or moving so much data around in memory; pre-allocation and pointers.
Pre-allocation is simply declaring a variable once (or as little as possible), and then re-using that variable.
Pointers, on the other hand, make use of the fact that an image has already been allocated space in memory, so instead of making another copy, we simply tell the program to use that space in memory, rather than allocating more.
More information on memory management can be found in the OpenCV Docs.
In image processing, the general flow of data in the program is to take a lot of data as an input, and output a small amount of data as a result, where a good analogue would be a funnel. We want to take megabytes of data, and somehow compress this all down to some arbitrary, yet relevant, information about the image.
For this process, we ideally would like to perform as little work as possible on the current set of data we are working with. For example, consider trying to find the location of a red circle with a size between 200 and 400 pixels, in an 1920x1080x3 (RGB) image. One of the more optimal processes we could take is:
- Filter the input image to cut out only the red spectrum (in our case, take Hue from >225 and <30)
- We are now left with an image of size 1920x1080x1 (a binary image)
- Adjust the image to reduce the size and complexity
- Do a rough-crop to remove excess areas where there are large patches with no red colour
- Do a smart-crop to remove areas where you don't expect the circle to be based on previous results
- Scale the image if it has a very high resolution (and the circle is relatively large)
- Use fill and erosion functions to simplify the image (but take care as this may not be that efficient)
- Perform the Hough circle detection (making sure that parameters such as min_dist and min/max_size are reasonably set)
- We are now left with a list potential circles in our image
- Display the results (or apply an other relevant processing to this list)
- If we were looking for a red circle with a blue square in it, we could now inspect each of the regions found for those attributes
- If we were attempting to estimate circle location relative to the camera, this is the time to do it
In the previous example, we performed two manual steps in filtering the image to prepare it for use with the Hough circle detection. It is also important to remember that internally, the detection method also uses a lot of filtering methods to improve its performance based on the parameters provided.
- Home
- Common Terminology
- UAV Setup Guides (2024)
- UAV Setup Guides (2022)
- UAV Setup Guides (2021)
- UAV Setup Guides (--2020)
- General Computing
- On-Board Computers
- ROS
- Sensors & Estimation
- Flight Controllers & Autopilots
- Control in ROS
- Navigation & Localization
- Communications
- Airframes
- Power Systems
- Propulsion
- Imagery & Cameras
- Image Processing
- Web Interfaces
- Simulation