-
Notifications
You must be signed in to change notification settings - Fork 0
/
chapter1-intro.tex
533 lines (459 loc) · 36.4 KB
/
chapter1-intro.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
\begin{comment}
./texfix.py --outline --fpaths chapter1-intro.tex
fixtex --fpaths chapter1-intro.tex --outline --asmarkdown --numlines=999 --shortcite -w && ./checklang.py outline_chapter1-intro.md
./texfix.py --grep "\\\\[A-Za-z]*[^{a-zA-Z]"
./texfix.py --reformat --fpaths figdef1.tex
\end{comment}
\chapter{INTRODUCTION}\label{chap:intro}
\section{IMAGE-BASED IDENTIFICATION APPLIED TO POPULATION ECOLOGY}
Population ecology relies on estimating the number of individual animals that inhabit an
area~\cite{krebs_ecological_1999}.
Estimating a population size is done in two phases:
data collection and analysis.
Data are collected as sets of \glossterm{sighting} and \glossterm{resighting} observations.
A sighting is the first observation of an individual, and a resighting is a subsequent observation of a
previously sighted individual.
The observed data are then analyzed using software such as ``program MARK''~\cite{white_program_1999,
schwarz_jolly_seber_2006} or Wildbook that applies statistical models such as the Lincoln-Petersen
index~\cite{seber_estimation_1982}, Jolly-Seber model~\cite{jolly_explicit_1965, seber_note_1965}, or other
related models~\cite{cormack_estimates_1964, chao_estimating_1987,kenneth._h._pollock_statistical_1990}.
For an ecologist recording that an individual has been observed is simple, but determining if that
observation is a sighting or a resighting can be challenging.
This requires the ecologist to identify the individual by comparing against all other observations in the
data set.
Current methods to estimate a population size are limited by the data collection
phase~\cite{sundaresan_network_2007, rubenstein_ecology_2010}. The statistical population models require an
observation sample size that grows with the size of the population being studied~\cite{seber_estimation_1982}.
As the number of observations increases so does the difficulty of determining identity. Thus, the scope of a
population study is limited by the number of raw observations that can be made, and by the rate of determining
the individual identity within a set of observations. Overcoming these limitations is of particular importance
to wildlife preservation because population statistics are necessary to guide conservation
decisions~\cite{rubenstein_behavioral_1998}.
Consider images as a source of sight-resight observations. There are numerous advantages. Many observations can
be made rapidly and simultaneously, due to the simplicity and availability of cameras. Recording an observation
is as cheap and simple as taking a picture. Camera traps can be employed for autonomous data collection. In a
wildlife conservancy or national park, observations can be crowd-sourced by gathering images from safari
tourists and citizen scientists. Images can be accumulated and stored in a large dynamic dataset of
observations that could grow by thousands of images each day. However, the challenge of identifying the
individuals in the images remains. Manual methods are infeasible due to the rapid rate at which images can be
collected. Therefore, we must turn towards computer vision based methods.
This \thesis{} develops the foundation of the image analysis component of the ``Image Based Ecological
Information System'' (IBEIS).
The purpose of this system is to gain ecological insight from images using computer vision.
We focus on estimating the size of a population of animals as just one example of ecological insight that
might be gained from images.
Thus, we come to the core problem addressed in this \thesis{}:
image-based identification of individual animals.
\section{CHALLENGES OF ANIMAL IDENTIFICATION}\label{sec:challenges}
In animal identification we are given a database of images.
This database may initially be empty.
Each image is cropped to a bounding box around an animal of interest and labeled with that animal's identity.
For a new query image, the goal is to determine if any other images of the individual are in the database.
If the query is matched, it is added to the database as a resighting of that individual.
If the query is not matched, then it is added as a new individual.
In this work we focus on identifying individuals of species with distinguishing textures. Examples include
zebras, giraffes, humpback whales, lionfish, nautiluses, hyenas, whale sharks, wildebeest, wild dogs, jaguars,
cheetahs, leopards, frogs, toads, snails, and seals. The primary species that we will consider in this
\thesis{} are plains and Grévy's zebras, but we will maintain a secondary focus on Masai giraffes and humpback
whales. The difficulty of animal identification depends on the distinctiveness of the visual patterns that
distinguish an individual from others of its species. In addition, the images we identify are collected ``in
the wild'' and therefore contain occlusion, distracting features, variations in viewpoint and image quality.
This section will present several examples to illustrate the challenges faced in animal identification. The
discussion will begin with the challenges posed by the three primary species. Then problems common to all
species will be described. These will be illustrated using plains zebras because they are the most challenging
species considered in this \thesis{}.
\subsection{Distinguishing textures of each species}
The plains zebra --- shown in~\cref{fig:PlainsFigure} --- is challenging to visually identify because
individuals have relatively few distinguishing texture features. For most plains zebras, the majority of distinctive
information lies in a small area on the front shoulder. \Cref{fig:HardCaseFigure} illustrates that the patterns
that distinguish two individuals can be subtle, even when the features are clearly visible. The matching
difficulty greatly increases when features are partially occluded, the viewpoint changes, or the image quality
is poor.
In contrast, Masai giraffes and Grévy's zebras, shown in~\cref{fig:GirMasaiFigure}
and~\cref{fig:GrevysFigure} respectively, have an abundance of distinctive features. Distinctive textures
that are unique to each individual are spread across the entire body of a Masai giraffe. For a Grévy's
zebra there is a high density of distinguishing information above both front and back legs, as well as a
moderate density of distinctive textures along the side of the body. The high density of distinctive
textures in Masai giraffes and Grévy's zebras increases the likelihood that the same distinctive features
can be seen from different viewpoints. Even so, the problem is still difficult due to ``in the wild''
conditions such animal pose, occlusion, and image quality.
There are some species, like Humpback whales, where some individuals may contain distinguishing textures
while others may lack them entirely.
This means that only a subset of humpback whales will be able to be identified with the texture based
techniques that we will consider in this thesis.
However, other cues --- like the shape of the notches along the trailing edge of the fluke --- can be
used to distinguish between different individuals.
%The work of Hendrick Weideman~\cite{hendrick} addresses identifying humpback whales using shape features.
The work of Weideman and Jablons~\cite{jablons_identifying_2016} addresses identifying
humpback whales using trailing edge shape features.
The example in~\cref{fig:HumpbackFig} illustrates individual humpback whales with and without distinctive
textures.
\PlainsFigure{}
\HardCaseFigure{}
\GirMasaiFigure{}
\GrevysFigure{}
\HumpbackFig{}
\FloatBarrier{}
\subsection{Viewpoint and pose}
One of the most difficult challenges faced in the animal identification problem is viewpoint. Animals are seen
in a variety of poses and viewpoints, which can cause distinctive features to appear distorted. The patterns on
the left and right sides of animals are almost always asymmetric. Therefore, matches can only be established
using overlapping viewpoints and only if the viewpoints are distinctive. Some viewpoints, such as the backs of
plains zebras, lack distinguishing information as shown in~\cref{fig:BacksFigure}. The effect of pose and
viewpoint variation can be seen in~\cref{fig:ThreeSixtyFigure} and~\cref{fig:PoseFigure}.
\BacksFigure{}
\ThreeSixtyFigure{}
\PoseFigure{}
\FloatBarrier{}
\subsection{Occluders and distractors}
Because images of animals are often taken ``in the wild'', other objects in the image can act as
\glossterm{occluders} or \glossterm{distractors}. Objects such as grass, bushes, trees or other animals, can act
as occluders by partially obscuring the features that distinguish one individual from another. The appearance of
the other animals nearby can be distracting because features from these animals will match different animals in
the database. These \glossterm{distractors} may also be from non-animal features when multiple pictures are
taken against the same background as animals move through the same field of view. Several examples of occlusions
and distractors are illustrated in~\cref{fig:OccludeFigure}.
\OccludeFigure{}
\FloatBarrier{}
\subsection{Image quality}
Image quality is influenced by lighting, shadows, the camera used, image resolution, and the size of the
animal in the image. Outdoor images will naturally have large variations in illumination. Different cameras
can produce visual differences between images of an object. Images taken out of focus, from far away, or
with a non-steady camera can cause animals to appear blurred. The effects of outdoor shadow and
illumination are illustrated in~\cref{fig:IlluminationFigure}. \Cref{fig:QualityFigure} illustrates five
categories of image quality that will be described later in~\cref{sub:viewqual}.
\IlluminationFigure{}
\QualityFigure{}
\FloatBarrier{}
\subsection{Aging and injuries}
The appearance of an individual changes over time due to aging and other factors including injuries. An example
of the difference between a juvenile and adult zebra is shown in~\cref{fig:AgeFigure}. An example of how
injuries can both remove distinctive features and add new ones is shown in~\cref{fig:GashFigure}.
\AgeFigure{}
\GashFigure{}
\FloatBarrier{}
\section{THE GREAT ZEBRA COUNT}\label{sec:introgzc}
To further illustrate the problems addressed in this \thesis{}, we consider the ``Great Zebra Count'' (\GZC{}),
held at Nairobi National Park on March 1\st{} and 2\nd{}, $2015$~\cite{rubenstein_great_2015}. This event was
designed with two purposes in mind: (1) to involve citizens in the scientific data collection effort, thereby
increasing their interest in conservation, and (2) to determine the number of plains zebras and Masai giraffes
in the park.
\subsection{Data collection}
Volunteer participants --- each with his or her own camera --- arrived by car at the park.
Some cars had more than one photographer.
Each car was assigned a route to drive through the park.
We attached a GPS dongle to each car to record time and location throughout the drive.
Correlating this with the time stamp on each image (after adding a correction offset for each camera)
allowed us to determine the geolocation of each image.
Each photographer was given instructions guiding them toward taking quality images of the left sides of
the animals they saw.
When the cars returned --- some after just an hour or two, others after the whole day --- the images were
copied from the cameras, a small sample of each photographer's images was immediately processed to
illustrate what we would do with the data, and the entire set of images was stored for further
processing.
The result of this crowd-sourced collection event was a $\SI{48}{\giga\byte}$ dataset consisting of
$9406$ images.
\subsection{Data processing}\label{subsec:introdataprocess}
After the event, the entire collection of images was processed using a preliminary version of the system in
order to generate the final count. The preliminary system followed the workflow of: %
\begin{enumin}
%\item ingest images %
\item \occurrence{} grouping, %
\item animal detection, %
\item viewpoint and quality labeling, %
\item \intraoccurrence{} matching, %
\item \vsexemplar{} identification, %
\item consistency checks, and %
\item population estimation. %
\end{enumin}
%\Cref{chap:application} discusses this workflow
%in greater detail.
Here, we provide a brief overview of each step involved in the processing of the \GZC{} image data, and then we
will describe the challenges that arose.
\subsubsection{Occurrence grouping}
The images were first divided into \glossterm{\occurrences{}} --- a standard term defined by the Darwin
Core~\cite{wieczorek_darwin_2012} to denote a collection of evidence (\eg{} images) that an organism exists
within defined location and time-frame. In the scope of this application, an \occurrence{} is a cluster of
images taken within a small window of time and space. Images are grouped into \occurrences{} using the GPS
and time data. Details are provided in~\cref{app:occurgroup}.
These computed occurrences are valuable measurements for multiple components of the IBEIS software.
At its core an occurrence describes \wquest{when} a group of animals was seen and \wquest{where} that
group was seen.
From this starting point other algorithms can address questions like:
\wquest{how many} animals there were, \wquest{who} an animal is, \wquest{who else} is an animal with,
and \wquest{where else} have these animals been seen?
Furthermore, there are computational and algorithmic benefits to first grouping images into an
\occurrence{}.
One benefit is that an \occurrence{} can be used as a semantic processing unit to distribute
manageable chunks of work to users of the system.
Another is that \occurrences{} can be used to improve the results of identification.
Typically, there will be only a few individuals within an \occurrence{}, and it is not uncommon for
each individual to photographed multiple times and from multiple viewpoints.
This redundancy in images will be exploited in \Cref{chap:graphid}.
\subsubsection{Animal detection}
Before matching begins each image is cropped to focus on a particular animal and remove background
distractors.
A detection algorithm localizes animals within the images.
Each verified detection generates an \glossterm{\annot{}} --- a bounding box around a single animal
in an image.
An example illustrating detection of plains zebras is shown in~\cref{fig:DetectFigure}.
In the \GZC{} each detection was manually verified before becoming an \annot{}, but recent work
introduces an automatic verification mechanism and reduces the need for complete manual review.
The details of the detection algorithm are beyond the scope of this \thesis{}, and are described in
the work of Parham~\cite{parham_photographic_2015,parham_detecting_2016}.
\DetectFigure{}
\subsubsection{Viewpoint and quality labeling}\label{sub:viewqual}
When determining the number of animals in a population it is important to account for factors that can lead
to over-counting. If two \annots{} of the same individual are not matched, then that individual will be
counted twice. This could happen due to factors such as viewpoint and quality. For example, one \annot{}
showing only the left side of an animal and another \annot{} showing only the right side the same animal
cannot be matched because they are \glossterm{incomparable}. The two \annots{} are comparable when they
share regions with distinguishing patterns that can be put in correspondence. Viewpoint is the primary
reason that two \annots{} are not comparable. However, other factors like image quality and heavy occlusion
can corrupt distinguishing patterns rendering the \annot{} unidentifiable --- not comparable with any other
\annot{}. We must define what it means for two \annots{} to be comparable before we can estimate a
population size.
Determining if an individual can be identified is analogous to the
notion of a marked-individual~\cite{seber_estimation_1982}. For an
\annot{} to be identifiable the patterns that can distinguish it
from the rest of the population must be clear and visible, otherwise
the \annot{} may not be able to find or be compared to potential
matches. This means an \annot{} is only identifiable if
\begin{enumin}
\item the image quality is high enough, and %
\item it has a viewpoint that is comparable to all potential
matches. %
\end{enumin}
To address this challenge we label each \annot{} with $5$ discrete quality labels and $8$ discrete viewpoint
labels. The quality labels we define are: \qualJunk{}, \qualPoor{}, \qualOk{}, \qualGood{}, and
\qualExcellent{}. The \qualJunk{} label is given to \annots{} that almost certainly will not be able to be
identified, and \qualPoor{} labels are given to \annots{} that will likely be unidentifiable for a computer
vision algorithm. The $\qualGood{}$ and \qualExcellent{} labels are given to clear, well illuminated
\annots{} with little to no occlusion with \qualExcellent{} being reserved for the best of the best. All
other \annots{} are labeled as $\qualOk$. The viewpoint labels we define are: \vpFront{}, \vpFrontLeft{},
\vpLeft{}, \vpBackLeft{}, \vpBack{}, \vpBackRight{}, \vpBack{}, and \vpFrontRight{}. Note, that additional
viewpoint labels like $\vpUp{}$ and $\vpDown{}$ may be necessary for animals such as lionfish or turtles.
However, the $8$ labels we use are sufficient for animals like zebras and giraffes because they are most
commonly seen in upright positions.
In an effort to ensure that all \annots{} used in the \GZC{} were comparable, we did not include any
\annot{} that had junk or poor qualities. We also did not include \annots{} not labeled with a left or
frontleft viewpoint to account for limitations in the initial ranking algorithm. All labelings of
viewpoint and quality were generated manually. Since then, we have trained viewpoint and quality
classifiers using this manual data. Automatic detection of quality and viewpoint is discussed in the
work of Parham~\cite{parham_photographic_2015}.
\subsubsection{Matching within each \occurrence{}} %
Animals often have multiple redundant views within an \occurrence{}, each of which can be the same,
better, or complementary to other views. The images in~\cref{fig:OccurrenceComplementFigure} illustrate
redundant and complementary views of an individual in an \occurrence{}. Merging all of an individual's
views is a challenge, but also potentially an advantage as we can exploit redundancy to better handle
missing features, subtle viewpoint changes, and occlusions.
We exploit this redundancy to gain the benefit of complementary views by matching all \annots{} within an
\occurrence{} in a process called \glossterm{\intraoccurrence{} matching}. In the \GZC{}, each \annot{} was
queried against all other \annots{} in its \occurrence{}, returning a ranked list of candidate matches. The
person running the software made the final decisions about which \annots{} match. Details about the ranking
algorithm are given in~\cref{chap:ranking}.
The result of \intraoccurrence{} matching is a set of \glossterm{\encounters{}}. \Aan{\encounter{}} is a
group of \annots{} that were matched within an \occurrence{}. Each \encounter{} is either (1) the first
sighting an individual or a (2) resighting. The task now becomes to determine which of these is the case by
identifying each \encounter{} against a \masterdatabase{}.
\OccurrenceComplementFigure{}
\subsubsection{Matching against the \masterdatabase{}} %
To determine if \aan{\encounter{}} is a new sighting or a resighting of an individual, it is matched
against the \masterdatabase{} in a process called \glossterm{\vsexemplar{} matching}. Before matching
begins the \masterdatabase{} is prepared for search. For each \name{} in the \masterdatabase{} a subset
of \glossterm{\exemplar{}} \annots{} is chosen to represent the appearance of that individual. The
\exemplars{} are indexed using a search data structure.
After the \masterdatabase{} has been prepared, the ranking algorithm is able to issue a subset of the
\encounter{}'s \annots{} as a query.
The result is a ranked list of \exemplars{} that are visually similar to the \encounter{}.
The top \exemplars{} in the ranked list are used as candidate matches.
Then, the candidate matches are reviewed, and the \encounter{} is either merged into an existing
\mastername{} or added to the \masterdatabase{} as a new \mastername{}.
\subsubsection{Consistency checks}
When merging \encounters{} into the \masterdatabase{} it is possible that mistakes were made. Two
error cases commonly occur.
%%%
\begin{enumln}
\item A \glossterm{split case} occurs when a set of \annots{} from two or more different animals is
incorrectly labeled with the same \name{}. The main cause of this error is when distracting features are
matched causing the \annots{} to appear visually similar.
%%/
\item A \glossterm{merge case} occurs when two sets of \annots{} from the same animal are incorrectly
labeled with different \names{}. This is caused by an algorithm or human error where a query \encounter{}
was not correctly matched to the database \exemplars{}.
\end{enumln}
%%%
These errors usually occur because the query and database \annots{} have a low degree of \emph{comparability} (\eg{}
differences in viewpoint or low quality). Of course, if no visual overlap exists between the two sets ---
such as one set exclusively from the left side and another exclusively from the right --- nothing can be
done. This is why the animal must be seen from a predetermined view in order to be counted. In the \GZC{}
this is the left side.
In the \GZC{} suspect individuals were flagged for split checks using various heuristics such as the
number of \annots{} in the \name{} or the apparent speed of the animal's movement as GPS and time data.
To check a flagged individual we used the ranking algorithm to search for pairs of \annots{} with low
matching scores that belong to the flagged \name{}. Low similarity between two \annots{} within a
\name{} suggested that an error had occurred. These low scoring results were then manually reviewed.
When breaking apart split cases, care was taken to account for the fact that right and left images
should not match. Likewise, care was taken to ensure that an intermediate \annot{} linking two disjoint
\annots{} has enough information to establish the link.
Merge checks issue all \exemplars{} as queries against all other \exemplars{}.
High similarity between two different \names{} suggested that a match was missed.
These high scoring results were manually reviewed.
More sophisticated error detection and recovery will be discussed in \Cref{sec:incon}.
\subsubsection{Population estimation}
The final step for the \GZC{} workflow was to estimate the number of animals in the park.
Using the identification algorithm we defined which \annots{} were sightings and which were
resightings.
Because we were using a preliminary version of the system we were conservative in defining when an
animal was sighted by only using the left and frontleft \annots{} with quality labels of ok, good, or
excellent.
Each individual that met these criteria was counted as a sighting.
If a sighted individual had an \annot{} from both days, then we counted that individual as resighted.
\subsection{Processing challenges}
Our experience with the Great Zebra Count has highlighted a number of challenges that must be addressed if this
system is to be applied in future events. These challenges include the number of manual reviews required, the
detection of and recovery from manual errors, and the overall lack of a systematic identification framework.
Perhaps the greatest challenge faced during the \GZC{} was the considerable amount of time that was
required to manually verify identification results.
It can take several seconds to manually verify if a pair of \annots{} is a correct match even if the
results are presented in a ranked list.
This task is illustrated in~\cref{fig:RankFigure}.
Requiring the manual verification of each result is untenable for a system that accepts thousands of new
images a day.
The lack of a systematic approach for identification meant that whenever two \annots{} were matched, the
name labels of all annotations of those names were changed.
This made it difficult to tease apart errors when they occurred.
Furthermore, manual errors (likely caused by fatigue from the large number of manual reviews) resulted in
numerous identification errors cases that were not able to be detected and resolved until the end of the
process.
Reviews of results were also done in order of matching scores regardless of previous decisions, causing
the manual reviewer to inefficiently review redundant results between the same individual.
Additionally, no stopping criterion for reviews was defined resulting in an ad hoc approach to
determining when all matches were found.
Motivated by these observations we seek to develop a semi-automatic approach to animal identification.
This approach will should be governed by a system that reduces the number of manual reviews and is able
to detect and recover from errors, and determine when to stop searching for new matches.
%Furthermore, as new \exemplars{} are added to the system the search
% data structure must be updated before additional queries can be made.
%Rebuilding this data structure is another source of delays.
%We consider addressing this problem as two separate challenges.
%The first challenge is algorithmic, and the second challenge is system
% based.
%We will use these challenges to motivate the development a system that
%is able to dynamically detect and identify individual animals in large
%volumes of images.
%The algorithmic challenge is to develop a confidence-based decision
% mechanism.
%We will use these challenges to motivate a verification mechanism that
%automatically accepts or dismisses candidate matches.
%Only a subset of the most difficult identification results should be
% manually reviewed, the rest should be handled automatically.
%This motivates developing a
%On the system side, the challenge is to dynamically update the search
% data structure.
%This involves intelligent bookkeeping because the image analysis
% system is designed as a stateless API{}.
%Statelessness is essential if multiple users are to access the same
% instance of image analysis and makes the system compatible with web
% technologies.
%A stateless API is allowed to cache results, but it cannot maintain a
% single canonical object such as an indexer.
%Instead the API{} works by accepting and responding to requests.
%This has the effect of enforcing that objects are immutable, but also
% eliminates bugs due to race conditions, gives the program a large
% degree of thread safety, and encourages extensible and testable coding
% practices.
%Updating search structures dynamically is a challenging problem in a
% stateless framework, but it can be addressed with careful system
% design.
\RankFigure{}
\section{APPROACH}
The problem addressed in this \thesis{} is to identify individual animals ``in the wild'' and to count the
individuals in a population.
We are given a set of images containing \annots{} of the same species.
The images are collected in an uncontrolled environment and likely contain imaging challenges such as
occlusion, distracting features, viewpoint variations, pose variations, and quality variations.
Furthermore, the images may be collected either over many years or over just a few days as in the \GZC{}.
Each \annot{} is labeled with time, GPS, quality, and viewpoint.
We may also be given an initial partial \name{} labeling of the annotations --- \eg{} in the case where we
identify a new set of annotations against a previously identified set --- but this need not be the case.
We want to label each \annot{} with a \glossterm{\name{}} that uniquely identifies the individual.
In other words, our task is to label all \annots{} from the same individual with the same \name{} and give
\annots{} from different individuals different \names{}.
After this is complete, the resulting database will contain the information needed to estimate the size of
the population using techniques from sight-resight statistics.
The first step of the identification process is a ranking algorithm. The inputs to the algorithm are a single query
\annot{} and a set of database \annots{}. Sparse patch-based features are localized in all \annots{}, and a
descriptor vector is extracted for each feature. The descriptors of the database \annots{} are indexed for fast
nearest neighbor search. We then find a set of matches in the database for each descriptor in the query \annot{}.
The matches are scored based on visual similarity, distinctiveness within the database, and likelihood of belonging
to the foreground. Matches are combined across multiple \exemplar{} \annots{} to produce a matching score for each
\name{} in the database, resulting in a ranked list of results for each query.
We then extend the ranking algorithm by developing a classifier able to automatically review its results.
First, we construct a pairwise feature that captures relationships between two annotations using local
feature correspondence and global properties such as time and GPS.
Then, we learn a classifier to predict if a pair of annotations --- \ie{} a result in the ranked list --- is
correct or incorrect.
In the final part of our approach, we place the problem of animal identification in a graph framework able to
systematically guide the identification process. This is done by placing each annotation in a graph as a vertex and
placing labeled edges between annotations to represent how they are related. Using the graph framework we will be
able to detect and recover from errors by taking advantage of multiple images seen of each individual.
We evaluate the ranking, verification, and graph identification algorithm by performing experiments on two
main databases of plains zebras and Grévy's zebras.
Some additional experiments are also performed on databases of Masai giraffes and humpback whales.
First, the ranking experiments test the algorithm's ability to find potential matches of an individual animal
over large periods of time, different viewpoints, different sized databases, and different numbers of
\exemplars{}.
Then, the verification experiments will test the extent to which the correct results from the ranking
algorithm can be separated from the incorrect results using our learned classifier.
Finally, the graph identification experiments will demonstrate the algorithm's ability to reduce the number
of required manual reviews and recover from errors.
We determine the configuration of each algorithm that works best for identifying each species.
%To do this we
%develop both a suite of algorithms and a software system. The algorithms
%will allow us to infer properties about images and \annots{}. The system
%will allow us to maintain the images, \annots{}, algorithms, and inferred
%properties in a controlled and reproducible manner.
%We build a workflow on top of the matching algorithm.
%This workflow accepts new \annots{} in groups defined by \occurrences{}.
%The matching algorithm groups \annots{} within the \occurrence{}, and
% then leverages redundant and multiple viewpoints to perform identification
% against the database.
%As the database grows we handle multiple views of each \exemplar{} by
% maintaining a set of \exemplars{} for each \name{}.
%We develop methods for recovering from any errors in identification when
% multiple individuals are grouped into the same \exemplar{} as well as when
% multiple \exemplars{} actually represent the same individual.
%To address the challenges introduced by this workflow we extend the core
% matching algorithm using a probabilistic graph-based inference algorithm.
%We will learn the probability of matching given two \annots{} as well as a
% confidence in that estimate.
%We will use this information build a weighted graph of potential matches.
%To perform inference on this graph we propose to develop a decision
% mechanism that will make probabilistic decisions about \intraoccurrence{}
% matching, \vsexemplar{} matching, and consistency checks.
%To support continuous and dynamic use of the system we develop a caching
%scheme that supports seamless invalidation of outdated data, computes
%requested data on the fly, and disallows duplicate data. We use this scheme
%to dynamically update the underlying data structures as more data is added
%to the system. This is all accomplished in a stateless framework which
%allows for the image analysis software to be used concurrently by web-based
%frameworks.
\section{ORGANIZATION} %
This \thesis{} is organized as follows:
%
\Cref{chap:relatedwork} describes related work.
The focus is on the details of techniques used in the system, while an overview is given for those which are
indirectly related.
%
\Cref{chap:ranking} describes the ranking algorithm for identifying individual animals, one \annot{} at a
time, against a database of \exemplars{}.
This chapter includes an experimental evaluation of the ranking algorithm.
This is the algorithm that was used in the \GZC{}.
\Cref{chap:pairclf} addresses the problem of semi-automatic verification of results from the ranking
algorithm.
%
\Cref{chap:graphid} combines the ranking and verification algorithm into a semi-automatic framework that
detects and corrects errors while reducing the number of manual reviews.
%
\Cref{chap:conclusion} concludes this \thesis{} and summarizes its contributions.