-
Notifications
You must be signed in to change notification settings - Fork 1
/
fusedMOT.m
460 lines (388 loc) · 18.9 KB
/
fusedMOT.m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
%% Motion-Based Multiple Object Tracking
% This example shows how to perform automatic detection and motion-based
% tracking of moving objects in a video from a stationary camera.
%
% Copyright 2014 The MathWorks, Inc.
%%
% Detection of moving objects and motion-based tracking are important
% components of many computer vision applications, including activity
% recognition, traffic monitoring, and automotive safety. The problem of
% motion-based object tracking can be divided into two parts:
%
% # detecting moving objects in each frame
% # associating the detections corresponding to the same object over time
%
% The detection of moving objects uses a background subtraction algorithm
% based on Gaussian mixture models. Morphological operations are applied to
% the resulting foreground mask to eliminate noise. Finally, blob analysis
% detects groups of connected pixels, which are likely to correspond to
% moving objects.
%
% The association of detections to the same object is based solely on
% motion. The motion of each track is estimated by a Kalman filter. The
% filter is used to predict the track's location in each frame, and
% determine the likelihood of each detection being assigned to each
% track.
%
% Track maintenance becomes an important aspect of this example. In any
% given frame, some detections may be assigned to tracks, while other
% detections and tracks may remain unassigned.The assigned tracks are
% updated using the corresponding detections. The unassigned tracks are
% marked invisible. An unassigned detection begins a new track.
%
% Each track keeps count of the number of consecutive frames, where it
% remained unassigned. If the count exceeds a specified threshold, the
% example assumes that the object left the field of view and it deletes the
% track.
%
% For more information please see
% <matlab:helpview(fullfile(docroot,'toolbox','vision','vision.map'),'multipleObjectTracking') Multiple Object Tracking>.
%
% This example is a function with the main body at the top and helper
% routines in the form of
% <matlab:helpview(fullfile(docroot,'toolbox','matlab','matlab_prog','matlab_prog.map'),'nested_functions') nested functions>
% below.
% load('centroids_rcnn.mat')
% load('bboxes_rcnn.mat')
% load('filename_jpg.mat')
% 根据函数multiObjectTracking()改编而成,
% 具体信息可参考该函数。
function fusedMOT()
% Create System objects used for reading video, detecting moving objects,
% and displaying the results.
obj = setupSystemObjects();
load('centroids_rcnn.mat')
load('bboxes_rcnn.mat')
tracks = initializeTracks(); % Create an empty array of tracks.
nextId = 1; % ID of the next track
cont=1;
foldername=dir('E:\faster_rcnn_matlab\faster_rcnn-master\MOT\GGBEECS')% 用于得出所有子文件夹的名字
for i=1:length(foldername)-2
filename=strcat('E:\faster_rcnn_matlab\faster_rcnn-master\MOT\GGBEECS\',foldername(i+2).name);% 读取子文件夹的名字和路径
filename_jpg=dir(strcat(filename,'\*.jpg'));% 读取子文件夹图片,jpg格式
% imagedata=imread(filename_jpg);% 读取图片
% save('存放路径\名字.mat',imagedata);
end
% Detect moving objects, and track them across video frames.
while ~isDone(obj.reader)
frame = readFrame();
% frame = imread(fullfile(filename, filename_jpg(nextId).name));
% [centroids, bboxes, ~] = detectObjects(frame);
centroids=centroids_rcnn{cont};
bboxes=bboxes_rcnn{cont};
predictNewLocationsOfTracks();
[assignments, unassignedTracks, unassignedDetections] = ...
detectionToTrackAssignment();
updateAssignedTracks();
updateUnassignedTracks();
deleteLostTracks();
createNewTracks();
displayTrackingResults();
cont=cont+1
pause(0.05)
end
%% Create System Objects
% Create System objects used for reading the video frames, detecting
% foreground objects, and displaying results.
function obj = setupSystemObjects()
% Initialize Video I/O
% Create objects for reading a video from a file, drawing the tracked
% objects in each frame, and playing the video.
% Create a video file reader.
obj.reader = vision.VideoFileReader('basketball.mp4');
obj.writer = vision.VideoFileWriter('basketball_rcnn_kalman.avi');
% E:\DanielBu\资料\硕士\EECS442 Mechine Vision机器视觉\project\Tracking\otherpeopleMOT\
% Create two video players, one to display the video,
% and one to display the foreground mask.
% obj.videoPlayer = vision.VideoPlayer('Position', [20, 100, 1380, 820]);
obj.videoPlayer = vision.VideoPlayer('Position', [20, 400, 700, 400]);
obj.maskPlayer = vision.VideoPlayer('Position', [740, 400, 700, 400]);
% Create System objects for foreground detection and blob analysis
% The foreground detector is used to segment moving objects from
% the background. It outputs a binary mask, where the pixel value
% of 1 corresponds to the foreground and the value of 0 corresponds
% to the background.
obj.detector = vision.ForegroundDetector('NumGaussians', 3, ...
'NumTrainingFrames', 40, 'MinimumBackgroundRatio', 0.7);
% Connected groups of foreground pixels are likely to correspond to moving
% objects. The blob analysis System object is used to find such groups
% (called 'blobs' or 'connected components'), and compute their
% characteristics, such as area, centroid, and the bounding box.
obj.blobAnalyser = vision.BlobAnalysis('BoundingBoxOutputPort', true, ...
'AreaOutputPort', true, 'CentroidOutputPort', true, ...
'MinimumBlobArea', 400);
end
%% Initialize Tracks
% The |initializeTracks| function creates an array of tracks, where each
% track is a structure representing a moving object in the video. The
% purpose of the structure is to maintain the state of a tracked object.
% The state consists of information used for detection to track assignment,
% track termination, and display.
%
% The structure contains the following fields:
%
% * |id| : the integer ID of the track
% * |bbox| : the current bounding box of the object; used
% for display
% * |kalmanFilter| : a Kalman filter object used for motion-based
% tracking
% * |age| : the number of frames since the track was first
% detected
% * |totalVisibleCount| : the total number of frames in which the track
% was detected (visible)
% * |consecutiveInvisibleCount| : the number of consecutive frames for
% which the track was not detected (invisible).
%
% Noisy detections tend to result in short-lived tracks. For this reason,
% the example only displays an object after it was tracked for some number
% of frames. This happens when |totalVisibleCount| exceeds a specified
% threshold.
%
% When no detections are associated with a track for several consecutive
% frames, the example assumes that the object has left the field of view
% and deletes the track. This happens when |consecutiveInvisibleCount|
% exceeds a specified threshold. A track may also get deleted as noise if
% it was tracked for a short time, and marked invisible for most of the of
% the frames.
function tracks = initializeTracks()
% create an empty array of tracks
tracks = struct(...
'id', {}, ...
'bbox', {}, ...
'kalmanFilter', {}, ...
'age', {}, ...
'totalVisibleCount', {}, ...
'consecutiveInvisibleCount', {});
end
%% Read a Video Frame
% Read the next video frame from the video file.
function frame = readFrame()
frame = obj.reader.step();
end
%% Detect Objects
% The |detectObjects| function returns the centroids and the bounding boxes
% of the detected objects. It also returns the binary mask, which has the
% same size as the input frame. Pixels with a value of 1 correspond to the
% foreground, and pixels with a value of 0 correspond to the background.
%
% The function performs motion segmentation using the foreground detector.
% It then performs morphological operations on the resulting binary mask to
% remove noisy pixels and to fill the holes in the remaining blobs.
function [centroids, bboxes, mask] = detectObjects(frame)
% Detect foreground.
mask = obj.detector.step(frame);
% Apply morphological operations to remove noise and fill in holes.
mask = imopen(mask, strel('rectangle', [3,3]));
mask = imclose(mask, strel('rectangle', [15, 15]));
mask = imfill(mask, 'holes');
% Perform blob analysis to find connected components.
[~, centroids, bboxes] = obj.blobAnalyser.step(mask);
end
%% Predict New Locations of Existing Tracks
% Use the Kalman filter to predict the centroid of each track in the
% current frame, and update its bounding box accordingly.
function predictNewLocationsOfTracks()
for i = 1:length(tracks)
bbox = tracks(i).bbox;
% Predict the current location of the track.
predictedCentroid = predict(tracks(i).kalmanFilter);
% Shift the bounding box so that its center is at
% the predicted location.
predictedCentroid = int32(predictedCentroid) - bbox(3:4) / 2;
tracks(i).bbox = [predictedCentroid, bbox(3:4)];
end
end
%% Assign Detections to Tracks
% Assigning object detections in the current frame to existing tracks is
% done by minimizing cost. The cost is defined as the negative
% log-likelihood of a detection corresponding to a track.
%
% The algorithm involves two steps:
%
% Step 1: Compute the cost of assigning every detection to each track using
% the |distance| method of the |vision.KalmanFilter| System object(TM). The
% cost takes into account the Euclidean distance between the predicted
% centroid of the track and the centroid of the detection. It also includes
% the confidence of the prediction, which is maintained by the Kalman
% filter. The results are stored in an MxN matrix, where M is the number of
% tracks, and N is the number of detections.
%
% Step 2: Solve the assignment problem represented by the cost matrix using
% the |assignDetectionsToTracks| function. The function takes the cost
% matrix and the cost of not assigning any detections to a track.
%
% The value for the cost of not assigning a detection to a track depends on
% the range of values returned by the |distance| method of the
% |vision.KalmanFilter|. This value must be tuned experimentally. Setting
% it too low increases the likelihood of creating a new track, and may
% result in track fragmentation. Setting it too high may result in a single
% track corresponding to a series of separate moving objects.
%
% The |assignDetectionsToTracks| function uses the Munkres' version of the
% Hungarian algorithm to compute an assignment which minimizes the total
% cost. It returns an M x 2 matrix containing the corresponding indices of
% assigned tracks and detections in its two columns. It also returns the
% indices of tracks and detections that remained unassigned.
function [assignments, unassignedTracks, unassignedDetections] = ...
detectionToTrackAssignment()
nTracks = length(tracks);
nDetections = size(centroids, 1);
% Compute the cost of assigning each detection to each track.
cost = zeros(nTracks, nDetections);
for i = 1:nTracks
cost(i, :) = distance(tracks(i).kalmanFilter, centroids);
end
% Solve the assignment problem.
costOfNonAssignment = 20;
[assignments, unassignedTracks, unassignedDetections] = ...
assignDetectionsToTracks(cost, costOfNonAssignment);
end
%% Update Assigned Tracks
% The |updateAssignedTracks| function updates each assigned track with the
% corresponding detection. It calls the |correct| method of
% |vision.KalmanFilter| to correct the location estimate. Next, it stores
% the new bounding box, and increases the age of the track and the total
% visible count by 1. Finally, the function sets the invisible count to 0.
function updateAssignedTracks()
numAssignedTracks = size(assignments, 1);
for i = 1:numAssignedTracks
trackIdx = assignments(i, 1);
detectionIdx = assignments(i, 2);
centroid = centroids(detectionIdx, :);
bbox = bboxes(detectionIdx, :);
% Correct the estimate of the object's location
% using the new detection.
correct(tracks(trackIdx).kalmanFilter, centroid);
% Replace predicted bounding box with detected
% bounding box.
tracks(trackIdx).bbox = bbox;
% Update track's age.
tracks(trackIdx).age = tracks(trackIdx).age + 1;
% Update visibility.
tracks(trackIdx).totalVisibleCount = ...
tracks(trackIdx).totalVisibleCount + 1;
tracks(trackIdx).consecutiveInvisibleCount = 0;
end
end
%% Update Unassigned Tracks
% Mark each unassigned track as invisible, and increase its age by 1.
function updateUnassignedTracks()
for i = 1:length(unassignedTracks)
ind = unassignedTracks(i);
tracks(ind).age = tracks(ind).age + 1;
tracks(ind).consecutiveInvisibleCount = ...
tracks(ind).consecutiveInvisibleCount + 1;
end
end
%% Delete Lost Tracks
% The |deleteLostTracks| function deletes tracks that have been invisible
% for too many consecutive frames. It also deletes recently created tracks
% that have been invisible for too many frames overall.
function deleteLostTracks()
if isempty(tracks)
return;
end
invisibleForTooLong = 20;
ageThreshold = 8;
% Compute the fraction of the track's age for which it was visible.
ages = [tracks(:).age];
totalVisibleCounts = [tracks(:).totalVisibleCount];
visibility = totalVisibleCounts ./ ages;
% Find the indices of 'lost' tracks.
lostInds = (ages < ageThreshold & visibility < 0.6) | ...
[tracks(:).consecutiveInvisibleCount] >= invisibleForTooLong;
% Delete lost tracks.
tracks = tracks(~lostInds);
end
%% Create New Tracks
% Create new tracks from unassigned detections. Assume that any unassigned
% detection is a start of a new track. In practice, you can use other cues
% to eliminate noisy detections, such as size, location, or appearance.
function createNewTracks()
centroids = centroids(unassignedDetections, :);
bboxes = bboxes(unassignedDetections, :);
for i = 1:size(centroids, 1)
centroid = centroids(i,:);
bbox = bboxes(i, :);
% Create a Kalman filter object.
kalmanFilter = configureKalmanFilter('ConstantVelocity', ...
centroid, [200, 50], [100, 25], 100);
% Create a new track.
newTrack = struct(...
'id', nextId, ...
'bbox', bbox, ...
'kalmanFilter', kalmanFilter, ...
'age', 1, ...
'totalVisibleCount', 1, ...
'consecutiveInvisibleCount', 0);
% Add it to the array of tracks.
tracks(end + 1) = newTrack;
% Increment the next id.
nextId = nextId + 1;
end
end
%% Display Tracking Results
% The |displayTrackingResults| function draws a bounding box and label ID
% for each track on the video frame and the foreground mask. It then
% displays the frame and the mask in their respective video players.
function displayTrackingResults()
% Convert the frame and the mask to uint8 RGB.
frame = im2uint8(frame);
% mask = uint8(repmat(mask, [1, 1, 3])) .* 255;
minVisibleCount = 1;
if ~isempty(tracks)
% Noisy detections tend to result in short-lived tracks.
% Only display tracks that have been visible for more than
% a minimum number of frames.
reliableTrackInds = ...
[tracks(:).totalVisibleCount] > minVisibleCount;
reliableTracks = tracks(reliableTrackInds);
% Display the objects. If an object has not been detected
% in this frame, display its predicted bounding box.
if ~isempty(reliableTracks)
% Get bounding boxes.
bboxes = cat(1, reliableTracks.bbox);
% Get ids.
ids = int32([reliableTracks(:).id]);
% Create labels for objects indicating the ones for
% which we display the predicted rather than the actual
% location.
labels = cellstr(int2str(ids'));
predictedTrackInds = ...
[reliableTracks(:).consecutiveInvisibleCount] > 0;
isPredicted = cell(size(labels));
isPredicted(predictedTrackInds) = {' predicted'};
labels = strcat(labels, isPredicted);
% Draw the objects on the frame.
frame = insertObjectAnnotation(frame, 'rectangle', ...
bboxes, labels);
% imshow(frame,'border','tight','initialmagnification','fit');
% print(gcf,'-djpeg' ,['fusedMOT_ggbeecs',num2str(nextId),'.jpeg'],'-r400')
% Draw the objects on the mask.
% mask = insertObjectAnnotation(mask, 'rectangle', ...
% bboxes, labels);
end
end
% Display the mask and the frame.
% obj.maskPlayer.step(mask);
obj.videoPlayer.step(frame);
obj.writer.step(frame);
end
%% Summary
% This example created a motion-based system for detecting and
% tracking multiple moving objects. Try using a different video to see if
% you are able to detect and track objects. Try modifying the parameters
% for the detection, assignment, and deletion steps.
%
% The tracking in this example was solely based on motion with the
% assumption that all objects move in a straight line with constant speed.
% When the motion of an object significantly deviates from this model, the
% example may produce tracking errors. Notice the mistake in tracking the
% person labeled #12, when he is occluded by the tree.
%
% The likelihood of tracking errors can be reduced by using a more complex
% motion model, such as constant acceleration, or by using multiple Kalman
% filters for every object. Also, you can incorporate other cues for
% associating detections over time, such as size, shape, and color.
displayEndOfDemoMessage(mfilename)
end