Viewing a single comment thread. View all comments

sje397 t1_isrm4mi wrote

The difference is that when you're tracking, you want to identify whether the bounding box in two successive frames is the same object, or two different objects of the same type. There's a bunch of complexity, like the linear sum assignment problem (that is, if you start by assigning the same object id to the closest bounding boxes in two successive frames, you can get a worse solution than minimising the distances between boxes in successive frames overall), and whether you track the centres of bounding boxes or look at e.g. IoU (intersection over union).

2