Submitted by Dense-Smf-6032 t3_y6iu0l in MachineLearning
Hello All
How is the video tracking different from image detection? From my understanding, tracking within a video can be simply doing a per-frame level objection detection, and then using NMS to combine these object (based on the overlapping). However, my friend told me this might not be an efficient method (because per-frame level).
What are the current norm of doing video tracking? Do they run at the per-frame level?
marcus_hk t1_ispjwcw wrote
Object detection involves a predetermined class of objects.
Video tracking means the tracking of any arbitrary thing with a bounding box around it. There are no classes per se.