Viewing a single comment thread. View all comments

StephaneCharette t1_ir7nomj wrote

I cannot help but think, "oh yeah, this framework over here is 50x faster than anything else, but everyone has forgotten about it until just now..."

If <something> gave 50X improvements, wouldn't that be what everyone uses?

Having said that, the reason I use Darknet/YOLO is specifically because the whole thing compiles to a C++ library. DLL on Windows, and .a or .so on Linux. I can squeeze out a few more FPS by using the OpenCV implementation instead of Darknet directly, but the implementation is not trivial to use correctly.

However, if you're working with ONNX then I suspect you're already achieving speeds higher than using Darknet or OpenCV as the framework.

One thing to remember: resizing images (aka video frames) is SLOWER than inference. I don't know what your pytorch and onnx frameworks do when the input image is larger than the network, but when I take timing measurements with Darknet/YOLO and OpenCV's DNN, I end up spending more time resizing the video frames than I do in inference. This is a BIG deal, which most people ignore or trivialize. If you can size your network correctly, or you can adjust the video capture to avoid resizing, you'll likely more than double your FPS. See these performance numbers for example: https://www.ccoderun.ca/programming/2021-10-16_darknet_fps/#resize

1