Submitted by op_prabhuomkar t3_10iqeuh in MachineLearning
It’s an early version and I’m trying to get some feedback on how I can improve this and do it the “right way”.
Source Code and Results: https://github.com/prabhuomkar/bitbeast/tree/master/ptibench
kkchangisin t1_j5gcgbe wrote
Nice work! Triton already looks good but have you tried optimizing with the Triton Model Analyzer?
https://github.com/triton-inference-server/model_analyzer
In various models I use with Triton I've found the output model formats and configurations for use with Triton can provide drastically increased performance whether that be throughput, latency, etc.
Hopefully I get some time soon to try it out myself!
Again, nice work!