kkchangisin

kkchangisin t1_j5gcgbe wrote

Nice work! Triton already looks good but have you tried optimizing with the Triton Model Analyzer?

https://github.com/triton-inference-server/model_analyzer

In various models I use with Triton I've found the output model formats and configurations for use with Triton can provide drastically increased performance whether that be throughput, latency, etc.

Hopefully I get some time soon to try it out myself!

Again, nice work!

5