Viewing a single comment thread. View all comments

kkchangisin t1_j5ijvdy wrote on January 23, 2023 at 6:19 AM

Reply to comment by NovaBom8 in [P] Benchmarking some PyTorch Inference Servers by op_prabhuomkar

Looking at the model configs in the repo there’s definitely dynamic batching going on.

I think what’s really interesting is the fact that even with default parameters for dynamic batching the response times are superior and very consistent.