There is always needing specific things to GPUs. There is always a need to make a new thing over and over again for each new GPU. We got Cuda, Rocm, Metal, and will soon need Intel. I know there are already a lot of tools out there for Cuda which make it hard to replace. However for something like Apple devices (which Apple has a history of not giving a darn about compute unless if it's the iPhone or iPad). Then there is a ton of operations that have to get implemented and only CUDA is something you know will be reliably supported it seems. I am curious on your guys thoughts with why this ain't a thing in ML, even though game industry uses open standards like these all the time .

Edit: Shoot I just realized PyTorch was prototyping Vulcan as a backend. https://pytorch.org/tutorials/prototype/vulkan_workflow.html

Comments

You must log in or register to comment.

suflaj t1_j3haky4 wrote on January 8, 2023 at 4:02 PM

Why would it be used? It doesn't begin to compare to CUDA and cuDNN. Nothing really does. And Vulkan specifically is made for graphics pipelines, not for general purpose compute. To be cross compatible, it usually sends compute to be done on the CPU.

It's not that there is a consipiracy to use proprietary nvidia software - there just isn't anything better than it.

jacobgorm t1_j3nigl3 wrote on January 9, 2023 at 8:11 PM

Being cross-platform and not tied to a single vendor's hardware would be a great plus. Vulkan Compute is for general purpose compute not graphics.

suflaj t1_j3oepus wrote on January 9, 2023 at 11:33 PM

You understimate how hard cross-platform is to achieve. Especially with GPUs. There is no GPGPU API standard, first and foremost, so ensuring cross-platform is a tedious task which essentially either means creating an API that has to accomodate every GPU, or writing "drivers" for every different GPU. GPUs can be vastly different between generations and models, unlike x86 and x86-64 CPU architectures which have mostly stayed the same for several decades now.

Vulkan Compute is nowhere near reaching feature parity with CUDA and cuDNN. ROCm's CUDA is way better and still too much of a pain to install and keep.

Furthermore, open standards mean nothing when a graphics vendor can just gimp the API, like, ironically, nvidia already does with Vulkan.

There is an open variant called OpenCL. But it will probably never be as mature as CUDA, even though 3.0 is apparently making great strides. There is absolutely no reason to push for Vulkan due to how cancerous developing anything in it is.

olivierp9 t1_j3hc9x7 wrote on January 8, 2023 at 4:13 PM

Also I think vulkan compute shaders does not support more than 3d or 4d tensor, not sure

FastestLearner t1_j3i0bdo wrote on January 8, 2023 at 6:45 PM

But what can Vulkan do that CUDA can’t already do?

I_will_delete_myself OP t1_j3i2jv5 wrote on January 8, 2023 at 6:59 PM

But what about something more niche like MPS or Rocm?

CyberDainz t1_j3kv4f5 wrote on January 9, 2023 at 6:39 AM

ML is not only just the backend. Technically you can code and run ml programs on OpenCL or OpenGL, but speed will be at least x2-x4 worse than specialized backend like cuda / rocm.

It's all about tuning programs (such as matmul) for each GPU model to achieve maximum performance. CUDA/Rocm already contains tuned programs.