Submitted by I_will_delete_myself t3_1068gl6 in MachineLearning

There is always needing specific things to GPUs. There is always a need to make a new thing over and over again for each new GPU. We got Cuda, Rocm, Metal, and will soon need Intel. I know there are already a lot of tools out there for Cuda which make it hard to replace. However for something like Apple devices (which Apple has a history of not giving a darn about compute unless if it's the iPhone or iPad). Then there is a ton of operations that have to get implemented and only CUDA is something you know will be reliably supported it seems. I am curious on your guys thoughts with why this ain't a thing in ML, even though game industry uses open standards like these all the time .

Edit: Shoot I just realized PyTorch was prototyping Vulcan as a backend. https://pytorch.org/tutorials/prototype/vulkan_workflow.html

8

Comments

You must log in or register to comment.

suflaj t1_j3haky4 wrote

Why would it be used? It doesn't begin to compare to CUDA and cuDNN. Nothing really does. And Vulkan specifically is made for graphics pipelines, not for general purpose compute. To be cross compatible, it usually sends compute to be done on the CPU.

It's not that there is a consipiracy to use proprietary nvidia software - there just isn't anything better than it.

13

jacobgorm t1_j3nigl3 wrote

Being cross-platform and not tied to a single vendor's hardware would be a great plus. Vulkan Compute is for general purpose compute not graphics.

2

suflaj t1_j3oepus wrote

You understimate how hard cross-platform is to achieve. Especially with GPUs. There is no GPGPU API standard, first and foremost, so ensuring cross-platform is a tedious task which essentially either means creating an API that has to accomodate every GPU, or writing "drivers" for every different GPU. GPUs can be vastly different between generations and models, unlike x86 and x86-64 CPU architectures which have mostly stayed the same for several decades now.

Vulkan Compute is nowhere near reaching feature parity with CUDA and cuDNN. ROCm's CUDA is way better and still too much of a pain to install and keep.

Furthermore, open standards mean nothing when a graphics vendor can just gimp the API, like, ironically, nvidia already does with Vulkan.

There is an open variant called OpenCL. But it will probably never be as mature as CUDA, even though 3.0 is apparently making great strides. There is absolutely no reason to push for Vulkan due to how cancerous developing anything in it is.

3

olivierp9 t1_j3hc9x7 wrote

Also I think vulkan compute shaders does not support more than 3d or 4d tensor, not sure

4

CyberDainz t1_j3kv4f5 wrote

ML is not only just the backend. Technically you can code and run ml programs on OpenCL or OpenGL, but speed will be at least x2-x4 worse than specialized backend like cuda / rocm.

It's all about tuning programs (such as matmul) for each GPU model to achieve maximum performance. CUDA/Rocm already contains tuned programs.

3