Viewing a single comment thread. View all comments

Pocok5 t1_ixpxqo9 wrote

A GPU can do a shitton of data-parallel stuff. If you find yourself doing the same operation over a ton of data points, it's worth thinking about whether you can do it all at the same time. Since you are doing python, check Numba https://carpentries-incubator.github.io/lesson-gpu-programming/03-numba/index.html

https://numba.readthedocs.io/en/stable/cuda/cudapysupported.html

2

blry2468 OP t1_ixpyuub wrote

Yes, except my for loops use the number of times it is looped and the loop step as variables, so it does not repeat the same operation over and over again cause it changes each time it loops by abit. So thats the issue.

0

Pocok5 t1_ixq2k5l wrote

If it's just that then it's no issue, in fact it is integral to how CUDA works (I'm assuming loop step is constant over one run of a loop). You get the index of the current thread and you can use it - for example the CUDA prime check example is "check the first N integers for primes" -> start N threads and do a prime check algorithm on the thread index. The only problem happens if your loop #n+1 uses data calculated during the #n loop.

2