Submitted by NaturalGradient t3_10lot3v in MachineLearning

Find the release notes here:

A big highlight is how fast these implementations are! I genuinely believe GPU-acceleration is the future of Evolutionary algorithms, and EvoTorch and its integration into the PyTorch ecosystem is a fantastic enabler for this.

To demonstrate the raw speed provided by the new release, I compared EvoTorch's CMA-ES implementation to that provided by the popular pycma package on the 80-dimensional Rastrigin problem and tracked the run-time:

Performance was measured over 50 runs on the 80-dimensional Rastrigin problem

The crazy thing to note is that when we switch to GPU (Tesla V100), we can efficiently run CMA-ES with population sizes going into 100k+!



You must log in or register to comment.

lucidraisin t1_j5z7z6g wrote

CMA-ES! definitely playing around with this, thank you!


NaturalGradient OP t1_j60jc3z wrote

Great to hear! I actually lead the CMA-ES effort and tried very hard to match the fine details of pycma so that the performance is comparable. If you run into any unexpected behavior please do open a Github issue or reach out to me directly. There's a lot of fine details in practical CMA-ES implementation, so I'd really like to know if I missed anything.


Ulfgardleo t1_j603u8t wrote

in my experience, this is never the bottleneck. rastrigin does not cost much to evaluate, real functions where you would consider evolution on, do. I did research in speeding up CMA-ES and in the end it felt like a useless exercise in matrix algebra for that reason.

Yes, in theory being able to speed-up matrix operations is nice, but doing stuff in higher dimensions (80 is kinda irrelevant computationally, even on a CPU) always has to fight against the O(1/n) convergence rate of all evo algorithms.

So all this is likely good for is benchmarking these algorithms in a regime that is practically irrelevant for evolution.


NaturalGradient OP t1_j60iyek wrote

It depends what you're trying to do :)

If you want to run GPU-accelerated neuroevolution in Brax or IsaacGym, then keeping everything on GPU is absolutely relevant. Similarly if you're trying to do MPC or any optimisation of an NN input, then its still very useful to be on the GPU. As you said, bench-marking is another place this GPU acceleration can be very helpful. Basically anywhere where the fitness evaluation isn't the only bounding factor.

For expensive/CPU-bounded fitness functions, we have other utilities too! For example, with a single flag you can distribute your fitness evaluation across multiple actors using ray. This means you can scale to an entire CPU cluster effortlessly!


Mefaso t1_j61zim5 wrote

>If you want to run GPU-accelerated neuroevolution in Brax or IsaacGym, then keeping everything on GPU is absolutely relevant

Do you have evidence for that?

I would assume that running brax rollouts for example would take 100x as long as the actual cmaes


pythonpeasant t1_j614dq6 wrote


Please go back to the AttentionNeuron and AttentionAgent papers and retrain them on GPU with big population sizes!


programmerChilli t1_j60s9pz wrote

Have you tried out PyTorch 2.0 compilation feature (i.e. torch.compile)? Might help a lot for evolutionary computation stuff.


danielgafni t1_j62mh4o wrote

How does it compare to evojax? A huge deal there is training all the networks in the population in parallel. This gives absolutely massive speedups as you can imagine. Can evotorch do it?


ML4Bratwurst t1_j5z42ky wrote

Call me picky, but I would not use a ML library that is not GPU accelerated. This should be default


ReginaldIII t1_j5zvqal wrote

Okay, you're picky :p

Try deploying a model for realtime online learning of streaming sensor data that needs to runs on battery power and then insist it needs to run on GPUs.

Plenty of legitimate use cases for non GPU ML.


ML4Bratwurst t1_j5zxikl wrote

Can you give me one example of this? And even if. My point is still valid because I did not say that you should delete the CPU support lol


ReginaldIII t1_j5zzhj1 wrote

Pick the tools that work for the problems you have. If you are online training a model on an embedded device you need something optimized for that hardware.

I gave you a generic example of a problem domain where this applies. You can search for online training on embedded devices if you are interested but I can't talk about specific applications because they are not public.

All I'm saying is drawing a line in the sand and saying you'd never use X if it doesn't have Y is silly because what if you end up working on something in the future where the constraints are different?


fernandocamargoti t1_j5z4qpc wrote

Evolutionary algorithms are not ML.


new_name_who_dis_ t1_j5zoc0t wrote

They are not gradient-descent based (so they don't need GPU acceleration as much, but sometimes times still do depending on the problem) but they are definitely ML.


fernandocamargoti t1_j5zs45e wrote

They not about learning from data, they are about optimization. They are from the broader AI field of study, but I wouldn't say they are ML. They serve a different purpose. Even though there are some research about using them to optimize models (instead of using gradient descent), but it's not their main use case.


ReginaldIII t1_j5zv9gz wrote

Thats such a tenuous distinction and you're wrong anyway because you can pose any learning from data problem as a generic optimization problem.

They're very useful when your loss function is not differentiable but you still want to fit a model to input+output data pairs.

They're also useful when your model parameters have domain specific meaning and you can derive rules for how two parameter sets can be meaningfully combined with one another

Decision trees and random forests are ML too. What you probably mean is Deep Learning. But even that has a fuzzy boundary to surrounding methods.

Being a prescriptionist with these definitions is a waste of time because the research community as a whole cannot draw clear lines in the sand.


fernandocamargoti t1_j60xagg wrote

Well, what you talking about is some ways to use evolutionary algorithms to optimize the parameters of a ML model. But in my eyes, it doesn't mean it is ML. They both share a lot, but they aren't the same. For me, evolutionary algorithms is part of Meta Heuristics, which is part of AI (which ML is also part of). Different areas and sub areas of research do interact with each other. I just mean that the is part is a bit too much in this case.


ReginaldIII t1_j61nlno wrote

Trying to force these things into a pure hierarchy sounds nothing short of an exercise in pedantry.

And to what end? You make up your own distinctions that no one else agrees with and you lose your ability to communicate ideas to people because you're talking a different language to them.

If you are so caught up on the "is a" part. Have you studied any programming languages that support "multiple inheritance" ?


new_name_who_dis_ t1_j601m4q wrote

Gradient descent is also about optimization... You can optimize even neural networks with a bunch of different methods other than gradient descent (including evolutionary methods). They don't work as well but you can still do it.