Submitted by [deleted] t3_11s58n4 in MachineLearning

Preview of the post since it's dropping in a few hours: https://deploy-preview-1313--pytorch-dot-org-preview.netlify.app/blog/pytorch-2.0-release/

Also a post about Accelerated Diffusers with 2.0: https://deploy-preview-1315--pytorch-dot-org-preview.netlify.app/blog/accelerated-diffusers-pt-20/

GPT Summary:

  • PyTorch 2.0 is a next generation release that offers faster performance and support for dynamic shapes and distributed training using torch.compile as the main API.

  • PyTorch 2.0 also includes a stable version of Accelerated Transformers, which use custom kernels for scaled dot product attention and are integrated with torch.compile.

  • Other beta features include PyTorch MPS Backend for GPU-accelerated training on Mac platforms, functorch APIs in the torch.func module, and AWS Graviton3 optimization for CPU inference.

  • The release also includes prototype features and technologies across TensorParallel, DTensor, 2D parallel, TorchDynamo, AOTAutograd, PrimTorch and TorchInductor.

210

Comments

You must log in or register to comment.

ReginaldIII t1_jcdbpqz wrote

"GPT summary" jesus wept. As if reddit posts weren't already low effort enough.

Neat news about Pytorch.

104

WH7EVR t1_jce683f wrote

Quality > Effort. I welcome the higher-quality comments and content we'll be getting by augmenting human laziness with AI speed and ability.

4

Philpax t1_jcdt8r0 wrote

Oh no, someone used a state of the art language model to summarise some text instead of doing it themselves. However will we live with this incalculable slight against norms of discussion on Reddit?

−4

ReginaldIII t1_jcdwasr wrote

LPT copy pasting the bullet point change notes uses fewer GPUs. The more you know!

35

Philpax t1_jch72b8 wrote

I invite you to compare the GPT summary and dotpoints in the article and to tell me they are the same

−5

dangpzanco t1_jccfj49 wrote

"Python 1.8 (deprecating Python 1.7)" links to "Deprecation of Cuda 11.6 and Python 1.7 support for PyTorch 2.0"

11

CyberDainz t1_jcfe382 wrote

torch.compile does not work in windows :(

5

lostmsu t1_jcfmlbg wrote

It worked in preview. Does it just not optimize? I didn't see significant improvements (e.g. under 5%)

1

LightbulbChanger25 t1_jcgir88 wrote

I think 2.0 is a good moment to add pytorch to my list of skills. Are there any good resources to learn pytorch 2.0 yet? I would consider myself between intermediate and advanced in tensorflow.

3

throwawaychives t1_jcgnhuf wrote

PyTorch docs are more than enough to learn torch, especially if you have good experience in other ML frameworks. Nothing will beat implementing an actual model in torch and there are plenty of GitHub repos out there you can use as a reference

5

Competitive-Rub-1958 t1_jccyreq wrote

I think I may be reading things wrong here, but FlashAttention is only for calculating basic scaled QKV attention, not embedded inside their MHA module?

2

logophobia t1_jch9ow5 wrote

Neat concept, compile, but still has some limitations for the models I used them on (complex-valued tensors, pykeops, CUDA kernels). Some pretty great advancements otherwise. Will probably help when training transformers.

1

programmerChilli t1_jci4fyx wrote

I've actually had pretty good success on using torch.compile for some of the stuff that KeOps works well for!

1

CosmosKrew t1_jcdyrd2 wrote

I really could get into pytorch if they provided a functional interface like keras. I find it mathematically pleasing.

−9

1F9 t1_jcdfbje wrote

I am concerned that moving more stuff up into Python is a mistake. It limits support for other languages, like Rust, which speak to the C++ core. Also, executing Python is slower, so limits what can be done by the framework before being considered “too slow.”

Moving a bit to a high level language seems like a win, but when that inspires moving large parts of a big project to high-level languages, I’ve seen unfortunate results. It seems each piece in a high level language often imposes non-obvious costs on all the pieces.

This is nothing new. Way back in the day, Netscape gave up on Javagator, and Microsoft “reset” Windows longhorn to rip out all the c#. Years of work by large teams thrown away.

−12

-Rizhiy- t1_jce09xx wrote

There is a reason it is called PyTorch)

27

1F9 t1_jcfxc5b wrote

That reason is that they replaced Lua with Python as the high-level language that wrapped Torch's core, and needed to differentiate that from the original Torch. But it seems as though you believe the "py" prefix means the correct design decision for the project is to replace ever more parts of torch with Python. Perhaps you could elaborate more on your thinking there?

2

Philpax t1_jcdtj6o wrote

Agreed. It also complicates productionising the model if you're reliant on features that are only available in the Python interface. Of course, there are ways around that (like just rewriting the relevant bits), but it's still unfortunate.

7

programmerChilli t1_jcdykn2 wrote

The segregation is that the "ML logic" is moving into Python, but you can still export the model to C++.

7

zbyte64 t1_jcdzvhh wrote

That's why all my ML is done in OvjectiveC /s. Production looks different for different use cases.

5

Exarctus t1_jcfmqqs wrote

I think you’ve entirely misunderstood what PyTorch is and how it functions.

PyTorch is a front-end to libtorch, which is the C++ backend. Libtorch itself is a wrapper to various highly optimised libraries as well as CUDA implementations of specific ops. Virtually nothing computationally expensive is done on the python layer.

5

duboispourlhiver t1_jcek81c wrote

IMHO this can only be answered on a case by case basis and there is no general rule. If anyone really understands what has been moved to python and what are the consequences, his lights are welcome

2