sayoonarachu
sayoonarachu t1_j4n2w5j wrote
Quite a bit and even more if you use optimized frameworks and packages like voltaml, pytorch lighting, colossalai, bitsandbytes, xformers, etc. Those are just the ones I am familiar with.
Some libraries allow balancing between cpu, gpu, and memory, though obviously, that will come at a cost of speed.
General rule, the more parameters the model, the higher the cost of memory. So, unless you're planning to train from scratch or fine tune in the billions of param, you'll be fine.
It's gonna take playing around with hyper parameters, switching between 32, 16, 8 bit quant with pytorch or other python packages, testing between offloading weights to gpu/cpu, etc to get a feel of what you can and can't do.
Also, if I remember correctly, pytorch 2.0 will somewhat benefit the consumer nvidia 40 series to some extent when it is more ready.
Edit: p.s. supposedly a new Forward Forward algorithm can be "helpful" for large models since there's no back propagation
sayoonarachu t1_j3z22kh wrote
Other than tortoise tts as mentioned above, probably best to watch the Microsoft github page. They have a section for vall-e and they do tend to release some of their source codes for their other models.
Might take a while as the paper was just publish like a week and and still says, "work in progress."
https://github.com/microsoft/unilm/blob/master/valle/README.md
sayoonarachu t1_j2xypar wrote
Reply to comment by groman434 in [Discussion] If ML is based on data generated by humans, can it truly outperform humans? by groman434
Generally, it is a good idea to split your data into a training, validation, and testing set. Something like 80/10/10 or 80/20 depending on how much data you're feeding a neural network (NN).
So, 80% of the data, randomly selected, would be used to train an NN, and with, say, every epoch or batch if using batch normalization, it would validate against what it has "learn."
Once you're happy with said model performance, then you can use the test data set to see how well your model performs to "new" data in the sense that the 10% you set aside for testing was never introduced to the model during training.
Of course, there are many, many other methods to minimize loss, performance, etc. But, even if your network was "perfect," if the person building it didn't spend the time to "clean" the data, then no matter what it will always have some higher degree of error.
Or something like that. I'm just a fledgling when it comes to deep learning.
sayoonarachu t1_j2dytbo wrote
Not sure if it's tortoise tts but you should take a look at their examples.
sayoonarachu t1_j1408am wrote
If you're savy enough, you can technically run BLOOM 176b . But as others stated, it'll take forever to be usable. I.e 30 minutes for 10 token.
sayoonarachu t1_j0zlw4i wrote
Reply to comment by macORnvidia in laptop for Data Science and Scientific Computing: proart vs legion 7i vs thinkpad p16/p1-gen5 by macORnvidia
No. I was just using pandas (cpu) for simple quick regex and removing and replacing text rows. It was just for a hobby project. The data was scraped from Midjourney and Stable diffusion discord so there were millions of rows of duplicate prompts and poor quality prompts which I had pandas delete and in the end the number of unique rows with more than 50 characters amounted to about 700k which was then used to train gpt-neo 125m.
I didn't know about cudf. Thanks 😅
sayoonarachu t1_j0yn75o wrote
Reply to comment by macORnvidia in laptop for Data Science and Scientific Computing: proart vs legion 7i vs thinkpad p16/p1-gen5 by macORnvidia
I've only starting learning DL a month ago so mostly have been doing simple ANN. But inferencing larger param NLP models, GANS, Diffusion models, etc is fine. It's no desktop 3090s or enterprise grade GPU but for a laptop it's by far the best on the market. For example, the largest Parque file I've cleaned in pandas was about 7 million rows and about 10gb in size of just text. It can run queries through it in a few seconds.
Guess it depends on what kind of data science or dl you're looking to do. The 3080s probably won't be able to fine tune something like BLOOM model but can fine tune stable diffusion models with enough optimizations.
For modeling in blender or procedural generation in something like Houdini, I haven't had issues. I've made procedurally generated 20km height maps in Houdini to export to Unreal Engine and was not a problem.
sayoonarachu t1_j0wjj7v wrote
Reply to laptop for Data Science and Scientific Computing: proart vs legion 7i vs thinkpad p16/p1-gen5 by macORnvidia
You could probably look at the 11th gen legion 7i which is cheaper than their new 12th gen ones. They're not 3080TI but the difference between 3080 and 3080 ti, last I check was very minimal like 5% performance difference.
I personally have the 11th gen version after comparing a bunch of gaming laptops and use it for programming in Unreal Engine and deep learning and playing with Stable Diffusion, etc. Main pro? Like you said, the looks. I love the simple minimal non gaming laptop appeal of the legions. 😅
Also, you'd probably wanna research if all the laptops you've listed are actually able to run the 3080s at their max rating of 150w (previously known as max-q i believe). Some oems won't advertise it. The legion 7i 3080s are though.
sayoonarachu t1_jcgjosz wrote
Reply to comment by CyberDainz in [N] PyTorch 2.0: Our next generation release that is faster, more Pythonic and Dynamic as ever by [deleted]
Doesn't work on Windows in 2.1 dev either, fyi.