Viewing a single comment thread. View all comments

sayoonarachu t1_j0yn75o wrote

I've only starting learning DL a month ago so mostly have been doing simple ANN. But inferencing larger param NLP models, GANS, Diffusion models, etc is fine. It's no desktop 3090s or enterprise grade GPU but for a laptop it's by far the best on the market. For example, the largest Parque file I've cleaned in pandas was about 7 million rows and about 10gb in size of just text. It can run queries through it in a few seconds.

Guess it depends on what kind of data science or dl you're looking to do. The 3080s probably won't be able to fine tune something like BLOOM model but can fine tune stable diffusion models with enough optimizations.

For modeling in blender or procedural generation in something like Houdini, I haven't had issues. I've made procedurally generated 20km height maps in Houdini to export to Unreal Engine and was not a problem.

1

macORnvidia OP t1_j0z24ie wrote

>. For example, the largest Parque file I've cleaned in pandas was about 7 million rows and about 10gb in size of just text. It can run queries through it in a few seconds.

Using rapids? Like cudf?

1

sayoonarachu t1_j0zlw4i wrote

No. I was just using pandas (cpu) for simple quick regex and removing and replacing text rows. It was just for a hobby project. The data was scraped from Midjourney and Stable diffusion discord so there were millions of rows of duplicate prompts and poor quality prompts which I had pandas delete and in the end the number of unique rows with more than 50 characters amounted to about 700k which was then used to train gpt-neo 125m.

I didn't know about cudf. Thanks 😅

1