Dankmemexplorer
Dankmemexplorer t1_jb9xjl9 wrote
-stable diffusion would be fun to play with
-you can try simple computer vision tasks / finetune a model to detect your cat or something
Dankmemexplorer t1_j3mpmt0 wrote
Reply to comment by rockpooperscissors in Building an NBA game prediction model - failing to improve between epochs by vagartha
this is likely the problem
Dankmemexplorer t1_j27hf6g wrote
Reply to comment by artoftheproblem in [R] LAMBADA: Backward Chaining for Automated Reasoning in Natural Language - Google Research 2022 - Significantly outperforms Chain of Thought and Select Inference in terms of prediction accuracy and proof accuracy. by Singularian2501
that was like 4 months ago right???
Dankmemexplorer t1_j13k11f wrote
Reply to comment by farmingvillein in [R] Nonparametric Masked Language Modeling - MetaAi 2022 - NPM - 500x fewer parameters than GPT-3 while outperforming it on zero-shot tasks by Singularian2501
aint that just the way
Dankmemexplorer t1_j123o1b wrote
Reply to [R] Nonparametric Masked Language Modeling - MetaAi 2022 - NPM - 500x fewer parameters than GPT-3 while outperforming it on zero-shot tasks by Singularian2501
time to train gpt-4 on my mom's laptop
Dankmemexplorer t1_iymsbgo wrote
Reply to comment by Deep-Station-1746 in [D] What advances need to happen for something like gpt3 to be able to run on consumer devices and laptops locally? Is it even a possibility? by aero_oliver2
my current gpu is 4 years old 😖
state of the art has gotten a lot better since then but not that much better
Dankmemexplorer t1_iymjsty wrote
Reply to comment by aero_oliver2 in [D] What advances need to happen for something like gpt3 to be able to run on consumer devices and laptops locally? Is it even a possibility? by aero_oliver2
running the full gpt-3 on a laptop would be like running crysis 3 on a commodore 64. you cant pare it down enough to run without ruining it
Dankmemexplorer t1_iymieav wrote
Reply to [D] What advances need to happen for something like gpt3 to be able to run on consumer devices and laptops locally? Is it even a possibility? by aero_oliver2
for a sense of scale, GPT-NeoX, a 20 billion parameter model, requires ~45GB of vram to run. gpt-3 davinci is 175 billion parameters.
unless these models can be pared down somehow (unlikely, the whole point of training these huge models is because their performance scales with size), we will have to wait a decade or two for consumer electronics to catch up
Dankmemexplorer t1_iwqebgp wrote
Reply to comment by dat_cosmo_cat in [R] The Near Future of AI is Action-Driven by hardmaru
true, they do keep getting gooder and people are like "we solved it this year"
i think it got good enough for most things in 2020 with gpt-3 like how dall-e/SD is good enough for most things now
Dankmemexplorer t1_jchlw3t wrote
Reply to comment by currentscurrents in [P] nanoT5 - Inspired by Jonas Geiping's Cramming and Andrej Karpathy's nanoGPT, we fill the gap of a repository for pre-training T5-style "LLMs" under a limited budget in PyTorch by korec1234
man its funny that 250M is a toy now
how far we've come...