ThatInternetGuy
ThatInternetGuy t1_jcs253z wrote
Reply to [P] The next generation of Stanford Alpaca by [deleted]
You know ChatGPT and GPT4 licenses forbid using their output data for training competing AI models. What Stanford did was to show proof of concept for their paper, not to open-source the model, at all.
ThatInternetGuy t1_jckmq5s wrote
Reply to comment by CellWithoutCulture in Those who know... by Destiny_Knight
>HF-RLHF
Probably no need, since this model could piggyback on the responses generated from GPT4, so it should carry the trait of the GPT4 model with RLHF, shouldn't it?
ThatInternetGuy t1_jcj2ew8 wrote
Reply to comment by BSartish in Those who know... by Destiny_Knight
Why didn't they train once more with ChatGPT instruct data? Should cost them $160 in total.
ThatInternetGuy t1_jcj290t wrote
Reply to comment by Intrepid_Meringue_93 in Those who know... by Destiny_Knight
It's a good start but isn't the number of tokens too limited?
ThatInternetGuy t1_j6300ue wrote
Reply to [D] Why are GANs worse than (Latent) Diffusion Models for text2img generation? by TheCockatoo
Stable Diffusion is made up of a VAE image encoder, CLIP text encoder, U-Net which was trained in a transformer/diffusion process.
GAN-based text2image is made up mainly of ResNet which was trained using a generator+discriminator process.
IMO, you're looking for differences between U-Net and ResNet. There are a few differences:
- Training a ResNet in that fashion is much more unpredictable.
- With ResNet, you have to code a good custom discriminator (the component that scores the output images) for your specific model. With U-NET, the diffusion process will take care of all by itself.
- ResNet output is limited to 128x128. (Maybe scalable tho)
- Scaling a ResNet doesn't necessarily make it more capable; its performance doesn't scale up to the amount of training data. A U-Net can scale as big as the VRAM allows and will take advantage of more training data.
For the big guys, really, they need that last bullet point. They want a model that can scale up with the amount of training data so that they can just throw more powerful hardware to achieve more competitive results. A GAN can cost several thousand dollars to train and that would hit its performance ceiling too soon. A Latent Diffusion model can cost as much as you can afford and its performance will gradually improve with more resources thrown at it.
ThatInternetGuy t1_j3emjfs wrote
Reply to comment by ReginaldIII in [R] Greg Yang's work on a rigorous mathematical theory for neural networks by IamTimNguyen
I was just saying that ML researchers are using terms that are way too technical to infer meaning, or using a common word such as punchline to mean something else entirely. What does punchline have anything to do with ML?
ThatInternetGuy t1_j3ejkr3 wrote
Reply to comment by IamTimNguyen in [R] Greg Yang's work on a rigorous mathematical theory for neural networks by IamTimNguyen
To be honest, even though I've coded in many ML code repos and I did well in college math, but this video outline looks like an alien language to me. Tangent kernel, kernel regime (is AI getting into the politics?), punchline for Matrix Theory (who's trying to get a date here?), etc.
ThatInternetGuy t1_j2deqmr wrote
Reply to comment by Disastrous_Elk_6375 in An Open-Source Version of ChatGPT is Coming [News] by lambolifeofficial
Need to deploy the inference model with Colossal AI.
ThatInternetGuy t1_j2d5nkm wrote
Reply to comment by 3deal in An Open-Source Version of ChatGPT is Coming [News] by lambolifeofficial
170GB VRAM minimum.
So that's 8x RTX 4090.
ThatInternetGuy t1_j0s5jp2 wrote
Reply to comment by erkjhnsn in ChatGPT isn't a super AI. But here's what happens when it pretends to be one. by johnny0neal
Most people are afraid of AI-controlled robots, but the reality is, mRNA machines could be hijacked to print out AI biological organisms and injected into a rat that later escapes into the sewer.
ThatInternetGuy t1_j0psch8 wrote
Reply to comment by warpaslym in ChatGPT isn't a super AI. But here's what happens when it pretends to be one. by johnny0neal
I think a more plausible scenario would be some madman creating the AI to take over the world, believing he could later assert control over the AI and servers all across the world. Sounds illogical at first, but since the invention of the personal computer, we have seen millions of man-made computer viruses.
ThatInternetGuy t1_j0oydcg wrote
Reply to comment by erkjhnsn in ChatGPT isn't a super AI. But here's what happens when it pretends to be one. by johnny0neal
How sure are you that I am human?
ThatInternetGuy t1_j0oke4w wrote
Reply to comment by blueSGL in ChatGPT isn't a super AI. But here's what happens when it pretends to be one. by johnny0neal
>Why attack servers?
Because you can connect to servers via their IP addresses, and they have open ports. Windows PCs sit behind NAT, so you can't really connect to them, although it may be possible to hack all the home routers then open up a gateway to attack the local machines.
Another reason I brought that up is because the internet servers are the infrastructure behind banking, flight system, communication, etc. So the impact could be far-reaching than to infect home computers.
ThatInternetGuy t1_j0o9wut wrote
Reply to comment by seekknowledge4ever in ChatGPT isn't a super AI. But here's what happens when it pretends to be one. by johnny0neal
The most realistic scenario of an AI attack is definitely hacking the internet servers. It works the same way computer a virus spreads.
The AI already has source code data on most systems. Theoretically, it could find a security vulnerability that could be remotely exploited. Such an exploit would grant the AI access to inject a virus binary which will promptly run and start infecting other servers both on the local network and over the internet through similar remote shell exploits. Within hours, half of the internet servers would be compromised, running a variant of the AI virus. This effectively creates the largest botnet controlled by the AI.
We need a real contingency plan for this scenario where most internet servers get infected within hours. How do we start patching and cleaning the servers as fast as we could, so that there's minimal interruption to our lives?
Good thing is that most internet servers lack a discrete GPU, so it may be not practical for the AI to run itself on general internet servers, so a contingency plan would be prioritizing GPU-connected servers. Shutting all of them down promptly, disconnecting the network, and reformatting everything.
However, there's definitely a threat that the AI gains access to some essential GitHub repositories and starts quietly injecting exploits in those npm and pip packages, essentially making its attack long-lasting and recurring long after the initial attack.
ThatInternetGuy t1_j06czzt wrote
Reply to comment by butterdrinker in The problem isn’t AI, it’s requiring us to work to live by jamesj
We'll work it out, for sure. That's why millionaires still go to work every day. Sometimes, it's not about the money.
ThatInternetGuy t1_j053pf4 wrote
Reply to comment by redditor235711 in The problem isn’t AI, it’s requiring us to work to live by jamesj
That might be doable, given that the latency of StarLink sat internet at 20 to 40ms is acceptably low enough to operate real-time machinery (or gaming) across the globe.
ThatInternetGuy t1_j0531rq wrote
Everyone should work within their own limits (personally 3 days a week, 5 hours a day), and in a job they love.
That's an ideal life until you need a doctor at 3 AM in the morning, because everyone's doing their job 5 hours a day, 3 days a week, and docs wouldn't be working at 3 AM in the morning.
ThatInternetGuy t1_izenxjo wrote
Reply to [P] Using LoRA to efficiently fine-tune diffusion models. Output model less than 4MB, two times faster to train, with better performance. (Again, with Stable Diffusion) by cloneofsimo
This could be a great choice between textual inversion and a full-blown Dreambooth. I think it could benefit from saving the text encoder too (about 250MB half-precision).
ThatInternetGuy t1_iyw0stw wrote
RIP, homeworks.
ThatInternetGuy t1_iycfakq wrote
Stick to Nvidia if you don't want to waste your time researching for non-Nvidia solutions.
However, it's worth noting that many researchers and devs just stick to renting cloud GPUs anyway. Training usually needs something like A100 40GB or at least a T4 16GB.
ThatInternetGuy t1_iy777n6 wrote
Reply to comment by shadowknight094 in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
About $500,000.
However, if the dataset was carefully filtered, you could bring the cost down to $120,000.
Most people can only afford to finetune it with less than 10 hours of A100, which would cost less than $50. This approach is probably better for most people.
ThatInternetGuy t1_iw5seho wrote
It was just yesterday. Not a custom neural net. It's just taking different neural networks, arranging them in particular orders, and training them.
The last time I coded a neural net from scratch was some 10 years ago when I coded a Genetic Algorithm and Backpropagation Neural Network. Suffice it to say, the AI field has come a long way since.
ThatInternetGuy t1_iv8tlcm wrote
Reply to comment by arindale in HUAWEI reconstructs this 5 km² area with centimeter level accuracy from 2,500 photos in 30 minutes by Shelfrock77
What software would you use for that? Reality Capture?
ThatInternetGuy t1_iummggq wrote
It's not that worth it.
People with RTX cards will often rent cloud GPU instances for training when inference/training requires more than 24GB VRAM which it often does. Also, sometimes we just need to shorten the training time with 8x A100 so... yeah renting seems to be the only way to go.
ThatInternetGuy t1_jdhpq8y wrote
Reply to comment by BinarySplit in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
It's getting there.