visarga
visarga t1_jcjornh wrote
Reply to comment by CellWithoutCulture in Those who know... by Destiny_Knight
Everyone does it, they all exfiltrate valuable data from OpenAI. You can use it directly, like Alpaca, or for pre-labelling, or for mislabeled example detection.
They train code models by asking GPT3 to explain code snippets, then training a model the other way around to generate code from description. This data can be used to fine-tune a code model for your specific domain of interest.
visarga t1_jcjolhv wrote
Reply to comment by anaIconda69 in Those who know... by Destiny_Knight
It's the first time I've seen FaceBook on people's side against the big corps. Didn't think this day would come.
visarga t1_jc5teq6 wrote
Reply to comment by LessPoliticalAccount in [D] Are modern generative AI models on a path to significantly improved truthfulness? by buggaby
Then we need to only use a second model for strict fact checking, not creative responses. Since entailment is a common NLP task I am sure any LLM can solve it out of the box, of course with its own error rate.
visarga t1_jc3wlib wrote
Reply to comment by abriec in [D] Are modern generative AI models on a path to significantly improved truthfulness? by buggaby
I give you a simple solution: run GPT-3 and LLaMA in parallel, if they concur, then you can be sure they have not hallucinated the response. Two completely different LLMs would not hallucinate the same way.
visarga t1_jbn5g3w wrote
Reply to comment by harharveryfunny in [D] Why are so many tokens needed to train large language models? by blacklemon67
On the other hand LLM has broad knowledge about all topics, a true dilettante. We can't keep up on that level.
visarga t1_jb7a656 wrote
Reply to comment by ihateshadylandlords in What might slow this down? by Beautiful-Cancel6235
> A lot of things here will get shelved because they’re either not able to get the price down or it malfunctions too often and they can’t fix it.
You just described about 99% of all AI products. They all malfunction. All of them. "Errare humanum est", but for now "errare machinale est".
visarga t1_jb79kst wrote
Reply to comment by fluffy_assassins in What might slow this down? by Beautiful-Cancel6235
Back-propagation is self-modifying code. There is also meta-back-propagation for meta-learning, which is learning to modify a neural network to solve novel tasks.
At a higher level, language models trained on code can cultivate a population of models with evolutionary techniques.
visarga t1_jb794uu wrote
Reply to comment by TopicRepulsive7936 in What might slow this down? by Beautiful-Cancel6235
The more advanced these chips get, the harder to make. So advancement in capability amplifies the cost.
visarga t1_jb78ro9 wrote
Reply to comment by s2ksuch in What might slow this down? by Beautiful-Cancel6235
AI needs the highest grade of chips that can only be produced in Taiwan, other countries can produce lower grades.
visarga t1_jaw9d2e wrote
Reply to comment by [deleted] in [P] LazyShell - GPT based autocomplete for zsh by rumovoice
ALL the data
visarga t1_jalh1r1 wrote
Reply to comment by Dendriform1491 in [D] Are Genetic Algorithms Dead? by TobusFire
You don't always need a population of neural networks, it could be a population of prompts or even a population of problem solutions.
If you're using GA to solve specific coding problems, then there is one paper where they use LLM to generate diffs for code. The LLM was the mutation operator, and they even fine-tune it iteratively.
visarga t1_jalgrla wrote
Reply to [D] Are Genetic Algorithms Dead? by TobusFire
visarga t1_jalg9iu wrote
Reply to comment by fmai in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
I think the main pain point was memory usage.
visarga t1_jaj4lxx wrote
Reply to comment by Timdegreat in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
Not this time. Still text-embedding-ada-002
visarga t1_jaj4bqs wrote
Reply to comment by harharveryfunny in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
> $1.5M/yr
The inference cost is probably 10% of that.
visarga t1_jae1edb wrote
Reply to comment by just-a-dreamer- in When will AI develop faster than white collar workers can reskill through education? by just-a-dreamer-
> you cannot keep pace with AI
We are not competing with AI. We are competing with other people who use AI. Everyone has and will have AI. Using AI won't give you a comparative advantage in 2030.
Companies that want to scale AI need people. AI really shines when it is supported. You need people around them to maximise their value.
If you want to get rid of your human employees and use only AI, your competition will eat your lunch. They will team up AI with humans and be faster and more creative than you. Competition won't allow companies to simply get rid of people.
All this extra creativity and work enabled by AI will be eaten by our expanding desires and entitlement. In 2030 the expectations of the public will be sky high compared to now, companies will have to provide better products to keep up.
visarga t1_jadzefz wrote
Reply to comment by [deleted] in When will AI develop faster than white collar workers can reskill through education? by just-a-dreamer-
There's a long way from "impressive demo" to "replacing humans". Self driving cars could impress us in demos even 10 years ago, but they can't be on their own, not even now.
If you work in ML you tend to know the failure modes and issues much better than the public. So you have to be less optimistic. Machine learning works only when the problem is close to the training data. It doesn't generalise well, you have to get good data if you want good results.
visarga t1_ja8dm4q wrote
Reply to comment by nexapp in OpenAI has privately announced a new developer product called Foundry by flowday
That's a naive view that doesn't take into consideration the second order effects. In 5-10 years companies will have to compete with more advanced products that use AI, a lot of that new found AI productivity will be spent to level off with the competition instead of raking in absurd profits. And lowering prices will help consumers.
visarga t1_ja8cjid wrote
Reply to comment by turnip_burrito in OpenAI has privately announced a new developer product called Foundry by flowday
There's much less long form data to train on. That's problematic.
visarga t1_ja8c2yk wrote
Reply to comment by TFenrir in OpenAI has privately announced a new developer product called Foundry by flowday
I tested a paper quickly and it was 20K tokens in 200KB of text.
visarga t1_ja57ahr wrote
Reply to comment by kaityl3 in People lack imagination and it’s really bothering me by thecoffeejesus
Yes, we got far. But why did we get here?
-
We had a "wild" GPT3 in 2020, it would hardly take instructions, but still the largest leap in capability ever seen
-
Then they figured out that training the model in a mix of many tasks will unlock general following ability. That was the instruct series.
-
But still, it was hard to make the model "behave". It was not aligned with us. So why did we get another miracle here? Reinforcement Learning has almost nothing to do with NLP, but here we have RLHF the crown jewel of the GPT series. With it we got chatGPT and BingChat.
None of these three moments were guaranteed based on what we knew at the time. They are improbable things. Language models did nothing of the sort before 2020. They were factories of word salad. They could barely write two lines of coherent English.
What I want to say is that we see no reason these miracles have to happen so fast in succession. We can't rely on their consistent return.
What we can rely on is the parts we can extrapolate now. We think we will see models at least 10x larger than GPT3 and trained on much more data. We know how to make models 10x more efficient. We think language models will improve a lot when combined with other modules like search, Python code execution, calculator, calendar and database, we're not even at 10% there with the external resources. We think integrating vision, audio, actions and other modalities will have a huge impact, and we're just starting. LLMs are still pure text.
I think we can expect 10x...1000x boost just based on what we know right now.
visarga t1_ja55tll wrote
Reply to comment by thecoffeejesus in People lack imagination and it’s really bothering me by thecoffeejesus
That's meaningless. Even enumerating all games of Go is tedious, 10^170, more than 10^80 the number of atoms in the universe, and that's only a small corner of "everything that can happen". If you put two go boards side by side the number of state multiplies between them.
visarga t1_ja55edi wrote
Reply to comment by ShidaPenns in People lack imagination and it’s really bothering me by thecoffeejesus
Probably money is the most important thing. $1B given by MS to OpenAI in 2019 became GPT-3.
visarga t1_ja5450y wrote
Reply to comment by Difficult_Review9741 in People lack imagination and it’s really bothering me by thecoffeejesus
No, it's not about flashiness. Those ML apps you are talking about were specialised projects, each one developed independently. LLMs on the other hand are generalist. They can do thousands of known tasks and countless more, including combinations of tasks.
Instead of taking one year or more to provide a proof of concept, you can do that in a week. Instead of painstakingly labelling tens of thousands of examples, you just prompt with 4 examples. The entry barrier is so low now for many applications that anyone with programming experience can do it.
For vision, the CLIP model gives us a way to make classifiers without any samples, and the diffusion models allow us to generate any image. All without retraining, without large scale labelling.
visarga t1_jcjp7gt wrote
Reply to comment by Exogenesis98 in Those who know... by Destiny_Knight
That's one future job for us. Be the legs and hands of an AI. Using our human privileges (passport, legal rights) and mobility to take it anywhere and act in the world. I bet there will be more AIs than people available, so they will have to pay more to hire an avatar. Jobless problem solved by AI. A robot would be different, it doesn't have human rights, it's just a device. A human can provide "human-in-the-loop" service.