wind_dude t1_jdxrcpp wrote

>depend on the Alpaca dataset, which was generated from a GPT3 davinci model, and is subject to non-commercial use

Where do you get that? tatsu-lab/stanford_alpaca is apache 2.0, so you can use it for whatever.


for OpenAI


(c) Restrictions. You may not (i) use the Services in a way that infringes, misappropriates or violates any person’s rights; (ii) reverse assemble, reverse compile, decompile, translate or otherwise attempt to discover the source code or underlying components of models, algorithms, and systems of the Services (except to the extent such restrictions are contrary to applicable law); (iii) use output from the Services to develop models that compete with OpenAI; (iv) except as permitted through the API...



So as far as I'm concerned you are allowed to use the generated dataset for commercial purposes...


Only use might be the licensing on the llama models... but you can train another LLM


wind_dude t1_jdf5yhj wrote

Look at their limited docs, I feel it's a little simpler than toolformer, probably more like the blenderbot models for search, and prompt engineering.

- Matching intent from the prompt to a description of the plugin service

- extracting relevant terms from the prompts to send as query params based on description of the endpoint

- model incorporates API response into model response


"The file includes metadata about your plugin (name, logo, etc.), details about authentication required (type of auth, OAuth URLs, etc.), and an OpenAPI spec for the endpoints you want to expose.The model will see the OpenAPI description fields, which can be used to provide a natural language description for the different fields.We suggest exposing only 1-2 endpoints in the beginning with a minimum number of parameters to minimize the length of the text. The plugin description, API requests, and API responses are all inserted into the conversation with ChatGPT. This counts against the context limit of the model." -


wind_dude t1_jd012ru wrote

I'm not big into image generation, but... some thoughts...

- SSIM - I believe the issue here has to due with the quality of the img captions. Perhaps merging captions on images

- could try training boolean classifiers for both images and captions, `is_junk`, and than using that model to remove junk from the training data.


wind_dude t1_j9up1ux wrote

> Until the tools start exhibiting behavior that you didn't predict and in ways that you have no control over.

LLMs already do behave in ways we don't expect. But they are much more than a hop skip, a jump and 27 hypothetical leaps away from being out of our control.

Yes, people will use AI for bad things, but that's not an inherent property of AI, that's an inherent property of humanity.


wind_dude t1_j9rwt70 wrote

>Quantum neural networks are an interesting idea, but our brain is certainly not sitting in a vat of liquid nitrogen, so intelligence must be possible without it.

look at the links I shared above.


Recreating actual intelligence, what the definition of AGI was 6 months ago, will not be possible on logic based computers. I have never said it's not possible. There's a number of reasons it is not currently possible, the number 1 that we don't have a full understanding of intelligence, and recent theories suggest it's not logic based like previously theorised, but quantum based.

Look at the early history of attempting to fly, for centuries humans strapped wings to their arms and attempted to fly like birds.


wind_dude t1_j9rv2vw wrote

Would you admit a theory that may not be possible and than devote your life to working on it? Even if you don't you're going to say it, and eventually believe it. And the definitions do keep moving with lower bars as the media and companies sensationalise for clicks and funding.


wind_dude t1_j9ru6yc wrote


Considering in 355 BC Aristotle thought the brain was a radiator, it's not a far leap to think were wrong that it uses electrical impulses like a computer. And I'm sure after quantum mechanics there will be something else. Although we have far more understanding than 2000 years ago, we are very far from the understanding we will have in 2000 years.


wind_dude t1_j9ro57j wrote

No, absolutely not. First AGI is just a theory, it's not possible on modern logic based hardware, quantum is a possibility. Even if we do achieve it, it's fragile, just unplug it. 2nd, we've had nuclear weapons for close to 80 years, and we're still here, that's a much more real and immediate threat to our demise.


As a thought experiment, it's not bad...


wind_dude t1_j6sj0ix wrote

I solved a similar issue by building a knowledge graph. It took some manual curation and starting with a good base, but suggestions for misspelling and alternates were suggested by comparing vectors. The suggester runs as a batch with new entities after my ETL batch is done.


wind_dude t1_j50x6ad wrote

Yea, unless they master continual learning, the models will get stale quick, or need to rely on iterative training, very expensive and slow. I don't see hardware catching up soon.

I think you'll still need to run a fairly sophisticated LLM as the base model for a query based archetecture. But you can probably reduce the cost of running it by distilling it, and curating the input data. I actually don't think there has been a ton of research on curating the input data before training (OpenAI did something similar curating responses in chatGPT with the RLHF, so similar concept), although concerns/critiques may arise of what junk, which is why it hasn't been looked at in depth before. I believe SD did this in the latest checkpoint removing anything "pornographic", which is over censorship.

You look at something like CC that makes up a fairly large portion of the training data, run it through a classifier to remove junk before training. And even CC text, a lot of it is probably landing type pages, or even a blocked by paywall msging. To my knowledge the percent of these making up CC hasn't even been looked at, let alone trimmed from the training datasets used.


wind_dude t1_j50pmcc wrote

I would suspect similar to blenderbot2 from meta and

Chat memory is searched for relevant information and sent to the decoder for the final output.



So it's in the model architecture.