wind_dude
wind_dude t1_jecct1i wrote
Reply to comment by KerfuffleV2 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
ahh, sorry, referring to the dataset pulled from shareGPT that was used for finetuning. Which shareGPT has disappeared since the media hype about google using it for BARD.
​
Yes, the llama weights are everywhere, including HF in converted form for hf transformers.
wind_dude t1_jecbli5 wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
What are the concerns with the release of the [shareGPT] dataset? I really hope it does get released, since it looks like shareGPT has shutdown api access, and even web access.
wind_dude t1_jec9lb4 wrote
Reply to comment by lazybottle in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed
Interesting I didn't realise the dataset was on HF with a different license. The dataset (https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json) is also in the code repo which has the apache 2.0 license, so the dataset would be covered by it.
wind_dude t1_je7qkue wrote
it will likely make them much much worse.
wind_dude t1_jdxrcpp wrote
>depend on the Alpaca dataset, which was generated from a GPT3 davinci model, and is subject to non-commercial use
Where do you get that? tatsu-lab/stanford_alpaca is apache 2.0, so you can use it for whatever.
​
for OpenAI
"""
(c) Restrictions. You may not (i) use the Services in a way that infringes, misappropriates or violates any person’s rights; (ii) reverse assemble, reverse compile, decompile, translate or otherwise attempt to discover the source code or underlying components of models, algorithms, and systems of the Services (except to the extent such restrictions are contrary to applicable law); (iii) use output from the Services to develop models that compete with OpenAI; (iv) except as permitted through the API...
"""
​
So as far as I'm concerned you are allowed to use the generated dataset for commercial purposes...
​
Only use might be the licensing on the llama models... but you can train another LLM
wind_dude t1_jdxqp0v wrote
Reply to comment by Taenk in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed
Last I checked they still hadn't opensourced the training data... which is bizarre since they used humans to train it, with all the talk of it being opensource.
wind_dude t1_jdip38g wrote
Reply to comment by [deleted] in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
access to GPT-4 with multimodel
wind_dude t1_jdikwc4 wrote
Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
I'm also curious about this, I reached out for developer access to try and test this on web screenshots for information extraction.
wind_dude t1_jdf5yhj wrote
Reply to comment by endless_sea_of_stars in [N] ChatGPT plugins by Singularian2501
Look at their limited docs, I feel it's a little simpler than toolformer, probably more like the blenderbot models for search, and prompt engineering.
- Matching intent from the prompt to a description of the plugin service
- extracting relevant terms from the prompts to send as query params based on description of the endpoint
- model incorporates API response into model response
​
"The file includes metadata about your plugin (name, logo, etc.), details about authentication required (type of auth, OAuth URLs, etc.), and an OpenAPI spec for the endpoints you want to expose.The model will see the OpenAPI description fields, which can be used to provide a natural language description for the different fields.We suggest exposing only 1-2 endpoints in the beginning with a minimum number of parameters to minimize the length of the text. The plugin description, API requests, and API responses are all inserted into the conversation with ChatGPT. This counts against the context limit of the model." - https://platform.openai.com/docs/plugins/introduction
wind_dude t1_jd012ru wrote
I'm not big into image generation, but... some thoughts...
- SSIM - I believe the issue here has to due with the quality of the img captions. Perhaps merging captions on images
- could try training boolean classifiers for both images and captions, `is_junk`, and than using that model to remove junk from the training data.
wind_dude t1_jcoqe5z wrote
Reply to comment by banatage in [Discussion] Future of ML after chatGPT. by [deleted]
nor with statistical models. But accuracy has generally been higher, but LLMs are catching up and key NLP domains.
wind_dude t1_j9up1ux wrote
Reply to comment by SleekEagle in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
> Until the tools start exhibiting behavior that you didn't predict and in ways that you have no control over.
LLMs already do behave in ways we don't expect. But they are much more than a hop skip, a jump and 27 hypothetical leaps away from being out of our control.
Yes, people will use AI for bad things, but that's not an inherent property of AI, that's an inherent property of humanity.
wind_dude t1_j9s1cr4 wrote
Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
I'll see if I can find the benchmarks, I believe there are a few papers from IBM and deepmind talking about it. And a benchmark study in relation to flan.
wind_dude t1_j9rwt70 wrote
Reply to comment by currentscurrents in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
>Quantum neural networks are an interesting idea, but our brain is certainly not sitting in a vat of liquid nitrogen, so intelligence must be possible without it.
look at the links I shared above.
​
Recreating actual intelligence, what the definition of AGI was 6 months ago, will not be possible on logic based computers. I have never said it's not possible. There's a number of reasons it is not currently possible, the number 1 that we don't have a full understanding of intelligence, and recent theories suggest it's not logic based like previously theorised, but quantum based.
Look at the early history of attempting to fly, for centuries humans strapped wings to their arms and attempted to fly like birds.
wind_dude t1_j9rwd41 wrote
Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
No, absolutely not. It's fear mongering about something we aren't even remotely close to achieving.
wind_dude t1_j9rvmbb wrote
Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
When they scale they hallucinate more, produce more wrong information, thus arguably getting further from intelligence.
wind_dude t1_j9rvfo5 wrote
Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
That's just how tools are used, has been since the dawn of time. You just want to be on the side with the largest club, warmest fire, etc.
wind_dude t1_j9rv2vw wrote
Reply to comment by VirtualHat in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
Would you admit a theory that may not be possible and than devote your life to working on it? Even if you don't you're going to say it, and eventually believe it. And the definitions do keep moving with lower bars as the media and companies sensationalise for clicks and funding.
wind_dude t1_j9ru6yc wrote
Reply to comment by currentscurrents in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
https://plato.stanford.edu/entries/qt-consciousness/
https://www.nature.com/articles/s41566-021-00845-4
https://www.nature.com/articles/440611a
https://phys.org/news/2022-10-brains-quantum.html
​
Considering in 355 BC Aristotle thought the brain was a radiator, it's not a far leap to think were wrong that it uses electrical impulses like a computer. And I'm sure after quantum mechanics there will be something else. Although we have far more understanding than 2000 years ago, we are very far from the understanding we will have in 2000 years.
wind_dude t1_j9ro57j wrote
Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
No, absolutely not. First AGI is just a theory, it's not possible on modern logic based hardware, quantum is a possibility. Even if we do achieve it, it's fragile, just unplug it. 2nd, we've had nuclear weapons for close to 80 years, and we're still here, that's a much more real and immediate threat to our demise.
​
As a thought experiment, it's not bad...
wind_dude t1_j6sj0ix wrote
Reply to [P] NER output label post processing by hasiemasie
I solved a similar issue by building a knowledge graph. It took some manual curation and starting with a good base, but suggestions for misspelling and alternates were suggested by comparing vectors. The suggester runs as a batch with new entities after my ETL batch is done.
wind_dude t1_j61rnjt wrote
huggingface
wind_dude t1_j50x6ad wrote
Yea, unless they master continual learning, the models will get stale quick, or need to rely on iterative training, very expensive and slow. I don't see hardware catching up soon.
I think you'll still need to run a fairly sophisticated LLM as the base model for a query based archetecture. But you can probably reduce the cost of running it by distilling it, and curating the input data. I actually don't think there has been a ton of research on curating the input data before training (OpenAI did something similar curating responses in chatGPT with the RLHF, so similar concept), although concerns/critiques may arise of what junk, which is why it hasn't been looked at in depth before. I believe SD did this in the latest checkpoint removing anything "pornographic", which is over censorship.
You look at something like CC that makes up a fairly large portion of the training data, run it through a classifier to remove junk before training. And even CC text, a lot of it is probably landing type pages, or even a blocked by paywall msging. To my knowledge the percent of these making up CC hasn't even been looked at, let alone trimmed from the training datasets used.
wind_dude t1_jedvs9b wrote
Reply to comment by gmork_13 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
That’d actually be pretty cool to see, could train some classifiers pretty quick and pull some interesting stats on how people are using chatgpt.
Hoping someone publishes the dataset.