wind_dude t1_jedvs9b wrote on March 31, 2023 at 8:57 AM

Reply to comment by gmork_13 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

That’d actually be pretty cool to see, could train some classifiers pretty quick and pull some interesting stats on how people are using chatgpt.

Hoping someone publishes the dataset.

wind_dude t1_jecct1i wrote on March 30, 2023 at 11:51 PM

Reply to comment by KerfuffleV2 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

ahh, sorry, referring to the dataset pulled from shareGPT that was used for finetuning. Which shareGPT has disappeared since the media hype about google using it for BARD.

Yes, the llama weights are everywhere, including HF in converted form for hf transformers.

wind_dude t1_jecbli5 wrote on March 30, 2023 at 11:42 PM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

What are the concerns with the release of the [shareGPT] dataset? I really hope it does get released, since it looks like shareGPT has shutdown api access, and even web access.

wind_dude t1_jec9lb4 wrote on March 30, 2023 at 11:27 PM

Reply to comment by lazybottle in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed

Interesting I didn't realise the dataset was on HF with a different license. The dataset (https://github.com/tatsu-lab/stanford_alpaca/blob/main/alpaca_data.json) is also in the code repo which has the apache 2.0 license, so the dataset would be covered by it.

wind_dude t1_je7qkue wrote on March 30, 2023 at 12:41 AM

Reply to Do you guys think AGI will cure mental disorders? by Ok-Wing111

it will likely make them much much worse.

wind_dude t1_jdxrcpp wrote on March 27, 2023 at 11:32 PM

Reply to [D] Instruct Datasets for Commercial Use by JohnyWalkerRed

>depend on the Alpaca dataset, which was generated from a GPT3 davinci model, and is subject to non-commercial use

Where do you get that? tatsu-lab/stanford_alpaca is apache 2.0, so you can use it for whatever.

for OpenAI

"""

(c) Restrictions. You may not (i) use the Services in a way that infringes, misappropriates or violates any person’s rights; (ii) reverse assemble, reverse compile, decompile, translate or otherwise attempt to discover the source code or underlying components of models, algorithms, and systems of the Services (except to the extent such restrictions are contrary to applicable law); (iii) use output from the Services to develop models that compete with OpenAI; (iv) except as permitted through the API...

"""

So as far as I'm concerned you are allowed to use the generated dataset for commercial purposes...

Only use might be the licensing on the llama models... but you can train another LLM

wind_dude t1_jdxqp0v wrote on March 27, 2023 at 11:27 PM

Reply to comment by Taenk in [D] Instruct Datasets for Commercial Use by JohnyWalkerRed

Last I checked they still hadn't opensourced the training data... which is bizarre since they used humans to train it, with all the talk of it being opensource.

wind_dude t1_jdip38g wrote on March 24, 2023 at 5:50 PM

Reply to comment by [deleted] in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

access to GPT-4 with multimodel

wind_dude t1_jdikwc4 wrote on March 24, 2023 at 5:24 PM

Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-

I'm also curious about this, I reached out for developer access to try and test this on web screenshots for information extraction.

wind_dude t1_jdf5yhj wrote on March 23, 2023 at 11:13 PM

Reply to comment by endless_sea_of_stars in [N] ChatGPT plugins by Singularian2501

Look at their limited docs, I feel it's a little simpler than toolformer, probably more like the blenderbot models for search, and prompt engineering.

- Matching intent from the prompt to a description of the plugin service

- extracting relevant terms from the prompts to send as query params based on description of the endpoint

- model incorporates API response into model response

"The file includes metadata about your plugin (name, logo, etc.), details about authentication required (type of auth, OAuth URLs, etc.), and an OpenAPI spec for the endpoints you want to expose.The model will see the OpenAPI description fields, which can be used to provide a natural language description for the different fields.We suggest exposing only 1-2 endpoints in the beginning with a minimum number of parameters to minimize the length of the text. The plugin description, API requests, and API responses are all inserted into the conversation with ChatGPT. This counts against the context limit of the model." - https://platform.openai.com/docs/plugins/introduction

wind_dude t1_jd012ru wrote on March 20, 2023 at 9:10 PM

Reply to [D] Determining quality of training images with some metrics by i_sanitize_my_hands

I'm not big into image generation, but... some thoughts...

- SSIM - I believe the issue here has to due with the quality of the img captions. Perhaps merging captions on images

- could try training boolean classifiers for both images and captions, `is_junk`, and than using that model to remove junk from the training data.

wind_dude t1_jcoqe5z wrote on March 18, 2023 at 11:58 AM

Reply to comment by banatage in [Discussion] Future of ML after chatGPT. by [deleted]

nor with statistical models. But accuracy has generally been higher, but LLMs are catching up and key NLP domains.

wind_dude t1_j9up1ux wrote on February 24, 2023 at 6:14 PM

Reply to comment by SleekEagle in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

> Until the tools start exhibiting behavior that you didn't predict and in ways that you have no control over.

LLMs already do behave in ways we don't expect. But they are much more than a hop skip, a jump and 27 hypothetical leaps away from being out of our control.

Yes, people will use AI for bad things, but that's not an inherent property of AI, that's an inherent property of humanity.

wind_dude t1_j9s1cr4 wrote on February 24, 2023 at 3:34 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

I'll see if I can find the benchmarks, I believe there are a few papers from IBM and deepmind talking about it. And a benchmark study in relation to flan.

wind_dude t1_j9rwt70 wrote on February 24, 2023 at 2:59 AM

Reply to comment by currentscurrents in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

>Quantum neural networks are an interesting idea, but our brain is certainly not sitting in a vat of liquid nitrogen, so intelligence must be possible without it.

look at the links I shared above.

Recreating actual intelligence, what the definition of AGI was 6 months ago, will not be possible on logic based computers. I have never said it's not possible. There's a number of reasons it is not currently possible, the number 1 that we don't have a full understanding of intelligence, and recent theories suggest it's not logic based like previously theorised, but quantum based.

Look at the early history of attempting to fly, for centuries humans strapped wings to their arms and attempted to fly like birds.

wind_dude t1_j9rwd41 wrote on February 24, 2023 at 2:55 AM

Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

No, absolutely not. It's fear mongering about something we aren't even remotely close to achieving.

wind_dude t1_j9rvmbb wrote on February 24, 2023 at 2:49 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

When they scale they hallucinate more, produce more wrong information, thus arguably getting further from intelligence.

wind_dude t1_j9rvfo5 wrote on February 24, 2023 at 2:48 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

That's just how tools are used, has been since the dawn of time. You just want to be on the side with the largest club, warmest fire, etc.

wind_dude t1_j9rv2vw wrote on February 24, 2023 at 2:45 AM

Reply to comment by VirtualHat in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

Would you admit a theory that may not be possible and than devote your life to working on it? Even if you don't you're going to say it, and eventually believe it. And the definitions do keep moving with lower bars as the media and companies sensationalise for clicks and funding.

wind_dude t1_j9ru6yc wrote on February 24, 2023 at 2:39 AM

Reply to comment by currentscurrents in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

https://plato.stanford.edu/entries/qt-consciousness/

https://www.nature.com/articles/s41566-021-00845-4

https://www.nature.com/articles/440611a

https://phys.org/news/2022-10-brains-quantum.html

https://www.newscientist.com/article/mg22830500-300-is-quantum-physics-behind-your-brains-ability-to-think/

Considering in 355 BC Aristotle thought the brain was a radiator, it's not a far leap to think were wrong that it uses electrical impulses like a computer. And I'm sure after quantum mechanics there will be something else. Although we have far more understanding than 2000 years ago, we are very far from the understanding we will have in 2000 years.

wind_dude t1_j9ro57j wrote on February 24, 2023 at 1:54 AM

Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

No, absolutely not. First AGI is just a theory, it's not possible on modern logic based hardware, quantum is a possibility. Even if we do achieve it, it's fragile, just unplug it. 2nd, we've had nuclear weapons for close to 80 years, and we're still here, that's a much more real and immediate threat to our demise.

As a thought experiment, it's not bad...

wind_dude t1_j6sj0ix wrote on February 1, 2023 at 4:18 PM

Reply to [P] NER output label post processing by hasiemasie

I solved a similar issue by building a knowledge graph. It took some manual curation and starting with a good base, but suggestions for misspelling and alternates were suggested by comparing vectors. The suggester runs as a batch with new entities after my ETL batch is done.

wind_dude t1_j61rnjt wrote on January 27, 2023 at 2:25 AM

Reply to [Discussion] Github like alternative for ML? by angkhandelwal749

huggingface

wind_dude t1_j50x6ad wrote on January 19, 2023 at 5:08 PM

Reply to [D] is it time to investigate retrieval language models? by hapliniste

Yea, unless they master continual learning, the models will get stale quick, or need to rely on iterative training, very expensive and slow. I don't see hardware catching up soon.

I think you'll still need to run a fairly sophisticated LLM as the base model for a query based archetecture. But you can probably reduce the cost of running it by distilling it, and curating the input data. I actually don't think there has been a ton of research on curating the input data before training (OpenAI did something similar curating responses in chatGPT with the RLHF, so similar concept), although concerns/critiques may arise of what junk, which is why it hasn't been looked at in depth before. I believe SD did this in the latest checkpoint removing anything "pornographic", which is over censorship.

You look at something like CC that makes up a fairly large portion of the training data, run it through a classifier to remove junk before training. And even CC text, a lot of it is probably landing type pages, or even a blocked by paywall msging. To my knowledge the percent of these making up CC hasn't even been looked at, let alone trimmed from the training datasets used.

wind_dude t1_j50pmcc wrote on January 19, 2023 at 4:21 PM

Reply to [D] Inner workings of the chatgpt memory by terserterseness

I would suspect similar to blenderbot2 from meta and parl.ai.

Chat memory is searched for relevant information and sent to the decoder for the final output.

https://medium.com/ai-network/is-there-a-chatbot-that-goes-beyond-the-gpt-3-blenderbot-2-0-17e42e674824

https://ai.facebook.com/blog/blender-bot-2-an-open-source-chatbot-that-builds-long-term-memory-and-searches-the-internet/

So it's in the model architecture.