EmmyNoetherRing t1_jdma3em wrote on March 25, 2023 at 1:27 PM

Reply to comment by Crystal-Ammunition in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

Introspection? Cog-sci/classical AI like to use the term, not always in the best justified fashion I think. But when you’re hallucinating your own new training data it seems relevant.

EmmyNoetherRing t1_jcpr2q3 wrote on March 18, 2023 at 4:44 PM

Reply to comment by boss_007 in [Discussion] Future of ML after chatGPT. by [deleted]

Are they? They graze, I’d think you’d have the same methane problem you’ve got with cows.

EmmyNoetherRing t1_jcotecm wrote on March 18, 2023 at 12:30 PM

Reply to comment by boss_007 in [Discussion] Future of ML after chatGPT. by [deleted]

just like cars pass horses.

EmmyNoetherRing t1_jcot5t2 wrote on March 18, 2023 at 12:27 PM

Reply to comment by banatage in [Discussion] Future of ML after chatGPT. by [deleted]

Is that true? OpenAI seems to think they’ll be able to train task-specific AI on top of their existing models for specific roles.

EmmyNoetherRing t1_j7ia3n1 wrote on February 7, 2023 at 12:02 AM

Reply to comment by MelonFace in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey

It also doesn’t give SD a chance to apply any lessons learned from the first case to the second one, I guess.

EmmyNoetherRing t1_j6j7zq4 wrote on January 30, 2023 at 6:46 PM

Reply to comment by mettle in [Discussion] ChatGPT and language understanding benchmarks by mettle

I wouldn’t mind being one of those folks. But you make a good point that the old rubrics may not be capturing it.

If you want to nail down what users are observing as its comparison to human performance, practically speaking you may need to shift to diagnostics that were designed to evaluate human performance. With the added challenge of avoiding tests where the answer sheet would already be in its training data.

EmmyNoetherRing t1_j6i8xfv wrote on January 30, 2023 at 3:02 PM

Reply to [Discussion] ChatGPT and language understanding benchmarks by mettle

I hate to say it, but I think the actual answer to “as compared to what” is “as compared to my human professor”.

People using it to learn are having interactions that mimic interactions with teachers/experts. When they mention hallucinations, I think it’s often in that context.

EmmyNoetherRing t1_j5ulz6t wrote on January 25, 2023 at 6:01 PM

Reply to [D] Efficient retrieval of research information for graduate research by [deleted]

So-- a few things

ChatGPT doesn't currently have access to the internet, although it's obviously working with data it scraped in the recent past, and I expect searching wikipedia from 2021 is sufficient to answer a wide array of queries, which is why it feels like it has internet access when you ask it questions.

ChatGPT is effective because it's been trained on an unimaginably large set of data, and had an unknown large number of human hours gone into supervised/interactive/online/reinforcement/(whatever) learning where an army of contractors has trained it how to deal well with arbitrary human prompts. You don't really want an AI trained just on your data set by itself.

But ChatGPT (or just plain GPT3) is great for summarizing bodies of text as it is right now. I expect you should be able to google how to nicely ask GPT3 to summarize your notes or answer questions with respect to them.

EmmyNoetherRing t1_j5g8ogy wrote on January 22, 2023 at 8:12 PM

Reply to comment by ardula99 in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

So, not quite. You’re describing funny cases that a trained classifier will misclassify.

We’re talking about what happens if you can intentionally inject bias into an AI’s training data (since it’s pulling that data from the web, if you know where it’s pulling from you can theoretically influence how it’s trained). That would potentially cause it to misclassify many cases (or have other more complex issues). It starts to be weirdly slightly feasible if you think about a future where a lot of online content is generated by AI— but we have at least two competing companies/governments supplying those AI.

Say we’ve got two AI’s, A & B. A can use secret proprietary watermarks to recognize its own text online and avoid using that text in its training data (it wants to train on human data). And of course AI B can do the same thing, to recognize its own text. But since each AI is using its own secret watermarks, there’s no good way to prevent A from accidentally training on B’s output. And vice versa.

The AI’s are supposed to only train on human data, to be more like humans. But maybe there will be a point where they unavoidably start training on each other. And then if there’s a malicious actor, they might intentionally use their AI to flood a popular public text data source with content that, if the other AI ingest it, will cause them to behave in a way that the actor wants (biased against their targets, or biased positively for the actor).

Effectively, at some point we may have to deal with people secretly using AI to advertise to, radicalize, or scam other AI. Unless we get some fairly global regulations up in time. Should be interesting.

I wonder to what extent we’ll manage to get science fiction out about these things before we start seeing them in practice.

EmmyNoetherRing t1_j5fhpjq wrote on January 22, 2023 at 5:19 PM

Reply to comment by artsybashev in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

There’s almost an analogy here to malicious influence attacks aimed at radicalizing people. You have to inundate them with a web of targeted information/logic to gradually change their worldview.

EmmyNoetherRing t1_j5ffiqv wrote on January 22, 2023 at 5:05 PM

Reply to comment by Advanced-Hedgehog-95 in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

The company wants to be able to identify their own output when they see it in the wild, so they can filter it out when they’re grabbing training data. You don’t want the thing talking to itself.

EmmyNoetherRing t1_j5er3xp wrote on January 22, 2023 at 2:05 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I’d heard they had added one, actually. Or were planning to— the concern they listed was they didn’t want the model accidentally training on its own output, as more of its output shows up online.

I have to imagine this is a situation where security by obscurity is unavoidable though, so if they do have a watermark we might not hear much about it. Otherwise malicious users would just clean it back out again.

We may end up with a situation where only a few people internal to OpenAI know how the watermark works, and they occasionally answer questions for law enforcement with the proper paperwork.

EmmyNoetherRing t1_j5253a8 wrote on January 19, 2023 at 9:31 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

>Softmax activation function

Ok, got it. huh (on reviewing wikipedia). so to rephrase the quoted paragraph, they find that the divergence between the training and testing distribution (between the compressed versions of the training and testing data sets in my analogy) starts decreasing smoothly as the scale of the model increases, long before the actual final task performance locks into place successfully.

Hm. Says something more about task complexity (maybe in some computability sense, a fundamental task complexity, that we don't have well defined for those types of tasks yet?). Rather than imagination I think, but I'm still with you on imagination being a factor, and of course the paper and the blog post both leave the cliff problem unsolved. Possibly there's a definition of imagination such that we can say degree X of it is needed to successfully complete those tasks.

EmmyNoetherRing t1_j522inn wrote on January 19, 2023 at 9:16 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

So I'm in a different flavor of data science, which means I've got the basic terminology, but not the specifics. I know what a loss function is and what entropy is. What role does "cross" play here? A cross between what?

EmmyNoetherRing t1_j51x98z wrote on January 19, 2023 at 8:45 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

> As an alternative evaluation, we measure cross-entropy loss, which is used in scaling laws for pre-training, for the six emergent BIG-Bench tasks, as detailed in Appendix A. This analysis follows the same experimental setup from BIG-Bench (2022) and affirms their conclusions for the six emergent tasks we consider. Namely, cross-entropy loss improves even for small model scales where the downstream metrics (exact match, BLEU, and accuracy) are close to random and do not improve, which shows that improvements in the log-likelihood of the target sequence can be masked by such downstream metrics. However, this analysis does not explain why downstream metrics are emergent or enable us to predict the scale at which emergence occurs. Overall, more work is needed to tease apart what enables scale to unlock emergent abilities.

Don't suppose you know what cross-entropy is?

EmmyNoetherRing t1_j51wpgz wrote on January 19, 2023 at 8:42 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

thank you, I've been looking for something along these lines.

EmmyNoetherRing t1_j51cvjh wrote on January 19, 2023 at 6:42 PM

Reply to comment by DaLameLama in [D] Inner workings of the chatgpt memory by terserterseness

yeah, as an uninformed guess it seems like IJCAI or NeurIPS would be a more natural home, but AAAI is actually in DC, which seems helpful for some categories of conversation. if the right people attend.

EmmyNoetherRing t1_j510553 wrote on January 19, 2023 at 5:25 PM

Reply to comment by DaLameLama in [D] Inner workings of the chatgpt memory by terserterseness

>Unfortunately, OpenAI aren't serious about publishing technical reports anymore.

Do OpenAI folks show up to any of the major research conferences? These days I mostly come into contact with AI when it wanders into the tech policy/governance world, and this seems like the sort of work that would get you invited to an OSTP workshop, but I'm not sure if that's actually happening.

OpenAI's latest not-so-technical report (on their website) has a few folks from Georgetown contributing to it, and since AAAI is in DC in a few weeks I was hoping OpenAI would be around and available for questions in some capacity, in some room at the conference.

EmmyNoetherRing t1_j50zesm wrote on January 19, 2023 at 5:21 PM

Reply to comment by DaLameLama in [D] Inner workings of the chatgpt memory by terserterseness

I've heard a diverse variety of folks talk about leaving chatGPT tabs/sessions open for for days or weeks and maintaining context plausibly well throughout.

EmmyNoetherRing t1_j50q53i wrote on January 19, 2023 at 4:25 PM

Reply to comment by mycall in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

Huh, fair. Got a concrete example?

EmmyNoetherRing t1_j2ss8qe wrote on January 3, 2023 at 6:31 PM

Reply to comment by currentscurrents in [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

Learning is compression, sorta.

EmmyNoetherRing t1_j2s64d8 wrote on January 3, 2023 at 4:11 PM

Reply to [R] Massive Language Models Can Be Accurately Pruned in One-Shot by starstruckmon

There’s a lot of cognitive psychology research in how human brains forget things strategically, which I always found interesting. Another point of evidence that it’s computationally possible to learn how to process complex info without hanging onto everything you observed in the process of learning.

EmmyNoetherRing t1_j1ruur9 wrote on December 26, 2022 at 10:25 PM

Reply to comment by queenexorcist in 10 South Korean tourists were stranded in a blizzard near Buffalo. Then they spent 2 nights in a stranger's home, cooking and watching football. by HalFWit

It’s between DC and Niagara Falls

EmmyNoetherRing t1_j0zl99m wrote on December 20, 2022 at 4:21 PM

Reply to comment by w_is_h in [R] Foresight: Deep Generative Modelling of Patient Timelines using Electronic Health Records by w_is_h

Nice!

EmmyNoetherRing t1_j0yrv8i wrote on December 20, 2022 at 12:28 PM

Reply to comment by w_is_h in [R] Foresight: Deep Generative Modelling of Patient Timelines using Electronic Health Records by w_is_h

Don’t forget to check for accuracy by illness category too. Humans have biases because of social issues, machines also pick up biases due to the relative shapes/distributions of the various concepts they’re trying to learn— they’ll do better on simpler ones and more common ones. You might get high accuracy on cold/flu cases that show up frequently in the corpus and have very simple treatment paths, and because they show up frequently that may bump up your overall accuracy. But at the same time you want to check how it’s handling less common cases whose diagnosis/treatment will likely be spread across multiple records over a period of time, like cancer or auto-immune issues.

It’s a good idea to verify that your simulation process isn’t accidentally stripping the diversity out of the original data, by generating instances of the rarer or more complex cases that are biased towards having traits from the simpler and more common cases (especially in this context that might result in some nonsensical record paths for more complex illnesses).