dojoteef OP t1_jc6om7a wrote on March 14, 2023 at 1:14 PM

Reply to comment by generatorman_ai in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

Thanks for the vote of confidence!

Unfortunately, I recently deleted my twitter account 🫣. I was barely active there: a handful of tweets in nearly a decade and a half...

That said, I'll probably post my preprint on this sub when it's ready. I also need to recruit some play testers, so will probably post on r/discoelysium recruiting participants in the next few weeks (to ensure high quality evaluations we need people who have played the game before, rather than using typical crowdsourcing platforms like MTurk).

dojoteef OP t1_jc4hwyw wrote on March 13, 2023 at 11:47 PM

Reply to comment by rePAN6517 in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

If you actually want the NPCs to meaningfully add to the game rather than merely being mouthpieces then your approach won't work. How do you ensure what they say is consistent with the game world? E.g. what if they make up the location of a hidden treasure, offer to give you an item, etc. All of that needs to be accounted for in the game logic as well, otherwise they'll say things that make no sense in the game world.

It's actually a challenging problem and requires research. As far as I know there a very few people actively researching this area; if they are, then they certainly aren't publishing it. Hopefully my next paper which investigates using LLMs in Disco Elysium will help change that.

dojoteef OP t1_jc4e13h wrote on March 13, 2023 at 11:19 PM

Reply to comment by rePAN6517 in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

Not sure we're there yet, but I have some active research in this area right now.

dojoteef OP t1_jc3wcg9 wrote on March 13, 2023 at 9:16 PM

Reply to comment by Fast-for-a-starfish in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

It's not my work, so I can't answer your questions. Helpfully the authors see this post and can answer your questions.

dojoteef OP t1_jc37rjs wrote on March 13, 2023 at 6:39 PM

Reply to comment by icedrift in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

Maybe it's the reddit/social media hug of death. It worked when I posted, but I can't get past the "Agree" button now either.

dojoteef t1_jc2uum4 wrote on March 13, 2023 at 5:17 PM

Reply to [R] Introducing Ursa from Speechmatics | 25% improvement over Whisper by jplhughes

Removed after LOTS of reports. See rules #3 and #8 in the sidebar.

dojoteef t1_j916kx3 wrote on February 18, 2023 at 1:18 PM

Reply to [D] Please stop by [deleted]

See previous discussion: https://old.reddit.com/r/MachineLearning/comments/110swn2/d_quality_of_posts_in_this_sub_going_down

dojoteef t1_j8sqm4i wrote on February 16, 2023 at 6:12 PM

Reply to comment by narsilouu in [D] HuggingFace considered harmful to the community. /rant by drinkingsomuchcoffee

I commend what Huggingface is trying to do (be the source for the latest models that is consistent and easy to use), but every time I've used the library I've had to tackle bugs that were very time consuming to pinpoint, which is exacerbated by the structure of the code. The worst bugs have been subtle heisenbugs: the code seemed to work most of the time, but failed at other times. The heisenbugs are what made me stop using Huggingface altogether, unless it's my only option.

For example, I ran into a bug that only manifested when downloading a specific pretrained model for a task, which in turn downloads a config file that had a bug in the config. As a user it was super difficult to know where the source of the bug was without extensive spelunking. I've had many similarly difficult to diagnose issues each time I've used the Huggingface ecosystem.

I understand that what you're tasked with as a company is a huge undertaking for such a small team. Maybe splitting the package into a "stable" package and a "nightly" package could help (with stable being extensively bug tested more like an Ubuntu LTS release). My guess is that your team is likely too small to support that approach while adding new features at the same speed.

dojoteef t1_j8e2m8g wrote on February 13, 2023 at 5:02 PM

Reply to [D] Quality of posts in this sub going down by MurlocXYZ

Tbh, it's because I took a step back and haven't been moderating the sub the past week and a half. I've been the one mod doing the majority of the filtering of these posts over the past couple of years and the noise has just been going up exponentially over that time. It's very time consuming and I'm pretty burned out doing it, so I've taken some time away. I brought this up with the other mods before stepping back a bit.

It's probably good to try to get more mods, but I think the majority of the current mods are afraid to hire on new mods that might have a different philosophy of moderating, thus changing the feel of the sub.

dojoteef t1_j60evd7 wrote on January 26, 2023 at 8:50 PM

Reply to [D] Why are GANs worse than (Latent) Diffusion Models for text2img generation? by TheCockatoo

I'd guess that it's an easier optimization problem. GANs are known to have stability issues during training, likely due to the adversarial formulation.

I think a more interesting question is why it also performs better than VAEs, since diffusion models also fall under the category of variational inference. Again I'd assume it's an easier optimization problem due to having a large number of denoising steps. Perhaps a technique like DRAW could match diffusion models if used with more steps? Not sure.

dojoteef t1_j5l399n wrote on January 23, 2023 at 7:46 PM

Reply to [D] Embedding bags for LLMs by WigglyHypersurface

This has been studied quite a bit. You can just follow the citation graph of the fastText paper: Enriching Word Vectors with Subword Information

For example, people have investigated sampling different subword tokenizations during training (Stochastic Tokenization with a Language Model for Neural Text Classification) and character-aware embeddings (CharBERT: Character-aware Pre-trained Language Model).

dojoteef t1_j56v9rx wrote on January 20, 2023 at 8:31 PM

Reply to comment by Avelina9X in [D] Did YouTube just add upscaling? by Avelina9X

Reddit automatically removed it, likely due to editing the post. Don't know why, but that occasionally triggers their spam filter. I've approved the post again.

dojoteef t1_j4vnho4 wrote on January 18, 2023 at 4:09 PM

Reply to comment by Avelina9X in [D] Has any work been done on VQ-VAE Language Models? by Avelina9X

Note that the authors have an earlier paper introducing discrete latents for NLP and there are a number of follow up papers to this one as well. So if your interested in a deep dive, you should investigate the citation graph of this paper. Good luck!

dojoteef t1_j41m2hd wrote on January 12, 2023 at 3:34 PM

Reply to [D] Has any work been done on VQ-VAE Language Models? by Avelina9X

See Fast Decoding in Sequence Models using Discrete Latent Variables

dojoteef t1_j2xcxbg wrote on January 4, 2023 at 4:44 PM

Reply to comment by alcanthro in [Project] Building a stateful multi-context aware chat-bot using OpenAI's GPT. by alcanthro

Beginner projects go elsewhere. Try r/learnmachinelearning instead.

dojoteef t1_j2wlgsa wrote on January 4, 2023 at 1:29 PM

Reply to [Project] Building a stateful multi-context aware chat-bot using OpenAI's GPT. by alcanthro

Please do not post this again. See rule #6.

dojoteef t1_j2p0jrg wrote on January 2, 2023 at 11:17 PM

Reply to [D] life advice to relatively late bloomer ML theory researcher. by notyourregularnerd

Better late than never. Started my PhD in my mid thirties and I'm glad I did.

That said, I knew exactly what I wanted to work on (it's relatively niche) and have been fortunate enough to find an advisor willing to let me work in that area. If you're unsure, then it might make sense to work in industry for a while and later decide if you want to come back for a PhD.

dojoteef t1_j1v4j4r wrote on December 27, 2022 at 5:08 PM

Reply to comment by respeckKnuckles in [P] Can you distinguish AI-generated content from real art or literature? I made a little test! by Dicitur

You don't need to tell them one is AI or model generated. Could be two model generated texts or two human written texts. Merely having another text for comparison allows people to better frame the task since otherwise they essentially need to imagine a baseline for comparison, which people rarely do.

dojoteef t1_j1uy04f wrote on December 27, 2022 at 4:24 PM

Reply to [Discussion] 2 discrimination mechanisms that should be provided with powerful generative models e.g. ChatGPT or DALL-E by Exnur0

Very interesting idea. It could easily be applied to images since digital watermarks already exist. Not sure how feasible it is for AI generated text.

Tbh, I imagine it behooves companies to do this so they are less likely to train on media (text, images, audio, etc) produced from a model. The more ubiquitous the use of AI generation becomes, the more of an issue this poses. Currently that problem is likely quite minimal and probably acts to inject a small bit of noise into training (and the knowledge distillation effect could make slightly improve training efficiency).

Though I guess a new data cleaning step could be running a classification model to classify if the media trained on is likely AI generated, though that would likely be less efficient than a hash produced at the time of generation.

dojoteef t1_j1uwubj wrote on December 27, 2022 at 4:16 PM

Reply to [P] Can you distinguish AI-generated content from real art or literature? I made a little test! by Dicitur

Nice job!

Though, to produce a better comparison it's best to show two examples side-by-side (one by a human, the other by the model, in a randomized order of course). The reason is that most people are not trained to analyze short snippets of text out of context. People trained to do that, e.g. English teachers, can better distinguish generated text without a baseline to compare against, but most people (crowd sourced evaluation) will likely produce a very biased analysis not reflective of the real ability for humans to distinguish between the two.

For a more thorough investigation of this phenomenon you can check out our research:

The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation

dojoteef t1_j0ayqqq wrote on December 15, 2022 at 9:30 AM

Reply to [D] Is "natural" text always maximally likely according to language models ? by Emergency_Apricot_77

See the graphs in the paper that introduced nucleus sampling: The Curious Case of Neural Text Degeneration. They visualize how human authored text has different statistical properties from machine generated text. That's mainly a tradeoff between fluency and coherence. Sampling procedures like top-k or nucleus sampling restrict the tokens that can be emitted and thus introduce statistical bias in the generated text, but produce more fluent text. Rather, sampling from the full distribution gets closer to the distribution of human-authored text, but often degenerates into incoherence (hence the title of the paper).

dojoteef t1_j02kku4 wrote on December 13, 2022 at 5:04 PM

Reply to comment by justheuristic in [D] Are there any distributed model training services similar to, e.g. Folding@Home? by genuinelySurprised

This is great! Is it realistically possible to train LLMs ala BLOOM from scratch using these, or just do finetuning? I guess I'm wondering how the training speed scales with more compute nodes.

Even if we assume high end GPUs/TPUs, a frequent bottleneck is throughput due to network latency. How big of an issue is that? For example, I had previously tried scaling to multi-node training on my University's cluster and it turned out that it was faster to do gradient accumulation on a single node than to do multi-node training because the network switches were not purchased with high-throughput in mind.

dojoteef t1_j0275on wrote on December 13, 2022 at 3:30 PM

Reply to [D] Are there any distributed model training services similar to, e.g. Folding@Home? by genuinelySurprised

While there is a field of research investigating federated learning which might one day allow for an ML@Home type project, as it stands the current algorithms require too much memory, computation, and bandwidth for training the very large models like GPT3.

I'm hopeful that an improved approach will be devised that mitigates these issue (in fact I have some ideas I'm considering for my next research project), but as it stands these issues render a real ML@Home type project currently infeasible.

dojoteef t1_iyw254f wrote on December 4, 2022 at 4:56 PM

Reply to comment by Even_Stay3387 in [D] NeurIPS 2022 Outstanding Paper modified results significantly in the camera ready by Even_Stay3387

Mistakes happen. In this case the authors report the issue publicly and should be commended for that.

The NeurIPS organizers can choose to address the issue in whatever way they deem appropriate, especially as the authors are not hiding the fact that their results were changed.

Of course you're free to assume it's malicious if you want (at least that seems to be the stance your taking, but if it's not then I might have misinterpreted your response).