terath t1_j9xdc0i wrote on February 25, 2023 at 6:15 AM

Reply to comment by gt33m in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

AI has a great many positive uses. Guns not so much. It’s not a good comparison. Nuclear technology might be better, and I’m not for banning nuclear either.

terath t1_j9x6v7k wrote on February 25, 2023 at 5:07 AM

Reply to comment by gt33m in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

My point is that you don’t need ai to hire a hundred people to manually spread propaganda. That’s been going on now for a few years. AI makes it cheaper yes but banning AI or restricting it in no way fixes it.

People are very enamoured with AI but seem to ignore the already many existing technological tools being used to disrupt things today.

terath t1_j9u4o7b wrote on February 24, 2023 at 4:04 PM

Reply to comment by perspectiveiskey in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

If we're getting philosophical, in a weird way if we ever do manage to build human-like AI, and I personally don't believe were at all close yet, that AI may well be our legacy. Long after we've all died that AI could potentially still survive in space or in environments we can't.

Even if we somehow survive for millenia, it will always be near infeasible for us to travel the stars. But it would be pretty easy for an AI that can just put itself in sleep mode for the time it takes to move between system.

If such a thing happens, I just hope we don't truly build them in our image. The universe doesn't need such an aggressive and illogical species spreading. It deserves something far better.

terath t1_j9sd368 wrote on February 24, 2023 at 5:17 AM

Reply to comment by perspectiveiskey in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

This is already happening but the problem is humans not ai. Even without ai we are descending into an era of misinformation.

terath t1_j8oemyz wrote on February 15, 2023 at 8:13 PM

Reply to [D] Is anyone working on ML models that infer and train at the same time? by Cogwheel

Another key phrase to use with google scholar is "online learning", this is where you have a stream of new examples and you update a model one example at a time. Usually you can use the model for inference at any point in this process, and some algorithms in this area are designed to be a bit more aggressive or at least to control the update rates to more quickly more more slowly adapt to new data.

terath t1_j77unze wrote on February 4, 2023 at 7:13 PM

Reply to comment by EmbarrassedHelp in [N] GitHub CEO on why open source developers should be exempt from the EU’s AI Act by EmbarrassedHelp

Why can't they just block the EU ip address blocks and put a disclaimer that this is not authorized for download in the EU?

terath t1_j5l8t4k wrote on January 23, 2023 at 8:21 PM

Reply to comment by WigglyHypersurface in [D] Embedding bags for LLMs by WigglyHypersurface

Oh I see what you mean. I remember that there were some character level language models, but they fell out of favour for subwords as I think the accuracy difference wasn't enough to justify the extra compute required for the character level.

Reviewing the fast text approach, they still end up hashing the character-ngrams rather then training an embedding for each. This could introduce the same sorts of inconsistencies that you're observing. That said, the final fast text embeddings are already the sum of the character embeddings, so I'm not clear on how your approach is different than just using the final fast text embeddings.

terath t1_j5kz6tz wrote on January 23, 2023 at 7:21 PM

Reply to [D] Embedding bags for LLMs by WigglyHypersurface

Have you not heard of byte pair encoding? There are plenty of subword tokenizers and many language models are built on them.

Here is a quick article on them: https://towardsdatascience.com/byte-pair-encoding-subword-based-tokenization-algorithm-77828a70bee0

terath t1_j50rz6q wrote on January 19, 2023 at 4:36 PM

Reply to [D] ML Researchers/Engineers in Industry: Why don't companies use open source models more often? by tennismlandguitar

They probably do use open source architectures and maybe code, but often train their own model on their own data. This is because the research training sets both don't match whatever domain companies need to use it with, but also because many of the research data sets licenses forbid commercial use.