DigThatData t1_jeb49b8 wrote on March 30, 2023 at 6:54 PM

Reply to comment by currentscurrents in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama

this is probably not a concern for whale vocalizations, but an issue for attempting to decode animal communications generally via LLMs is that they're probably communicating as much information (if not more) non-vocally. for example, if we wanted to train an LLM to "understand" dog communication, it'd probably be more important to provide it with signals corresponding to changes in body and face pose than vocalizations. interesting stuff in any event.

DigThatData t1_jea10sg wrote on March 30, 2023 at 2:37 PM

Reply to comment by TheAdvisorZabeth in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama

yeah... i hate to say it but I agree with the other commenters. If you have access to medical support, I strongly recommend you get seen by a physician. I'm concerned you might be experiencing some kind of psychiatric episode. If you're skeptical that's fine, you can even tell them that.

> "Strangers on the internet expressed concern that I might be experiencing a psychiatric episode of some kind. I don't see it, but enough people suggested it that I felt it merited a professional opinion, so here I am."

DigThatData t1_je8pm87 wrote on March 30, 2023 at 5:54 AM

Reply to comment by dreaming_geometry in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama

I've decided to just lean into it and am literally just giving my ideas away. https://github.com/dmarx/bench-warmers

DigThatData t1_je600q2 wrote on March 29, 2023 at 5:35 PM

Reply to comment by alyflex in [D] Alternatives to fb Hydra? by alyflex

i misunderstood, i thought you were looking for an alternative config component. if you're looking for an atlernative for managing hyperparameter search jobs, consider https://docs.ray.io/en/latest/tune/index.html . I think hydra actually might even integrate with ray.

DigThatData t1_je5kfoc wrote on March 29, 2023 at 3:57 PM

Reply to [D] Alternatives to fb Hydra? by alyflex

go closer to the metal and use omegaconf directly.

DigThatData t1_jdsbb8w wrote on March 26, 2023 at 8:19 PM

Reply to [D] GPT4 and coding problems by enryu42

well, i was able to use ChatGPT to generate a novel, functional, complete software library for me, including a test suite, tutorial, and announcement blog post. crazy idea: maybe you just need to get a bit more creative with your prompting or anticipate that there might need to be multi-stage prompts (or god forbid: back and forth dialogue and iteration) for certain applications.

DigThatData t1_jdpza0l wrote on March 26, 2023 at 7:38 AM

Reply to comment by Puzzleheaded_Acadia1 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

it's an RNN

DigThatData t1_jdmvquq wrote on March 25, 2023 at 4:10 PM

Reply to comment by Puzzleheaded_Acadia1 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

the fact that it's comparable at all is pretty wild and exciting

DigThatData t1_jdmvjyb wrote on March 25, 2023 at 4:09 PM

Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

dolly is important precisely because the foundation model is old. they were able to get chatgpt level performance out of it and they only trained it for three hours. just because the base model is old doesn't mean this isn't recent research. it demonstrates:

the efficacy of instruct finetuning
that instruct finetuning doesn't require the worlds biggest most modern model or even all that much data

dolly isn't research from a year ago, it was only just described for the first time a few days ago.

EDIT: ok I just noticed you have an ERNIE model up there so this "no old foundation models" thing is just inconsistent.

DigThatData t1_jdmv87n wrote on March 25, 2023 at 4:07 PM

Reply to [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

don't forget Dolly, the databricks model that was successfully instruct-finetuned on gpt-j-6b in 3 hours

DigThatData t1_jcdhlie wrote on March 16, 2023 at 1:13 AM

Reply to comment by [deleted] in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

https://en.wikipedia.org/wiki/GPT-2#Restrictions_and_partial_release

DigThatData t1_jcbdl00 wrote on March 15, 2023 at 5:00 PM

Reply to comment by canopey in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

it started with the GPT-2 non-release for "safety reasons"

DigThatData t1_jcbdeka wrote on March 15, 2023 at 4:59 PM

Reply to comment by Deep-Station-1746 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]

i don't see the analogy here, i'm wondering if maybe you're misunderstanding: they have a patent over the technique. not "a dropout", all dropout.

DigThatData t1_jbthgbj wrote on March 11, 2023 at 4:16 PM

Reply to [Discussion] Compare OpenAI and SentenceTransformer Sentence Embeddings by Simusid

I think it might be easier to compare if you flip the vertical axis on one of them. you can just negate the values of the component, won't change the topology (the relations of the points relative to each other).

DigThatData t1_j9s4kj0 wrote on February 24, 2023 at 4:00 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

https://en.wikipedia.org/wiki/Instrumental_convergence

DigThatData t1_j9s23ds wrote on February 24, 2023 at 3:40 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

> Isn't there a difference between the two, because the latter concerns a human trying to pursue a certain goal (maximize user engagement), and giving the AI that goal.

in the paperclip maximization parable, "maximize paperclips" is a directive assigned to an AGI owned by a paperclip manufacturer, which consequently concludes that things like "destabilize currency to make paperclip materials cheaper" and "convert resources necessary for human life to exist into paperclip factories" are good ideas. so no, maximizing engagement at the cost of the stability of human civilization is not "aligned" in exactly the same way maximizing paperclip production isn't aligned.

DigThatData t1_j9rzrzd wrote on February 24, 2023 at 3:22 AM

Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

if a "sufficiently advanced AI" could achieve "its own goals" that included "humanity going extinct" (at least as a side effect) in such a fashion that humanity did the work of putting itself out of extinction on its own needing only the AGIs encouragement, it would. In other words, the issues I described are indistinguishable from the kinds of bedlam we could reasonably expect an "x-risk AGI" to impose upon us. ipso facto, if part of the alignment discussion is avoiding defining precisely what "AGI" even means and focusing only on potential risk scenarios, the situation we are currently in is one in which it is unclear that a hazardous-to-human-existence AGI doesn't already exist and is already driving us towards our own extinction.

instead of "maximizing paperclips," "it" is just trying to maximize engagement and click-through rate. and just like the paperclips thing, "it" is burning the world down trying to maximize the only metrics it cares about. "it" just isn't a specific agent, it's a broader system that includes a variety of interacting algorithms and platforms forming a kind of ecosystem of meta-organisms. but the nature of the ecosystem doesn't matter for the paperclip maximization parable to apply.

DigThatData t1_j9rux16 wrote on February 24, 2023 at 2:44 AM

Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt

I think the whole "paperclip" metaphor descibres problems that are already here. a lot of "alignment" discussion feels to me like passengers on a ship theorizing what would happen if the ship became sentient and turned evil and decided to crash into the rocks, but all the while the ship has already crashed into the rocks and is taking on water. It doesn't matter if the ship turns evil in the future: it's already taking us down, whether it crashed into the rocks on purpose or not. See also: contribution of social media recommendation systems to self-destructive human behaviors including political radicalization, stochastic terrorism, xenophobia, fascism, and secessionism. Oh yeah, also we're arguing over the safety of vaccines during an epidemic and still ignoring global warming, but for some reason public health and environmental hazards don't count as "x-risks".

DigThatData t1_j9nlrii wrote on February 23, 2023 at 6:41 AM

Reply to comment by JackBlemming in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

just to be clear: i'm not saying neural networks don't scale, i'm saying they're not the only class of learning algorithm that scales.

DigThatData t1_j9k17rr wrote on February 22, 2023 at 3:32 PM

Reply to [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

it's not. tree ensembles scale gloriously, as do approximations of nearest neighbors. there are certain (and growing) classes of problems for which deep learning produces seemingly magical results, but that doesn't mean it's the only path to a functional solution. It'll probably give you the best solution, but that doesn't mean it's the only way to do things.

in any event, if you want to better understand scaling properties of DL algorithms, a good place to start is the "double descent" literature.

DigThatData t1_j95gxlf wrote on February 19, 2023 at 11:35 AM

Reply to comment by maxToTheJ in [D] Please stop by [deleted]

i think something changed in the past week though. /r/MLQuestions has recently been getting a lot of "can you recommend a free AI app that does <generic thing>?". I'm wondering if there was a news piece that went viral or something that turned a new flood of people on to what's been happening in AI or something like that.

DigThatData t1_j8xxnpp wrote on February 17, 2023 at 7:11 PM

Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties

unrelated to OP: what is the "best practice" method for a notebook to self-test if it's running in a colab environment? i think the method I'm currently using is something like

probably_colab = False
try:
    import google.colab
    probably_colab = True
except ImportError:
    pass

which I'm not a fan of for a variety of reasons. what would you recommend?

DigThatData t1_j8qkl3f wrote on February 16, 2023 at 6:00 AM

Reply to comment by autoraft in [P] Build data web apps in Jupyter Notebook with Python only by pp314159

all I know is voila works with panel, and panel works with basically everything (ipywidgets, bokeh, plotly...). not sure about streamlit/gradio.

DigThatData t1_j8or4jr wrote on February 15, 2023 at 9:31 PM

Reply to [P] Build data web apps in Jupyter Notebook with Python only by pp314159

i feel like voila is pretty hard to beat, especially considering it already ships with jupyter. just change the word "tree" in your URL to "voila" and bam: your notebook's a webapp.

DigThatData t1_j7wjpp4 wrote on February 9, 2023 at 10:21 PM

Reply to [D] Using LLMs as decision engines by These-Assignment-936

https://innermonologue.github.io/