DigThatData
DigThatData t1_jea10sg wrote
Reply to comment by TheAdvisorZabeth in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
yeah... i hate to say it but I agree with the other commenters. If you have access to medical support, I strongly recommend you get seen by a physician. I'm concerned you might be experiencing some kind of psychiatric episode. If you're skeptical that's fine, you can even tell them that.
> "Strangers on the internet expressed concern that I might be experiencing a psychiatric episode of some kind. I don't see it, but enough people suggested it that I felt it merited a professional opinion, so here I am."
DigThatData t1_je8pm87 wrote
Reply to comment by dreaming_geometry in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
I've decided to just lean into it and am literally just giving my ideas away. https://github.com/dmarx/bench-warmers
DigThatData t1_je600q2 wrote
Reply to comment by alyflex in [D] Alternatives to fb Hydra? by alyflex
i misunderstood, i thought you were looking for an alternative config component. if you're looking for an atlernative for managing hyperparameter search jobs, consider https://docs.ray.io/en/latest/tune/index.html . I think hydra actually might even integrate with ray.
DigThatData t1_je5kfoc wrote
Reply to [D] Alternatives to fb Hydra? by alyflex
go closer to the metal and use omegaconf directly.
DigThatData t1_jdsbb8w wrote
Reply to [D] GPT4 and coding problems by enryu42
well, i was able to use ChatGPT to generate a novel, functional, complete software library for me, including a test suite, tutorial, and announcement blog post. crazy idea: maybe you just need to get a bit more creative with your prompting or anticipate that there might need to be multi-stage prompts (or god forbid: back and forth dialogue and iteration) for certain applications.
DigThatData t1_jdpza0l wrote
Reply to comment by Puzzleheaded_Acadia1 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
it's an RNN
DigThatData t1_jdmvquq wrote
Reply to comment by Puzzleheaded_Acadia1 in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
the fact that it's comparable at all is pretty wild and exciting
DigThatData t1_jdmvjyb wrote
Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
dolly is important precisely because the foundation model is old. they were able to get chatgpt level performance out of it and they only trained it for three hours. just because the base model is old doesn't mean this isn't recent research. it demonstrates:
- the efficacy of instruct finetuning
- that instruct finetuning doesn't require the worlds biggest most modern model or even all that much data
dolly isn't research from a year ago, it was only just described for the first time a few days ago.
EDIT: ok I just noticed you have an ERNIE model up there so this "no old foundation models" thing is just inconsistent.
DigThatData t1_jdmv87n wrote
don't forget Dolly, the databricks model that was successfully instruct-finetuned on gpt-j-6b in 3 hours
DigThatData t1_jcdhlie wrote
DigThatData t1_jcbdl00 wrote
Reply to comment by canopey in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
it started with the GPT-2 non-release for "safety reasons"
DigThatData t1_jcbdeka wrote
Reply to comment by Deep-Station-1746 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
i don't see the analogy here, i'm wondering if maybe you're misunderstanding: they have a patent over the technique. not "a dropout", all dropout.
DigThatData t1_jbthgbj wrote
I think it might be easier to compare if you flip the vertical axis on one of them. you can just negate the values of the component, won't change the topology (the relations of the points relative to each other).
DigThatData t1_j9s23ds wrote
Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
> Isn't there a difference between the two, because the latter concerns a human trying to pursue a certain goal (maximize user engagement), and giving the AI that goal.
in the paperclip maximization parable, "maximize paperclips" is a directive assigned to an AGI owned by a paperclip manufacturer, which consequently concludes that things like "destabilize currency to make paperclip materials cheaper" and "convert resources necessary for human life to exist into paperclip factories" are good ideas. so no, maximizing engagement at the cost of the stability of human civilization is not "aligned" in exactly the same way maximizing paperclip production isn't aligned.
DigThatData t1_j9rzrzd wrote
Reply to comment by royalemate357 in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
if a "sufficiently advanced AI" could achieve "its own goals" that included "humanity going extinct" (at least as a side effect) in such a fashion that humanity did the work of putting itself out of extinction on its own needing only the AGIs encouragement, it would. In other words, the issues I described are indistinguishable from the kinds of bedlam we could reasonably expect an "x-risk AGI" to impose upon us. ipso facto, if part of the alignment discussion is avoiding defining precisely what "AGI" even means and focusing only on potential risk scenarios, the situation we are currently in is one in which it is unclear that a hazardous-to-human-existence AGI doesn't already exist and is already driving us towards our own extinction.
instead of "maximizing paperclips," "it" is just trying to maximize engagement and click-through rate. and just like the paperclips thing, "it" is burning the world down trying to maximize the only metrics it cares about. "it" just isn't a specific agent, it's a broader system that includes a variety of interacting algorithms and platforms forming a kind of ecosystem of meta-organisms. but the nature of the ecosystem doesn't matter for the paperclip maximization parable to apply.
DigThatData t1_j9rux16 wrote
Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
I think the whole "paperclip" metaphor descibres problems that are already here. a lot of "alignment" discussion feels to me like passengers on a ship theorizing what would happen if the ship became sentient and turned evil and decided to crash into the rocks, but all the while the ship has already crashed into the rocks and is taking on water. It doesn't matter if the ship turns evil in the future: it's already taking us down, whether it crashed into the rocks on purpose or not. See also: contribution of social media recommendation systems to self-destructive human behaviors including political radicalization, stochastic terrorism, xenophobia, fascism, and secessionism. Oh yeah, also we're arguing over the safety of vaccines during an epidemic and still ignoring global warming, but for some reason public health and environmental hazards don't count as "x-risks".
DigThatData t1_j9nlrii wrote
Reply to comment by JackBlemming in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
just to be clear: i'm not saying neural networks don't scale, i'm saying they're not the only class of learning algorithm that scales.
DigThatData t1_j9k17rr wrote
it's not. tree ensembles scale gloriously, as do approximations of nearest neighbors. there are certain (and growing) classes of problems for which deep learning produces seemingly magical results, but that doesn't mean it's the only path to a functional solution. It'll probably give you the best solution, but that doesn't mean it's the only way to do things.
in any event, if you want to better understand scaling properties of DL algorithms, a good place to start is the "double descent" literature.
DigThatData t1_j95gxlf wrote
Reply to comment by maxToTheJ in [D] Please stop by [deleted]
i think something changed in the past week though. /r/MLQuestions has recently been getting a lot of "can you recommend a free AI app that does <generic thing>?". I'm wondering if there was a news piece that went viral or something that turned a new flood of people on to what's been happening in AI or something like that.
DigThatData t1_j8xxnpp wrote
Reply to comment by ckperry in [N] Google is increasing the price of every Colab Pro tier by 10X! Pro is 95 Euro and Pro+ is 433 Euro per month! Without notifying users! by FreePenalties
unrelated to OP: what is the "best practice" method for a notebook to self-test if it's running in a colab environment? i think the method I'm currently using is something like
probably_colab = False
try:
import google.colab
probably_colab = True
except ImportError:
pass
which I'm not a fan of for a variety of reasons. what would you recommend?
DigThatData t1_j8qkl3f wrote
Reply to comment by autoraft in [P] Build data web apps in Jupyter Notebook with Python only by pp314159
all I know is voila works with panel, and panel works with basically everything (ipywidgets, bokeh, plotly...). not sure about streamlit/gradio.
DigThatData t1_j8or4jr wrote
i feel like voila is pretty hard to beat, especially considering it already ships with jupyter. just change the word "tree" in your URL to "voila" and bam: your notebook's a webapp.
DigThatData t1_jeb49b8 wrote
Reply to comment by currentscurrents in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
this is probably not a concern for whale vocalizations, but an issue for attempting to decode animal communications generally via LLMs is that they're probably communicating as much information (if not more) non-vocally. for example, if we wanted to train an LLM to "understand" dog communication, it'd probably be more important to provide it with signals corresponding to changes in body and face pose than vocalizations. interesting stuff in any event.