sam__izdat t1_jdps8rk wrote

>Why would that be a disingenuous definition?

Doesn't matter if it's disingenuous. What it's implying is ridiculous. It would be more surprising if the linear regression model didn't work at all. The fact that it can correlate fMRI data better than random doesn't mean you've replicated how language works in the brain, let alone how it's acquired.

> In general, your defense of generative linguistics is very weak. It's just invective and strawmen, and it reeks of desperation.

I don't have any horse in the race or anything to be desperate about. It's just an astonishingly stupid proposition.

I should say, I am not qualified to defend or refute generative linguistics (though that clearly was no obstance for the author), and I don't know anything about it. I do feel qualified (because I can read and check sources) to dismiss this embarrassing pile of nonsense, though, as it's just so plainly nonsense that it doesn't take an expert to dismiss its bombastic claims as pseudoscience -- and I'm talking about Piantadosi here and not his references, which, for all I know, are serious research misrepresented by a dunce. I'm not in academia and I don't feel the need to be any more diplomatic about this than he was toward linguists in his pdf-format blog post.


sam__izdat t1_jchg8nd wrote

I'll leave it to the linguists to debate UG and the specifics of what it does and doesn't mean, but commonalities like some sort of hierarchy, recursion, structure-dependence of rules, etc clearly exist, whatever you want to call them. By shared I just mean there's specific things that human cognitive faculties are set up to do and then other (often computationally simpler) things they clearly don't do. But again, if you're just saying natural languages are not formal languages, I guess that's true by definition. It just sounded to me like you were implying something different.


sam__izdat t1_jch4kn0 wrote

It is a "structured thing" because it has concrete definable grammatical rules, shared across essentially every language and dialect, and common features, like an infinite range of expression and recursion. If language didn't have syntactic structure we'd just be yelling signals at each other, instead of doing what we're doing now. There would be nothing for GPT to capture.


sam__izdat t1_jch1c32 wrote

I'm familiar with the terms, but saying e.g. "imaginary numbers don't exist because they're called imaginary" is not making a meaningful statement. All you've said is that German is not C++, and we have a funny name for that. And that's definitely one of the fuzzier interactions you can have about this, but I'm not sure how it proves that natural languages (apparently? if I'm reading this right...) lack structure.


sam__izdat t1_jceowxm wrote

Ridiculously unfounded claim based on a just plain idiotic premise. Children don't learn language by cramming petabytes of text documents to statistically infer the most plausible next word in a sentence, nor do they accept input with arbitrary syntactic rules. Right or wrong, the minimalist program and Merge offer a plausible partial explanation for a recent explosion of material culture -- which did not happen gradually or across multiple species -- consistent with what we can observe in real human beings. GPT, on the other hand, is not a plausible explanation for anything in the natural world, and has basically nothing inherently to do with human language. He's not wrong that it's a bulldozer. It will just as happily accommodate a made-up grammar that has nothing in common with any that a person could ever use, as it would English or Japanese.

> Chomsky et al. 2023 tilt at an imagined version of these models, while ignoring the fact that the real ones so aptly capture syntax, a success Chomsky and others have persistently claimed was impossible.

Exactly the opposite is true. Transformers are general-purpose computers that will gobble up almost anything you can throw at them. His objection was to the "defect" that it will capture any arbitrary syntax, which means it isn't interesting or helpful to cognitive scientists -- just like like a backhoe doesn't offer any insight into how people, in biological terms, are able to lift heavy objects. What he said was impossible, when approached about it decades ago, was to do these things without resorting to brute force in the absence of an actual theoretical framework and computational model for how language works in the brain. That statement is just as correct today as it was in the 1950s and the rigorous theory of "let's keep cramming in data and stirring the big ol' pot of linear algebra until candy comes out" doesn't do anything to to change that picture.

> In particular, we may view each model or set of modeling assumptions as a possible hypothesis about how the mind may work. Testing how well a model matches humanlike behavior then provides a scientific test of that model’s assumptions.

I see. So, by that reasoning, Deep Blue affirmed our hypothesis that Kasporov got really good at chess by evaluating 200,000,000 positions per second. Nonsense on stilts.

> Hosseini et al. 2022 present evidence that models trained on developmentally-plausible amounts of data already capture human neural responses to language in the brain

If you define "developmentally plausible" as "100 million tokens" and "capture human neural responses" as "fasten together a linear regression model to make statistically significant correlations with some coarse fMRI activity."

> After decades of privilege and prominence in linguistics, Noam Chomsky’s approach to the science of language is experiencing a remarkable downfall. The story is, in part, a cautionary tale about what happens when an academic field isolates itself from what should be complementary endeavours.

Sound advice for overconfident doe-eyed futurists guzzling the silicon valley kool aid and assuming they can dispense with common sense and just wing it with anthropology, biology and cog sci.

edit - and, by the way, it's actually kind of impressive for a person to miss this point so completely:

> Generate ten other sentences like [colorless green ideas]

> Purple fluffy clouds dream wildly

> Blue glittery unicorns jump excitedly

> Orange sparkly penguins waddle energetically

> Red shimmering elephants trumpet melodiously

> Yellow dazzling lions roar majestically

> Green glittery monkeys swing energetically

> Pink sparkly giraffes walk gracefully

> Black shiny kangaroos hop playfully

> White glittery bears cuddle affectionately

> Brown shimmering rabbits hop merrily.

Literally none of these cut-and-paste color-texture-animal-verb-adverb responses satisfies the query, and to get even a little bit closer with GPT, you have to explicitly beg for examples of contradictions and category errors.


sam__izdat t1_jccyxl4 wrote

As a spectator, it's the standard story that's played out a million times now. I see ML as pre-scientific. If capital is allowed to take full control and call all the shots, it's not moving past that "pre" any time soon. It'll be a digital Taylorist discipline for PR industry surveillance and optimizing Amazon packers' pee breaks, and the brief flurry of actually useful progress is probably done.


sam__izdat t1_j9sj2zl wrote

> I think in the opposite way: if alignment is possible, then alignment is profoundly dangerous.

Exactly. What is this neoliberal fever dream? "But what if the computer doesn't do what they want?!" -- my god, what if it does? Are we living on the same planet? Have you seen what they want?

I love how the core of the panic is basically:

"Oh my god, what if some kind of machine emerged, misaligned with human interests and totally committed to extracting what it wants from the material world, no matter the cost, seeing human life and dignity as an obstruction to its function?!"

Yeah, wow... what if?! That'd be so crazy! Glad we don't have anything like that.


sam__izdat t1_j9si9wx wrote

Worry less about misalignment of Skynet, the impending singularity and the rise of the robots, which is science fiction, and worry more about misalignment of class interests and misalignment of power, which is our reality.

For the former, it's still mostly an ambitious long-term goal to simulate the world's simplest nematodes. There's hardly any reason to believe anyone's appreciably closer to AGI now than they were in the 1950s. For the latter, though, there are well-founded concerns that automation will be used for surveillance, disinformation, manipulation, class control, digital Taylorism and other horrifying purposes, as the species knowingly accelerates toward extinction by ignoring systemic failures like AGW and nuclear war, which pose actual, imminent and growing existential risks -- risks that will be compounded by giving state and capital tools to put the interests of power and short term ROI above even near-term human survival, let alone human dignity or potential.

"What if this pile of linear algebra does some asimov nonsense" is not a serious concern. The real concern is "what if it does exactly what was intended, and those intentions continue to see omnicide an acceptable side effect."


sam__izdat t1_j9ni8rd wrote

> They're still AI-assisted

the USCO has (correctly) repeatedly rejected copyright for the raw output of image generators, where you asked the computer to paint you a pretty picture

the parallels with photography are tenuous at best, and it's not about effort but rather the total absence of creative involvement -- it's less photography and more "I found this on google image search" except your database is the model's latent space

it is a good thing that they elected to forego a radical expansion of the already nightmarish, bloated IP regime, where being first-to-access would have granted users (not artists) a blackstonian property right to the results of a text query

i don't need whoever's hoarding the most compute to mine the commons and automatically pump out self-generating, legally-enforceable NFTs, at an industrial scale, in perpetuity... the world has enough parasites as it is, without a new clan of digital landlords, thank you


sam__izdat t1_j9ngakh wrote

I don't have any technical criticism that would be useful to you (and frankly it's above my pay grade), but to expand on what I meant when I said that it's a game of calvinball, there's some history here worth considering. Copyright has gone through myriad justifications.

If we wanted to detect offending content by the original standards of the Stationers' Company, then it may be useful to look for signs of sedition and heresy, since the stated purpose was "to stem the flow of seditious and heretical texts."

By the justification of the liberals who came after, typesetting, being a costly and error-prone process, forced their hand to protect the integrity of the text. So, if for some reason we wanted to take that goal seriously, it might make sense to look for certain kinds of dissimilarity instead: errors and distortions in reproductions. After all, that was the social purpose of the monopoly right.

If the purpose of the copyright regime today is to secure the profits of private capital in perpetuity, then simple metrics of similarity aren't enough to guarantee a virtual Blackstonian land right either.

For example:

> In our discussions, we refer to C ∈ C abstractly as a “piece of copyrighted data”, but do not specify it in more detail. For example, in an image generative model, does C correspond to a single artwork, or the full collected arts of some artists? The answer is the former. The reason is that if a generative model generates data that is influenced by the full collected artworks of X, but not by any single piece, then it is not considered a copyright violation. This is due to that it is not possible to copyright style or ideas, only a specific expression. Hence, we think of C as a piece of content that is of a similar scale to the outputs of the model.

That sounds reasonable. Is it true?

French and Belgian IP laws, for example, consider taking an original photo of a public space showing protected architecture a copyright violation. Prior to mid 2016, taking a panoramic photo with the Atomium in the background was copyright infringement. Distributing a night photo of the Eiffel tower is still copyright infringement today. So, how would you guarantee that a diffusion model fall within the boundaries of arbitrary rules when those tests of "substantial similarity" suddenly become a lot more ambiguous than anticipated?


sam__izdat t1_j9mgsbr wrote

There's no information-theoretic notion of character copyright, for example. It's a game of calvinball, and has been since the Stationers' Company. It's true that copyright is badly misunderstood and over-generalized to things that it has absolutely nothing to do with, like plagiarism and other notions of (nonexistent) authorship rights, but it isn't a measurable thing either and you can't guarantee that the law and policy will agree with your model.


sam__izdat t1_j9imyry wrote

I have never seen it generate any code that is correct-in-principle, let alone usable, for any non-trivial problem. It may be useful as a kind of impressionist painting of a solution, for those who are already programmers. And for trivial code, you'd frankly be better off just learning to code.

In other words, I don't really see this being remotely useful to someone who doesn't know how to code. If anything, the barrier to entry is higher, because you will need to debug extremely unusable but convincing-looking programs. It's at best a hint or a template and at worst a hinderance.


sam__izdat t1_j99j0iu wrote

You're not likely to get much help there, unfortunately. With SD, your best bet would probably be Dreambooth, which you can get with the Huggingface diffusers library. It might be overcomplicating matters, if the site is representative of your training data, though. GANs can be notoriously difficult to train but it's probably worth a shot here -- it's a pretty basic use case. You might look into data augmentation and try a u-net with a single-channel output.

A slightly more advanced option might be ProGAN. Here's a good video tutorial if that's your thing.


sam__izdat t1_iziau6e wrote

What are you having trouble following? I'm not trying to be rude, but it's already a -less technical- method because HF's diffusers and accelerate stuff will download everything for you and set it all up. I rather it was a little more technical, because it's a bit of a black box.

I was having problems with unhelpful error messages until I updated transformers. I'm still having CUDA illegal memory access errors at the start of training, but I think that's because support for old Tesla GPUs is just fading -- had the same issue with new pytorch trying to run any SD in full precision.


sam__izdat t1_iy6o4z0 wrote

  1. They had 4,000 A100s chewing on it, toward the end. I think it's 5,000 now. You can probably do the math from the info in the model card to figure out how much that is in power bills.

  2. It's licensed under RAIL-M. It is questionable whether this licensing has any legal basis because it's unclear whether models themselves are copyrightable. They allow permissive sublicensing with their inference code. You'll have to look at the wording of the license to see how this is reconciled with RAIL-M's usage-based restrictions.

  3. Yes. You can finetune it cheaply and pretty quickly (maybe an hour or two or even less, depending on GPU and settings) with DreamBooth. Retraining a general-purpose model from scratch is probably out of the reach of most people. There is some code available for training from scratch, though, and a special-purpose model might be doable without millions in resources. I think there's been one or two of those, if I'm not mistaken.


sam__izdat t1_iy5swts wrote

Fingers aside, I don't see much improvement, but if there is any -- and I am only guessing -- I reckon "blurry" and "ugly" are pulling a lot of weight. If you do something like:

> ugly, hands, blurry, low resolution, lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, long neck [etc]

- it will definitely have a pronounced effect. Is it the one you want? Well - maybe, maybe not. But it does seem to make things more professional-looking and the subjects more conventionally attractive. It'll also try to obscure hands completely, which is probably the right call all things considered.

And on top of that there's also the blue car effect. It's entirely possible that putting in "close up photo of a plate of food, potatoes, meat stew, green beans, meatballs, indian women dressed in traditional red clothing, a red rug, donald trump, naked people kissing" will amplify some of what you want and cut out some of what's (presumably) a bunch of irrelevant or low-quality SEO spam. Here's somebody's hypothesis on what might be happening.