neuronexmachina t1_jcax06f wrote

2017 transformers paper for reference: "Attention is all you need" (cited 68K times)

>The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data


neuronexmachina t1_j95cntq wrote

>However, my concern is the sophistication of the nueral networks that are no doubt classified, most definitely in the hands of government and military.

Makes me wonder how adept the massively-parallel machines the NSA uses to crack encryption are when repurposed for training LLMs and other neural nets.

Or heck, if they secretly have a functioning quantum computer, there's probably some pretty crazy capabilities when combined with transformers/etc.

(I had a link to an article about quantum transformers, but the auto-mod ate it)


neuronexmachina t1_j867ome wrote

Link to MIT summary of study: Solving a machine-learning mystery: A new study shows how large language models like GPT-3 can learn a new task from just a few examples, without the need for any new training data.

Actual preprint and abstract: What learning algorithm is in-context learning? Investigations with linear models

>Neural sequence models, especially transformers, exhibit a remarkable capacity for in-context learning. They can construct new predictors from sequences of labeled examples (x,f(x)) presented in the input without further parameter updates. We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly, by encoding smaller models in their activations, and updating these implicit models as new examples appear in the context. Using linear regression as a prototypical problem, we offer three sources of evidence for this hypothesis. First, we prove by construction that transformers can implement learning algorithms for linear models based on gradient descent and closed-form ridge regression. Second, we show that trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression, transitioning between different predictors as transformer depth and dataset noise vary, and converging to Bayesian estimators for large widths and depths. Third, we present preliminary evidence that in-context learners share algorithmic features with these predictors: learners' late layers non-linearly encode weight vectors and moment matrices. These results suggest that in-context learning is understandable in algorithmic terms, and that (at least in the linear case) learners may rediscover standard estimation algorithms. Code and reference implementations are released at this https URL.


neuronexmachina t1_j63etuk wrote

A few days ago Rep. Ted Lieu also used ChatGPT to write the first paragraph of his opinion piece in the NYT about AI:

>Imagine a world where autonomous weapons roam the streets, decisions about your life are made by AI systems that perpetuate societal biases and hackers use AI to launch devastating cyberattacks. This dystopian future may sound like science fiction, but the truth is that without proper regulations for the development and deployment of Artificial Intelligence (AI), it could become a reality. The rapid advancements in AI technology have made it clear that the time to act is now to ensure that AI is used in ways that are safe, ethical and beneficial for society. Failure to do so could lead to a future where the risks of AI far outweigh its benefits.

> I didn’t write the above paragraph. It was generated in a few seconds by an A.I. program called ChatGPT, which is available on the internet. I simply logged into the program and entered the following prompt: “Write an attention grabbing first paragraph of an Op-Ed on why artificial intelligence should be regulated.”

>I was surprised at how ChatGPT effectively drafted a compelling argument that reflected my views on A.I., and so quickly. As one of just three members of Congress with a computer science degree, I am enthralled by A.I. and excited about the incredible ways it will continue to advance society. And as a member of Congress, I am freaked out by A.I., specifically A.I. that is left unchecked and unregulated. ....


neuronexmachina t1_j6189hm wrote

It's real GDP, which adjusts for inflation. Without factoring inflation the growth would be 6.5%:

> (GDP) increased at an annual rate of 2.9 percent in the fourth quarter of 2022 (table 1), according to the "advance" estimate released by the Bureau of Economic Analysis. In the third quarter, real GDP increased 3.2 percent.

> ... Current‑dollar GDP increased 6.5 percent at an annual rate, or $408.6 billion, in the fourth quarter to a level of $26.13 trillion. In the third quarter, GDP increased 7.7 percent, or $475.4 billion (tables 1 and 3).


neuronexmachina t1_j4wkojl wrote

Do you have a source for that? According to this they haven't had layoffs yet, but it's widely expected they'll announce them in the near future:

>Although several major companies across the tech industry have announced mass layoffs as rising inflation and uncertain macroeconomic conditions fuel recession fears, Alphabet Inc., the parent company of search giant Google, has been obviating this undesirable. The tech behemoth is yet to announce any major layoff, despite several reports of upcoming job cuts inside the company.
However, Alphabet employees will not be immune to this industry-wide layoffs for too long. Employees at Google anticipate an imminent announcement of a huge layoffs, according to reports.
In November, The Information had reported that Alphabet was planning to lay off nearly 10,000 low performing employees starting early 2023 in order to curb costs.
Slumping ad market has weighed heavily on Google's bottom line, which in turn has made investors urge the company to seek avenues to cut expenses. Executives at Alphabet are seemingly more focused to cut down costs of projects and make the company more efficient.


neuronexmachina t1_j4c2poz wrote

Interesting comment from last month's HackerNews thread about IvorySQL:

>It’s also likely in violation of arcane copyright law. The oracle wire protocol includes a handshake procedure that sends a poem in one of the initial messages. It’s untested legal theory whether copying that poem in a new work (e.g. a new driver or compatability layer) would violate Oracle’s copyright on the poem.


neuronexmachina t1_j2ntanb wrote

I thought this part of the paper's conclusion was pretty interesting:

>Yet today’s geopolitical divide does not depend upon historical ties or cultural affinity. Rather, it finds its basis within politics and political ideology: namely, whether regimes are democratic or authoritarian, and whether societies are liberal or illiberal in their fundamental view of life. In the first category are maritime societies based on trade, the free flow of peoples and ideas, and the protection of individual rights: in this grouping we find the countries of western Europe, the settler societies of both North and South America and Australasia, as well as high-income insular democracies in North Pacific Asia. By contrast, the second cate- gory is comprised of historically land-based, continental empires: Iran, Russia, Central Asia, China, and the Arab Middle East. In that sense, comparisons with the Cold War are not entirely mistaken. For even though this latter grouping spans the full range of political in- stitutions and ideologies – from Islamism to secular communism, and from traditionalist monarchism to mass movement populism – they are united in their rejection of western modernity, and its associated political and social alternative.


neuronexmachina t1_j1km7f9 wrote

I believe this builds on some of the Caltech-based research group's prior work on detecting online trolling, e.g. Finding Social Media Trolls: Dynamic Keyword Selection Methods for Rapidly-Evolving Online Debates

>Online harassment is a significant social problem. Prevention of online harassment requires rapid detection of harassing, offensive, and negative social media posts. In this paper, we propose the use of word embedding models to identify offensive and harassing social media messages in two aspects: detecting fast-changing topics for more effective data collection and representing word semantics in different domains. We demonstrate with preliminary results that using the GloVe (Global Vectors for Word Representation) model facilitates the discovery of new and relevant keywords to use for data collection and trolling detection. Our paper concludes with a discussion of a research agenda to further develop and test word embedding models for identification of social media harassment and trolling.


neuronexmachina t1_j1dvgip wrote

Part of the problem is that FB's API at the time not only allowed an app access to a user's data, but also a lot of the personal data for their FB friends as well. That's how CA got access to data for 50M users -- there certainly weren't 50M users of their app.


neuronexmachina t1_j1d7w1y wrote

I assume the goal was to narrow down the list of potential leakers, which IP addresses would be useful for. Regarding your hotel chain example, they could just perform a reverse lookup to see it's an IP belonging to a hotel chain, and weight the information accordingly along with other information they have about their employees.

Also, the article doesn't mention this, but checking the Google Play Store and Apple App Store entries for TikTok, it looks like location data is part of what the app has access to.


neuronexmachina t1_j1ccq4v wrote

IP addresses could definitely be used to figure out if a journalist was connected to the same wifi network as a ByteDance employee, though:

>An internal investigation by ByteDance, the parent company of video-sharing platform TikTok, found that employees tracked multiple journalists covering the company, improperly gaining access to their IP addresses and user data in an attempt to identify whether they had been in the same locales as ByteDance employees.