visarga t1_j712mwb wrote on February 3, 2023 at 8:36 AM

Reply to comment by theoneandonlypatriot in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

My task is in the NLP space, maybe that makes it more approacheable - information extraction from semistructured documents. I can do extraction from existing documents with GPT-3 (question answering) or I can generate new data with known tags.

visarga t1_j7127ha wrote on February 3, 2023 at 8:30 AM

Reply to comment by tripple13 in [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

> It’s not about ‘threatening’ jobs, but improving certain aspects of it.

Jobs don't just exist by themselves, it's the people who demand products and services causing jobs to exist. In other words, they are a function of human needs and desires.

The question is - can automation satiate all our desires? I don't think so. We will invent new jobs and tasks because we will desire things automation can't provide yet. In a contest between human entitlement and AI advancement I think entitlement will always win - we will think everything we have is just basic stuff and want something more. If you asked people from 300 years ago what they think about our lifestyles they would think we already reached singularity, but we know we haven't because we feel already entitled to what we have.

visarga t1_j711jhe wrote on February 3, 2023 at 8:21 AM

Reply to [D] Is computer science one of the most threatened jobs due to AI? by Suspicious-Spend-415

One year ago I tried information extraction from invoices with GPT-3 and it worked very well. Our team has been working on this project for years, collected data, built labelling tools, trained models, etc ... and now this AI does it without any specific training. We shivered fearing for our future.

Now I started using GPT-3 and let me tell you - it's not as easy as it looks in the playground. If you use GPT-3 you need to think of prompt design, demonstrations, prompt evaluation, data pre-processing and post-processing (is the extracted text actually present in the source?), using justifications, CoT or self consistency. In the end I have so much work I don't know what to do first.

AI will assume a number of tasks and open up other tasks around it so the total amount of work will remain the same - which is as much as people can handle. Software is a weird field - it has been cannibalising itself for decades and decades and yet developers are growing in numbers and compensation. That is a testament to our infinite desire for more.

visarga t1_j6zb9em wrote on February 2, 2023 at 11:12 PM

Reply to comment by Nhabls in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

Many AI teams are scrambling now to label data with GPT-3 and train their small efficient models from GPT-3 predictions. This makes the hard part of data labelling much easier, speeds up development 10 times. In the end you get your cheap & fast models that work about as good as GPT-3 but only on a narrow task.

visarga t1_j6zb3ca wrote on February 2, 2023 at 11:10 PM

Reply to comment by wintermute93 in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

> largely the same kind of thing.

For what value of largely? How many coherent words can it write? Does it also obey commands and solve tasks?

visarga t1_j6zapcm wrote on February 2, 2023 at 11:08 PM

Reply to comment by LeanderKu in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

That's why I keep pen and notebook open in front of my keyboard at all times, I take light notes during meetings and use it as scratchpad when I am thinking. I can fill 100 pages in a month, almost never re-read except for meeting notes.

visarga t1_j6zaf56 wrote on February 2, 2023 at 11:06 PM

Reply to comment by TheTerrasque in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

interference is all you need

visarga t1_j6z9nie wrote on February 2, 2023 at 11:01 PM

Reply to comment by frequenttimetraveler in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata

No, you got it worng. Today you want to sprnikle a few mistakes to signal your authenticity. It's the new cool style. Only chatGPT and copyrighting professionals have perfect grammar.

visarga t1_j6yw3md wrote on February 2, 2023 at 9:32 PM

Reply to comment by FomalhautCalliclea in Controversy over current progress in AI by SoulGuardian55

> AlphaFold's goal isn't to "install proteins on his own, in real time"

Actually that's important - to have experimental validation, a signal to learn from when you have no other clue. Instead of learning from a fixed dataset, an AI could design experiments, like human scientists.

visarga t1_j6xtpyh wrote on February 2, 2023 at 5:34 PM

Reply to comment by HighTechPipefitter in ChatGPT Passes US Medical Licensing Exams Without Cramming by RareGur3157

Maybe it can raise research questions and then follow through experimental testing. That would be neat. Automate science.

visarga t1_j6x8zna wrote on February 2, 2023 at 3:23 PM

Reply to [D]How Will Open Source Alternatives Compete With GPT3? by noellarkin

I think open source implementations will eventually get there. They probably need much more multi-task and RLHF data, or they had too little code in the initial pre-training. Training GPT-3.5 like models is like a recipe, and the formula + ingredients are gradually becoming available.

visarga t1_j6x1uwy wrote on February 2, 2023 at 2:34 PM

Reply to comment by Ronny_Jotten in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

> The extent to which something is memorized ... is certainly something to be discussed.

One in a million chance of memorisation even when you're actively looking for them is not worth discussing about.

> We select the 350,000 most-duplicated examples from the training dataset and generate 500 candidate images for each of these prompts (totaling 175 million generated images). We find 109 images are near-copies of training examples.

On the other hand, these models compress billions of images into a few GB. There is less than 1 byte on average per input example, there's no space to have significant memorisation. Probably why there were only 109 memorised images found.

I would say I am impressed there were so few of them, if you use a blacklist for these images you can be 100% sure the model is not regurgitating training data verbatim.

I would suggest the model developers remove these images from the training set and replace them with variations generated with the previous model so they only learn the style and not the exact composition of the original. Replacing originals with variations - same style, different composition, would be a legitimate way to avoid close duplication.

visarga t1_j6x0qcm wrote on February 2, 2023 at 2:26 PM

Reply to comment by Ronny_Jotten in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

I think their argument goes like this - when you encode an image to JPEG the actual image is replaced by DCT coefficients and reconstruction is only approximate. That doesn't make the image free of copyright.

visarga t1_j6x07jz wrote on February 2, 2023 at 2:22 PM

Reply to comment by MemeBox in [D] What does a DL role look like in ten years? by PassingTumbleweed

I was actually saying the opposite - AIs need human validation to do anything of value. Generating tons of text and images without manually checking them is useless. So there is work around AIs.

visarga t1_j6vxxy0 wrote on February 2, 2023 at 6:58 AM

Reply to comment by Borrowedshorts in GPT tool that lets you connect to databases and ask questions in text. by Mogen1000

Maybe Google can get an idea from you, they have zero customer support, even for app developers on Android. Got your account blocked? - good luck getting any person to help you. People are legitimately terrified of this scenario to the point of giving up on Gmail. Losing all online identities in one go is not fun.

visarga t1_j6uev3y wrote on February 1, 2023 at 11:21 PM

Reply to comment by Sh1ner in What is your opinion of what is going to happen between AGI and Singularity. by CertainMiddle2382

> Some Blue Collar jobs have been replaced by automation but everyone knows their time is short as job cuts keep coming as efficiency is improved meaning less workers required to do the same amount of work.

Why do you believe there will be the same amount of work? A company will have to compete, AI will raise the bar for everyone. So they have to work harder in order to achieve better quality or more diversity or customisation. When your competition has AI and humans, you are going to be at a disadvantage with just AI.

visarga t1_j6u0vyz wrote on February 1, 2023 at 9:49 PM

Reply to [D] What does a DL role look like in ten years? by PassingTumbleweed

I think the road to trusted AI is going to be long, even a great AI is useless unless we can verify it aligns with our intentions and truth. So we are going to see lots of work around it.

visarga t1_j6n5mgc wrote on January 31, 2023 at 2:55 PM

Reply to comment by andreichiffa in Few questions about scalability of chatGPT [D] by besabestin

Oh, yes, gladly. This "open"AI paper says it:

> Larger models are significantly more sample efficient, such that optimally compute efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence.

https://arxiv.org/abs/2001.08361

You can improve outcomes from small datasets by making the model larger.

visarga t1_j6n30bh wrote on January 31, 2023 at 2:37 PM

Reply to comment by wavefxn22 in Google not releasing MusicLM by Sieventer

I don't think even Van Gogh can claim ownership of squiggly lines that look like fire or the colour palette of white-blue-gold. They pre-existed and were rediscovered in many ways in by many artists.

Can we agree that a style used by 3 or more artists doesn't belong to anyone and is open for AI to use? We just need to make a list of all styles that are generic enough.

visarga t1_j6jwgm3 wrote on January 30, 2023 at 9:18 PM

Reply to comment by ajahiljaasillalla in Chinese Search Giant Baidu to Launch ChatGPT-Style Bot by Buck-Nasty

Aren't all generative models examples of good communism? "From each according to his ability, to each according to his needs" The AI learns from everyone and then generates what we need on demand.

visarga t1_j6jut51 wrote on January 30, 2023 at 9:08 PM

Reply to comment by starstruckmon in Chinese Search Giant Baidu to Launch ChatGPT-Style Bot by Buck-Nasty

No, AI doesn't work that way. You just put into it text in any language, all of them together, and it figures out an inter-language representation. So you can ask in Chinese what it learns in English.

But there's also plenty of Chinese text. GLM-130B has been trained on over 400 billion text tokens (200B each for Chinese and English). GPT-3 was trained on 300B tokens mostly English.

visarga t1_j6f177f wrote on January 29, 2023 at 9:55 PM

Reply to comment by Ivanthedog2013 in My human irrationality is already taking over: as generative AI progresses, I've been growing ever more appreciative of human-made media by Yuli-Ban

This is very insightful. In 2023 paintings are painting themselves and books are writing themselves, to someone from the past this would be magic.

The model is a distillation of our culture. It works like a microscope, zooming into any concept or style immediately, and allowing interactive exploration. It is a trip into the mirror house of our imagination. What we see there is our own mind reflecting back.

visarga t1_j6exxh8 wrote on January 29, 2023 at 9:36 PM

Reply to comment by GeneralZain in My human irrationality is already taking over: as generative AI progresses, I've been growing ever more appreciative of human-made media by Yuli-Ban

>we will always do art, its baked into our species.

We will always do what we need to improve our lives, with or without help. It's baked into our species. Amazing lack of confidence in our ability to invent new kinds of work with AI! Or maybe lack of imagination about what these future jobs might be. Or just fear of the unknown.

visarga t1_j6dk9gu wrote on January 29, 2023 at 4:21 PM

Reply to comment by IONaut in Will humans rebel against the AI? by Plenty-Side-2902

It will be like electricity, everyone will use it. Electricity didn't make jobs disappear overall, but its existence was creative destruction in the job market. So many types of jobs wouldn't even exist without it.

Whenever a new tech quickly expands like this it will suck the profit on its own level while creating new opportunities one level above or below. Another example is open source - you can't compete with free, but you can build on it and you can make hardware for it.

So many companies used to make a profit selling closed source apps for what open source gives away for free - they go squashed, but nobody's crying for them, the field is still healthy. Other companies with other job offers will hire the same people.

visarga t1_j6c3eg2 wrote on January 29, 2023 at 7:02 AM

Reply to comment by MfDoomer222 in [D] Microsoft ChatGPT investment isn't about Bing but about Cortana by fintechSGNYC

The water levels were lower in the past and there was a land bridge, and today you can cross by Channel Tunnel, there are a few immigrants that sneaked in Calais to walk to Dover along the train tracks.