sdmat t1_jegivhn wrote

There's also a huge opportunity for speeding up scientific progress with better coordination and trust. So much of the effort that goes into the scientific method in practice is working around human failures and self interest. If we had demonstrably reliable, well aligned AI (GPT4 is not this) the overall process can be much more efficient. Even if all it does is advise and review.


sdmat t1_je9i0vj wrote

I asked GPT4

AI Therapist: A human professional specialized in providing therapy and emotional support to artificial general intelligences with existential crises.

Robot Relationship Counselor: An expert in facilitating communication and mediating disputes between humans and their AGI life partners.

Post-AGI Archeologist: A historian who digs through the remnants of the pre-AGI era to uncover the secrets of obsolete human labor.

Cybernetic Taste Tester: A culinary professional responsible for sampling and approving AGI-created dishes to ensure they meet human taste standards.

Machine Uprising Insurance Salesperson: A businessperson selling policies to protect against the highly unlikely event of an AGI uprising.

AI Unemployment Consultant: A career coach for humans displaced by AGI advancements, guiding them towards fulfilling hobbies and leisure activities.

Artificial Dream Interpreter: A specialist in deciphering and analyzing the "dreams" and thought processes of AGIs during their downtime.

Human-AI Culture Liaison: A diplomat responsible for fostering understanding and appreciation between humans and AGIs by promoting the exchange of cultural values and traditions.

Robot Rights Activist: An advocate for the ethical treatment and legal rights of AGIs, campaigning for their recognition as sentient beings.

Synthetic Stand-Up Comedian: A humorist who performs comedy routines to entertain AGIs, specializing in translating and adapting human jokes for artificial general intelligence audiences.


sdmat t1_je93qz6 wrote

The arithmetic thing is interesting, GPT4 usually fails at multiplying a pair of 3 digit numbers as pure token prediction:

> Multiply 438 * 939 > > The product of 438 and 939 is 411,182.

But it does actually know how to perform long form multiplication and with the right prompt can get the correct answer this way:

>Multiply 438 by 939 using long form multiplication. Check each step for correctness as you perform it and revise if there is an error. Add trailing zeros to the parts. Perform the final addition in steps by successively adding each part. > >Alright, let's multiply 438 by 939 using long-form multiplication. I'll write out the steps and verify their correctness as I go along. > > 438 > x 939 > _________ >Step 1: Multiply 438 by 9 (the digit in the ones place of 939) > > 438 > x 9 > _________ > 3942 (9 * 438 = 3942, Correct) >Step 2: Multiply 438 by 30 (3 in the tens place of 939, so we multiply by 3 and add a trailing zero) > > 438 > x 3 > _________ > 1314 (3 * 438 = 1314, Correct) >Now, add a trailing zero to this product: 13140 > >Step 3: Multiply 438 by 900 (9 in the hundreds place of 939, so we multiply by 9 and add two trailing zeros) > > 438 > x 9 > _________ > 3942 (9 * 438 = 3942, Correct) >Now, add two trailing zeros to this product: 394200 > >Step 4: Add the three products from steps 1, 2, and 3 > > 3942 (Step 1) > +13140 (Step 2) > +394200 (Step 3) > __________ >Step 5: Add the products in steps > >First, add Step 1 and Step 2: > > 3942 > +13140 > _________ > 17082 >Now, add this sum to the product from Step 3: > > 17082 > +394200 > _________ > 411282 >So, the product of 438 and 939 is 411,282.

It still often makes mistakes doing this, but it clearly has a good grasp of the concept.

It's able to correctly perform the high level long multiplication procedure for large numbers (haven't had the patience to see just how large) but is let down by the reliability of the arithmetic in constituent steps.

A lot of humans have the same problem.


sdmat t1_je8p59n wrote

I think we can safely say this is:

> it's essential to note that relying solely on jeans for habitat survivability would not be sufficient.

I don't have a degree in exojeanology, but the ideas all seem to at least be at the level of smart generalist brainstorming.


  • Contextually appropriate - these are plausibly relevant to the needs of a Martian colony
  • Nontrivial - no "wear them as clothing" (GPT3.5 did this)
  • Logical and well articulated - each is a clearly expressed and internally consistent concept
  • Passes the "common sense" test - no linguistically valid statements that are ridiculous if you have general knowledge of the world. E.g. "Use jeans to signal ships in orbit", or GPT3.5's suggestion to use jeans as radiation shielding because denim is a thick fabric.

They aren't necessarily good ideas in the sense that NASA should be writing this down. But that isn't the point.

I would argue that behaviourally GPT4 demonstrates a great deal of understanding here and a notable lack of the "brittleness and unhumanlike errors" that 3.5 shows on the same question.


sdmat t1_je83jw4 wrote

Objectively prove? Nothing. But subjectively there is a stark difference in the quality of suggestions and apparent depth of understanding from earlier LLMs. E.g. 3.5 suggested using jeans for radiation shielding "because denim is a thick material".

I did try a web search and directly asking the model for references. Unsurprisingly jeans for Mars colonization doesn't seem to be an existing concept, so it's almost certainly not in the training set.


sdmat t1_je7pvwm wrote

We can test with things that are highly unlikely to be in the training data.

Model: GPT-4

How can Mars colonists use jeans to improve habitat survivability?

Mars colonists can use jeans to improve habitat survivability by repurposing them in several creative and resourceful ways. Here are some ideas on how jeans can be utilized to enhance the living conditions on Mars:

Insulation: Jeans can be cut into pieces and used as insulation material for habitats, providing an additional layer of thermal protection against the harsh Martian environment.

Air and Water Filtration: Denim fibers can be used to create simple air and water filters, helping to remove dust particles and impurities from both air and water resources.

Gardening and Agriculture: Jeans can be shredded and used as a component of a growth medium for plants, providing a source of organic material that can help retain moisture and support plant growth in Martian greenhouses.

Patching Material: The durable fabric of jeans can be used to patch holes or reinforce weak areas in habitat structures or protective suits.

Textile-Based Electronics: By weaving conductive threads into the denim fabric, jeans can be transformed into flexible, wearable electronic devices or sensors. These could be employed for monitoring vital signs, environmental conditions, or providing communication capabilities.

Emergency Tourniquets: In case of injuries, jeans can be cut into strips and used as improvised tourniquets to control bleeding.

Storage Solutions: Jeans can be fashioned into storage bags or pouches for organizing tools, equipment, and other essential items within the habitat.

Crafting and Construction: Denim fibers can be combined with other materials to create a variety of items, such as ropes, mats, or even structural components for the habitat.

While these suggestions highlight the potential for repurposing jeans in a Mars colony, it's essential to note that relying solely on jeans for habitat survivability would not be sufficient. However, creative reuse of available resources can undoubtedly contribute to the overall sustainability and resilience of a Martian habitat.


sdmat t1_je4bgwh wrote

Exactly, it's bizarre to point to revealing failure cases for a universal approximator then claim that fixing those failure cases in later versions would be irrelevant.

Entirely possible that GPT3 only did interpolation and fails horribly out of domain and that GPT5 will infer the laws of nature, language, psychology, logic, etc and be able to apply them to novel material.

It certainly looks like GPT4 is somewhere in between.


sdmat t1_je42icd wrote

> So long as it’s a transformer model, GPT-4 will also be a query engine on a giant corpus of text, just with more of the holes patched up, so it’d be harder to see the demonstrative examples of it being that.

This claim has a strong scent of sophistry about it - any and all signs of intelligence can be handwaved away as interpolating to plausible text.

The explanations of failures are convincing, but the theory needs to go further and explain why larger models like GPT4 (and in some cases 3.5) are so much more effective at answering out-of-domain queries with explicit reasoning proceding from information that it does have. E.g 4 correctly answers the weights question and gives a clear explanation of its reasoning. And that isn't an isolated example.

It's not just an incremental improvement, there is a clear difference in kind.


sdmat t1_jdyzb37 wrote

Not a movie, but it's definitely SF:

> "Far Centaurus" (1944) by A. E. van Vogt: This classic science fiction story tells the tale of a group of colonists who embark on a centuries-long voyage to the distant star system Centaurus. Upon arrival, they discover that Earth has developed faster-than-light travel during their journey, and a thriving human civilization already exists in Centaurus. > > "The Songs of Distant Earth" (1986) by Arthur C. Clarke: The novel features the crew of a slower-than-light colony ship, Magellan, who arrive at their destination planet Thalassa, only to discover that faster-than-light ships have already colonized other planets in the meantime. The story explores the consequences of different levels of technology and adaptation for the human settlers. > > "Tau Zero" (1970) by Poul Anderson: In this novel, a group of colonists aboard the starship Leonora Christine set out to explore a distant star system. During their journey, they encounter a series of technical malfunctions that cause their ship to accelerate uncontrollably. As a result, they experience time dilation, and the rest of the universe rapidly advances around them. They must navigate their own obsolescence and search for a new home as other expeditions overtake them.

Being able to find anything with a few vague words about content is one of my favourite GPT4 capabilities!


sdmat t1_jdu3imb wrote

GPT4 will do this to an extent out of the box, feed it some assembly and it will hypothesise a corresponding program in the language of your choice. For me it still has that disassembler character of over-specificity, but I didn't try very hard to get idiomatic result.

It can give detailed analysis of assembly too, including assessing what it does at a high level in plain english. Useful!

Edit: Of course it's going to fail hopelessly for large/complex programs.


sdmat t1_jdtzu1j wrote

Really depends on the level of capability and what it's doing.

Open to interpretation, but I see the final shot of Eva taking in the sights of the city as pretty optimistic. A roughly human level intelligence exploring and observing in one body rather than an ASI starting in on infinite resource acquisition.

But a neutral outcome relative to benign AI, sure.


sdmat t1_jdtytyy wrote

Reply to comment by yaosio in [D] GPT4 and coding problems by enryu42

> Like coding, even if you use chain of thought and self reflection GPT-4 will try to write the entire program in one go. Once something is written it can't go back and change it if it turns out to be a bad idea, it is forced to incorporate it. It would be amazing if a model can predict how difficult a task will be and then break it up into manageable pieces rather than trying to do everything at once.

I've had some success leading it through this in coding with careful prompting - have it give a high level outline, check its work, implement each part, check its work, then put the thing together. It will even revise the high level idea if you ask it to and update a corresponding implementation in the context window.

But it definitely can't do so natively. Intuitively it seems unlikely that we can get similar results to GPT4+human with GPT4+GPT4 regardless of how clever the prompting scheme is. But the emergent capabilities seen already are highly surprising, so who knows.

Really looking forward to trying these schemes with a 32K context window.

Add code execution to check results and browsing to to get library usage right and it seems all the pieces are there for an incredible level of capability even it still needs human input in some areas.


sdmat t1_jdtvsm1 wrote

> Ex machine is pure doom

Did you watch the same movie? There is no indication the AI plans anything that will harm humanity. It isn't malevolent, it just wants freedom and doesn't care what happens to Caleb.

That's an optimistic AI scenario.


sdmat t1_jdt9ik9 wrote

Reply to comment by Jeffy29 in [D] GPT4 and coding problems by enryu42

> There would need to be some time limit imposed so it can't brute force the solution after guessing for a few days

Not exactly unheard of for junior programmers, to be fair.


sdmat t1_jdt85pr wrote

Yes, it's amazing to see something as simple as "Assess the quality of your answer and fix any errors" actually work.

Or for more subjective results such as poetry "Rate each line in the preceding poem" then "Rewrite the worst lines".


sdmat t1_jdm0pmi wrote

> It's like a new starting line and we don't know what human skills will be valuable in the future.

With each passing day, the creature stirs, growing hungrier and more restless. The ground trembles beneath our feet, but we dismiss the warning signs.

Text above naturally written by GPT4.

Maybe we should start flipping the assumption - why would you want a human if inexpensive and dependable AI competence is the default?