Viewing a single comment thread. View all comments

Destiny_Knight t1_j9dxbeq wrote

Aren't there only like 400 employees at OpenAI or something? That's like saying you have a friend who won the lottery. That's pretty amazing. What's their experience like? Anything they can share? Is it all secretive?

2

SoylentRox t1_j9dytn4 wrote

Several friends. Others at AI startups. Somehow they are self taught. Good at Python, has a framework that uses some cool hacks included automated function memoization.

Note that until very recently, like 2 months now, OpenAI was kind of not the best option for elite programmers. It was all people on a passion project. The lottery winners were at Deepmind or Meta.

Have several friends there also. The Meta friends are all the usual background, with the graduate degree and 15+ yoe in high performance GPU work.

2

fangfried t1_j9e14pq wrote

Why is python so widely used in AI when it’s a really inefficient language under the hood? Wouldn’t Rust be better to optimize models? Or do you just need that optimization at the infrastructure level while the models are so high level it doesn’t matter?

Also it’s really cool there’s people in the forefront of AI on this sub. I’m at a big tech company right now, and I want to transfer into infrastructure for AI there. Then hopefully, I’ll build a resume to get into a top PhD program. After that I could work in AI research.

2

SoylentRox t1_j9e39rk wrote

>Why is python so widely used in AI when it’s a really inefficient language under the hood? Wouldn’t Rust be better to optimize models? Or do you just need that optimization at the infrastructure level while the models are so high level it doesn’t matter?

You make calls to a high level framework, usually pytorch, that have the effect of creating a pipeline. "Take this shape of input, inference it through this architecture using this activation function, calculate the error, backprop using this optimizer".

The python calls can be translated to a graph. I usually see these in *.onnx files though there are several other representations. These describe how the data will flow.

In the python code, you form the object, then call a function to actually inference it a step.

So internally it's taking that graph, creating a GPU kernel that is modified for the shapes of your data, compiling it, and then running it on the target GPU. (or on the project i work on, it compiles it for what is a TPU).

The compile step is slow, using a compiler that is likely C++. The loading step is slow. But once it's all up and running, you get essentially the same performance as if all the code were in C/C++, but all the code you need to touch to do AI work is in Python.

3

Destiny_Knight t1_j9dzwf0 wrote

What's your prediction for when a ChatGPT that doesn't make mistakes in answering and has 10x more memory will occur? What's your timeline for AGI, singularity?

1

SoylentRox t1_j9e2m1x wrote

Mistakes: Depends on the outcome of efforts to try to reduce answering errors. If self introspection works, months.

More context memory: Weeks to months. There already are papers that set up the groundwork: https://arxiv.org/abs/2302.04761 . Searching the past log for this same session (past our token window) is easily integratable with the toolformer architecture.

There are also alternate architectures that may also enormously increase the window.

AGI : it is possible within a few years. Whether it happens depends on the trajectory of outside investment. If Google and Microsoft go into an all out AI war where each are spending 100B plus annually? A few years. If current approaches "cap out" and the hyper diminishes? Could take decades.

Singularity: shortly after AGI is good enough to control robotics for most tasks. So shortly after AGI probably. (shortly meaning a matter of months to a few years)

3