Viewing a single comment thread. View all comments

yaosio t1_j2g4wqx wrote

Deepmind put out a paper on discovering faster matrix multiplication. I only know enough machine learning to ask where the bathroom is so I don't know the methods they used.

https://www.deepmind.com/blog/discovering-novel-algorithms-with-alphatensor

https://www.nature.com/articles/s41586-022-05172-4

38

Dylan_TMB t1_j2g6nlc wrote

Good suggestion but not really applicable to this question. TL;DR matrix multiplication done the naive way is slow. There is a way to decompose the matrix and get and algebraic expression that is equivalent but has less multiplication steps making it faster. Finding the decomposition through an algorithm is too time consuming. They trained a reinforcement model that treated decomposition like a game to make a model that can find the decomposition for a set of matrices fast. (I am not 100% decomposition is the right word).

I think what OP is really describing is almost a subfield of explainable AI. I think making an AI that learns to write an algorithm (let's say a python function) that can solve a problem correctly could be attempted. Would be solving multiple loss functions here 1) valid code 2) solves the problem 3) is fast. And 4) isn't just hard coding a dictionary of the training data as a lookup table lol. However, I don't think if a model is learning a function that an algorithm can be extracted.

You have to remember a model is learning a FUNCTION that approximates the mapping of two spaces. So it isn't exact or always correct. So even if the algorithm could be extracted it would be "correct" and I would assume if it could it would not be comprehensible.

7

currentscurrents OP t1_j2g9mvy wrote

Thanks, that is the question I'm trying to ask! I know explainability is a bit of a dead-end field right now so it's a hard problem.

An approximate or incomprehensible algorithm could still be useful if it's faster or uses less memory. But I think to accomplish that you would need to convert it into higher-level ideas; otherwise you're just emulating the network.

Luckily neural networks are capable of converting things into higher-level ideas? It doesn't seem fundamentally impossible.

3

Dylan_TMB t1_j2gc9va wrote

I actually think you are looking for this:

https://arxiv.org/abs/2210.05189

Proof that all neural networks can be represented by a decision tree. Navigating a decision tree is an algorithm so this would be a representation of the "algorithm"

So a question to ask would be if it is the minimal decision tree?

2

currentscurrents OP t1_j2gctk4 wrote

Interesting!

This feels like it falls under emulating a neural network, since you've done equivalent computations - just in a different form.

I wonder if you could train a neural network with the objective of creating the minimal decision tree.

1

Dylan_TMB t1_j2gcz2v wrote

Or just learn to minimize a tree that's input.

4

arg_max t1_j2hw6f8 wrote

That is an interesting paper BUT their method relies heavily on the structure of the task. In general, if you want to create a method that outputs algorithms choosing the output format is already non-trivial. For humans, pseudo-code probably is the most natural way to present algorithms but then you will require some kind of language model or at least a recurrent architecture that can output solutions of different lengths (as not every program has a fixed length). And even once you get your output from the model you have to first make sure that is a valid program and more importantly that it solves the task. This means that you have to verify the correctness of every method your model creates before being able to measure runtime.

But matrix multiplication is different. If you read the paper, you will see that every matrix multiplication algorithm can be written as a higher order Tensor and given a Tensor decomposition its trivial to check the correctness of the matrix multiplication algorithm. This is not even a super novel insight, people knew that you can formulate the task of finding better matrix multiplication algorithms as Tensor decomposition optimization problem BUT the problem is super hard to solve.

But not many real world tasks are like this. For most problems you don't have such a nice output space and at that point it becomes much much harder to to learn algorithms. I guess once people figure out a way to make models that can output verifiably correct pseudo code we will start to see tons of papers of new AI generated heuristics for NP hard and other problems that cannot be solved in optimal time yet.

6