Viewing a single comment thread. View all comments

OptimizedGarbage t1_j86zehe wrote

People talking about "can't you just do this in X?" are seriously underestimating how difficult a problem this actually is. There's a good reason why it's hard to do, which is that this kind of static type checking requires refinement types (where typechecking is NP-hard). Basically, by including values in the types, typechecking becomes way harder -- the types contain a limited program that involves arithmetic, and sound typing requires you to prove that the dimensions add up correctly. So your type system needs to include a notion of arithmetic, except that because of Godel's incompleteness theorem, any logical system that includes integer arithmetic is undecidable. So now this is basically stepping from traditional static typechecking to something like an automated theorem prover, unless you get very careful and clever with how you set up the problems. That can mean one of two things -- either you write a ton of type annotations (like more code than the actual program) to prove to the type system that your program is valid. Or you can hook up an automated theorem prover to prove the soundness of your program automatically, at the cost of type-checking being NP-hard or worse.

This can be worth it, and there are potentially ways of making this tractable, but it's very non-trivial -- you basically need a type system that's dedicated to this problem specifically, not something that you can stick onto an existing language's type system.

That said, there are some things that try to do this. Haskell has a port of torch called HaskTorch that includes this kind of typed tensor shapes, and calls the Z3 theorem prover on the backend to solve type inference. It can get away with this because of the LiquidHaskell compiler extension, which has refinement types capable of solving this kind of typing problem, and is already pretty mature. Dex is a research language from Google that's based on Haskell and built to explore this kind of typechecking. Really you'd want to do this in Rust, because that's where the tradeoff of speed and safety for convenience makes the most sense, but rust is just barely on the edge of having a type system capable of this. You have to get really clever with the type system to make it work at all, and there's been no sustained push from any company to develop this into a mature solution. Hopefully something better comes along soon

118

throwaway957280 OP t1_j89eiz0 wrote

Thank you for the detailed answer! This is really interesting.

6

DoxxThis1 t1_j87h9gx wrote

I wonder if GPT could be leveraged to create an NLP-based type system. The programmer annotates the types in plain English, and the AI hallucinates the appropriate theorem-proving axioms! It would be an interesting "dog-fooding" of AI/ML for easier AI/ML development.

EDIT Holy cow what did I say to deserve so many downvotes? The one response below makes me think it's not such a wild idea.

−23

OptimizedGarbage t1_j87jazn wrote

So, yes and no. You really do not want to make the type annotations be in plain English, because in the Curry-Howard correspondence, the types correspond to theorems, and the code corresponds to a proof of those theorems. It's one thing if you know the theorem, but don't see the proof. You can often trust that the proof is right without seeing it. But you really need to know what the theorem is. If you start with English and generate some type behind the scenes, you don't see what the actual theorem is, but just know that the system has proved 'some theorem' about your code. As the programmer you have no idea what this actually tells you, so it kinda defeats the point of using static typing in the first place.

That said, you *can* write down a desired type and have a system write down a ton of type annotations or generate a bunch of code to prove that the type you wrote down is satisfied by your program. There's been recent work on this in deep learning for theorem proving, such as this work which uses GPT for proving theorems in Lean, a dependently type programming language and theorem prover. A better approach though would be to combine this with an actual tree search algorithm to allow a more structured search over the space of proofs, instead of trying to generate full correct proofs in one shot. Hypertree Proof Search does this, using a variant of AlphaZero to search and fine-tune the neural net. Unfortunately it hasn't been open-sourced though, and it's pretty compute intensive, so we can't use this for actual type inference yet. But yeah there's active interest in doing this kind of thing, both as a proving ground for using RL for reasoning tasks and from mathematicians for theorem-proving.

15