Viewing a single comment thread. View all comments

xt-89 t1_jbrx1ss wrote

This project is interesting. The description however is hard to parse through. I’d suggest going over your README and cleaning up some things.

If I could also suggest a feature, if you could use this to generate UML diagrams that’d be great.

You mention that the code base can improve itself. I don’t see where that functionality is. Do you mean that if a person uses this tool for software analysis, productivity increases?

28

[deleted] t1_jbs3esk wrote

the full functionality has been constrained a bit due to refactor, will be fixed soon.

I apologize for the messy read me, the main idea is that I’m using a GNN layer as an inductive bias to improve the representation power of the sentence embeddings by exponentiating the Adj matrix A^2 then aggregated node features using message passing. Then finally using the topic model to create a topic tree to then feed it back into the system prompt to generate more high quality semantic context. It’s also relatively easy w/ bertopic to combine this with outlier detection/ filters for low quality data removal.

It’s a recurrent neural network in the sense where your feeding the output of the previous step back into the network.

You can also use another repos topic tree to suggest improvements. For example, I can add the deepmind topic tree to comment on where it could add features such as the graph attention network or where code can be converted to jax and then generate it on the fly.

Also yes, I found it very useful for deconstructing complex repos like knot theory which provided me with a lot of insight making it easier to narrow down my research and study to the principal components of the repo.

9

xt-89 t1_jbs8od5 wrote

Yeah you’ve definitely setup a good representation bias for modeling entire software architectures.

I had a thought a while back that GitHub Copilot is eventually going to offer a feature where they suggest improvements to entire software architectures… and then eventually just write whole projects from a text description alone. I think that the solution for that would be pretty similar to what you’ve done if scaled up and applied that way.

If your plan is to scale up the system for more advanced features, that would be awesome.

Another suggestion is that if you integrated your tool with GitHub, it would be pretty useful for enterprise software development. Most companies are pretty crappy at documentation. Even with good documentation, a chatbot is better than a static document.

Good job!

7

[deleted] t1_jbsafxc wrote

Thank you! Yes I thought the topic tree would be a great complement to the commit tree. Would be great for stale repos with little to no documentation.

Also the option to mix in multiple repositories and message pass between them to help with brain storming new features. Or message passing between your repo and its dependencies.

1

xt-89 t1_jbsaabf wrote

I also plan on applying the basic idea of a GNN with prompting to the thought loop of an cognitive entity (basically open assistant). I believe if you take the tree your outputting for code, but use it to aid CoT reasoning, that could be pretty powerful

3

[deleted] t1_jbsama8 wrote

Yes exactly! That’s the a major goal of this project. I plan on incorporating the inference server that Yannic set up for open assistant.

1

NovelspaceOnly OP t1_jbsdr55 wrote

I have some preliminary generation scripts for SMILES chemical graphs, Feynman diagrams, storytelling with interleaved images, and testing compilation rates. sorry for switching accounts. this one is logged on my laptop lol..

1

xt-89 t1_jbt5yyd wrote

That’s cool. I assume you’re going to apply this to memories for the agent. There’s already relevant research on how to do that. Here’s one from Facebookresearch: https://ai.facebook.com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models/

1

NovelspaceOnly OP t1_jbu2nki wrote

yes! I would describe my repo as very aligned with ideas from Yann Lecun. "composition of clever abstractions"

1

[deleted] t1_jbs3u5o wrote

It’s also easy to retrieve representative docs for the topics in the tree

1