Submitted by Low-Mood3229 t3_10h5nfx in MachineLearning

Hello,

I really don't know how to frame this question but I wanted to ask if the was a way to integrate the relationships and nodes of a knowledge graph with recorded data. Like for example, when a knowledge graph contains information about relationships between features, can it be integrated with a dataset containing recorded or measured quantities of those features. The goal of this is to "infuse" the recorded dataset with relationships already known in the knowledge graph for some data analysis purpose.

I know it sounds confusing but you can as for clarification on some details. Please help.

16

Comments

You must log in or register to comment.

clvnmllr t1_j56kitl wrote

This is the use of knowledge graph embeddings as a feature for ML. “Graph embeddings” is your key word and should help you find other resources

8

Low-Mood3229 OP t1_j56lqee wrote

I did look at resources about graph embedding but they all seem to talk about using it in a link prediction or graph completion sense. My use case is more classification of datapoints(containing many seemingly unimportant features that may or may not have some relationship to each other. Relationships that are captured in the knowledge graph )

2

axm92 t1_j57jfrk wrote

>My use case is more classification of datapoints(containing many seemingly unimportant features that may or may not have some relationship to each other. Relationships that are captured in the knowledge graph

Sounds eerily close to one of our paper: https://aclanthology.org/2021.emnlp-main.508.pdf

To solve commonsense reasoning questions, we first generate a graph that can capture relationship between entities in the question (if you're thinking "chain-of-thought" prompting--yes, the idea is similar). Then, we jointly train a mixture-of-experts model with a classifier (RoBERTa) to do three things: i) learn to discard useless nodes, ii) pool node representations from useful nodes into a single graph embedding, and iii) classify using question + graph embeddings.

​

This video may give a good TLDR too.

8

dancingnightly t1_j583tfa wrote

>we first generate a graph that can capture relationship between entities in the question

This is really impressive, what's your thoughts on the state of this kind of approach? Could it be extended from sentences to whole context paragraphs at some stage, with the entities dynamically being different graph items?

1

axm92 t1_j58761h wrote

>Could it be extended from sentences to whole context paragraphs at some stage, with the entities dynamically being different graph items?

Absolutely. Highly recommend that you try playing around with some examples here: https://beta.openai.com/playground.

3

dancingnightly t1_j58anv8 wrote

That's a great resource, thanks. I have studied how this kind of autoregressive model works and found attention fascinating, but here it's graph embedding entities you brought up that sound exciting. I have just skim read your paper for now, so perhaps I made a mistake, but what I mean is:

For graph embeddings, could you dynamically capture different entities/tokens up to a much broader context than for common sense reasoning statements and questions? i.e. do entailment on a whole chapter(or knowledge base entry with 50 triplets), where the graph embeddings meaningfully represent many entities (perhaps with Sine positional embeddings for each additional text entry mention in addition to the graph, just like for attention)?

[Why I'm interested: because I presume it's impractical to scale this approach up in context - similar to for autoregressive models - due to the graph scaling exponentially if fully connected, but I'd love to know your thoughts - can a graph be strategically connected etc]

1

axm92 t1_j5b2ug8 wrote

I’m not sure if I understand you, but you can generate these graphs over long documents, and then run a GNN.

For creating graphs over long documents, one trick I’ve used in my past papers is to create a graph per 3 paragraphs, and then merge these graphs (by fusing similar nodes).

1

dancingnightly t1_j5c31u6 wrote

Oh ok. Thank you for taking the time to explain. I see that this graph approach isn't for extending beyond the existing context of RoBERTa/similar transformer models, but rather enhancing performance.

I was hoping graphs could capture relational information (in a way compatible with transformer embeddings) within the document at far parts between it essentially (like: for each doc.ents, connect in a fully connected graph), sounds like this dynamic graph size/structure per document input wouldn't work with the transformer embeddings for now though.

1

Veggies-are-okay t1_j58jqyx wrote

Mayyyybe graph neural networks?

https://distill.pub/2021/gnn-intro/

3

deviantkindle t1_j58zhpx wrote

Thanks for that link! It was a great read above all else.

Their description is actually what I was thinking of doing with my graph-connecting project, but I had never heard of "graph neural networks" before. Looks like a cool rabbit hole.

So much for doing taxes this weekend...

1

FirstOrderCat t1_j58hvki wrote

sql join?..

1

Low-Mood3229 OP t1_j59hurb wrote

😂😂

1

FirstOrderCat t1_j5aojpd wrote

I am serious, knowledge graph is just many triples, and can be stored in relational DB, and the same is about other data, and then you can join them..

1

xquizitdecorum t1_j58xf15 wrote

Graph embeddings are mentioned below, but also explore graph convolutional neural networks and message-passing neural networks. These methods are extensions of traditional CNN's into graph structures - after all, isn't an image just a lattice graph with pixels as nodes? These models can be used for, as also mentioned below, node and edge prediction/completion, but they can also be used for entire graph-based prediction. I've worked on graph-based prediction for molecular modeling, where I do whole-graph classification.

1

Low-Mood3229 OP t1_j59hxu2 wrote

Do you have a link to your work on molecular modeling

1

inFamous_16 t1_j5a67of wrote

Read TextGCN paper which uses Graph Neural Network for text classification task

1