Submitted by ramya_1995 t3_10dv8rf in MachineLearning

Hello everyone,

​

I have a question about GCNs and would appreciate any thoughts. Do we typically use only one graph for GCN training/inference?

I'm asking this because when I saw official DGL website, there was only one example graph after loading it. Based on my experience with DNNs, I expected a batch of examples. However, it was not the case for GCNS. I could find PPI dataset with multiple graph examples (24) but for other widely used datasets (e.g., Cora, Citeseeer, and Pubmed), there was only one.

Thank you!

3

Comments

You must log in or register to comment.

laaweel t1_j4pm5q3 wrote

Hello,
it depends on the problem but it is also possible to train over many graphs.
I am also a beginner, especially in the area of graph neural networks, and found it very confusing that in all the examples only one graph was trained on at a time.
But it seems to be no problem. I am currently training a model and have 200k+ example graphs and I do predict node features.
I collected the dataset myself though. But I think there are also datasets with many graphs in the field of biology / medicine.

Feel free to reach out if you need help :)

2

ramya_1995 OP t1_j56ovcx wrote

u/laaweel I have another quick question. Cora dataset splits the labels into 140 trains, 500 for valid and 1000 for test (according to DGL website). I found that these numbers correspond to the number of nodes (node classification problem). But any thought why the sum (140+500+1000) does not match the total node number in Cora dataset (2708 nodes)? Is it because the rest of the nodes are unlabeled? Thank you!

1

clemda2 t1_j5eliix wrote

You CAN batch train GCNs (or some of them are very amenable to that) some of the most scalable GCNs rely on something like GraphSAGE convolution which doesn’t require the whole graph laplacian for updates (this approach is used by Wikipedia, Uber, Pinterest) to train highly scalable GCNs). Other convolutional operators like GAT also can be batch trained.

You can use the Python package PyTorch-Geometric documentation as a jumping off point for reading about practical graph sub sampling.

2