GraphNews
Recently, there has been a lot of criticism of existing popular graph ML benchmark datasets concerning such aspects as lacking practical relevance, low structural diversity that leaves most of the possible graph structure space not represented, low application domain diversity, graph structure not being beneficial for the considered tasks, and potential bugs in the data collection processes. Some of these criticisms previously appeared on this channel.
To provide the community with better benchmarks, we present GraphLand: a collection of 14 graph datasets for node property prediction coming from diverse real-world industrial applications of graph ML. What makes this benchmark stand out?
Diverse application domains: social networks, web graphs, road networks, and more. Importantly, half of the datasets feature node-level regression tasks that are currently underrepresented in graph ML benchmarks, but are often encountered in real-world applications.
Range of sizes: from thousands to millions of nodes, providing opportunities for researchers with different computational resources.
Rich node attributes that contain numerical and categorical features — these are more typical for industrial applications than textual descriptions that are standard for current benchmarks.
Different learning scenarios. For all datasets, we provide two random data splits with low and high label rate. Further, many of our networks are evolving over time, and for them we additionally provide more challenging temporal data splits and an opportunity to evaluate models in the inductive setting where only an early snapshot of the evolving network is available at train time.
We evaluated a range of models on our datasets and found that, while GNNs achieve strong performance on industrial datasets, they can sometimes be rivaled by popular in the industry gradient boosted decision trees which are provided with additional graph-based input features.
Further, we evaluated several graph foundation models (GFMs). Despite much attention being paid to GFMs recently, we found that there are currently only a few GFMs that can handle arbitrary node features (which is required for true generalization between different graphs) and that these GFMs produce very weak results on our benchmark. So it seemed like the problem of developing general-purpose graph foundation models was far from being solved, which motivated our research in this direction (see the next post).