Have you ever wondered how recommender systems at web-scale applications like Pinterest manage to surface relevant content to hundreds of millions of users from billions of items? It turns out that behind many of these systems are powerful deep learning architectures called Graph Convolutional Neural Networks (GCNs) that have revolutionized the field in recent years. By ingeniously combining ideas from graph theory with the representational power of deep neural networks, GCNs have achieved unprecedented performance on recommender system benchmarks, outperforming traditional approaches by significant margins.
However, deploying GCNs in real-world, web-scale recommendation settings poses immense challenges. The graphs encountered in these domains often contain billions of nodes and edges, rendering many of the core assumptions and techniques underlying GCNs impractical. Making GCNs truly scalable while retaining their empirical benefits has remained an open problem.
What are Graph Convolutional Neural Networks (GCNs)?
GCNs are a class of deep learning methods designed to perform inference on data described by graphs. They combine the mathematical machinery of graph theory with the feature learning capabilities of neural networks.
The core idea is to learn low-dimensional vector embeddings of nodes that capture the graph structure as well as node features. Intuitively, this allows the embeddings of nodes to incorporate both their local neighborhood information as well as their own features.
By operating directly on the graph structure, GCNs can effectively propagate information across edges, allowing them to uncover complex higher-order interactions. This has led to dramatic performance improvements on tasks like node classification and link prediction, with GCNs achieving state-of-the-art results on several recommender system benchmarks.
Challenges in scaling GCNs to web-scale recommendation tasks
Despite their conceptual appeal and remarkable performance on benchmark tasks, deploying GCNs for web-scale recommendation involves daunting challenges:
- Operating on billion-scale graphs is infeasible: Existing GCN approaches require operating on the full graph Laplacian during training. For graphs with billions of nodes and edges, this is clearly impractical due to memory constraints. The full graph Laplacian cannot fit on a single machine or even be easily sharded.
- Foundational assumptions are violated at scale: Many of the core techniques powering GCNs were developed with modestly sized graphs in mind. For instance, the idea of message passing over the entire graph breaks down when working with billions of nodes and edges. The locality assumptions underlying most GCN architectures also become increasingly tenuous in massive graphs.
- Inference latency is a major bottleneck: In order to generate embeddings for new nodes, GCNs need to perform expensive graph convolutions that can take several minutes or even hours for large graphs. Such high latency is unacceptable for generating real-time recommendations.
In summary, deploying GCNs for web-scale recommendation requires rethinking many of their fundamental building blocks. Overcoming these challenges to make GCNs practical for real-world, billion-scale graphs is a major open problem.
PinSage: A random-walk based GCN framework for web-scale recommendation
To tackle the challenge of scaling GCNs to massive graphs, the team at Pinterest developed PinSage, a random-walk based GCN framework capable of learning high-quality embeddings for nodes in graphs with billions of objects. PinSage incorporates several key innovations:
- Localized convolutions via neighborhood sampling: Instead of operating on the entire graph, PinSage performs localized convolutions by sampling the neighborhood around each node via random walks. For each node, it dynamically constructs a computation graph based on the sampled neighborhood, specifying how to perform the convolution. This eliminates the need to keep the full graph in memory during training.
- Importance-based neighborhoods: PinSage defines the neighborhood of a node by simulating random walks starting from that node. Neighbors are selected based on the probability of visiting them, which measures their importance. This provides a tractable way to identify relevant neighbors for convolution while naturally capturing the graph structure. By choosing a fixed number of neighbors to convolve over, PinSage can tightly control memory usage.
- Substantial performance gains: Compared to using the traditional K-hop neighborhoods, PinSage's random-walk based approach offers a 46% improvement on offline evaluation metrics. This demonstrates the power of selectively sampling important neighbors for convolution.
By performing localized convolutions on sampled neighborhoods, PinSage can efficiently learn embeddings for nodes in massive graphs. However, generating these embeddings for all nodes at inference time remains a challenge.
Efficient MapReduce inference for generating embeddings of billion-scale graphs
Given a trained GCN model, using it to generate embeddings for all nodes, especially unseen ones, is quite tricky. A naive approach of applying localized convolutions for each node leads to many redundant computations, since the neighborhoods of nodes can significantly overlap.
PinSage addresses this by leveraging an efficient MapReduce pipeline that decomposes the embedding computation into three key operations:
- Map: The map operation projects each node into the embedding space without any neighborhood aggregation. This yields initial embeddings that don't incorporate any graph structure.
- Join: The join operation gathers the initial embeddings of all neighbors of each node. This sets the stage for neighborhood aggregation.
- Reduce: The reduce operation performs the actual aggregation over neighbor embeddings to generate the final embedding for each node.
By decomposing the embedding computation this way, PinSage ensures that each node's initial embedding is computed exactly once in the map phase. The join and reduce operations then efficiently aggregate the neighborhoods to produce the final embeddings.
This MapReduce inference pipeline is remarkably efficient, enabling PinSage to generate embeddings for billions of nodes within a few hours using a cluster of just a few hundred machines. This is orders of magnitude faster than existing GCN inference approaches.
Applications and impact at Pinterest
PinSage powers several key recommendation tasks at Pinterest, including related pins, ads and shopping. It is deployed on an internal graph with over 3 billion nodes representing pins and boards, with over 18 billion edges encoding various relationships like pins saved to boards.
The impact of PinSage has been substantial, leading to 25-30% improvements in user engagement compared to previous deep learning approaches in A/B tests. By learning high-quality embeddings that unify both graph structure and rich node features, PinSage can surface recommendations that are both relevant and engaging.
To the best of our knowledge, PinSage represents the largest-scale application of GCNs in a real-world production system to date. It exemplifies the potential of deep graph embeddings to power a new generation of web-scale recommender systems that can effectively leverage massive heterogeneous graphs.
As we've seen, Graph Convolutional Neural Networks and innovations like PinSage are pushing the boundaries of what's possible with web-scale recommender systems. If you're excited by the potential of these cutting-edge techniques to transform your business, we invite you to explore how Shaped can help. Get started with us today and unleash the power of deep graph embeddings for your recommender systems.