Success-stories
Snapchat Enhances Friend Recommendations with NebulaGraph
Snapchat, the camera-first social media platform, promotes authentic connections between its users through ephemeral photos, videos, and AR experiences. As of June of 2024, Snapchat reported having more than 432 million daily active users worldwide, ranking among the top mobile platforms in the US. The user base growth poses growing challenges for the underlying infrastructure, requiring efficient tools to handle complex data workloads to navigate its vast, dense, interconnected social graph. Snapchat’s Infrastructure team was seeking an efficient database solution to manage and analyze the massive interconnected user data, aiming to enhance their friend recommendation use case.
The Challenge: Performance Bottleneck Restricts Real-time Friend Recommendations
To create better recommendations for friend suggestions, the underlying system needs to compute heuristics that leverage users and relationships, stored in aggregate and anonymized as metadata in vertices and edges. During the retrieval stage of the recommendation pipeline, these heuristics are used to compute affinity to the query user, and often require multiple hops, a task where graph databases excel compared to traditional relational databases. Moreover, the graph model is more natural for users to express data relations. At the end of 2022, Snapchat's Infrastructure team began exploring several graphDB solutions to examine their feasibility and performance.
The Solution: NebulaGraph’s High Availability and Performance Make It the Ideal Choice
After a thorough evaluation of various graph databases, the team decided to adopt NebulaGraph as it integrates seamlessly with their in-house environment and uniquely satisfies their production requirements of serving diverse online requests.
NebulaGraph stood out as the team’s top choice with the following technical advantages:
- High availability and resilience to failures: NebulaGraph commits to delivering reliable performance with a 99.95% uptime, ensuring smooth operation during updates and upgrades.. Its high availability stands to remain unaffected in the face of failures such as node or network breakdowns, and even zone failures. This resilience is critical, considering the common occurrence of failures in large-scale clusters.
- High performance: NebulaGraph was among the very few top-performing databases, distinguished by its ability to store and process graphs with trillions of vertices and edges, while delivering impressive QPS and latency. It ensures highly simultaneous access, swift graph traversals, and optimized memory utilization.
- Horizontal scalability: NebulaGraph's shared-nothing distributed architecture offers linear scalability. This feature allows Snapchat to independently scale computing and storage, achieving desired performance at a cost-effective rate.
- Seamless integration with Kubernetes: NebulaGraph, with its cloud-native nature, can be effortlessly hosted within a Kubernetes environment, including AWS EKS, GCP GKE, Azure AKS, and more. Moreover, NebulaGraph holds full compatibility with the Kubernetes Operator Framework. This compatibility significantly reduces Snapchat's maintenance costs while ensuring uninterrupted service. The ability to run seamlessly in a Kubernetes environment further enhances its high-availability, offering customers peace of mind.
- AAA security compliance: NebulaGraph enhances Snapchat's network security through authentication, authorization, and accounting, safeguarding their user data. NebulaGraph supports both local and LDAP authentication that authorizes each user a role comprising various privileges, including metadata and data read/write access, and even fine-grained vertex and edge level permission management. Additionally, NebulaGraph logs user activities and database operations for future auditing. It also secures client-to-NebulaGraph and intra-NebulaGraph node traffic with TLS encryption. To protect data privacy, NebulaGraph supports TTL (Time-To-Live), ensuring that expired data is automatically and permanently removed from the database. Users can also delete data by dropping and clearing the database, providing control over data retention and privacy.
The Result: Enhanced Performance, Reduced Costs and Simplified Maintenance
With the power of NebulaGraph, Snapchat has rolled out new friend recommendation workloads in production with some highlighted results, including:
- Performance Amplification: NebulaGraph's efficient handling of massive data sets has enabled Snapchat to employ new distributed online graph query algorithms, previously unattainable in traditional KV-store databases, while ensuring user data is stored in aggregate and anonymized as metadata in vertices and edges. This has led to a noticeable increase in new friendships formed through friends-of-friends recommendations.
- Large Graph Support: NebulaGraph’s sharded architecture allowed ingestion of 12TB graphs and real-time queries at p90 <20ms.
- High Availability: NebulaGraph has maintained 99.95% uptime since its initial deployment, and has shown it’s capable of handling large-scale ingestion, online query, through multiple traffic spike events. This has led to reduced operational overhead for the Snapchat Infrastructure team.
- Cost-Effective Hardware: Leveraging more affordable machine types for scaling, Snapchat can host NebulaGraph more economically compared to using a single, powerful machine with equivalent capacity.
“We evaluated 9 graph DB solutions over the last year and found NebulaGraph capable of meeting our data size, query SLO, and QPS needs. NebulaGraph’s native Kubernetes support also makes it a nice fit to Snapchat’s underlying multi-cloud service mesh infrastructure. Since launch, we've thrown the book at NebulaGraph, adding 3 production clusters with more complex workloads and spiky online traffic patterns. Each time, it's scaled effortlessly, exceeding our expectations.” - Martin Qian, Sr. Manager Infrastructure at Snap.
Looking Ahead: Evolving the Entire Infrastructure with NebulaGraph
Moving beyond friend recommendations, Snapchat's Infrastructure team plans to leverage NebulaGraph in three new use cases:
- ML Edge Feature Storage: Eliminating the overhead of materializing costly multi-hop ML features, NebulaGraph can be used to power real-time inference and training queries.
- In-app User Engagement: Storing and querying relationships between users, content, lenses, locations and more will enable training graph convolution neural nets and relationship embeddings, boosting personalization.
- Off-app Analytics: Tackling identity resolution for Ads Attribution, and further data exploration opportunities beyond the app.
Additionally, NebulaGraph enables data sovereignty by allowing Snapchat to manage its own deployments of database instances. This shared-nothing data model ensures all of Snapchat's user data is maintained within Snapchat Infrastructure's controlled environments. This guarantees that enterprise data sovereignty is upheld and maximizes adoption flexibility.