Address Labels, Fund Flows & Graphs: BlockSec Builds a Crypto Risk Control System with NebulaGraph

The EU’s MiCA Act and OFAC Sanctions are raising the bar for crypto risk control. How can we build secure, compliant systems in this new era? BlockSec, a global leader in blockchain security, offers a solution: a robust risk control system built on NebulaGraph, enabling seamless flow from data ingestion to actionable risk insights. Yunfei Xie, a security R&D engineer at BlockSec, leads work on crypto address tagging and graph databases.

The Core Idea: From Transaction Data to a Risk Network

The Atomic Unit: The Transfer as the Core of Fund Flow

All complex on-chain value transfer activities are ultimately reduced to the most basic token transfer.

However, the raw on-chain data is incredibly complex, including batch processing of multiple transfers, complex swap logic, and cross-chain messaging.

Our goal is to build a foundation for risk analysis from these massive and complex on-chain “atomic units”, including noise filtering (such as removing junk tokens) and the establishment of a Fund Flow Graph.

Risk Network: The Graph is the Skeleton, Labels are the Soul

Imagine you have only raw blockchain data: connections between anonymous addresses, such as the transfer from “0x098b…2f96” to “0x1361…1f39” and then to “0xd90e…f31b.” This data forms an “anonymous skeleton”—we see structured connections but lack meaningful context. We know funds are flowing, but we don’t know who’s moving them, why, or what they represent.

When we assign labels to these addresses and transactions, the data comes to life.

For example, by labeling “0x098b…2f96” and “0x1361…1f39” as “Lazarus Group: Ronin Bridge Exploiter,” and “0xd90e…f31b” as “Tornado.Cash Router,” we can clearly see that the renowned North Korean hacker group, the Lazarus Group, laundered 3,000 ether through the sanctioned mixer Tornado.Cash after the Ronin Bridge attack.

Through labeling, the anonymous skeleton becomes an insightful risk map. Labeling gives meaning to the data, allowing us to accurately identify high-risk entities and behaviors from massive amounts of on-chain data.

Accurate Portrait: On-Chain Risk Classification System

By combining graph analysis with a sophisticated tagging system, we can accurately profile on-chain risks. The system needs to be able to identify and tag multiple threat types to help us fully understand and respond to the “dark forest” of the crypto world.

As shown in the following figure:

By accurately identifying and classifying these threats, the risk control system can provide multi-dimensional risk assessment.

The Implementation: Building the Risk Control System with NebulaGraph

Reasons for Choosing NebulaGraph

First and foremost, performance was the primary consideration. NebulaGraph’s distributed architecture and superior concurrent processing capabilities can easily handle queries on tens of billions of nodes and edges, ensuring we complete complex graph traversals and analyses in milliseconds.

Secondly, a flexible data model allows us to better depict the complex relationships in the blockchain world. The property graph model in NebulaGraph is ideally suited to capture the rich, multi-layered connections in blockchain data.

Finally, ecosystem compatibility is also a key consideration. NebulaGraph natively supports ISO-GQL, which reduces the learning curve for our team. Its rich visualization tools and APIs allow us to quickly build intuitive cash flow analysis interfaces.

System Architecture: Four Pillars of Risk Control System

BlockSec’s risk control system is built on four pillars, which work together to support the functionality of the entire system.

The Data Layer is responsible for extracting and processing filtered on-chain fund flow data in real time from various blockchain networks (such as BSC, ETH, and Tron full nodes). This data is mined through data extractors and cleaned using a multi-filter pipeline to ensure data accuracy and relevance.

The Intelligence Layer is the “brain” of the system. Through 24/7 intelligence collectors, it draws information from a variety of sources, including public APIs, open-source intelligence (OSINT) feeds, and regulatory agency lists. This raw intelligence is processed through intelligence processors to generate precise address tags, providing critical context for subsequent risk analysis.

The Storage Layer is the core infrastructure of the system and is powered by NebulaGraph. NebulaGraph was chosen for its superior graph data storage and query capabilities, which can efficiently handle massive amounts of addresses and transaction relationships, providing a solid foundation for building the risk network.

The Computation Layer deploys predefined and custom risk engines. These engines leverage graph data from the Storage Layer and labels provided by the Intelligence Layer to perform advanced risk pattern detection and threat assessment. These engines interact with the system through the nGQL (NebulaGraph Query Language) query interface to obtain real-time risk analysis results.

These four pillars together constitute a powerful encryption risk control system, ensuring efficient operation of the entire chain from data capture to risk insight.

Qualitative Analysis: Using Labels to Give Data Meaning

Where do labels come from? How can we ensure their accuracy? This is precisely the key to qualitative analysis.

Our label sources are multi-dimensional:

On-Chain Heuristics: Analyze behavioral patterns from blockchain activity, build phishing detection engines and attack detection engines, and identify phishing addresses and attack transactions.
Off-Chain Intelligence: This information is gathered from public sources, including social media, official legal documents, and collaboration with third-party intelligence partners. If an address is sanctioned by an official agency or widely identified as a hacker in the public community, it will be labeled accordingly.
Human Analysis: As an important supplement to the automated system, a team of experts conducts in-depth investigations to verify automated findings and discover new and complex crime patterns that the system may have missed.

We adopt the following strategies to ensure the label accuracy:

Automated Cross-Validation: The system automatically aggregates and cross-references data from multiple intelligence sources. If an address is labeled as fraudulent by multiple independent sources, the confidence level of that label is higher.
Human-in-the-Loop Feedback: Leverage product suites (e.g., MetaSleuth, Phalcon, and API services) to form a continuous feedback loop. When analysts use these tools to investigate, any labeling errors or updates they discover can be immediately corrected and fed back into the system, continuously optimizing labeling accuracy and coverage.

Risk Quantification: Building a Computable Risk Model

To quantify the risk and trace the sources of funds, we apply a heuristic algorithm with the following key steps:

Fiat Exposure Calculation

Objective: Standardize the USD value of token transfers.

Approach: Use price oracles to calculate the USD equivalent for every token transfer.

Time Series Filtering

Objective: Ensure temporal consistency in tracing.

Approach: Only edges in compliance with a time sequence rule are preserved. An address cannot spend funds that it has not yet received.

Haircut Strategy

Objective: Focus on the most relevant paths.

Approach: Apply proportional weight allocation and introduce a threshold to prune low-contribution or irrelevant paths. This prevents contamination of the result by negligible or noisy funding sources.

Through these steps, we can build a relatively accurate and flexible risk quantification model to provide a quantitative basis for attributing high-risk transactions.

The Value of Risk Control Systems

Proactive Prevention: Building a Proactive Defense System

Core Capability: Real-time blocking of risky transactions.

Application Scenario: When processing user deposits on an exchange or interactions on a DeFi protocol, an API call can fetch the risk score of associated addresses. If the score exceeds a threshold (e.g., funds from a mixer or sanctioned entity), the system can automatically halt or reject the transaction, stopping the risk at the gate.

In-Process Control: Raising the Security Baseline of Partners

Core Capability: Comprehensive due diligence on counterparties and projects.

Application Scenario: When evaluating a new DeFi project for integration or conducting a large transaction with an institution, graph analysis can review the historical behavior and funding network of their contracts and wallets. This effectively identifies if they have close ties to "high-risk entities," providing critical data for partnership decisions.

Post-Mortem Forensics: Providing Precise Investigation & Evidence

Core Capability: Deep penetration and tracing of on-chain funds.

Application Scenario: After a security incident (like a hack), provide clear, visual fund flow reports for the project team and law enforcement. Our system can quickly trace the complete path of stolen funds through multiple hops, mixers, and into exchanges, securing the crucial time window for asset freezing and recovery.

Conclusion

Leverage NebulaGraph, BlockSec has built a robust risk control system from transaction data to risk networks. In the future, we believe that NebulaGraph can provide more stable and solid support for the compliant development of global crypto finance.