Fraud Detection with Graph Analytics
The 2023 State of Fraud report by Signifyd indicates that cases of online fraud are increasing at an alarming rate. E-commerce sites, for example, experienced a 71% increase in bot attacks, and fraudulent orders grew by 34%.
Even worse, scammers are now using new technology like artificial intelligence (AI) to automate fraud. This is bad news for any business, regardless of size or industry. Our recommendation for decision makers like yourself, is to start with the HOW.
How do you safeguard your assets when fraudsters are adopting new technologies to attack traditional security methods?
Not to worry, graph analytics is filling the void that has overwhelmed the traditional approaches. This superior method of identifying and preventing online fraud is already working for the organizations that have embraced it. You too should make the switch.
Let’s understand why this is important.
What is graph analytics?
To understand graph analytics, it is important to briefly visit how graph databases work. A typical graph database, whether a fully commercial or open source graph database, stores data as connections rather than tables and indexes. Each database has nodes which represent different data points. Examples of data points can be details like phone numbers, addresses, or devices. Each data point is then linked to one another by a line, also called an edge. These lines can represent data transfers, phone calls, email messages, or other information linked to the nodes. Therefore, a graph database looks like a connected network.
Now, graph analytics is the technique of studying these connected data points to find relationships and patterns. Through graph analytics, analysts can see how different data points interact within a network. It reveals the strength of each interaction and how information flows throughout the network.
Graph analytics also prioritizes the context around data points. It assumes that no data point acts in isolation, and that certain data points can influence the rest to varying degrees. This is different from traditional analytics that calculates averages and frequencies rather than how data points affect each other.
From what we have been seeing, fraudsters share information with each other through platforms in places like the dark web. An example of such information could be how-to tutorials for new hacking and scamming techniques. They also tend to cooperate with each other to maximize on lucrative or high-stakes targets.
While the fraudsters’ activities may be difficult to trace, their behaviors in the real world are not. They show patterns that graph analysis can detect. This is why graph analytics have become crucial in fraud detection.
Strengths of graph analytics in fraud detection
Graph analytics is well suited to fraud detection because of the nature of fraudulent activities. Some online scammers act alone, but most frauds are committed by a group of people, or a fraud ring.
A typical fraud ring may comprise a handful of individuals or thousands of members across the world. Fraud rings can target a wide range of businesses, e.g., banks, retailers, insurance companies, currency exchanges, etc. The scammers typically use different techniques to hide their identities and locations, making them difficult to track. But they cannot hide from graph analytics.
Using machine learning algorithms, graph analytics can identify a fraud ring and reveal its connections. For example, it can spot fraudulent activities in credit card transactions or wire transfer records. Graph analytics empowers investigators to see how fraud rings operate.
Here is a brief breakdown of the specific strengths that make graph analytics superior:
- The visual representation of graph data is more intuitive to interpret than charts or spreadsheets
- The hidden relationships between fraudulent profiles and activities become clear
- Machine learning algorithms reduce the manual tasks of scanning and analyzing data
- Fraudulent actors are identified faster and more accurately based on real-time data.
How a typical graph data model in graph analytics works (for fraud detection)
When used in fraud detection, a graph data model acts as a virtual detective that finds evidence of fraud. It links individuals, transactions, and financial institutions involved in crime. Then, it compares these connections with previously identified fraud patterns to find similarities or anomalies.
Let's consider a money laundering pattern as an example. One person wants to transfer large amounts of money without being detected by banking systems. The recipient also wants to get the funds without detection. The sender usually divides the funds into smaller portions. They also open several accounts with different banks to set up the transfers. Similarly, the recipient prepares several accounts to appear as though they are receiving payments from different clients.
At a glance, these transactions may appear normal and continue to flow without problems. But with graph analytics, the connections between the sender and recipient become clear. The graphs will show how the money flows from a single sender to a single recipient through multiple channels. This is a common pattern of money laundering, where money generated by criminal activities appears legitimate.
Graph analytics can also determine if such fraudulent activities are expanding rapidly, e.g., new criminal rings emerging in unexpected locations. It reveals shared data points, e.g., the same devices or email addresses accessing multiple banks or websites.
In terms of e-commerce fraud, graph analytics can alert businesses about frequent chargebacks or transaction reversals from specific customers. These patterns indicate that the customers intend to keep purchased goods without paying for them, which is a criminal offense.
The key approaches that graph analytics employ to identify fraud
There are four approaches that data scientists use to identify fraud using graph analytics, as we’ll explain below:
Graph databases require a graph query language to extract and modify data. An example of a query language is NebulaGraph’s nGQL. These are specialized programming languages that enable graph analysts to ask questions and derive valuable answers from their databases, such as:
- How many individuals are transacting with a known fraudster?
- How far apart (in miles) are transactions of $10,000?
- How many of those transactions were conducted at 3AM in the morning?
Using such questions, graph analysts can identify any data points of interest and determine how they are related. The queries can be as detailed as possible, e.g., differences between a shipping address and a billing address on specific orders. They can also analyze the contents of a checkout basket to find anomalies in purchasing patterns. These patterns may be difficult to see without graph technology.
2. Machine learning
The specific form of machine learning that is utilized in this context is what we call supervised machine learning. So what is supervised machine learning? This is a process that classifies the data used to train a graph analytics model. This means assigning labels to the training datasets and allowing the machine learning model to learn and adapt.
A common example of supervised machine learning is image recognition. This is where data scientists label the training images, e.g., photos of a bus. Then, the model is shown photos of all kinds of vehicles (planes, cars, trains, etc.) and learns how to identify a bus.
In fraud detection, graph analysts can label training data as fraud or non-fraud. For example, a fraudulent transaction can be tagged and fed into the ML model. A tag can be a particular tax haven location which signals fraud. Using this tag, the model can analyze all transaction data and pick out all matching incidents.
This approach is particularly helpful to track money laundering activities. Data scientists can tag the training datasets with parameters such as:
- User identities
- Frequency of purchases
- Transfer amounts
- Internet Protocol (IP) addresses, etc.
3. Graph Algorithms
A graph algorithm is a mathematical tool used to detect patterns in graph databases automatically. It can identify specific qualities and connections between data points in ways that graph queries might miss. Some examples of graph algorithms include:
- Centrality algorithms: These detect the most influential individuals or data points in a network, e.g., accounts with the most frequent chargeback activities on a payment platform.
- Community detection algorithms: These identify data points that have the most connections, e.g., transaction patterns that show signs of a fraud ring.
- Similarity algorithms: These identify data points that have matching qualities, e.g., clusters of accounts that are linked to the same email addresses.
- Link prediction algorithms: These predict connections between two data points, e.g.,how likely a suspicious account activity will turn out to be fraud, based on established patterns.
There are numerous other algorithms that apply to graph analytics. Data scientists can customize these algorithms to match specific use cases and derive detailed insights. This means that they can keep up with changing fraud tactics before they impact an organization.
4. Real-time analytics
Some types of fraud are extremely time-sensitive, such as phone-based scams targeting senior citizens. With graph analytics, such types of fraud can be detected in real time. For example, phone communications can be scanned in real-time based on the scam callers' patterns.
The calls are automatically redirected to a fraud detection team. The call recipient can also see an alert on their devices saying that the caller is a potential scammer.
According to the Federal Bureau of Investigations (FBI), the Internet Crime Complaint Center (IC3) received fewer scam reports in 2022. However, the value of the money lost in these reports increased almost by 50%.
This increase is attributed to an emerging trend where scammers are targeting citizens who are over 60 years old, who have retirement savings and investments in large amounts. What we learn here is that scammers are constantly evolving their techniques to increase their fraudulent returns.
Because of the increasing sophistication by frauds, it’s important that your organization’s fraud detection approaches are always a step ahead. As we have seen, graph analytics will get the job done.
Remember that traditional fraud detection methods rely on stored data, which requires intensive human resources to categorize and analyze it for patterns. Graph analysis, however, automates databases through machine learning. It reveals patterns from the data that traditional analysis methods may fail to detect. This is why the switch to graph analytics is essential for a robust organizational fraud detection strategy.