How machine learning is used to detect fraud

One of machine learning’s most well-known use cases is fraud detection, an area that has drawn the attention of a growing number of technology suppliers looking to develop the best algorithms and techniques to solve a problem that costs businesses millions of dollars each year.

According to study by Vesta, a global payment service provider, fraud cost businesses an average of 8% of annual revenues in 2017. The biggest impact, however, has been on digital goods suppliers that lost 9.7% of revenue on average to fraud – an increase of 13% from 2016.

The majority of fraud expenditures are for fraud management, which makes up 75% of fraud costs, triple the actual fraud losses themselves.

San Francisco-based Stripe, a payment technology company, believes it has what it takes to detect online fraud from the onset with its technology.

Consider a fraudster who uses credit card information bought off the dark web to buy a laptop from an online merchant. Upon realising that a fraudulent transaction has been made, the rightful cardholder files a dispute with his or her bank, which in turn levies the total cost of the fraudulent transaction on the merchant.

Michael Manapat, Stripe’s head of data and machine learning products, says instead of having merchants review each transaction and write rules in the case of traditional fraud detection, the company uses machine learning to do all the heavy-lifting.

There are several tell-tale signs of fraud that Stripe’s machine learning model looks out for. These include buying an item in multiple sizes, pasting credit card details into order forms rather than typing them out, and the number of distinct cards used by a single person over a period of time.

With historical data on transactions and purchases made across its network of merchants, Stripe is then able to flag up potentially fraudulent transactions with higher confidence levels.

To reduce the number of false positives, Stripe uses human risk analysts to fine-tune and identify the fraud signals used in its machine learning model. Machine learning engineers will also examine false positives by hand to understand why the classifier got something wrong. “All our systems are retrained every day automatically as more data arrives,” Manapat says.

But besides creating better machine learning models, just as important is the need to ensure that insights generated from those models are trusted and understood by users. Manapat claims that Stripe’s model delivers insights with a high degree of human interpretability.

“Users want to know why we think a transaction is fraudulent, so we’ve been providing explanations on all model decisions,” Manapat says. “When we say a transaction is at high risk of fraud, we’ll tell you that we saw a high volume of similar transactions over the past day. Or that it’s medium risk because the card was issued in the US but the user’s IP address was in Singapore.”

So far, Stripe appears to be gaining traction in the market, having blocked $4bn worth of fraudulent transactions in 2017. It recently upgraded its feature set, which now lets merchants block potential fraudsters based on a list of attributes, as well as set custom thresholds at which to block payments, among other enhancements.

Data Center
Data Management