Fraud Detection Using Neural Networks

Machine Learning in General Insurance

July 2022

Fraud detection is an important aspect of running a modern insurance company. Several billions worth of fraudulent claims are reported and settled each year, causing losses to insurance companies and in turn creating inflated and unfair premiums. Let us therefore do our best to make life hard for fraudsters.

The Purpose of a Fraud Detection Model

Typically, the process of detecting fraudulent claims goes something like this: A suspicious claim is received and is sent to triage. If there is sufficient information to prioritize the case, it is then forwarded to investigation. 

The purpose of a fraud detection model is to supplement the existing process by improving the identification of suspicious claims for further investigation. If we are able to do this well, we will catch more fraudulent claims using less resources.      

Building a Fraud Detection Model

Building a fraud detection model requires several elements, each being crucial for success. The most important ones are  

Define goals

Defining the goals of the model is extremely important. Do we simply want to catch as many fraudulent claims as possible? Or are there resource constraints to account for?

Data

Data is the foundation of any Machine Learning model. Understanding the data and processing it appropriately is crucial. 

Performance

Most often, we will build several different models in order to compare them and assess which ones perform the best and most easily integrate into the business. Choosing the right methods and metrics is imperative. 

Modelling

Once data is in place, we can create the actual fraud detection model. This is an elaborate step consisting of processing data for use in a model, feature engineering as well as choosing and tuning a model.

Deployment

Creating a deployment strategy for how we best integrate the model into the business is again crucial. If we fail at this step, all of the awesome things we did in the previous steps will have been for almost nothing. 

Putting It Into Practice

The project was a proof-of-concept project in collaboration with a large Danish insurance company. The aim was to assess whether it would make sense to invest in building a sophisticated Machine Learning based fraud detection model over a more simple and low-tech approach. 

We formulated the project as a supervised binary classification problem, involving the following model classes:

Feed Forward Neural Networks

This class of models is the work horse of all neural networks and is thus well understood and well implemented in statistical software. 

Bayesian Neural Networks

Taking a Bayesian approach is compelling because the predictive distribution can provide valuable information compared with point predictions. It is, however, a class of models which is difficult to handle and very computationally expensive.

Graph Convolutional Neural Network

The nature of the data allowed us to include information on the relation between claimants, ie. possible ways in which claimants might know each other. This is really interesting, in particular in terms of organized fraud. The relational data is encoded in a graph, which we can then use as input into a Machine Learning model; in this case a Graph Convolutional Neural Network.

The results of the project can be summarized as follows:

  • Feed Forward Neural Networks proved to perform significantly better – and by a large margin – compared to more simple solutions such as logistic regression. 
  • Bayesian Neural Networks were too computationally expensive to be practically feasible. 
  • Including relational data through Graph Convolutional Neural Networks improved performance. Using relational data in conjunction with regular methods is an exciting way of improving performance beyond what might otherwise be possible.
Generally speaking, these methods are very versatile and are by no means limited to insurance. Coupling Machine Learning with experts within the specific domain and using “the best of both worlds” is very often the way to a better solution.