Credit Card Fraud Transaction Detector

In this article, we will be creating a basic custom neural network model designed for Kaggle’s Credit Card Fraud Detection dataset using Deep Learning Studio.

Deep Learning Studio

Instead of coding up the entire network, we will be using the GUI based drag and drop software Deep Learning Studio Manager (DLS) to build the custom neural network. DLS will also be used to train and test the network on the dataset provided.

The software can be downloaded from deepcognition.ai by creating a free account. For this example, Windows v3.0.0 will be used.

Dataset

In this example, we will be using the Credit Card Fraud Detection dataset which is available on Kaggle. The dataset is highly imbalanced with only 492 fraud transaction among the 284,808 transactions provided i.e. about 0.172% of fraud transactions. There are 30 features that determines whether the posted transaction is a fraud or not. Of these 30 features, 28 are masked due to confidentially issue, while the other two are Time and Amount in USDs.

To download the dataset, visit the link posted below and download the dataset (~144 MB). Note that a Kaggle account may be required to download the dataset.

https://www.kaggle.com/mlg-ulb/creditcardfraud

We rename the CSV file containing the dataset to train.csv and archive it into a ZIP file to upload it to DLS.

Upload Dataset to DLS

To upload the dataset, go to the Datasets tab and click on the Upload Dataset button in the My Datasets tab. A prompt will open where you either drag and drop the ZIP file or can browse your file directory and upload the ZIP file. The dataset format will be DLS Native. Click on the Start Upload button to upload the dataset to DLS. Below is a screenshot of how it should look:

Custom Neural Network Model

Creating Project in DLS

We proceed to make our custom neural network model. Click on the Projects tab. To create a new project, click on the Create Project button. A pop up asking for the project name, project type and description shows up. The project type for our model is Custom Neural Network. You can set the name and description as per your preference. Click on the newly created project and you will see a screen with multiple tabs as show below:

Setting up the Dataset in DLS

In the Data tab (shown above), select the Credit Card Fraud Detection dataset that was uploaded to DLS. We will use a 90% - 5% - 5% shuffled train/validation/test split for our dataset i.e. we will train on 256,326 examples and using 14,240 examples for our validation. The testing set will also have 14,240 examples. The input (InputPort0) is the V1-V28 columns along with the Amount and Time columns. The data type for all input will be numeric. The output (OutputPort0) will be the Class column and it will be categorical data type, as it will be a binary classifier (i.e. fraud or no-fraud). Here is a screenshot of how it should look:

Making Custom Neural Network Model

In the Model tab, we will be creating the neural network for our dataset. The neural network would only consist of fully connected NN layers (or Dense layers) and the Dropout layers. We will also use Batch Normalization on the input test sets. In DLS, you will need to drag and drop the layers from the available layers on the left menu. You then connect them together in order. The layers which require additional parameters, like the number of units or the activation function in a Dense layer, needs to be entered in the respective options by clicking on the layer. Following is how the model will look like in DLS along with parameters for each layer (the code for this model is presented after the hyperparameters are selected in the next step):

  • Input_1

    • Shape: (None, 30)

  • BatchNormalization_1

  • Dense_1

    • Units: 128

    • Activation: relu

  • Dropout_1

    • Rate: 0.3

  • Dense_2

    • Units: 64

    • Activation: relu

  • Dropout_2

    • Rate: 0.3

  • Dense_3

    • Units: 16

    • Activation: relu

  • Dropout_3

    • Rate: 0.3

  • Dense_4

    • Units: 2

    • Activation: softmax

  • Output_1

    • Shape: (None, 2)

Setting up Hyperparameters

On the Hyperparameters tab, we will tune the hyperparameters using Show Advanced Hyperparameters. The following will be the hyperparameters used:

Training the Custom Neural Network Model

We will now train the model from the Training tab. Select the device using which you want to compute the model parameters and click on Start Training. An ETA will be shown in the top left function along with other statistics shown on the tab. Following is a screenshot of DLS during model training:

Training Results

Once the model has completed its training, we now go to the Results tab to check our result. On the Graphs option, select the RunX from the left and you will see the graphs for our training run. As you can see on the screenshot below, the training loss was close to 0.0042, the training accuracy was close to 0.9993 and the validation accuracy was close to 0.9996. This indicates the model is doing extremely well on the dataset.

On the Configuration option, you can see all the layers and hyperparameters that were used for the given training run.

Inference Testing Data

On the Inference/Deploy tab, we will now test the model using the testing set of 14,240 examples. Select the Dataset Inference tab and select Testing as the Dataset Source. While the DLS software shows the model result and the actual result, we can download it to check for additional metrics, if needed. Following is the screenshot:

On analyzing the downloaded results, we find that there were 6 predictions that were not correct or about 0.04% of the testing dataset. The number of false positive were two of those six wrong results, while there were four false negative results of the six wrong results.

Model Code

import tensorflow.keras
from tensorflow.python.keras.layers.normalization import BatchNormalization
from tensorflow.python.keras.layers.core import Dense
from tensorflow.python.keras.layers.core import Dropout
from tensorflow.keras.layers import Input
from tensorflow.keras.models import Model
from tensorflow.keras.regularizers import *


def get_model():
	aliases = {}
	Input_1 = Input(shape=(30,), name='Input_1')
	BatchNormalization_1 = BatchNormalization(name='BatchNormalization_1')(Input_1)
	Dense_1 = Dense(name='Dense_1',units= 128,activation= 'relu' )(BatchNormalization_1)
	Dropout_1 = Dropout(name='Dropout_1',rate= 0.3)(Dense_1)
	Dense_2 = Dense(name='Dense_2',units= 64,activation= 'relu' )(Dropout_1)
	Dropout_2 = Dropout(name='Dropout_2',rate= 0.3)(Dense_2)
	Dense_3 = Dense(name='Dense_3',units= 16,activation= 'relu' )(Dropout_2)
	Dropout_3 = Dropout(name='Dropout_3',rate= 0.3)(Dense_3)
	Dense_4 = Dense(name='Dense_4',units= 2,activation= 'softmax' )(Dropout_3)

	model = Model([Input_1],[Dense_4])
	return aliases, model


from tensorflow.keras.optimizers import *

def get_optimizer():
	return Adam()

def is_custom_loss_function():
	return False

def get_loss_function():
	return 'categorical_crossentropy'

def get_batch_size():
	return 32

def get_num_epoch():
	return 3

def get_data_config():
	return '{"mapping": {"Time": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V1": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V2": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V3": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V4": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V5": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V6": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V7": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V8": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V9": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V10": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V11": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V12": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V13": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V14": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V15": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V16": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V17": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V18": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V19": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V20": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V21": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V22": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V23": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V24": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V25": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V26": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V27": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "V28": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "Amount": {"type": "Numeric", "port": "InputPort0", "shape": "", "options": {"Scaling": 1, "Normalization": false}}, "Class": {"type": "Categorical", "port": "OutputPort0", "shape": "", "options": {}}}, "numPorts": 1, "samples": {"training": 256326, "validation": 14240, "test": 14240, "split": 5}, "dataset": {"name": "Credit Card", "type": "private", "samples": 284807}, "datasetLoadOption": "batch", "shuffle": true, "kfold": 1}'

Last updated