Credit card fraud detection using incremental feature learning

Date
2023-01
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Graduate Studies and Research, University of Regina
Abstract

Detecting credit card fraud is essential and it is one of the most popular payment methods. Credit card fraud can cause huge losses for cardholders. Therefore, so many studies have focused on proposing different standard machine learning methods and limited use of incremental learning to create a robust detective system. None of these studies can solve all the credit card fraud challenges together. The reason is the complicated real-world scenario and data we have in our hands. Some of these challenges are rapid data arrival rate, concept drift which causes model performance to decline over time and data sensitivity which causes a limited amount of instances in hand for training a model. We have proposed a chunk-based credit card fraud detection model which is based on incremental feature learning and transfer learning. Our proposed approach gives our model the capability to adjust its topology to find the near-optimal solution for the problem at hand. Our approach creates submodels per chunk and for the predictive model creation. We use the most relevant sub-models to the current data distribution we have. By doing so, we do not need to store all the transactions and we can avoid the model infinite growth by setting a limit on the number of used sub-models. There are a limited number of datasets for credit card fraud detection available due to the data sensitivity issue. So, we have evaluated our approach using two of the existing datasets: A mid-scale dataset consisting of two days of European cardholders’ transactions in September 2013 and A large-scale dataset consisting of 6 months of transactions in 2019. We have separated each dataset into a different number of chunks to be able to test and train our approach incrementally. We have compared our approach with a static model based on the initial chunk and re-trained on each chunk. Moreover, we have changed the number of sub-models to evaluate its impact on the performance.

Description
A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science, University of Regina. xiii, 125 p.
Keywords
Citation
Collections