Skip to content

ymurong/DSP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

UvA Deadlines

26-30 Sept: meeting with the supervisor to discuss the current set of ideas`

24-28 Oct: End ideation meeting: Groups arrange a meeting with the supervisor and preferably with stakeholders to discuss the idea they chose to work out until the end of the project

21-25 nov: •Mid-proto meeting. Groups arrange a meeting with the supervisor to discuss the current state of the prototype

1-9 Dec: End-proto meeting. Groups arrange a meeting with the supervisor (and preferably with stakeholders) to discuss the way how they intend to implement  and evaluate the current prototype.

Presentation Slides

Ideation Phase Slides Link

Prototype Phase Slides Link

Pre-Final Phase Slides Link

Explorative Analysis

Any missing values, outliers, bias?

Related Notebooks: descriptive

Visualization & Inferential Analysis

This would allow us to select the most relevant features and possibly construct new features that correlate with fraud based on given dataset

Related Notebooks:

  • Time Independant

    • Correlation Matrix (Cramer V, Theil U) between categorical features and Fraud
    • Euro Amounts Rank Sum Test
    • PCA Viz to detect patterns between features and Fraud (eur_amount included and excluded)
    • (WIP) Risk Score based on historical transactions (new feature)
      • Would a client/ip whose transaction amount distribution differs from the general non fraud distribution indicates higher risk of fraud (Odds ratio) ? Justify this by searching for the account/ip that don't have the same distribution (by hypothesis testing) and visualize its fraud cases. Based on this, possibly construct a risk score (high, midium, low) for each account/ip. Test risk score correlation with the fraud.
      • Would a client/ip who had fraud before indicates higher risk of fraud? weighted historical frauds counts
  • Time Dependant (construct new features based on given dataset)

    • cumulative frauds for a given window time range (same ip, same account)
    • eur amounts outlier for a given past time range

Feature Engineering

Check the FE_README.md for details.

Classifier Training/Evaluation

Check the CLASSIFIER_README.md for details.

Backend

Check the BACKEND_README.md for details.

The backend provides online openapi documentation http://127.0.0.1:8000/docs

Dashboard

Check the DASHBOARD_README.md for details.

References

APATE: A novel approach for automated credit card transaction fraud detection using network- based extensions" by Véronique Van Vlasselaer, Cristián Bravo, Olivier Caelen, Tina Eliassa-Rad, Leman Akoglu, Monique Snoeck, and Bart Baesens

A graph-based, semi-supervised, credit card fraud detection system

Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs

Assessment and mitigation of fairness issues in credit-card default models