Data & Automation Analyst

Liam Bennett

Turn messy data into better decisions

Scroll down
PythonSQLMachine LearningElasticsearchFraud DetectionETL Pipeline DevelopmentScikit-learnTableauPredictive ModellingData EngineeringFlaskStatistical AnalysisXGBoostPandasDecision SciencePythonSQLMachine LearningElasticsearchFraud DetectionETL Pipeline DevelopmentScikit-learnTableauPredictive ModellingData EngineeringFlaskStatistical AnalysisXGBoostPandasDecision Science

About Me

Bio

Data & Automation Analyst at Ditto Music, where I've built fraud detection pipelines, automated QC systems, and royalty reporting infrastructure from the ground up — saving hundreds of hours of manual work in the process. I'm drawn to problems where data can genuinely change an outcome, whether that's catching fraudulent streams or predicting what a customer does next. Currently developing my machine learning toolkit with a focus on predictive modelling and decision science.

Current Focus

  • RoleData & Automation Analyst
  • SpecialtyAnalytics & Modelling
  • LocationLiverpool, UK
  • EducationUOC — Chester

Experience

Ditto Music

Data & Automation Analyst

Ditto Music

Ditto Music's first dedicated analyst, building fraud detection, automated QC, and commercial reporting infrastructure from scratch across a catalogue of 2M+ artists.

  • Built DEEPFRAUD, a 6-signal fraud detection pipeline flagging ~3,000 suspicious accounts and 40M+ streams per month.
  • Drove AutoQC automated pass rate from 1% to 19%, processing 20,000+ releases per week.
  • Designed a weekly top-1,000 earner tracking pipeline across a 2M+ artist database, used by global RLS clients.
  • Built multi-source Python ETL pipelines pulling from Elasticsearch, MySQL, and internal databases for rights and royalty reporting.
  • Developed a confidence scoring system combining multiple fraud signals into a single ranked output for the fraud review team.
University of Chester

BSc Computer Science

University of Chester

Studied core computer science fundamentals with a focus on data, machine learning, and software development — culminating in a dissertation on fraud detection using supervised and unsupervised ML.

  • Built and evaluated supervised (Random Forest) and unsupervised (Isolation Forest) models for fraud classification as part of dissertation research.
  • Developed full-stack web applications using Python, Flask, HTML and CSS.
  • Applied NLP techniques including transformer models (T5) and entity extraction with spaCy.
  • Gained foundational experience in data preprocessing, feature engineering, and model evaluation metrics.
  • Worked with SQL for data querying and manipulation across multiple projects.
  • Introduced to ETL concepts through real-time data pipeline and dashboard projects.

Projects

LIVE

DEEPFRAUD

Fraud detection pipeline analysing 40M+ streams/month across Ditto Music's catalogue. Six weighted signals produce a unified artist risk score.

0K+
Flagged/month
0M+
Streams/month
PythonElasticsearch
LIVE

AutoQC Reporting Suite

QC analytics pipeline across 20K+ weekly releases at Ditto Music. Data-driven iterations drove the automated pass rate from 1% to 19%.

0K+
Weekly releases
0%
Pass rate achieved
PythonSQLMySQL
LIVE

Account Performance Dashboard

Weekly pipeline surfacing Ditto Music's top 1,000 royalty earners from 2M+ artists. Powers commercial decisions for global RLS clients.

0M+
Artist database
0K+
Top earners tracked
PythonElasticsearchTableau
PROTOTYPE

Fraud Detection Tool

Dissertation ML app detecting fraud in CSV transaction data. Auto-selects Random Forest or Isolation Forest based on whether labels are present.

0
ML Models
0
Report metrics
PythonFlaskscikit-learn
PROTOTYPE

Loan Default Prediction

ML web app predicting loan default risk from applicant financial data. Returns a confidence score and the key risk factors driving the decision.

0+
Input features
0
Risk outputs
PythonFlaskscikit-learn
PROTOTYPE

Text2SQLAI

NLP tool converting plain English questions into SQL using a fine-tuned T5 transformer. Rule-based fallback ensures reliability when the model underperforms.

0
Test queries correct
0%
Accuracy
PythonFlaskT5spaCy