Data & Automation Analyst

Liam Bennett

Turn messy data into better decisions

PythonSQLMachine LearningElasticsearchFraud DetectionETL Pipeline DevelopmentScikit-learnTableauPredictive ModellingData EngineeringFlaskStatistical AnalysisXGBoostPandasDecision SciencePythonSQLMachine LearningElasticsearchFraud DetectionETL Pipeline DevelopmentScikit-learnTableauPredictive ModellingData EngineeringFlaskStatistical AnalysisXGBoostPandasDecision Science

About Me

Bio

Data & Automation Analyst at Ditto Music, where I've built fraud detection pipelines, automated QC systems, and royalty reporting infrastructure from the ground up — saving hundreds of hours of manual work in the process. I'm drawn to problems where data can genuinely change an outcome, whether that's catching fraudulent streams or predicting what a customer does next. Currently developing my machine learning toolkit with a focus on predictive modelling and decision science.

Current Focus

RoleData & Automation Analyst
SpecialtyAnalytics & Modelling
LocationLiverpool, UK
EducationUOC — Chester

Experience

Data & Automation Analyst

Ditto Music

Ditto Music's first dedicated analyst, building fraud detection, automated QC, and commercial reporting infrastructure from scratch across a catalogue of 2M+ artists.

Built DEEPFRAUD, a 6-signal fraud detection pipeline flagging ~3,000 suspicious accounts and 40M+ streams per month.
Drove AutoQC automated pass rate from 1% to 19%, processing 20,000+ releases per week.
Designed a weekly top-1,000 earner tracking pipeline across a 2M+ artist database, used by global RLS clients.
Built multi-source Python ETL pipelines pulling from Elasticsearch, MySQL, and internal databases for rights and royalty reporting.
Developed a confidence scoring system combining multiple fraud signals into a single ranked output for the fraud review team.

BSc Computer Science

University of Chester

Studied core computer science fundamentals with a focus on data, machine learning, and software development — culminating in a dissertation on fraud detection using supervised and unsupervised ML.

Built and evaluated supervised (Random Forest) and unsupervised (Isolation Forest) models for fraud classification as part of dissertation research.
Developed full-stack web applications using Python, Flask, HTML and CSS.
Applied NLP techniques including transformer models (T5) and entity extraction with spaCy.
Gained foundational experience in data preprocessing, feature engineering, and model evaluation metrics.
Worked with SQL for data querying and manipulation across multiple projects.
Introduced to ETL concepts through real-time data pipeline and dashboard projects.

Projects

LIVE

DEEPFRAUD

Fraud detection pipeline analysing 40M+ streams/month across Ditto Music's catalogue. Six weighted signals produce a unified artist risk score.

0K+

Flagged/month

0M+

Streams/month

PythonElasticsearch

LIVE

AutoQC Reporting Suite

QC analytics pipeline across 20K+ weekly releases at Ditto Music. Data-driven iterations drove the automated pass rate from 1% to 19%.

0K+

Weekly releases

Pass rate achieved

PythonSQLMySQL

LIVE

Account Performance Dashboard

Weekly pipeline surfacing Ditto Music's top 1,000 royalty earners from 2M+ artists. Powers commercial decisions for global RLS clients.

0M+

Artist database

0K+

Top earners tracked

PythonElasticsearchTableau

PROTOTYPE

Fraud Detection Tool

Dissertation ML app detecting fraud in CSV transaction data. Auto-selects Random Forest or Isolation Forest based on whether labels are present.

ML Models

Report metrics

PythonFlaskscikit-learn

PROTOTYPE

Loan Default Prediction

ML web app predicting loan default risk from applicant financial data. Returns a confidence score and the key risk factors driving the decision.

Input features

Risk outputs

PythonFlaskscikit-learn

PROTOTYPE

Text2SQLAI

NLP tool converting plain English questions into SQL using a fine-tuned T5 transformer. Rule-based fallback ensures reliability when the model underperforms.

Test queries correct

Accuracy

PythonFlaskT5spaCy