Back to Projects
Open Source
| Python | university

Chess Anomaly Detection

Machine learning system for behavioral anomaly detection in online chess, comparing six unsupervised detectors and an ensemble review workflow.

Role
ML Project Contributor
Period
2026
Signal
100/100 ML project
Why it matters

LOF reached 0.971 test AUC on subtle synthetic injection; ensemble flagged 312 of 17,909 players for review.

Chess Anomaly Detection

Chess Anomaly Detection is a machine learning project for detecting unusual behavioral patterns in online chess. Lichess game data is reshaped into player-level records, behavioral and engine-accuracy features are engineered, and six unsupervised detectors are compared against synthetic anomaly injections.

The goal is careful and defensible: identify statistically unusual behavior clusters for review, not declare definitive cheating labels. The strongest result was LOF with 0.971 test AUC on subtle synthetic injection, while an ensemble flagged 312 of 17,909 players for human review.

Tech Stack

Python scikit-learn PyTorch Pandas UMAP SHAP pytest Jupyter

Key Features

  • Player-level behavioral feature engineering from chess game data
  • Six unsupervised anomaly detectors compared with shared validation
  • Synthetic anomaly injection for measurable evaluation
  • Majority-vote ensemble for review-oriented flagging
  • Precomputed results, charts, report, poster, and presentation deliverables
  • 100/100 machine learning final project result

Technical Highlights

  • LOF achieved 0.971 test AUC on subtle synthetic injection
  • Ensemble flagged 312 of 17,909 players for review
  • Models included LOF, Isolation Forest, One-Class SVM, Autoencoder, ACPLSubAutoencoder, and Z-score baseline
  • Feature importance, UMAP, ROC curves, model agreement, and learning curves documented
  • Unit tests verify feature engineering and validation logic

Architecture

Data Pipeline

  • Load raw chess game data
  • Aggregate games into player-level records
  • Engineer behavioral and engine-accuracy features
  • Split and validate without leaking labels across phases

Modeling Layer

  • Compare anomaly detectors under common evaluation
  • Use synthetic injection to measure recall on subtle behaviors
  • Combine model outputs through ensemble review logic

Interpretability

  • Permutation importance for feature signal
  • UMAP views for cluster inspection
  • Model agreement analysis for review confidence

Challenges & Solutions

1

Avoiding overclaiming in a sensitive cheating-detection domain

2

Designing meaningful synthetic anomalies for evaluation

3

Comparing unsupervised models fairly

4

Making results understandable through charts and deliverables

Gallery

Chess Anomaly Detection Screenshot 1