Backtest Methodology
The headline hit-rate and ROI you see on our signals come from a walk-forward backtest against historical matches and closing bookmaker odds. This page explains what that means, what the numbers exclude, and why they shouldn't be taken as a guaranteed forward return.
What we measure
The percentage of picks where the selection won. If we made 100 picks and 55 of them settled in our favour, the hit rate is 55%.
Profit and loss per unit staked, assuming a flat 1-unit bet on every qualifying pick at the best closing price across the books we track. +5% ROI means 100 units staked returned 5 units of profit overall.
Our model's probability multiplied by the offered decimal odds, minus 1. Example: model says 55% on Home at 1.90 odds → edge = 0.55 × 1.90 − 1 = 4.5%. We only bet picks with edge ≥ 2%.
A combined measure of how accurate and well-calibrated a probability forecast is. Lower is better; 0 is perfect.
The average gap between predicted probability and actual frequency. If the model says 70% on a batch of picks, we want roughly 70% of them to win.
How the backtest works
It's walk-forward: when we score matches in a given season, the model is only trained on matches that finished before that season began. For 2023, the model sees every finished match up to the 2023 kickoff and nothing after. This prevents the model from "learning" the season it's being graded on.
Team-level features (form, rolling xG, possession, etc.) are looked up point-in-time — the snapshot used for a match is the latest one dated on or before that match's kickoff. Our ELO rating is strictly pre-match: the rating used for a team in match N never includes the result of match N itself.
We currently refit once per season rather than after every matchweek. A per-matchweek refit would be more rigorous (a week-20 pick would know week-19 results) but ~10–30× slower. It's on the roadmap; for now the season-granularity figure is the one we publish.
Where the odds come from
Historical bookmaker prices are sourced from football-data.co.uk's free CSV archives. These are closing odds — the last price published before kickoff — across a panel of major books (Bet365, Pinnacle, William Hill, Betway, and others).
When we compute ROI we use the best available price across that panel for each pick (marked "max" internally). This is an optimistic baseline — a retail account wouldn't always get the top of the market — but it's internally consistent across every season and league in the dataset.
Premier League, Bundesliga, Serie A, La Liga, Ligue 1, Eredivisie, Primeira Liga, Belgian Pro League, Turkish Süper Lig.
FA Cup, Champions League, Europa League, and Conference League. football-data.co.uk doesn't publish odds for these, so they appear in model-level metrics but not in the ROI figures.
What isn't included
- No transaction costs — no commissions, no limits, no stake-size shrinkage.
- Flat 1-unit stakes only — no Kelly, no variable sizing, no bankroll management.
- Closing prices only — not opening lines, not in-play, not a retail-achievable mid-market.
- No cup or continental competitions in the ROI figures (odds not available in the source).
- Evidence multiplier held at 1.0 in the backtest. The production composite score boosts picks that are backed by qualitative insight patterns; historical insight rows are sparse, so we hold that multiplier neutral rather than fabricating it. This is the main reason a live signal's top pick can differ from what the historical ranker would have chosen.
Breakdown
| Season | Picks | Hit rate | ROI |
|---|---|---|---|
| No data yet. | |||
FAQ
The production composite score applies an evidence multiplier when supporting insight patterns are present. The backtest holds that multiplier at 1.0 because historical insight data is sparse, so the two rankings can differ on any given match.
Our historical odds source (football-data.co.uk) publishes closing lines only. Opening prices and mid-market quotes aren't backfilled. If a source for opening prices becomes viable we'll publish a second ROI track against that.
football-data.co.uk doesn't cover those competitions. The model still produces forecasts for them, but we don't have a settled-odds trail to compute ROI against.
They refresh when we re-run the full backtest pipeline against a new git revision — typically after a model, feature, or composite-score change. The figure above updates automatically when that artifact regenerates.
No. Past performance isn't a guarantee of future results. These numbers use optimistic closing-line prices, ignore transaction costs, and assume flat staking. Treat them as a sanity check on the model's discipline, not as a forecast of realised P&L.