From a hand-tuned model that beats IESO once — to a self-improving platform that keeps improving your model, continuously, across your whole fleet.
You already proved you can beat IESO — by hand, once, for Ontario.
AHOY makes that automatic, continuous, and fleet-wide.
We are not selling a better forecast. You build excellent models. We sell the self-improving platform around your model — one that keeps improving it as conditions drift, retargets to every asset without new ML hires, and proves every change is safe and explainable.
| Signal | What we found |
|---|---|
| Your model | StandardScaler → XGBoost, 267 features, 966 trees @ depth 4, Optuna-tuned, leak-aware residual design. Mature and competent. |
| IESO baseline today | 2.0% MAPE, MAE 321 MW — good, but systematically under-forecasts ~250 MW, structured by hour (HE07 +552 MW, evening peak HE17 +388 MW). |
| Where it hurts | Peak-hour P95 error 930 MW; under heat waves P95 ≈ 1,031 MW — the tail blows out in the costliest hours. |
| Frozen eval window Jul 1–14, 2025 | IESO MAE 423 MW (2.27%), RMSE 538, P95 ≈ 1,019 MW. Data clean — 6 zeroed rows, no nulls. |
Computed directly from best_model.pkl + ontario.parquet — exact figures, not claims.
Where the tail error reaches ~1 GW — the hardest, costliest hours, where even a well-tuned single model struggles and an autonomous search has the most room.
Honesty note: we did not fabricate a "model beats IESO by X%" figure — that needs your feature/eval code. The same features (residual_roll_*) carry leakage risk we'd audit as part of the loop.
Honest framing: single-model average-RMSE gains are modest. The improvement is on the operational KPI (weighted-peak + P95), on a 2nd out-of-window regime, provably leak-safe, and produced autonomously — not a one-shot accuracy miracle.
The operational layer for Physical AI — perceive, decide, execute.