Forecasting Ridership Without the Data Science Theater

One district ran morning buses sized as if every enrolled rider appeared every Tuesday. Planners could see the half-empty rows pulling away; they did not have a number attached to the waste, so nothing changed. Once they started tracking who actually boarded by stop and weekday, they could shrink buffers without pretending they had a crystal ball.

That is the whole argument for forecasting in plain clothes: not a slide that says "machine learning," but a better guess than "assume 100%."

Pair operational tactics with rider no-shows. For seat-level cost language, see empty seat miles.

Worst-case planning has a bill

If every route assumes peak enrollment forever, you buy vehicles, hours, and fuel for ghosts. Forecasting flips the question to: what usually shows up, and how wrong can we afford to be? A range beats a single heroic number when you are deciding whether to float a spare driver or merge two light runs.

Data you can actually collect

Start boring: counts per stop or per trip, by day of week, for a couple of months. Add which vehicle ran so you are not mixing unlike routes. Schools have enrollment lists; shuttles have shift rosters — use them as ceilings, not as demand.

Things that sharpen the picture without a warehouse: half-days and holidays, obvious weather days, big on-site events for corporate sites. If riders can confirm or cancel in app or SMS, that feed is gold — it turns "maybe" into "told you no."

Dwell at busy stops often explains mystery lateness better than traffic alone; dwell time is worth fixing in parallel.

Start with a rolling average, not a model zoo

For many fleets, "this stop averages six riders on Wednesdays over the last eight Wednesdays" is enough to stop running a fourteen-passenger problem as thirty. Layer a simple band: we plan for the average plus a small cushion — the cushion is a policy choice, not a neural network.

When you have months of clean history and noisy, interacting factors, richer models can help — but the decision layer stays the same: you need thresholds a dispatcher can defend at 6:12 AM. If nobody trusts the output, the spreadsheet version wins.

Low-volume stops are noisy; grouping a few neighbors and forecasting the cluster often behaves better than pretending one curb has statistics.

Turn numbers into actions

Write rules everyone agrees on before the numbers arrive. Examples: if tomorrow's expected load on a run sits below X% of capacity for Y consecutive weeks, flag consolidation; if variance is high, keep a standby instead of cutting. Re-run routes the night before with expected loads; adjust morning-of when confirmations diverge.

Nudges ("still riding tomorrow?") improve both behavior and data quality. Message design lives in school bus notification playbook. When reality breaks the forecast, route contingency planning keeps you from improvising fresh every time.

For how optimization and review fit together in product, see how RouteBot optimizes school and employee transport.

Risks worth naming

Under-sizing scares every planner — mitigate with conservative bands and a small on-call pool during the learning window. Stale models after boundary or policy changes: revisit seasonally, not never. Equity: never "optimize" away mandated SPED capacity or unsafe walks; hard constraints stay hard.

Pilot without drama

Pick a slice of routes with stable geography and messy attendance. Run night-before plans off forecasted demand for four to six weeks; watch on-time, complaints, and overtime. Expand when the pattern holds — not when the slide deck says so.

Try the live demo if you want to see how data, routes, and comms sit in one stack.

— Emrah G.