Time Series Forecasting
Time series forecasting predicts future values based on historical patterns. It's essential for demand planning, financial predictions, capacity planning, and resource allocation.
Time Series Components
Trend
Long-term direction:
↗ Upward trend: Sales growing over years
↘ Downward trend: Declining market
→ No trend: Stationary
Seasonality
Regular periodic patterns:
Daily: Traffic peaks at rush hour
Weekly: Restaurant sales higher on weekends
Yearly: Retail spikes at holidays
Cyclical
Irregular long-term patterns:
Business cycles (years)
Economic expansions/recessions
Noise
Random variation:
Unexplainable fluctuations
Classical Methods
Moving Average
def moving_average(data, window):
return data.rolling(window).mean()
# Forecast = average of last n values
forecast = data[-window:].mean()
Simple, good baseline.
Exponential Smoothing
ŷₜ₊₁ = α × yₜ + (1-α) × ŷₜ
α = 0: Use only old forecasts
α = 1: Use only most recent value
More weight on recent observations.
Holt-Winters
Handles trend and seasonality:
Level: lₜ = α(yₜ - sₜ₋ₘ) + (1-α)(lₜ₋₁ + bₜ₋₁)
Trend: bₜ = β(lₜ - lₜ₋₁) + (1-β)bₜ₋₁
Seasonality: sₜ = γ(yₜ - lₜ) + (1-γ)sₜ₋ₘ
Forecast: ŷₜ₊ₕ = lₜ + h×bₜ + sₜ₊ₕ₋ₘ
ARIMA
AutoRegressive Integrated Moving Average
ARIMA(p, d, q)
p: AR terms (past values)
d: Differencing (for stationarity)
q: MA terms (past errors)
from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(data, order=(1, 1, 1))
results = model.fit()
forecast = results.forecast(steps=30)
SARIMA
Seasonal ARIMA:
SARIMA(p, d, q)(P, D, Q, m)
m = seasonal period (12 for monthly)
Modern Methods
Prophet
Facebook's library, handles:
- Multiple seasonalities
- Holidays
- Trend changepoints
from prophet import Prophet
df = pd.DataFrame({'ds': dates, 'y': values})
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)
LSTM
Recurrent neural network:
class LSTM_Forecaster(nn.Module):
def __init__(self, input_dim, hidden_dim, output_dim):
self.lstm = nn.LSTM(input_dim, hidden_dim, batch_first=True)
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
out, _ = self.lstm(x)
return self.fc(out[:, -1, :])
Good for complex patterns.
Temporal Fusion Transformer
State-of-the-art for multi-horizon forecasting:
- Attention over time
- Static and dynamic features
- Quantile forecasts
N-BEATS
Neural network with interpretable components:
- Trend and seasonality blocks
- Residual connections
Feature Engineering
Lag Features
df['lag_1'] = df['value'].shift(1)
df['lag_7'] = df['value'].shift(7) # Same day last week
df['lag_365'] = df['value'].shift(365) # Same day last year
Rolling Statistics
df['rolling_mean_7'] = df['value'].rolling(7).mean()
df['rolling_std_7'] = df['value'].rolling(7).std()
df['rolling_min_30'] = df['value'].rolling(30).min()
Date Features
df['dayofweek'] = df['date'].dt.dayofweek
df['month'] = df['date'].dt.month
df['is_weekend'] = df['dayofweek'].isin([5, 6])
df['is_holiday'] = df['date'].isin(holidays)
Evaluation
Metrics
| Metric | Formula | Notes |
|---|---|---|
| MAE | Σ | y - ŷ |
| RMSE | √(Σ(y - ŷ)² / n) | Penalizes large errors |
| MAPE | Σ | y - ŷ |
| sMAPE | Symmetric MAPE | Handles zeros better |
Time Series Cross-Validation
Fold 1: Train [1-100], Test [101-110]
Fold 2: Train [1-110], Test [111-120]
Fold 3: Train [1-120], Test [121-130]
from sklearn.model_selection import TimeSeriesSplit
tscv = TimeSeriesSplit(n_splits=5)
for train_idx, test_idx in tscv.split(X):
# Always train on past, test on future
Walk-Forward Validation
for i in range(len(test)):
train = data[:train_end + i]
model.fit(train)
pred = model.predict(1)
# Retrain as new data becomes available
Multi-Step Forecasting
Recursive
Predict t+1, use it to predict t+2, ...
Errors accumulate.
Direct
Train separate model for each horizon:
Model_1 predicts t+1
Model_7 predicts t+7
No error propagation.
Multi-Output
Single model outputs [t+1, t+2, ..., t+n]
Joint learning of horizons.
Common Pitfalls
Look-Ahead Bias
# ❌ Wrong: Uses future in feature
df['future_avg'] = df['value'].shift(-7).rolling(7).mean()
# ✓ Correct: Only uses past
df['past_avg'] = df['value'].shift(1).rolling(7).mean()
Incorrect Train/Test Split
# ❌ Wrong: Random split
train, test = train_test_split(data, random_state=42)
# ✓ Correct: Temporal split
train, test = data[:split_idx], data[split_idx:]
Ignoring Stationarity
- Many models assume stationarity
- Test with ADF (Augmented Dickey-Fuller)
- Difference if needed: diff = data[t] - data[t-1]
Production Considerations
Retraining Frequency
- Daily? Weekly? As data arrives?
- Balance freshness vs stability
Confidence Intervals
# Quantile forecasts
p10, p50, p90 = model.predict_quantiles([0.1, 0.5, 0.9])
Don't just give point forecasts.
Monitoring
- Track forecast accuracy over time
- Alert on degradation
- Detect distribution shift
Key Takeaways
- Decompose: trend + seasonality + noise
- Classical: ARIMA, Exponential Smoothing
- Modern: Prophet, LSTM, Transformers
- Feature engineering: lags, rolling stats, date features
- Always use temporal train/test splits
- Provide uncertainty estimates, not just point forecasts