Evgueni Poloukarov commited on
Commit
27ce714
·
1 Parent(s): 2e13800

docs: correct feature count (2,514 actual, not 3,043 claimed)

Browse files

Fixed documentation mismatch where code claimed 3,043 features but
actual implementation uses 2,514 features:
- Dataset reality: 2,553 columns = 1 timestamp + 38 targets + 2,514 features
- Breakdown: 603 full-horizon + 12 partial + 1,899 historical
- Context shape: (504, 2517) = 504h × (ts + border + target + 2,514 features)

Updated documentation in:
- src/forecasting/dynamic_forecast.py (header + docstring)
- src/forecasting/chronos_inference.py (header + print statements)
- app.py (UI description)

This ensures documentation matches code reality for accurate handover.

app.py CHANGED
@@ -2,7 +2,7 @@
2
  """
3
  FBMC Chronos-2 Forecasting API
4
  HuggingFace Space Gradio Interface
5
- Version: 1.4.0 (past-only masking with 3,043 features)
6
  """
7
 
8
  # CRITICAL: Set PyTorch memory allocator config BEFORE any imports
@@ -140,9 +140,9 @@ with gr.Blocks(title="FBMC Chronos-2 Forecasting") as demo:
140
  generalizes directly to FBMC cross-border flows using historical patterns and future covariates.
141
 
142
  **Features**:
143
- - 3,043 engineered features using past-only covariate masking
144
  - Known-future: weather, generation, load forecasts (615 features)
145
- - Past-only masked: CNEC outages, volatility, flows (~2,428 features)
146
  - 24-month historical context (Oct 2023 - Oct 2025)
147
  - Time-aware extraction (prevents data leakage)
148
  - Probabilistic forecasts (9 quantiles: 1st/5th/10th/25th/50th/75th/90th/95th/99th)
 
2
  """
3
  FBMC Chronos-2 Forecasting API
4
  HuggingFace Space Gradio Interface
5
+ Version: 1.4.0 (past-only masking with 2,514 features)
6
  """
7
 
8
  # CRITICAL: Set PyTorch memory allocator config BEFORE any imports
 
140
  generalizes directly to FBMC cross-border flows using historical patterns and future covariates.
141
 
142
  **Features**:
143
+ - 2,514 engineered features using past-only covariate masking
144
  - Known-future: weather, generation, load forecasts (615 features)
145
+ - Past-only masked: CNEC outages, volatility, flows (1,899 features)
146
  - 24-month historical context (Oct 2023 - Oct 2025)
147
  - Time-aware extraction (prevents data leakage)
148
  - Probabilistic forecasts (9 quantiles: 1st/5th/10th/25th/50th/75th/90th/95th/99th)
src/forecasting/chronos_inference.py CHANGED
@@ -2,7 +2,7 @@
2
  """
3
  Chronos-2 Inference Pipeline with Past-Only Covariate Masking
4
  Standalone inference script for HuggingFace Space deployment.
5
- Uses predict_df() API with ALL 3,043 features leveraging Chronos-2's mask-based attention.
6
  FORCE REBUILD: v1.4.0 - Past-only covariates + batch_size=128 for volatility capture
7
  """
8
 
@@ -173,10 +173,10 @@ class ChronosInferencePipeline:
173
  total_start = time.time()
174
 
175
  # PER-BORDER INFERENCE WITH PAST-ONLY COVARIATE MASKING
176
- # Using predict_df() API with ALL 3,043 features (known-future + past-only masked)
177
- print(f"\n[PAST-ONLY MASKING] Running inference for {len(forecast_borders)} borders with 3,043 features...")
178
  print(f" Known-future: weather, generation, load forecasts (615 features)")
179
- print(f" Past-only masked: CNEC outages, volatility, historical flows (~2,428 features)")
180
 
181
  for i, border in enumerate(forecast_borders, 1):
182
  # Clear GPU cache BEFORE each border to prevent memory accumulation
 
2
  """
3
  Chronos-2 Inference Pipeline with Past-Only Covariate Masking
4
  Standalone inference script for HuggingFace Space deployment.
5
+ Uses predict_df() API with ALL 2,514 features leveraging Chronos-2's mask-based attention.
6
  FORCE REBUILD: v1.4.0 - Past-only covariates + batch_size=128 for volatility capture
7
  """
8
 
 
173
  total_start = time.time()
174
 
175
  # PER-BORDER INFERENCE WITH PAST-ONLY COVARIATE MASKING
176
+ # Using predict_df() API with ALL 2,514 features (known-future + past-only masked)
177
+ print(f"\n[PAST-ONLY MASKING] Running inference for {len(forecast_borders)} borders with 2,514 features...")
178
  print(f" Known-future: weather, generation, load forecasts (615 features)")
179
+ print(f" Past-only masked: CNEC outages, volatility, historical flows (1,899 features)")
180
 
181
  for i, border in enumerate(forecast_borders, 1):
182
  # Clear GPU cache BEFORE each border to prevent memory accumulation
src/forecasting/dynamic_forecast.py CHANGED
@@ -9,10 +9,10 @@ Key Concepts:
9
  - run_date: When the forecast is made (e.g., "2025-09-30 23:00")
10
  - forecast_horizon: Always 14 days (D+1 to D+14, fixed at 336 hours)
11
  - context_window: Historical data before run_date (typically 512 hours)
12
- - future_covariates: ALL 3,043 features (leveraging Chronos-2 past-only masking)
13
  * 603 full-horizon (known future)
14
  * 12 partial D+1 (masked D+2-D+14)
15
- * ~2,428 historical (masked as past-only covariates)
16
 
17
  Chronos-2 Past-Only Covariate Masking:
18
  - Historical features have NaN future values → Chronos-2 sets mask=0
@@ -170,10 +170,10 @@ class DynamicForecast:
170
  """
171
  Extract future covariate data for D+1 to D+14.
172
 
173
- Future covariates include ALL 3,043 features using Chronos-2's past-only masking:
174
  - Full-horizon D+14: 603 features (known future values)
175
  - Partial D+1: 12 features (load forecasts, masked D+2-D+14)
176
- - Historical: ~2,428 features (MASKED as past-only covariates)
177
 
178
  Past-only covariates leverage Chronos-2's mask-based attention:
179
  - Future values are NaN (unknown)
 
9
  - run_date: When the forecast is made (e.g., "2025-09-30 23:00")
10
  - forecast_horizon: Always 14 days (D+1 to D+14, fixed at 336 hours)
11
  - context_window: Historical data before run_date (typically 512 hours)
12
+ - future_covariates: ALL 2,514 features (leveraging Chronos-2 past-only masking)
13
  * 603 full-horizon (known future)
14
  * 12 partial D+1 (masked D+2-D+14)
15
+ * 1,899 historical (masked as past-only covariates)
16
 
17
  Chronos-2 Past-Only Covariate Masking:
18
  - Historical features have NaN future values → Chronos-2 sets mask=0
 
170
  """
171
  Extract future covariate data for D+1 to D+14.
172
 
173
+ Future covariates include ALL 2,514 features using Chronos-2's past-only masking:
174
  - Full-horizon D+14: 603 features (known future values)
175
  - Partial D+1: 12 features (load forecasts, masked D+2-D+14)
176
+ - Historical: 1,899 features (MASKED as past-only covariates)
177
 
178
  Past-only covariates leverage Chronos-2's mask-based attention:
179
  - Future values are NaN (unknown)