Version: 1.0 | Last Updated: April 11, 2026 Status: Active hackathon build — use this as the single source of truth for all decisions
- Hackathon Problem Statement — Decoded
- Strategic Analysis — Winner's Framework
- What Most Teams Will Build vs What We Build
- Real-World Precedents — What Already Exists
- Firebird Deep Dive — Our Primary Inspiration
- Updated Architecture — Dual-Granularity Risk Engine
- Databricks Platform — Low-Level Technical Plan
- Gold Tables — Full Schema
- ML Strategy — Formula + Challenger Model
- Geospatial Command Center — UI Design
- Demo Strategy — Three Clicks to Win
- Judge Psychology & Objections
- 24-Hour Execution Plan
- 10 Differentiators Over Prior Systems
- Winning Narrative & Pitch
You are given 5 years of real San Francisco Fire Department operational data across five open datasets. In 14 days (hackathon time), build a production-ready Databricks lakehouse that helps the fire department:
- Predict where fires and losses are most likely to happen (risk scoring)
- Deploy units and inspection resources smarter (response optimization)
- Close compliance gaps faster (violation tracking and inspection prioritization)
SFFD currently schedules inspections using a combination of complaint-driven dispatch, fixed annual cycles by property type, and inspector discretion. There is no cross-dataset risk scoring system — violations, incidents, call history, and permit status are tracked in four separate tools with no unified property identity. The result: a battalion chief cannot answer "which 20 buildings in my district need inspection this week" using data. This project builds that answer for the first time.
| Dataset | Records | Socrata ID | Primary Use |
|---|---|---|---|
| Fire Incidents | 250K+ | wr8u-xric |
Historical fire events, loss, type |
| Calls for Service | 1M+ | nuek-vuh3 |
Dispatch, response times, unit deployment |
| Fire Violations | 150K+ | fey4-yub6 |
Open/closed violations per property |
| Fire Inspections | 80K+ | wb4c-6hwj |
Inspection history, pass/fail, recency |
| Fire Permits | 50K+ | 893e-7rap |
Permit type (occupancy risk proxy) |
- 5+ Gold Delta tables with a clean medallion architecture
- Risk prediction layer (per property or area)
- Response optimization layer (unit/station coverage)
- Compliance tracking system
- Operational dashboard dataset
- Full documentation and notebooks
Ship an integrated Databricks-native platform with medallion architecture, ML risk scoring, response optimization, compliance analytics, and an operational dashboard that a fire chief or prevention lead could actually use.
Hidden Goal
Prove you can think like a real enterprise data platform team:
- Strong data governance (Unity Catalog, lineage)
- Reproducible ML (MLflow, model registry)
- Orchestrated pipelines (Workflows/DLT)
- Production-grade Gold tables, not just notebook outputs
- A solution pattern that generalizes to other cities and risk domains
- Platform depth: Did you use Databricks the way a real customer would — medallion, UC, MLflow, Model Serving, Workflows?
- Operational value: Are your insights actionable to an overworked fire chief, not just impressive to a data scientist?
- Generalizability: Does this look like a reusable pattern for other cities and risk domains, not a one-off hack?
- Demo magic: Can you create a "wow" moment in under 90 seconds?
| Ambiguity | Strategic Opportunity |
|---|---|
| "Risk" is undefined | Define a composite "Fire Harm Index" — you own the narrative |
| "Response optimization" is vague | Use NFPA 1710 4-minute travel standard as a concrete, authoritative benchmark |
| No explicit fairness requirement | Add fairness/coverage analysis proactively — massive governance cred |
| No LLM requirement (stretch) | Add a narrow, governed explanation assistant — a RAG system with zero free-form generation: it only composes sentences using structured values already in gold.property_risk_twin. No external knowledge, no hallucination surface. Input: property_id. Output: a templated 3-sentence plain-English risk summary grounded entirely in Gold table columns. Evaluated using a custom mlflow.evaluate() metric checking for field-grounding. |
| No multi-city requirement | Show the pattern generalizes — same medallion, different city, same Gold schema |
The painful, expensive, frequent, urgent problem:
- US structure fires: 494,000/year causing 2,800 deaths and $9.8B in property damage (2014 NFPA data)
- Municipal fire departments operate with flat budgets and staffing while risk grows with climate, urban densification, and aging infrastructure
- SFFD conducts thousands of commercial inspections/year but has no data-driven prioritization — gut feel and tradition dominate
- Inspection resources are finite; high-risk properties fall through the cracks
- San Francisco has ~7,500 commercial properties requiring annual fire inspections; SFFD has historically completed ~4,000–5,000/year — a structural gap of 30–40%
- The 2016 Ghost Ship warehouse fire (Oakland, 36 deaths) — 30 miles from SF — was a property with unresolved violations and zero inspections in years: exactly the "dark property" pattern our system targets
- SF's dense SoMa/Mission/Tenderloin neighborhoods combine aging building stock, high occupancy, and historically understaffed inspection coverage — the highest-risk confluence in the dataset
The ROI argument:
- Preventing one major structural fire (multi-million loss + injuries) pays for the entire Databricks platform several times over
- Dark properties (buildings never inspected despite active permits) represent the greatest unaddressed risk
| User | Pain | What They Get |
|---|---|---|
| Fire Prevention Officer | "Which buildings do I inspect this week?" | Ranked inspection priority queue with formula-based explanations |
| Battalion Chief | "Where are my coverage gaps?" | NFPA compliance map by station area |
| Compliance Lead | "Which violations never get closed?" | Compliance funnel: violations → inspections → closures |
| City Executive | "Can I defend our resource allocation?" | Fairness/coverage chart by district |
| Analytics Lead | "Is our model still accurate?" | Drift monitoring dashboard |
- Scale: 1M+ calls-for-service records require distributed Spark processing, not serial pandas
- Geospatial: H3 indexing in Databricks (via Mosaic or h3-py) enables property-to-hex aggregation at city scale
- Governance: Unity Catalog provides lineage from raw Socrata to Gold risk table — essential for a public agency with audit requirements
- ML lifecycle: MLflow experiment tracking, model registry, and champion/challenger comparisons are native
- Orchestration: Databricks Workflows connects Bronze → Silver → Gold → scoring → dashboard refresh as a governed pipeline
- Reusability: Medallion pattern + UC schema can be lifted to any city with open fire data
- Dual-granularity H3 + property entity architecture
- Nearest-neighbor distance features (not just raw counts)
- Spatial cross-validation (not random k-fold — statistically wrong for geographic data)
- Weighted formula primary + MLflow-tracked ML challenger
- PSI-style feature drift monitoring over time windows
- Fairness/coverage analysis by district
- One or two dashboards: incident heatmaps, average response time by district, top violator properties
- A basic XGBoost risk score with random k-fold CV
- No entity resolution across datasets (each dataset analyzed in isolation)
- No Unity Catalog governance, minimal MLflow tracking
- No drift monitoring, no fairness analysis
- No deployment or model registry
- A "pretty map" with random colored dots and no decision layer
- A generic chatbot on top of the data with no evaluation
- Treats this as a reusable "Public Safety Risk Lakehouse" pattern that generalizes to dropout, credit risk, and other cities
- Builds a property-centric "Fire Risk Twin" with per-property risk scores, explanations, and proactive recommendations
- Implements dual-granularity architecture: H3 cells for the map surface, property entities for the inspection queue
- Discovers "dark properties" — buildings never inspected despite active permits or open violations
- Uses NFPA 1710 4-minute travel standard as a concrete, legally-grounded benchmark for response optimization
- Adds spatial cross-validation, nearest-neighbor distance features, and weighted explainable formula
- Deploys a geospatial command center with 5 layered overlays, time slider, and drill-down panel
- Governs everything in Unity Catalog with automatic lineage, access policies, and drift/fairness metrics
-- Surfaces permit expiry risk — properties with expired fire safety permits that have not renewed are a prospective leading indicator not used by any cited prior system; derivable entirely from
bronze.sf_permitswith zero external data
Status: Active, in production since 2010 (currently v3.0) Scale: 330,000 buildings, 17 city agencies, 7,500 risk factors, daily score updates
Architecture:
- Layer 1: Data warehouse pulling nightly ETL from 17 agencies (Dept of Buildings, Finance, Health, DEP, 311, etc.)
- Layer 2: FireCast risk scoring engine — 7,500 variables, proprietary ML, daily updates
- Layer 3: RBIS operational interface — ranked inspection lists, building profiles, 26 inspection types, closed-loop feedback
Key challenges FDNY faced (lessons for us):
| Challenge | Takeaway |
|---|---|
| Data silos across 17 agencies, no shared ID | Our Silver entity resolution layer solves this |
| "Crying wolf" — early models over-predicted risk, inspectors lost trust | Calibrate scores: only ~5-10% of properties should be CRITICAL |
| Cultural resistance from veteran inspectors | Frame as "decision support" not "decision replacement" — show the formula, not just the score |
| Class imbalance (~99% of buildings never had a fire) | Use weighted scoring formula + class_weight='balanced' in ML challenger |
| Model bias on demographic features | Use building-level features only, add fairness/coverage audit |
Key stats for our presentation:
- 330,000 buildings analyzed
- Increased serious violation detection by ~25% over manual scheduling
- 32,000 building inspections/year prioritized by FireCast
Status: Published 2016, open-source MIT, peer-reviewed at KDD 2016 Team: Georgia Tech Data Science for Social Good (DSSG) + Atlanta Fire Rescue Dept NFPA recognition: Highlighted as national best practice for data-driven fire inspections
Key result: 71.36% TPR at 20% FPR, discovered 6,096 previously unknown commercial properties
Three repos:
DSSG-Firebird/property-joins— Python, dual-key address join (proximity + fuzzywuzzy)DSSG-Firebird/risk-model— R, SVM/RF with time-based validationDSSG-Firebird/interactive-map— JS, Mapbox + D3.js
The dual-key join pattern:
Step 1: Geocode all addresses → lat/lon (Google Geocoding API)
Step 2: Spatial filter — only consider records within ~50 meter tolerance
Step 3: Fuzzy name match — fuzzywuzzy string similarity on business names
Step 4: Take best match above threshold → assign canonical PropertyID
Step 5: Flag source presence (in_incidents, in_violations, in_inspections, in_permits)
Step 6: Dark property discovery — permit=True AND inspections=False → never inspected
58 features used: Building age, size, construction materials, building use type (SIC code), historical fire count, inspection results, violation count, business license type, parcel condition, floors, station proximity, socioeconomic indicators, and 46 more engineered features.
Validation: Time-based — train on first 4 years, test on final year, 10 bootstrapped samples
Key lessons:
-
Address matching is hard — ~15% missed even with their approach. Accept imperfection, use district-level fallback.
-
Our fallback strategy: if Levenshtein join confidence < 0.7 threshold, fall back to H3 cell-level district aggregation. District-level scores (33 districts in SF) still power all four deliverables and lose only the per-property ranking precision. Log match quality as a Silver metadata column:
match_confidence(0.0–1.0) so judges can see the join quality directly. -
SVM = accurate but black box. Inspectors wanted to know WHY. Weighted formula is better for adoption.
-
Missing data: convert NAs to explicit "NA" category for categorical, impute median for continuous — never drop.
Status: Active, maintained by City of Pittsburgh, CMU Metro21 partnership
GitHub: CityofPittsburgh/fire_risk_analysis
Full pipeline in one shell script:
getdata.py → riskmodel.py → merger.py → R Shiny Dashboard
What Pittsburgh maps to our project:
getdata.py→ Our Bronze ingest notebooksriskmodel.py→ Our MLflow experiment + Gold risk scoringmerger.py→ Our Gold table joins- R Shiny Dashboard → Our geospatial command center
Status: Published 2020, educational/academic workshop by Ken Steif (UPenn)
GitHub: urbanSpatial/PredictingFireRisk
Critical: Uses San Francisco Fire Incidents (wr8u-xric) — the EXACT same dataset as ours
Key innovations we steal:
Instead of only property-level modeling, the SF system creates 500ft × 500ft grid cells (we use H3 hexagons instead). This gives a continuous risk surface — even areas with no known properties still have a risk level based on surrounding evidence.
vacants.nn → Average distance to 3 nearest vacant parcels
graf.nn → Average distance to 5 nearest graffiti reports
Why better: A property with zero violations next to a cluster of highly-violated buildings is still at risk. Raw count within buffer misses this. Distance to nearest hazard captures area-level exposure.
In our implementation:
nearest_violation_h3_dist → Distance to nearest H3 cell with open violations
nearest_incident_h3_dist → Distance to nearest H3 cell with fire incidents
"Fire risk is greater than actual fires observed."
- Properties never inspected have lower apparent incident rates even if their structural risk is high
- Standard supervised learning systematically underrates dark properties
- Our weighted formula + dark-property penalty corrects for this observation bias
- This framing immediately impresses ML-savvy judges
- Standard k-fold CV is statistically wrong for spatial data (spatially autocorrelated neighbors leak information)
- Use neighborhood-based folds — train on some neighborhoods, validate on others
- Even mentioning this scores major points with ML-engineer judges
SF system used 311 graffiti reports as blight proxies. We don't need external data — our five datasets contain better, fire-specific blight signals:
- Open violation density by area
- Low permit activity vs high incident density (ratio)
- High call-to-incident ratio (nuisance/deterioration signal)
- Inspection recency gaps
GitHub: areevesman/SFFD-response-times
Language: Python + Apache Spark
Uses: SFFD Calls for Service (nuek-vuh3) — SAME dataset as ours
Key finding: Response time ≠ just distance. Traffic, time of day, and simultaneous call volume matter MORE than distance.
What we borrow:
- Timestamp parsing logic for SFFD call data
- Response time calculation:
total_response_min = (on_scene_ts - received_ts) - Travel time specifically:
travel_time_min = (on_scene_ts - dispatch_ts)— compare to NFPA 4-minute standard - Finding: time-of-day and concurrent call volume are the biggest compliance risk factors
| Component | Benchmark | Notes |
|---|---|---|
| Travel Time (1st Engine) | 240 seconds (4 minutes) | Compare on_scene_ts - dispatch_ts |
| Full Alarm Assignment | 480 seconds (8 minutes) | Full response complement |
| Compliance measurement | 90th percentile | Not average — 90th percentile compliance |
Formula:
travel_time_min = (on_scene_dttm - dispatch_dttm).total_seconds() / 60.0
nfpa_pass = (travel_time_min <= 4.0)
nfpa_compliance_rate = nfpa_pass.mean() # % of calls meeting 4-min standardThis transforms "response optimization" from a vague concept into a concrete, legally-grounded, auditable metric.
| Firebird Component | Our Databricks Equivalent | Why Better |
|---|---|---|
address_to_xy_geocoding.py (serial Python) |
Silver transform: PySpark address normalization at scale | Distributed, 100x faster |
fuzzywuzzy dual-key join |
PySpark levenshtein() join with H3 spatial bucketing |
Distributed, governed in UC |
| Flat CSV property list | silver.property_entity_master Delta table |
Versioned, lineaged, queryable |
Property_short_list_generator.py |
gold.property_risk_twin |
SHAP factors added, UC governed |
SVM.R with time-based validation |
MLflow experiment: XGBoost + spatial CV | Tracked, reproducible, champion/challenger |
| Manual 1-10 score mapping by visual inspection | PR-curve threshold optimization | Data-driven, not eyeballed |
| Risk score CSV output | Gold table + optional Model Serving endpoint | SQL-queryable, REST-servable |
| Static Leaflet/GIS map | H3-backed geospatial command center | Live, layered, temporal playback |
| No fairness analysis | Explicit fairness Gold table: risk vs coverage by district | Production-grade trust layer |
- No medallion architecture — flat CSVs and R scripts. We have Bronze → Silver → Gold with Delta, DLT, Workflows.
- No model governance — no registry, no versioning, no champion/challenger. We have MLflow + UC Model Registry.
- No drift detection — static model built once. We add PSI-style drift metrics tracked over time.
- No fairness analysis — paper acknowledged bias concerns but didn't instrument them. We have a Fairness Gold table.
- No lineage or audit trail — no governance at all. We get UC automatic lineage.
- No explanations — output was a score, no per-property reasons. We add formula decomposition + SHAP factors.
- No H3 indexing — used lat/lon points only. We build H3 hex cells for scalable, layered rendering.
- No dual granularity — only property-level. We have H3 macro surface + property micro layer simultaneously.
- No NFPA benchmark — no response time analysis at all. We compute NFPA 1710 compliance per station area.
- No cross-dataset dark property discovery — their join found new properties. We do the same for SF but flag dark properties by type.
Atlanta: 6,096 properties discovered that AFRD didn't know about. Pittsburgh: Similar gaps found. San Francisco: We expect hundreds to thousands of "dark properties."
Four types of dark properties we will discover:
# Type 1: Has active permit, never inspected
dark_type_1 = properties where in_permits=True AND total_inspections=0
# Type 2: Has open violations, no follow-up inspection
dark_type_2 = properties where open_violation_count > 0 AND inspections_after_violation=0
# Type 3: Has incident history, no permit on file
dark_type_3 = properties where incident_count > 0 AND in_permits=False
# Type 4: Appears in 3+ datasets but has never been flagged for inspection
dark_type_4 = properties where source_count >= 3 AND inspection_priority_rank IS NULLThe demo moment: "These 600+ buildings have active business licenses and zero inspections in five years. The department doesn't even know they exist in their inspection queue."
The single biggest architectural decision that separates us from all prior systems:
We run two parallel granularities simultaneously:
| Layer | Granularity | Unit | Primary Purpose | Databricks Table |
|---|---|---|---|---|
| Macro | H3 Resolution 9 (~175m hex) | H3 cell | Map surface, hotspot detection, NFPA coverage analysis | gold.h3_risk_surface |
| Micro | Property entity | Canonical address/building | Inspection priority queue, action card, dark property flag | gold.property_risk_twin |
Why this wins:
- Property level alone: can't render a smooth risk surface on a map (too many points, irregular shapes)
- H3 level alone: can't give an inspector a specific building address with specific reasoning
- Both together: the map is the navigation, the property card is the action. This is exactly what operational systems like FDNY RBIS do.
SF Open Data (Socrata)
5 datasets × ~500K-1M total records
│
▼
┌─────────────────────────────────────────────┐
│ BRONZE LAYER │
│ sf_fire.bronze.* │
│ │
│ bronze.sf_incidents (250K records) │
│ bronze.sf_calls (1M+ records) │
│ bronze.sf_violations (150K records) │
│ bronze.sf_inspections (80K records) │
│ bronze.sf_permits (50K records) │
│ │
│ All: Delta managed tables, append-only │
│ Partitioned by date field │
│ Unity Catalog governed │
│ Auto Loader or batch Spark ingest │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ SILVER LAYER │
│ sf_fire.silver.* │
│ │
│ silver.address_entity_master │
│ → Canonical PropertyID via dual-key │
│ join (H3 spatial + Levenshtein name) │
│ → Source presence flags for 5 datasets │
│ → lat/lon, h3_index (resolution 9) │
│ → dark_property_type (1-4) │
│ │
│ silver.property_features │
│ → Per-property: incident_count_3yr, │
│ call_frequency_12mo, violation_count, │
│ inspection_failure_rate, │
│ days_since_last_inspection, │
│ permit_type_risk_tier, │
│ source_count (1-5 confidence) │
│ │
│ silver.h3_features │
│ → Per H3 cell: fire_count_3yr, │
│ call_density_12mo, open_violations, │
│ nearest_violation_h3_dist, ← NEW │
│ nearest_incident_h3_dist, ← NEW │
│ local_moran_i (spatial autocorr) │
│ mean_days_since_inspection │
│ │
│ silver.response_performance │
│ → Per call: travel_time_min, │
│ nfpa_pass (≤4min), time_of_day, │
│ concurrent_calls, station_id │
│ → Per station: compliance_rate, │
│ p90_travel_time, avg_travel_time │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ GOLD LAYER │
│ sf_fire.gold.* │
│ │
│ gold.property_risk_twin ← CORE │
│ gold.h3_risk_surface ← NEW │
│ gold.dark_property_discovery ← NEW │
│ gold.nfpa_response_compliance ← NEW │
│ gold.inspection_priority_queue │
│ gold.compliance_funnel │
│ gold.fairness_coverage │
│ gold.model_health_drift │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ ML + SERVING LAYER │
│ │
│ MLflow Experiments: │
│ → Formula baseline (interpretable) │
│ → RF/XGBoost challenger model │
│ → Spatial CV validation metrics │
│ → SHAP feature importance │
│ │
│ UC Model Registry: │
│ → Champion: weighted formula │
│ → Challenger: XGBoost │
│ │
│ Model Serving (optional): │
│ → score_property(address_id) endpoint │
│ → explain_property(address_id) endpoint │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ ORCHESTRATION + GOVERNANCE │
│ │
│ Databricks Workflows: │
│ → Weekly pipeline: Bronze→Silver→Gold │
│ → Drift monitoring refresh │
│ → Dashboard dataset refresh │
│ │
│ Unity Catalog: │
│ → All tables governed under │
│ sf_fire_prod.{bronze,silver,gold} │
│ → Automatic lineage from Socrata │
│ to Gold risk tables │
│ → Row/column-level access control │
│ │
│ Monitoring: │
│ → gold.model_health_drift │
│ → Alerts on feature drift / data lag │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ GEOSPATIAL COMMAND CENTER UI │
│ │
│ Full-screen SF map (Mapbox / deck.gl) │
│ Backed by Gold tables │
│ 5 layered overlays │
│ Property drill-down panel │
│ NFPA compliance overlay │
│ Dark property discovery layer │
│ Time slider + drift view │
│ Trust/health strip │
└─────────────────────────────────────────────┘
# Catalog + schema structure
spark.sql("CREATE CATALOG IF NOT EXISTS sf_fire_prod")
spark.sql("CREATE SCHEMA IF NOT EXISTS sf_fire_prod.bronze")
spark.sql("CREATE SCHEMA IF NOT EXISTS sf_fire_prod.silver")
spark.sql("CREATE SCHEMA IF NOT EXISTS sf_fire_prod.gold")
spark.sql("CREATE SCHEMA IF NOT EXISTS sf_fire_prod.ml")All tables created as sf_fire_prod.{layer}.{table_name} — UC-managed Delta tables with automatic lineage.
# Auto Loader for incremental ingest (or batch for hackathon)
from pyspark.sql.functions import current_timestamp
df = (spark.read
.format("csv")
.option("header", "true")
.option("inferSchema", "true")
.load("/Volumes/sf_fire_prod/landing/sf_incidents/")
.withColumn("_ingest_timestamp", current_timestamp()))
df.write.format("delta") \
.mode("append") \
.partitionBy("incident_date") \
.saveAsTable("sf_fire_prod.bronze.sf_incidents")from pyspark.sql.functions import levenshtein, greatest, length, col, row_number
from pyspark.sql.window import Window
import h3
# Step 1: Add H3 index to all datasets (resolution 9 = ~175m cells)
h3_udf = udf(lambda lat, lon: h3.geo_to_h3(lat, lon, 9), StringType())
incidents_h3 = incidents_silver.withColumn("h3_index", h3_udf(col("latitude"), col("longitude")))
violations_h3 = violations_silver.withColumn("h3_index", h3_udf(col("address_lat"), col("address_lon")))
# ... same for all 5 datasets
# Step 2: Spatial bucket join — only consider rows in same H3 cell
cross = incidents_h3.join(permits_h3, on="h3_index")
# Step 3: Fuzzy name similarity
cross = cross.withColumn("name_sim",
1 - levenshtein(col("business_name_a"), col("business_name_b")) /
greatest(length(col("business_name_a")), length(col("business_name_b"))))
# Step 4: Best match per entity
w = Window.partitionBy("incident_entity_id").orderBy(col("name_sim").desc())
best = cross.withColumn("rank", row_number().over(w)).filter(col("rank") == 1)# For each H3 cell, find distance to 3 nearest cells with open violations
# This is the key innovation from PredictingFireRisk
from pyspark.sql.functions import pandas_udf
import h3
@pandas_udf("double")
def nearest_violation_h3_dist(h3_indices: pd.Series, violation_h3_set: pd.Series) -> pd.Series:
"""Compute average distance to k nearest H3 cells with violations."""
violation_cells = set(violation_h3_set)
k = 3
distances = []
for h3_idx in h3_indices:
rings = []
for ring in range(1, 15): # expand rings outward
neighbors = h3.k_ring(h3_idx, ring) - h3.k_ring(h3_idx, ring-1)
found = neighbors & violation_cells
if found:
rings.extend([ring] * len(found))
if len(rings) >= k:
break
distances.append(sum(rings[:k]) / k if rings else 15.0)
return pd.Series(distances)from pyspark.sql.functions import col, unix_timestamp, when
calls = spark.table("sf_fire_prod.silver.sf_calls")
response_df = calls.withColumn(
"travel_time_min",
(unix_timestamp("on_scene_dttm") - unix_timestamp("dispatch_dttm")) / 60.0
).withColumn(
"total_response_min",
(unix_timestamp("on_scene_dttm") - unix_timestamp("received_dttm")) / 60.0
).withColumn(
"nfpa_pass",
when(col("travel_time_min") <= 4.0, 1).otherwise(0)
).withColumn(
"dispatch_delay_min",
(unix_timestamp("dispatch_dttm") - unix_timestamp("received_dttm")) / 60.0
)
# Per-station NFPA compliance (90th percentile — NFPA measures at P90)
station_compliance = response_df.groupBy("station_area").agg(
F.avg("travel_time_min").alias("avg_travel_time_min"),
F.expr("percentile(travel_time_min, 0.90)").alias("p90_travel_time_min"),
F.avg("nfpa_pass").alias("nfpa_compliance_rate"),
F.count("*").alias("total_calls")
)import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score
mlflow.set_experiment("/sf-fire/risk-model-challenger")
with mlflow.start_run(run_name="rf_spatial_cv_v1"):
mlflow.log_param("model_type", "RandomForestClassifier")
mlflow.log_param("n_estimators", 200)
mlflow.log_param("validation", "spatial_neighborhood_cv")
mlflow.log_param("temporal_split", "train:2016-2022 test:2023")
model = RandomForestClassifier(n_estimators=200, class_weight='balanced')
# CRITICAL: Use spatial CV, not random k-fold
# Split by neighborhood to prevent spatial data leakage
cv_scores = spatial_cross_val_score(model, X_train, y_train, neighborhoods)
mlflow.log_metric("cv_auc_mean", cv_scores.mean())
mlflow.log_metric("cv_auc_std", cv_scores.std())
mlflow.log_metric("temporal_test_auc", test_auc)
mlflow.log_metric("tpr_at_20pct_fpr", tpr)
# Log SHAP values
import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
mlflow.log_artifact("shap_summary.png")
# Register model in UC Model Registry
mlflow.sklearn.log_model(
model,
"fire_risk_challenger",
registered_model_name="sf_fire_prod.ml.fire_risk_model"
)# Workflow: sf_fire_weekly_refresh
tasks:
- task_key: bronze_ingest
notebook_path: /notebooks/01_bronze_ingest
- task_key: silver_entity_resolution
depends_on: [bronze_ingest]
notebook_path: /notebooks/02_silver_entity_resolution
- task_key: silver_h3_features
depends_on: [silver_entity_resolution]
notebook_path: /notebooks/03_silver_h3_features
- task_key: silver_response_performance
depends_on: [bronze_ingest]
notebook_path: /notebooks/04_silver_response_performance
- task_key: gold_risk_twin
depends_on: [silver_entity_resolution, silver_h3_features]
notebook_path: /notebooks/05_gold_risk_twin
- task_key: gold_h3_surface
depends_on: [silver_h3_features]
notebook_path: /notebooks/06_gold_h3_surface
- task_key: gold_nfpa_compliance
depends_on: [silver_response_performance]
notebook_path: /notebooks/07_gold_nfpa_compliance
- task_key: gold_dark_properties
depends_on: [gold_risk_twin]
notebook_path: /notebooks/08_gold_dark_properties
- task_key: gold_model_health
depends_on: [gold_risk_twin, gold_h3_surface]
notebook_path: /notebooks/09_gold_model_health_driftCREATE TABLE sf_fire_prod.gold.property_risk_twin (
property_id STRING, -- Canonical entity ID from Silver join
canonical_address STRING, -- Normalized address
latitude DOUBLE,
longitude DOUBLE,
h3_index STRING, -- H3 resolution 9 cell membership
district STRING,
neighborhood STRING,
-- Primary Score (Weighted Formula)
weighted_risk_score DOUBLE, -- 0-100 composite score
risk_tier STRING, -- LOW / MEDIUM / HIGH / CRITICAL
-- Formula Decomposition (Explainability)
violation_score DOUBLE, -- Contribution from violations (0-30)
incident_score DOUBLE, -- Contribution from incidents (0-25)
inspection_score DOUBLE, -- Contribution from inspection gaps (0-25)
call_score DOUBLE, -- Contribution from call frequency (0-10)
permit_type_score DOUBLE, -- Contribution from permit type (0-10)
-- Secondary Score (MLflow Challenger)
ml_risk_probability DOUBLE, -- XGBoost probability 0-1
ml_risk_tier STRING, -- Challenger tier
-- SHAP Explanations
top_factor_1 STRING, -- "2 structure fires in 3 years"
top_factor_2 STRING, -- "3 open violations unresolved"
top_factor_3 STRING, -- "847 days since last inspection"
-- Operational Fields
open_violation_count INT,
severe_violation_count INT,
days_since_inspection INT,
total_inspections INT,
inspection_pass_rate DOUBLE,
incident_count_3yr INT,
permit_type STRING,
permit_type_risk_tier STRING,
-- Dark Property Flags
is_dark_property BOOLEAN, -- Never been inspected
dark_property_type INT, -- 1-4 (see discovery types above)
-- Action Layer
recommended_action STRING, -- "Priority Inspection" / "Escalate Violation" / "Verify Permit"
inspection_priority_rank INT, -- Rank within district
-- Source Confidence
source_count INT, -- How many of 5 datasets contain this entity (1-5)
-- Metadata
score_computed_at TIMESTAMP,
model_version STRING
)
USING DELTA
PARTITIONED BY (district)CREATE TABLE sf_fire_prod.gold.h3_risk_surface (
h3_index STRING, -- H3 resolution 9
h3_center_lat DOUBLE,
h3_center_lon DOUBLE,
district STRING,
neighborhood STRING,
-- Aggregated Risk Metrics
weighted_risk_score_avg DOUBLE, -- Avg property risk in cell
max_property_risk DOUBLE, -- Max risk in cell (worst case)
critical_property_count INT, -- Properties with risk_tier=CRITICAL
high_risk_property_count INT,
-- Incident Signal
fire_count_3yr INT,
fire_count_1yr INT,
call_density_12mo DOUBLE,
-- Compliance Signal
open_violation_count INT,
severe_violation_count INT,
mean_days_since_inspection DOUBLE,
-- Nearest Neighbor Distance Features (PredictingFireRisk pattern)
nearest_violation_h3_dist INT, -- Rings to nearest violation cell
nearest_incident_h3_dist INT, -- Rings to nearest incident cell
-- Spatial Autocorrelation
local_moran_i DOUBLE, -- Hotspot indicator
-- NFPA Coverage
nfpa_compliant_pct DOUBLE, -- % of calls in cell meeting 4-min standard
station_coverage_gap BOOLEAN, -- TRUE if beyond 4-min from nearest station
nearest_station_id STRING,
-- Combined Crisis Score (HIGH RISK + NFPA FAIL = crisis zone)
is_crisis_zone BOOLEAN,
-- Temporal Snapshots (for time slider)
snapshot_month DATE,
-- Metadata
computed_at TIMESTAMP
)
USING DELTA
PARTITIONED BY (snapshot_month)CREATE TABLE sf_fire_prod.gold.dark_property_discovery (
property_id STRING,
canonical_address STRING,
latitude DOUBLE,
longitude DOUBLE,
h3_index STRING,
district STRING,
dark_property_type INT, -- 1: permit+no inspection, 2: violations+no followup
-- 3: incidents+no permit, 4: 3+ sources+never queued
dark_reason STRING, -- Human-readable explanation
weighted_risk_score DOUBLE, -- Despite being dark, their risk score
risk_tier STRING,
years_without_inspection INT,
open_violation_count INT,
incident_count_3yr INT,
source_count INT,
permit_type STRING,
business_type STRING,
recommended_action STRING,
urgency_rank INT
)
USING DELTACREATE TABLE sf_fire_prod.gold.nfpa_response_compliance (
station_id STRING,
station_area STRING,
district STRING,
station_lat DOUBLE,
station_lon DOUBLE,
-- NFPA 1710 Compliance (90th percentile standard)
total_calls INT,
nfpa_compliance_rate DOUBLE, -- % calls with travel_time ≤ 4.0 min
p90_travel_time_min DOUBLE, -- 90th percentile travel time
avg_travel_time_min DOUBLE,
avg_dispatch_delay_min DOUBLE,
-- Time-of-Day Breakdown
peak_hour_compliance DOUBLE, -- 7am-7pm compliance rate
offpeak_compliance DOUBLE,
-- Call Volume Effect
low_volume_compliance DOUBLE, -- Compliance when few concurrent calls
high_volume_compliance DOUBLE, -- Compliance during multi-call periods
-- Risk Exposure in Coverage Area
high_risk_properties_count INT,
critical_properties_count INT,
dark_properties_count INT,
h3_cells_outside_nfpa INT, -- H3 cells with high risk beyond 4-min
-- Snapshot
analysis_period_start DATE,
analysis_period_end DATE,
computed_at TIMESTAMP
)
USING DELTACREATE TABLE sf_fire_prod.gold.compliance_funnel (
district STRING,
neighborhood STRING,
risk_tier STRING,
total_violations INT,
open_violations INT,
closed_violations INT,
violations_inspected INT, -- Violations that triggered an inspection
violations_uninspected INT, -- Violations with no follow-up inspection
avg_days_to_closure DOUBLE,
median_days_to_closure DOUBLE,
closure_rate DOUBLE,
severe_open_violations INT,
repeat_offender_properties INT, -- Properties with 3+ violations
snapshot_month DATE,
computed_at TIMESTAMP
)
USING DELTA
PARTITIONED BY (district)CREATE TABLE sf_fire_prod.gold.fairness_coverage (
district STRING,
neighborhood STRING,
-- Risk-side
avg_risk_score DOUBLE,
high_risk_property_count INT,
critical_property_count INT,
expected_inspections_by_risk INT, -- Proportional to risk share
-- Inspection-side
actual_inspections INT,
inspection_coverage_rate DOUBLE,
-- Coverage gap
coverage_gap DOUBLE, -- actual - expected (negative = under-inspected)
coverage_gap_pct DOUBLE, -- % under/over relative to risk
under_inspected BOOLEAN,
over_inspected BOOLEAN,
-- Trust flags
fairness_flag STRING, -- "UNDER_SERVED" / "OVER_SERVED" / "BALANCED"
computed_at TIMESTAMP
)
USING DELTACREATE TABLE sf_fire_prod.gold.model_health_drift (
feature_name STRING,
snapshot_month DATE,
-- Distribution stats
mean_value DOUBLE,
p25_value DOUBLE,
p50_value DOUBLE,
p75_value DOUBLE,
p95_value DOUBLE,
-- PSI-style drift metric (vs baseline period)
psi_score DOUBLE, -- Population Stability Index
drift_status STRING, -- "STABLE" (PSI<0.1) / "MONITOR" / "DRIFT"
-- Model performance (where labels available)
auc_monthly DOUBLE,
calibration_error DOUBLE,
-- Data freshness
data_lag_days INT, -- How stale is our Bronze data?
model_version STRING,
computed_at TIMESTAMP
)
USING DELTA
PARTITIONED BY (feature_name)Why formula-first, not ML-first:
- FDNY's biggest challenge: "When everything is 'high risk,' inspectors get alarm fatigue and stop trusting the system"
- Cultural resistance: "I know my district better than any algorithm"
- Solution adopted by all successful real-world systems: interpretable primary output + ML as validation
The formula:
def compute_fire_risk_score(row):
# Component 1: Violation Score (0-30 points)
violation_score = min(
row['severe_violation_count'] * 10 +
row['open_violation_count'] * 4 +
(10 if row['violations_uninspected'] > 0 else 0),
30
)
# Component 2: Incident Score (0-25 points)
incident_score = min(
row['structure_fire_count_3yr'] * 10 +
row['other_incident_count_3yr'] * 2,
25
)
# Component 3: Inspection Score (0-25 points)
inspection_recency = min(row['days_since_last_inspection'] / 365 * 8, 15)
inspection_failure = row['inspection_failure_count'] * 5
inspection_score = min(inspection_recency + inspection_failure, 25)
# Component 4: Call Frequency Score (0-10 points)
call_score = min(row['call_frequency_12mo'] / 5 * 10, 10)
# Component 5: Permit Type Risk (0-10 points)
permit_type_lookup = {
'Assembly/Restaurant': 10, 'Industrial/Warehouse': 9,
'Mixed Use': 7, 'Retail': 5, 'Office': 3, 'Residential': 2
}
permit_score = permit_type_lookup.get(row['permit_type'], 5)
# Component 6: Permit Expiry Risk (0-8 points) — NEW SIGNAL
permit_expired = (row['permit_status'] == 'Expired') and (row['days_since_permit_expiry'] > 0)
permit_expiry_score = min(row['days_since_permit_expiry'] / 180 * 8, 8) if permit_expired else 0
# Dark property penalty — addresses latent risk undercount
dark_penalty = 8 if row['is_dark_property'] else 0
total = violation_score + incident_score + inspection_score + call_score + permit_score + dark_penalty
return min(total, 100)
# Tier mapping
def risk_tier(score):
if score >= 81: return "CRITICAL"
elif score >= 61: return "HIGH"
elif score >= 31: return "MEDIUM"
else: return "LOW"A static score ignores that risk grows with time. We apply an exponential decay to the inspection recency component: a building last inspected 1 year ago scores moderately; one last inspected 3 years ago scores near-maximum on that component. This is not modeled by any cited prior system.
import math
def inspection_recency_decay(days_since_inspection: int, max_score: int = 15) -> float:
"""
Exponential decay: score grows toward max_score as days increase.
Half-max reached at ~365 days, near-max at ~730 days.
"""
lambda_decay = 0.0019 # tunable: controls how fast score rises
return max_score * (1 - math.exp(-lambda_decay * days_since_inspection))This replaces the linear days_since_last_inspection / 365 * 8 term in Component 3 with a more realistic decay curve. Log lambda_decay as an MLflow parameter so it's tunable without code changes.
Formula output in the demo:
"Property at 123 Mission St scored 82/100 — CRITICAL because:
- 2 structure fires in 3 years (+20 pts)
- 3 open severe violations unresolved (+25 pts)
- 847 days since last inspection (+14 pts)
- Restaurant permit type (+10 pts)
- Dark property — never in inspection queue (+8 pts)
- High call frequency in area (+5 pts)"
# Temporal split — train on earlier years, test on most recent
# CRITICAL: Spatial cross-validation, NOT random k-fold
# Spatial CV prevents leakage from spatially autocorrelated neighbors
from sklearn.model_selection import GroupKFold
# Groups = neighborhoods — ensure no neighborhood appears in both train and test
gkf = GroupKFold(n_splits=5)
neighborhood_groups = X_train['neighborhood'].values
cv_scores = cross_val_score(
xgb_model, X_train, y_train,
groups=neighborhood_groups,
cv=gkf,
scoring='roc_auc'
)
# SHAP for feature importance → feeds back into Gold table
explainer = shap.TreeExplainer(xgb_model)
shap_values = explainer.shap_values(X_test)
# Top 3 features per property stored in gold.property_risk_twin.top_factor_*MLflow Experiment: sf_fire_prod/risk-model-comparison
Run 1: "formula_baseline"
→ AUC: computed by ranking formula scores and evaluating against labels
→ TPR at 20% FPR: target > 60%
→ Explainability: FULL
→ Latency: ~0ms (pure formula)
Run 2: "xgboost_spatial_cv"
→ AUC: expected 0.72-0.78
→ TPR at 20% FPR: target > 65% (Firebird achieved 71%)
→ Explainability: SHAP-based, post-hoc
→ Spatial CV AUC: ~0.68-0.74 (lower than random CV = honest)
Registered in Model Registry:
Champion: formula_baseline (production)
Challenger: xgboost_spatial_cv (staging)
Not a "world map." Not a generic dashboard with a map widget. A city-scale operational cockpit where the map is the primary canvas and every layer maps to one decision.
Dark theme, full-screen, command center aesthetic. This is the "world monitor style" but scoped to San Francisco.
┌────────────────────────────────────────────────────────────────────┐
│ 🔥 FireScope SF [LAYERS] [TIME: ▶ 2024-01] [⚠ DRIFT OK] │ ← Top bar
│─────────────────────────────────────────────────────────────────── │
│ │ │
│ LEFT SIDEBAR │ MAIN MAP CANVAS │
│ ────────────── │ (Full height SF map) │
│ LAYER TOGGLES: │ │
│ ☑ Risk Surface (H3 hex) │ H3 hex heatmap │
│ ☑ Dark Properties │ (red = CRITICAL) │
│ ☑ NFPA Coverage Gaps │ │
│ ☐ Incident Points │ Station coverage halos │
│ ☐ Violation Severity │ │
│ │ Dark property markers │
│ FILTERS: │ (pulsing red dots) │
│ District: [All ▼] │ │
│ Risk Tier: [All ▼] │ NFPA gap zones │
│ Property Type: [All ▼] │ (hatched overlay) │
│ │ │
│ SCENARIO: │ │
│ ☐ What-if: Relocate Station │ │
│ │ │
│────────────────────────────────────────│───────────────────────────│
│ RIGHT PANEL (on click) │
│ ───────────────────── │
│ [Property/District card appears here] │
│ Risk score decomposition, open violations, │
│ days since last inspection, NFPA status, │
│ recommended action button │
│────────────────────────────────────────────────────────────────────│
│ BOTTOM STRIP │
│ Timeline: ◄ 2022 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2024 ► │
│ KPIs: Critical Properties: 847 | NFPA Gaps: 23 cells | Dark: 634 │
└────────────────────────────────────────────────────────────────────┘
| Layer | Data Source | Decision It Enables |
|---|---|---|
| Risk Surface (H3 hex heatmap) | gold.h3_risk_surface |
Where to focus prevention resources |
| Dark Properties (pulsing red markers) | gold.dark_property_discovery |
Which buildings to add to inspection queue immediately |
| NFPA Coverage Gaps (hatched zones) | gold.nfpa_response_compliance |
Where to consider station/unit redeployment |
| Incident Points (historical dots) | bronze.sf_incidents |
Where fires actually happened — model validation |
| Violation Severity (colored rings) | gold.property_risk_twin |
Where compliance enforcement is most urgent |
Rule: Five layers maximum. More layers = cognitive overload = weak team signal.
┌─────────────────────────────────────┐
│ 123 Mission St, SoMa District │
│ ████████████████████░░ 82/100 │
│ 🔴 CRITICAL │
│─────────────────────────────────────│
│ Why CRITICAL: │
│ - 2 structure fires (2021, 2023) │
│ - 3 open severe violations │
│ - 847 days since last inspection │
│ - Restaurant permit (high risk) │
│ - ⚠ DARK PROPERTY │
│─────────────────────────────────────│
│ NFPA Status: ⚠ 4.2 min (FAIL) │
│ Nearest Station: Station 7 │
│─────────────────────────────────────│
│ [ 🔍 Priority Inspect ] │
│ [ 📋 View Full History ] │
└─────────────────────────────────────┘
Backend: Databricks Gold tables (all data)
Map rendering: Mapbox GL JS or deck.gl with H3 layer
Property cards: JavaScript/React or simple HTML/CSS served from static build
Data access: Either JDBC to Databricks SQL or pre-exported GeoJSON from Gold tables for hackathon speed
H3 rendering: deck.gl H3HexagonLayer for the risk surface
Kepler.gl alternative: Kepler (open-source, Uber) can load H3 data directly — fastest path for hackathon
"Right now, San Francisco has hundreds of high-risk commercial buildings that no inspector has ever visited — and this command center shows exactly which ones they are, why they're dangerous, and where to send someone first."
- Open H3 risk surface layer
- Show city-wide heatmap: red clusters in SoMa, Tenderloin, Mission
- Overlay NFPA coverage gaps: "These hatched zones are high-risk AND beyond 4-minute response coverage — the crisis zones"
- Animate time slider: "Risk has been building in this cluster over the past 18 months"
What this shows judges: H3 geospatial architecture, temporal analysis, NFPA benchmark integration
- Toggle on Dark Properties layer
- 600+ pulsing red markers appear in unexpected locations
- "These buildings have active business licenses or open violations — and zero inspections in three years. The department doesn't have them in any inspection queue."
- Show the count: "Our entity resolution across all five datasets surfaced 634 previously uncrossed properties"
What this shows judges: Cross-dataset entity resolution, property discovery (Firebird-validated methodology), operational impact
- Click one dark, high-risk property near a NFPA gap zone
- Action card appears: 82/100 CRITICAL, formula breakdown visible
- "This building is #3 on our priority list for next week — Restaurant, 847 days without inspection, 3 open violations, 4.2-minute response time from Station 7 (FAILS NFPA standard)"
- Hit "Priority Inspect" button → shows what the updated inspection queue looks like
- Brief flip to drift/fairness panel: "We're not targeting any district unfairly — inspection allocation aligns with risk share"
What this shows judges: Explainability, actionability, fairness governance, end-to-end platform
"If SFFD had used this engine last year, they would have had 634 additional high-risk properties in their inspection queue — properties the current system had no knowledge of. Any city with open fire data can stand this up on Databricks in days, not years. That's the platform argument."
If live demo fails at any point:
- Pre-recorded 60-second screen capture of all three clicks
- Static screenshots of each panel embedded in the deck
- Databricks SQL dashboard as backup for Gold table metrics
- All Gold tables are pre-computed — worst case, open a notebook and run
display(spark.table("gold.property_risk_twin").orderBy("weighted_risk_score", ascending=False).limit(20))
| Judge Type | What They Care About | What to Emphasize | What to Avoid |
|---|---|---|---|
| ML Engineer | Correct modeling choices, validation rigor | Spatial CV (not random k-fold), latent risk framing, SHAP, formula vs. ML tradeoff | Overclaiming accuracy, random splits for spatial data |
| Data Engineer | Platform depth, architecture quality | Medallion layers, UC lineage, Workflows orchestration, Auto Loader | Flat files, no governance, no orchestration |
| Enterprise/Platform Judge | Production readiness, scalability, generalizability | UC governance, model registry, drift monitoring, multi-city pattern | Prototype-only framing, missing governance |
| Product/Business Judge | User value, ROI, clarity of impact | Dark property discovery demo, NFPA compliance gaps, formula breakdown, $ROI estimate | Technical architecture without business story |
| Domain Expert | Fire safety accuracy, NFPA compliance, real-world feasibility | NFPA 1710 standard, Firebird/FDNY citations, latent risk concept | Fake precision, overclaiming prediction accuracy |
| # | Objection | Rebuttal | Fatal? |
|---|---|---|---|
| 1 | "This is just a fancy dashboard" | Show MLflow registry, UC lineage, spatial CV validation, formula decomposition — the dashboard is 5% of the system | Fatal if not addressed — must show platform depth |
| 2 | "Why Databricks? Any database could do this" | H3 distributed spatial indexing, UC lineage, MLflow challenger framework, Workflows orchestration — all native | Fatal if not addressed |
| 3 | "Your risk labels are weak — you can't predict fires reliably" | We rank, not predict. Formula output is interpretable inspection priority, not fire probability. Firebird achieved 71% TPR at KDD 2016 with same approach. | Manageable — acknowledge and contextualize |
| 4 | "What about response optimization? You mostly did prevention" | NFPA compliance panel + crisis zone (high risk + NFPA fail) IS the response optimization layer — it's specific, benchmarked, and actionable | Manageable with strong NFPA story |
| 5 | "This could be biased against certain neighborhoods" | Show fairness/coverage Gold table: inspection allocation aligns with risk share by district, no structural over/under | Manageable with one fairness chart |
| 6 | "Your model will drift over time" | Show drift metrics Gold table, PSI scores, model freshness timestamp in health strip, champion/challenger framework ready for retraining | Manageable — show it's designed for this |
| 7 | "Address matching is too messy — your joins are unreliable" | Acknowledge it proactively. Show match quality metrics. Use conservative thresholds. Show source_count as confidence signal. |
Manageable — being honest scores more than pretending it's clean |
| 8 | "How does this generalize to other cities?" | Same medallion pattern, same Gold schema, swap Bronze sources. Show Pittsburgh and Atlanta as peer-validated implementations. | Manageable with one "Future Work" slide |
| 9 | "You're overcounting dark properties" | Show the discovery logic explicitly — only flagging clear cases (permit + zero inspections, violations + zero follow-up). Conservative criteria, peer-validated by Firebird methodology. | Manageable |
| 10 | "This isn't production-ready" | Show Workflows, UC governance, model registry, drift monitoring. Acknowledge what's missing (CI/CD, real-time ingest). Frame as "production-oriented design, not production deployment" | Manageable |
| Role | Person | Primary Responsibility |
|---|---|---|
| Data/Pipeline Lead | Engineer A | Bronze + Silver layers, entity resolution, H3 features, NFPA timestamps |
| ML/Gold Lead | Engineer B | Weighted formula, MLflow experiment, SHAP, Gold table scoring, drift/fairness |
| UI/Demo Lead | Engineer C | Geospatial command center, demo script, fallback screenshots |
⚠️ The full ring-expansion nearest-neighbor UDF (Section 7) is a Target/Stretch feature only.
In MVP (Hours 0–16), replace it with a simpler H3 k-ring buffer count:
# MVP version — same signal, 10x simpler, no loop
from pyspark.sql.functions import udf
from pyspark.sql.types import IntegerType
import h3
@udf(IntegerType())
def violation_count_in_kring(h3_idx, k=2):
"""Count violation H3 cells within k rings. No ring-expansion loop needed."""
neighbors = h3.k_ring(h3_idx, k)
return len(neighbors & violation_h3_set_broadcast.value)Swap to the full ring-expansion UDF only after MVP is stable. Both produce the same directional insight; the full version just gives an exact distance number instead of a bucket count.