Skip to content

Feature: Model discovery API, generic executor support, and Streamlit Explorer service #20

@profsergiocosta

Description

@profsergiocosta

Feature: Model discovery API, generic executor support, and Streamlit Explorer service

Summary

Once SimulationRegistry, generic executors, and parameter_schema() land in the
dissmodel core (tracked in dissmodel#<core-issue>), the platform needs three
complementary pieces:

  1. /models API routes — serve model listings and parameter schemas by reading
    the TOML catalogue directly, with no pip install and no runtime import of model
    packages.
  2. Generic executor support in the worker — ensure SimulationRegistry is
    populated in the subprocess before a generic executor resolves its model class.
  3. Streamlit Explorer service — a containerised Streamlit application that
    consumes the platform API to provide the same interactive experience as the
    existing ca_all.py and run_all_sysdyn.py explorers, backed by the full
    platform infrastructure.

Depends on: dissmodel core issue (SimulationRegistry + generic executors +
parameter_schema).


Architecture: each service has a single source of truth

The three services interact with model metadata in completely different ways.
Understanding this separation is essential before implementing any of the pieces:

SOURCE OF TRUTH: dissmodel-configs/*.toml (mounted as a volume in all services)

┌─────────────────────────────────────────────────────────────────────────┐
│ services/api (FastAPI)                                                  │
│                                                                         │
│   Reads *.toml from the mounted volume.                                 │
│   Serves /models routes from TOML data alone.                          │
│   Never installs packages. Never imports model code.                   │
│   The model package may not even be published yet — the TOML is        │
│   the only thing the API needs.                                         │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────┐
│ services/worker (job_runner)                                            │
│                                                                         │
│   Reads the resolved_spec from ExperimentRecord (already merged        │
│   from TOML by the API at submission time).                             │
│   pip-installs the package from spec["model"]["package"].              │
│   Imports the package → __init_subclass__ fires →                      │
│   SimulationRegistry populated in memory.                              │
│   GenericCAExecutor resolves model class from SimulationRegistry.      │
│   SimulationRegistry is ephemeral — lives only for this subprocess.    │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────┐
│ services/streamlit                                                      │
│                                                                         │
│   Pure HTTP client. Never imports dissmodel. Never reads TOML.         │
│   GET /models/ca          → model list from API                        │
│   GET /models/ca/{}/schema → parameter schema from API                 │
│   POST /experiments        → submit job to worker via API              │
│   GET /experiments/{}      → poll result                               │
└─────────────────────────────────────────────────────────────────────────┘

This separation means:

  • A new model from an external repository can appear in the Streamlit Explorer
    as soon as its TOML entry is merged into dissmodel-configs — no deployment of
    the API or Streamlit service is needed, and the model package does not need to
    be installed anywhere except the worker at job execution time.
  • The API has no dependency on external model packages. It cannot fail to start
    because a model package is unavailable or has a broken dependency.
  • The Streamlit service has no dependency on dissmodel at all — it is a
    lightweight web app that could be replaced by any other client.

1. TOML format — [schema.model_params]

The TOML entries in dissmodel-configs gain a new [schema.model_params] block
that declares the parameter schema statically. This block is the single source of
truth for the API /schema endpoint — it must be kept in sync with the annotated
attributes in the Python class by the contributor and enforced via PR review.

# dissmodel-configs/models/forest_fire.toml

[model]
class   = "GenericCAExecutor"
package = "my-ca-models>=1.0.0"

[parameters]
model_class = "ForestFireModel"
grid_size   = 40
steps       = 100

[schema.model_params]
ignition_prob = { type = "float", default = 0.001 }
spread_prob   = { type = "float", default = 0.3   }

For models with no dynamic parameters (e.g. Conway's Game of Life with fixed rules),
the [schema.model_params] block is omitted.

Reference TOML entries for built-in dissmodel models:

# dissmodel-configs/models/game_of_life.toml
[model]
class   = "GenericCAExecutor"
package = "dissmodel"           # built-in, no install needed

[parameters]
model_class = "GameOfLife"
grid_size   = 40
steps       = 100
# dissmodel-configs/models/sir_epidemic.toml
[model]
class   = "GenericSysDynExecutor"
package = "dissmodel"

[parameters]
model_class = "SIREpidemic"
steps       = 200

[schema.model_params]
beta  = { type = "float", default = 0.3  }
gamma = { type = "float", default = 0.05 }

2. /models API routes

New read-only routes in services/api/routes/models.py. They read TOML files from
the mounted dissmodel-configs volume and return the data directly — no model
package is imported.

GET /models                            → combined listing by family
GET /models/ca                         → CA model names
GET /models/sysdyn                     → SysDyn model names
GET /models/ca/{model_name}/schema     → parameter schema for a CA model
GET /models/sysdyn/{model_name}/schema → parameter schema for a SysDyn model

Implementation

# services/api/routes/models.py

from pathlib import Path
import tomllib
from fastapi import APIRouter, HTTPException

router     = APIRouter(prefix="/models", tags=["models"])
TOML_DIR   = Path(os.environ.get("DISSMODEL_CONFIGS_PATH", "/configs/models"))


def _load_all_entries() -> list[dict]:
    entries = []
    for path in TOML_DIR.glob("*.toml"):
        with open(path, "rb") as f:
            entries.append(tomllib.load(f))
    return entries


def _family(entry: dict) -> str:
    """Infer model family from executor class name."""
    cls = entry.get("model", {}).get("class", "")
    if "SysDyn" in cls:
        return "sysdyn"
    return "ca"


@router.get("/ca")
def list_ca():
    names = [
        e["parameters"]["model_class"]
        for e in _load_all_entries()
        if _family(e) == "ca" and "model_class" in e.get("parameters", {})
    ]
    return {"family": "ca", "models": sorted(names)}


@router.get("/ca/{model_name}/schema")
def ca_schema(model_name: str):
    for entry in _load_all_entries():
        if (
            _family(entry) == "ca"
            and entry.get("parameters", {}).get("model_class") == model_name
        ):
            schema = entry.get("schema", {}).get("model_params", {})
            return {"model": model_name, "family": "ca", "parameters": schema}
    raise HTTPException(status_code=404, detail=f"Model '{model_name}' not found.")

/models/sysdyn and /models/sysdyn/{name}/schema follow the same pattern.


3. Worker — generic executor support

The worker's _import_executor_package already handles the import chain that
populates ExecutorRegistry. The same import now also populates SimulationRegistry
as a side effect, provided the external package exports its model classes in
__init__.py (this is a documented convention on the dissmodel side).

One explicit addition is needed: when model.class is GenericCAExecutor or
GenericSysDynExecutor, the worker must ensure the package in
spec["model"]["package"] is installed and imported before executor.validate()
is called, because validate() calls SimulationRegistry.get_ca() to check
model_params keys. This is already the case today — _import_executor_package
runs before executor_cls = ExecutorRegistry.get(model_class) — so no change
to the orchestration order is required.

A log line should be added to make this explicit:

# job_runner.py — after _import_executor_package

if package:
    _import_executor_package(package)
    record.add_log(f"Imported package: {package}")

4. Streamlit Explorer service

A new services/streamlit/ directory. The application is a pure API client —
it does not import dissmodel, does not read TOML files, and does not interact
with SimulationRegistry in any way.

Service structure

services/streamlit/
├── Dockerfile
├── requirements.txt       # streamlit, requests, geopandas, matplotlib, folium
├── app.py                 # entrypoint, page config and routing
└── pages/
    ├── ca_explorer.py     # CellularAutomaton explorer
    └── sysdyn_explorer.py # System Dynamics explorer

ca_explorer.py — key logic

# services/streamlit/pages/ca_explorer.py

import os, time
import requests
import streamlit as st

PLATFORM_URL = os.environ["PLATFORM_URL"]
HEADERS      = {"X-API-Key": os.environ["PLATFORM_API_KEY"]}

st.title("Cellular Automata Explorer")

# ── 1. Discover models from the API (reads TOML on the API side) ──────────
models   = requests.get(f"{PLATFORM_URL}/models/ca", headers=HEADERS).json()["models"]
selected = st.sidebar.selectbox("Model", models)

# ── 2. Render parameter form from schema (reads TOML on the API side) ─────
schema       = requests.get(
    f"{PLATFORM_URL}/models/ca/{selected}/schema", headers=HEADERS
).json()
model_params = {}

st.sidebar.markdown(f"**{selected} parameters**")
for param, info in schema.get("parameters", {}).items():
    if info["type"] == "float":
        model_params[param] = st.sidebar.number_input(param, value=float(info["default"] or 0.0))
    elif info["type"] == "int":
        model_params[param] = st.sidebar.number_input(param, value=int(info["default"] or 0), step=1)
    elif info["type"] == "bool":
        model_params[param] = st.sidebar.checkbox(param, value=bool(info["default"]))

# ── 3. Common simulation parameters ──────────────────────────────────────
steps     = st.sidebar.slider("Steps",     1, 500, 50)
grid_size = st.sidebar.slider("Grid size", 5, 100, 20)

# ── 4. Submit job and poll ────────────────────────────────────────────────
if st.button("Run Simulation"):
    payload = {
        "model": selected.lower(),
        "parameters": {
            "model_class":  selected,
            "steps":        steps,
            "grid_size":    grid_size,
            "model_params": model_params,
        },
    }
    resp   = requests.post(f"{PLATFORM_URL}/experiments", json=payload, headers=HEADERS)
    exp_id = resp.json()["experiment_id"]

    with st.spinner(f"Running {exp_id[:8]}…"):
        while True:
            status = requests.get(
                f"{PLATFORM_URL}/experiments/{exp_id}", headers=HEADERS
            ).json()
            if status["status"] in {"completed", "failed"}:
                break
            time.sleep(1)

    if status["status"] == "completed":
        st.success(f"Completed in {status['metrics'].get('time_total_sec', '?')}s")
        st.json(status)
    else:
        st.error("Simulation failed.")
        st.json(status)

docker-compose.yml addition

streamlit:
  build: ./services/streamlit
  ports:
    - "8501:8501"
  environment:
    PLATFORM_URL:      http://api:8000
    PLATFORM_API_KEY:  ${API_KEY}
  depends_on:
    - api
  volumes:
    - ./dissmodel-configs:/configs:ro   # not strictly needed — Streamlit never reads TOML
                                         # but kept for consistency with other services

Note: the volumes mount for dissmodel-configs is only required by the API service.
The Streamlit service lists it above for documentation clarity but does not use it.


Checklist

services/api

  • Add services/api/routes/models.py with /models/ca, /models/sysdyn, and /schema routes
  • Register the new router in the FastAPI app
  • Add DISSMODEL_CONFIGS_PATH to .env.example
  • Integration tests: list endpoints return correct names; schema endpoint returns [schema.model_params] from TOML; 404 for unknown model name

services/worker

  • Add log line after _import_executor_package confirming the package was imported
  • Verify that importing an external package populates SimulationRegistry when the package exports models in __init__.py
  • Add time_load_sec to profiling Markdown table (dependency on lifecycle refactor issue)

services/streamlit

  • Create services/streamlit/Dockerfile and requirements.txt
  • Implement app.py, pages/ca_explorer.py, pages/sysdyn_explorer.py
  • Add streamlit service to docker-compose.yml
  • Add PLATFORM_URL and PLATFORM_API_KEY to .env.example
  • Manual smoke test: select a built-in CA model, adjust parameters, submit, verify result

dissmodel-configs

  • Add [schema.model_params] block to existing TOML entries that use generic executors
  • Add game_of_life.toml as reference entry for GenericCAExecutor
  • Add sir_epidemic.toml as reference entry for GenericSysDynExecutor
  • Update PR review checklist to require [schema.model_params] sync for new model contributions

Open questions

Result visualisation in Streamlit. For CA models the output is a GeoDataFrame
serialised to GeoJSON — the explorer can render it with st.map or folium. For
SysDyn the output is a time series. A small _display_result(record) helper that
branches on output format is needed. Can be a follow-up if it blocks shipping.

Live step-by-step visualisation. The existing local explorers render each step
as it runs. The platform-backed version submits a job and polls for completion —
there is no streaming. Supporting live visualisation would require WebSocket progress
events from the worker. Explicitly out of scope for this issue.

Labels

feature platform streamlit api docker depends-on-core-issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions