Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 87 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,93 @@ and this file MUST be updated together whenever `__version__` changes.

---

## [0.8.0-dev11] — SAML SSO + IP allow-listing in front of the UI (security)

Builds on the dev10 hardening: puts a **SAML 2.0 login (Okta)** in front
of human access to the UI/API, and adds **per-surface IP allow-listing**
at the ingress. Machine traffic is untouched — MCP and `api_secret`
bearer callers and the webhook/telemetry receivers keep their existing
auth and bypass SSO.

### SAML Service Provider (in-app)

- New `netcortex/auth/` package: `saml.py` (python3-saml SP), `session.py`
(Redis-backed sessions), `router.py` (the SSO endpoints).
- Endpoints (self-disable with 404 when `saml_enabled=false`):
`GET /saml/login`, `POST /saml/acs`, `GET /saml/metadata`,
`GET /saml/logout`, `GET|POST /saml/sls`.
- The `_api_auth` middleware now accepts **either** a valid `api_secret`
bearer (machines) **or** a valid SAML session cookie (humans). An
unauthenticated browser navigation is 302-redirected to `/saml/login`
(preserving the target via `?next=`); an unauthenticated API/XHR call
gets 401. `/webhooks`, `/ingest`, `/health`, `/saml`, and the MCP mount
stay public (they authenticate themselves).
- Hardened SAML: `strict=True`, `wantAssertionsSigned`, SHA-256
signatures/digests, `rejectUnsolicitedResponsesWithInResponseTo`
(SP-initiated only), audience/Destination/timestamp validation, and
optional email-domain / group authorization. Destination/ACS URLs are
derived from the public base URL so validation works behind a
TLS-terminating ingress.

### Server-side sessions

- Opaque CSPRNG session id (`secrets.token_urlsafe(32)`) in a
`Secure` + `HttpOnly` + `SameSite=Lax`, non-persistent cookie; all
subject/email/group data lives in Redis, never the cookie.
- Idle timeout (default 30 min, slid forward per request) **and**
absolute timeout (default 8 h, non-extendable). Lifecycle events log a
salted hash of the session id, never the raw token.

### IP allow-listing (ingress)

- Ingress split into a **receiver** ingress (`/webhooks`, `/ingest`,
`/health` — always present) and an **admin** ingress (`/` — only when
`exposeApi=true`), so the two surfaces can have different allowed
source ranges via `nginx whitelist-source-range`.
- `ingress.adminAllowSourceRanges` (UI/API/MCP) and
`ingress.webhookAllowSourceRanges` (receivers). Both default empty;
per this change set only the admin surface is expected to be locked to
office/VPN CIDRs, with webhooks left open and HMAC/token-protected.

### New configuration (core secret / env)

`saml_enabled`, `saml_sp_base_url`, `saml_sp_entity_id`,
`saml_idp_entity_id`, `saml_idp_sso_url`, `saml_idp_slo_url`,
`saml_idp_x509_cert`, `saml_sp_x509_cert`, `saml_sp_private_key`,
`saml_allowed_email_domains`, `saml_allowed_groups`, `saml_attr_groups`,
`session_cookie_name`, `session_cookie_secure`,
`session_idle_timeout_seconds`, `session_absolute_timeout_seconds`.
Startup logs an error if `saml_enabled` is true but required IdP fields
are missing.

### Packaging

- `python3-saml` is a new optional extra (`pip install '.[saml]'`,
included in `all`); imported lazily so non-SSO deployments don't need
it. The Docker image installs `libxmlsec1`/`libxml2` (build + runtime)
for xmlsec signature verification.

### Tests

- `tests/auth/test_session.py` — session create/load/destroy, idle slide,
absolute expiry, secure cookie flags.
- `tests/auth/test_saml.py` — hardened settings, authz (domain/group),
group extraction, open-redirect-safe return paths.
- `tests/test_api_auth.py` — extended for SAML browser-redirect, XHR 401,
valid-session allow, and bearer-still-works-with-SAML-on.

### Operational notes

- To enable: store the SAML keys in `netcortex/core`, set
`saml_enabled=true` and `ingress.exposeApi=true`, set
`ingress.adminAllowSourceRanges` to your office/VPN CIDRs, and paste
`GET /saml/metadata` (or the SP entityId/ACS URL) into the Okta app.
- Okta config: SSO URL / ACS = `https://<host>/saml/acs`, Audience /
SP entityId = `https://<host>/saml/metadata`, NameID = email; map a
`groups` attribute if you use group-based authorization.

---

## [0.8.0-dev10] — Webhook & API security hardening (security)

A full security review of the inbound webhook surface (requested before
Expand Down
33 changes: 33 additions & 0 deletions deploy/helm/templates/deployment-web.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,39 @@ spec:
{{- with .Values.web.extraEnv }}
{{- toYaml . | nindent 12 }}
{{- end }}
{{- if .Values.saml.enabled }}
# SAML SSO (0.8.0-dev11) — non-secret IdP config from values.yaml.
- name: NETCORTEX_SAML_ENABLED
value: "true"
- name: NETCORTEX_SAML_SP_BASE_URL
value: {{ .Values.saml.spBaseUrl | default (printf "https://%s" .Values.ingress.hostname) | quote }}
{{- if .Values.saml.spEntityId }}
- name: NETCORTEX_SAML_SP_ENTITY_ID
value: {{ .Values.saml.spEntityId | quote }}
{{- end }}
- name: NETCORTEX_SAML_IDP_ENTITY_ID
value: {{ .Values.saml.idp.entityId | quote }}
- name: NETCORTEX_SAML_IDP_SSO_URL
value: {{ .Values.saml.idp.ssoUrl | quote }}
- name: NETCORTEX_SAML_IDP_SLO_URL
value: {{ .Values.saml.idp.sloUrl | quote }}
- name: NETCORTEX_SAML_IDP_X509_CERT
value: {{ .Values.saml.idp.x509cert | quote }}
- name: NETCORTEX_SAML_ALLOWED_EMAIL_DOMAINS
value: {{ join "," .Values.saml.allowedEmailDomains | quote }}
- name: NETCORTEX_SAML_ALLOWED_GROUPS
value: {{ join "," .Values.saml.allowedGroups | quote }}
- name: NETCORTEX_SAML_ATTR_GROUPS
value: {{ .Values.saml.attrGroups | quote }}
- name: NETCORTEX_SESSION_COOKIE_NAME
value: {{ .Values.session.cookieName | quote }}
- name: NETCORTEX_SESSION_COOKIE_SECURE
value: {{ .Values.session.cookieSecure | quote }}
- name: NETCORTEX_SESSION_IDLE_TIMEOUT_SECONDS
value: {{ .Values.session.idleTimeoutSeconds | quote }}
- name: NETCORTEX_SESSION_ABSOLUTE_TIMEOUT_SECONDS
value: {{ .Values.session.absoluteTimeoutSeconds | quote }}
{{- end }}
volumeMounts:
- name: data
mountPath: /app/data
Expand Down
88 changes: 65 additions & 23 deletions deploy/helm/templates/ingress.yaml
Original file line number Diff line number Diff line change
@@ -1,20 +1,30 @@
{{- if .Values.ingress.enabled }}
{{- $fullName := include "netcortex.fullname" . }}
{{- $svc := printf "%s-web" $fullName }}
{{- $port := .Values.service.port }}
# =============================================================================
# Receiver ingress — the machine-facing surface (webhooks + telemetry +
# health). Always created. No SAML, no IP allow-list by default (vendor
# webhook clouds use dynamic egress IPs; they authenticate with HMAC/token
# per the 0.8.0-dev10 hardening). Add ingress.webhookAllowSourceRanges to
# restrict later.
# =============================================================================
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ include "netcortex.fullname" . }}
name: {{ $fullName }}-receiver
labels:
{{- include "netcortex.labels" . | nindent 4 }}
app.kubernetes.io/component: ingress-receiver
annotations:
# Body-size cap (F3, 0.8.0-dev10): outermost defense against
# memory-exhaustion via large unauthenticated POST bodies. Keep aligned
# with the in-app webhook_max_body_bytes guard (default 1 MiB).
nginx.ingress.kubernetes.io/proxy-body-size: {{ .Values.ingress.proxyBodySize | quote }}
# Per-source request-rate cap (defense-in-depth against webhook floods).
{{- if .Values.ingress.rateLimit.enabled }}
nginx.ingress.kubernetes.io/limit-rps: {{ .Values.ingress.rateLimit.rps | quote }}
nginx.ingress.kubernetes.io/limit-connections: {{ .Values.ingress.rateLimit.connections | quote }}
{{- end }}
{{- if .Values.ingress.webhookAllowSourceRanges }}
nginx.ingress.kubernetes.io/whitelist-source-range: {{ join "," .Values.ingress.webhookAllowSourceRanges | quote }}
{{- end }}
{{- with .Values.ingress.annotations }}
{{- toYaml . | nindent 4 }}
{{- end }}
Expand All @@ -32,30 +42,62 @@ spec:
- host: {{ .Values.ingress.hostname }}
http:
paths:
{{- if .Values.ingress.exposeApi }}
# exposeApi=true: the whole app (status UI, /api, /metrics, /mcp)
# is reachable on this host. ONLY do this on a trusted network or
# with api_secret set so /api and /metrics require a bearer token.
- path: /
{{- range .Values.ingress.publicPaths }}
- path: {{ . }}
pathType: Prefix
backend:
service:
name: {{ include "netcortex.fullname" . }}-web
name: {{ $svc }}
port:
number: {{ .Values.service.port }}
{{- else }}
# Default (F2, 0.8.0-dev10): expose ONLY the public receiver and
# health surface. The status UI, topology/inventory API, metrics,
# and MCP endpoint stay cluster-internal (reach them via
# `kubectl port-forward` or a separate internal-only ingress).
{{- range .Values.ingress.publicPaths }}
- path: {{ . }}
number: {{ $port }}
{{- end }}
{{- if .Values.ingress.exposeApi }}
---
# =============================================================================
# Admin ingress — the human-facing UI/API/MCP surface. Only created when
# exposeApi=true. Two layers of defense in front of it:
# 1. IP allow-list (whitelist-source-range) — ingress.adminAllowSourceRanges
# 2. SAML SSO — enforced in-app (saml_enabled) for browser sessions;
# machine callers still use the api_secret bearer.
# The catch-all "/" path is LESS specific than the receiver's /webhooks,
# /ingest, /health, so nginx routes those to the receiver ingress (no IP
# filter) and everything else here (UI/api/mcp behind the allow-list).
# =============================================================================
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: {{ $fullName }}-admin
labels:
{{- include "netcortex.labels" . | nindent 4 }}
app.kubernetes.io/component: ingress-admin
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: {{ .Values.ingress.proxyBodySize | quote }}
{{- if .Values.ingress.adminAllowSourceRanges }}
nginx.ingress.kubernetes.io/whitelist-source-range: {{ join "," .Values.ingress.adminAllowSourceRanges | quote }}
{{- end }}
{{- with .Values.ingress.annotations }}
{{- toYaml . | nindent 4 }}
{{- end }}
spec:
{{- if .Values.ingress.className }}
ingressClassName: {{ .Values.ingress.className }}
{{- end }}
{{- if .Values.ingress.tls.enabled }}
tls:
- hosts:
- {{ .Values.ingress.hostname }}
secretName: {{ .Values.ingress.tls.secretName }}
{{- end }}
rules:
- host: {{ .Values.ingress.hostname }}
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: {{ include "netcortex.fullname" $ }}-web
name: {{ $svc }}
port:
number: {{ $.Values.service.port }}
{{- end }}
{{- end }}
number: {{ $port }}
{{- end }}
{{- end }}
49 changes: 49 additions & 0 deletions deploy/helm/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,40 @@ web:
# - name: LOG_LEVEL
# value: DEBUG

# -----------------------------------------------------------------------------
# SAML SSO (0.8.0-dev11) — gates human UI/API access behind an IdP login.
# All values here are NON-SECRET (IdP URLs + the IdP's public signing cert),
# so they live in values, not the secret backend. They are injected as
# NETCORTEX_SAML_* env on the web pod and can still be overridden by the
# netcortex/core secret if a key is present there. The optional SP private
# key (for signing AuthnRequests) is the only sensitive piece and is read
# from the secret backend only.
# -----------------------------------------------------------------------------
saml:
enabled: false
# Public https origin of this app. Empty → defaults to https://<ingress.hostname>.
spBaseUrl: ""
# SP entityId. Empty → defaults to <spBaseUrl>/saml/metadata.
spEntityId: ""
idp:
entityId: "" # IdP Issuer / entityID from the IdP metadata
ssoUrl: "" # IdP SingleSignOnService Location
sloUrl: "" # IdP SingleLogoutService (optional; blank if none)
x509cert: "" # IdP signing cert, single-line base64 (public)
# Optional coarse authorization. Empty = any authenticated IdP user.
allowedEmailDomains: []
allowedGroups: []
attrGroups: "groups" # SAML attribute name carrying group membership

# -----------------------------------------------------------------------------
# Session policy for SAML-authenticated UI sessions (server-side, Redis).
# -----------------------------------------------------------------------------
session:
cookieName: "nc_session"
cookieSecure: true # set false ONLY for local http dev
idleTimeoutSeconds: 1800 # 30 min
absoluteTimeoutSeconds: 28800 # 8 h

# -----------------------------------------------------------------------------
# Worker — Celery async task runner
# -----------------------------------------------------------------------------
Expand Down Expand Up @@ -238,6 +272,21 @@ ingress:
- /ingest
- /health

# IP allow-lists (0.8.0-dev11). nginx whitelist-source-range, applied
# per-ingress so the human UI and the machine receivers can have
# different allowed sources.
#
# adminAllowSourceRanges: CIDRs permitted to reach the UI/API/MCP admin
# ingress (only created when exposeApi=true). Empty = no IP filter
# (rely on SAML SSO + api_secret). Set to your office/VPN egress.
# webhookAllowSourceRanges: CIDRs permitted to reach /webhooks + /ingest.
# Empty = open (vendor clouds use dynamic IPs; HMAC/token is the
# control). Populate with vendor egress ranges to lock down later.
adminAllowSourceRanges: []
# - "203.0.113.0/24" # office
# - "198.51.100.7/32" # admin VPN
webhookAllowSourceRanges: []

# Hard cap on request body size at the ingress (F3). Mirror the in-app
# webhook_max_body_bytes (default 1 MiB).
proxyBodySize: "1m"
Expand Down
25 changes: 25 additions & 0 deletions deploy/values-local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,31 @@ ingress:
secretName: "netcortex-tls"
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
# Expose the UI/API (admin) ingress in addition to the receiver paths.
# Required for SAML SSO to have something to protect.
exposeApi: true
# Restrict the admin UI/API to these source CIDRs (SAML is the second
# layer). Empty = no IP filter (SAML only). Fill with your office/VPN
# egress ranges, e.g.:
adminAllowSourceRanges: []
# - "203.0.113.0/24"
# - "198.51.100.7/32"

# -----------------------------------------------------------------------------
# SAML SSO via Duo Security (IdP). Non-secret IdP values from the Duo
# "Generic SAML Service Provider" metadata. spBaseUrl is omitted → defaults
# to https://<ingress.hostname>.
# -----------------------------------------------------------------------------
saml:
enabled: true
idp:
entityId: "https://sso-dbbfec7f.sso.duosecurity.com/saml2/sp/DIRC8ET0RCU9RTD7A4AP/metadata"
ssoUrl: "https://sso-dbbfec7f.sso.duosecurity.com/saml2/sp/DIRC8ET0RCU9RTD7A4AP/sso"
sloUrl: "" # Duo metadata advertises no SingleLogoutService
x509cert: "MIIDDTCCAfWgAwIBAgIUIsTRaCpPjS2L5tYmwu4BQyBfuRAwDQYJKoZIhvcNAQELBQAwNjEVMBMGA1UECgwMRHVvIFNlY3VyaXR5MR0wGwYDVQQDDBRESVJDOEVUMFJDVTlSVEQ3QTRBUDAeFw0yNjA2MDYxODMwMzFaFw0zODAxMTkwMzE0MDdaMDYxFTATBgNVBAoMDER1byBTZWN1cml0eTEdMBsGA1UEAwwURElSQzhFVDBSQ1U5UlREN0E0QVAwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDI357wyTOFqIIM5OJdlH4azPaOkau/ALfytCsspSVwQu3LR+3B6vBYOW17dubSRaGoscifQVxfZAHKtKIMw+kVscQswEhmgQSxPsjTtNlO79NTEnQgaO1VnUYreWcAdJsX4lZNQYrwrxsH7whktDowKI7PhFIIFVjZzMrcsitry4cMpbRbzXAufPlTCpgljW9qm7OIVN3zld/wBJcaMxtrYIod0K0aL0co19w/xe3aX0ka10k1+IJrxOJstqlES43YPgchuQV9SwdetN/qjnpjzK3FC3DTuNsfNnXCFfRXYUFH6A7J+/9e33YYdE3sojddI+sXVT7u5EWiimmZyb19AgMBAAGjEzARMA8GA1UdEwEB/wQFMAMBAf8wDQYJKoZIhvcNAQELBQADggEBAK1bn0j7zVY34xeIHuEzcJ5TXSiOEzUMYU1FaqlPj3OvFYXEYYb/K8X3ZLd5D757YV4Lr1zq35tdzbdnYigGSxzlq4RGhGdsAmY9SFusQqH/P2JsCika86//UdmDfV5FJBgYfr01DjDK2UZ6ts0rb90DGXUmPWlArxN4gWn5xHYXHZyeHVl/Y91BwLGWBqPI7UPN7RNUvGCDSNMI+jI4gWn8QV7i2hCekvthZ6XlE9PpRtPexJJmOlAofL5k3xhQbJgwJpFE4zOG9OYlTJ1vUDQZiRtZGS4xU1c2Icry2JHUnn89B/7Osh1vvbEzK1dDsSTN59PVh1pRFo18D1u9dhM="
# Optional: lock to email domain(s). Empty = any authenticated Duo user.
allowedEmailDomains:
- "cisco.com"

# Tighten resources for a single-node dev cluster
web:
Expand Down
16 changes: 15 additions & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,18 @@ FROM base AS builder

ARG EXTRAS="all"

# gcc/libssl for native wheels; libxmlsec1/libxml2 + pkg-config so the
# `xmlsec` wheel (pulled in by python3-saml, the in-app SAML SP) builds
# and links against the system crypto libraries.
RUN apt-get update \
&& apt-get install -y --no-install-recommends gcc libssl-dev || true \
&& apt-get install -y --no-install-recommends \
gcc \
pkg-config \
libssl-dev \
libxml2-dev \
libxmlsec1-dev \
libxmlsec1-openssl \
|| true \
&& rm -rf /var/lib/apt/lists/*

COPY pyproject.toml README.md ./
Expand All @@ -24,11 +34,15 @@ RUN pip install --upgrade pip && pip install ".[${EXTRAS}]"
# --- Runtime stage ---
FROM base AS runtime

# Runtime shared libs for xmlsec/lxml (SAML signature verification).
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
curl \
openssh-client \
snmp \
libxml2 \
libxmlsec1 \
libxmlsec1-openssl \
&& rm -rf /var/lib/apt/lists/*

# Non-root user with a fixed numeric UID so Kubernetes can verify runAsNonRoot.
Expand Down
Loading
Loading