Docs/project guide by johnnybabs · Pull Request #11 · N4si/microservices-python-app

johnnybabs · 2026-06-03T19:36:54Z

No description provided.

- Added comprehensive .gitignore covering Terraform state, k8s secrets, build artifacts, Python cache, Node modules, and IDE files - Untracked 6 secret.yaml files that should never be in git history - Created directory structure for terraform/, monitoring/, docs/, src/frontend/, .github/workflows/ - Added terraform.tfvars.example template - Added CLAUDE.md and VIDCAST_UPGRADE_PLAN.md project context files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- VPC module: VPC, 2 public subnets (eu-west-2a/b), IGW, route table - IAM module: EKS cluster role + node role with correct policy attachments - EKS module: cluster v1.31, managed node group, OIDC provider for IRSA - Validation block rejects T-type instances (blocked by account SCP) - Security groups module: NodePort rules for ports 30002-30008 - Dev environment: root module wiring all child modules + S3/DynamoDB backend - All resources tagged: Project=vidcast, ManagedBy=terraform, Environment=dev Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…g + Trivy) - ci.yml: matrix build for 4 services — ruff lint, Trivy CRITICAL/HIGH scan, Docker build + push tagged with short git SHA (never :latest) - cd.yml: EKS deployment triggered by workflow_run on CI success - Jenkinsfile: parallel builds, Trivy scan, Docker Hub push, Swarm staging deploy, smoke test via /healthz, manual approval gate, EKS production deploy with automatic rollback on pipeline failure - docker-compose.swarm.yml: overlay network, named volumes, rollback on failure for all services — mirrors EKS deployment for staging parity - GITHUB_SECRETS_REQUIRED.md: documents all secrets needed for CI/CD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…port Auth service: - Added /healthz endpoint testing PostgreSQL connectivity (200 ok / 503 error) Gateway service: - Added /healthz endpoint testing MongoDB + RabbitMQ connectivity - Added flask-cors to requirements.txt; CORS(server) for frontend support Converter + Notification services: - Added pathlib.Path('/tmp/healthy').touch() after each successful message All 4 deployment manifests: - Liveness + readiness probes (HTTP for auth/gateway, exec for converter/notification) - Resource requests/limits: auth 50m/200m 64Mi/128Mi, gateway 100m/300m 128Mi/256Mi, converter 250m/500m 256Mi/512Mi, notification 50m/100m 64Mi/128Mi - securityContext: runAsNonRoot, runAsUser=1000, readOnlyRootFilesystem, allowPrivilegeEscalation=false, capabilities.drop ALL - Converter + notification: emptyDir volume mounted at /tmp for temp file writes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… alerts - monitoring/values.yaml: kube-prometheus-stack config — Grafana NodePort 30007 (admin/vidcast-demo), Alertmanager NodePort 30008, 7d retention, 10Gi storage, etcd/scheduler/controller-manager disabled (EKS manages these) - monitoring/dashboards/vidcast-operations.json: custom Grafana dashboard with pod status, restart counts, node CPU/memory gauges, RabbitMQ queue depth timeseries, per-pod CPU and memory usage - monitoring/alerts/vidcast-alerts.yaml: PrometheusRule CRD with 4 alerts: PodCrashLoopBackOff (critical), HighNodeMemory >85% (warning), HighNodeCPU >85% (warning), RabbitMQQueueBacklog >10 msgs (warning), RabbitMQUnavailable (critical) - monitoring/README.md: install, access, and uninstall instructions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rchitecture - React 18 + Vite + Tailwind CSS single-page application - Pages: Login (JWT auth), Upload (drag-and-drop MP4), Download (file ID input), Dashboard (Grafana iframe + links), Architecture (interactive service diagram) - src/api.js: axios wrapper for login, uploadVideo, downloadMp3 - Dockerfile: multi-stage — Node 18 build, nginx 1.25 serve as non-root (uid 1001) - nginx.conf: proxy /api/ to gateway service, SPA routing, security headers - manifest/: Deployment (NodePort 30006), Service, ConfigMap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…notes - README.md: rewritten for public GitHub — product overview, architecture diagram, quick-start deploy guide, CI/CD overview, security summary, teardown - docs/architecture.md: full service inventory, data flow walkthrough (13-step upload path), port map, security architecture (implemented vs discussed-but-not-built) - docs/deployment-guide.md: step-by-step guide for Terraform, Helm, PostgreSQL init, RabbitMQ queues, secret creation, microservice deploy, E2E test, monitoring install, operational commands, cost management, full teardown - docs/presentation-notes.md: 12-15 min timing guide, opening script, architecture analogies (restaurant/post office/security badge), platform engineering walkthrough, what-I'd-do-next talking points, 7 common interview questions with full model answers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This edit triggers the CI process for Docker image builds.

Removed a line indicating an edit to trigger CI.

Split all multi-import lines (E401) across 7 files. Additional fixes: - auth/server.py: bare except → except Exception (E722) - auth/validate.py: not "x" in → "x" not in (E713) - gateway/server.py: remove unused DispatcherMiddleware import (F401) - converter/consumer.py: remove unused time import (F401) - converter/to_mp3.py: remove unused err variable in except clause (F841) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

python:3.10-slim-bullseye (Debian 11) has CRITICAL/HIGH CVEs with fixes available, causing Trivy to fail CI. python:3.10-slim-bookworm (Debian 12, current stable) resolves these. Applied to all 4 service Dockerfiles. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

prometheus-client was declared in requirements.txt but never imported or initialised. The only intended consumer was the unauth_count counter, whose call sites (unauth_count.inc()) were already removed as a NameError crash fix. Dropping the dependency shrinks the image and removes a dead transitive. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The notification service only reads the mp3 queue and sends email via smtplib. It has no media-processing code path, so the ffmpeg install (~100MB) was pure waste copied from the converter Dockerfile. Removing it shrinks the image and reduces the CVE surface Trivy has to scan. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

None of the four Python service Dockerfiles dropped privileges; the final image ran as root. Added USER 1000 before CMD in each, matching the Kubernetes securityContext (runAsNonRoot: true, runAsUser: 1000) already enforced on the deployments. This makes the images non-root by default even outside k8s (e.g. the Docker Swarm staging environment). All listen ports are >1024 and the only runtime writes target /tmp (1777), so no privileged access is required. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

No service had a .dockerignore, so docker build sent the entire context (including manifest/, secret.yaml files, __pycache__, .git, and docs) to the daemon. The new files exclude that cruft, keeping build contexts small and ensuring Kubernetes secrets can never be baked into an image layer by accident. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The MongoDB connection strings (with embedded username/password) lived in gateway-configmap and converter-configmap. ConfigMaps are not treated as sensitive — they are trivially dumped via `kubectl get configmap -o yaml` and were committed in plaintext. Moved them to the gateway-secret / converter-secret Secret objects. Env var names are unchanged and the deployments already mount both configMapRef and secretRef via envFrom, so this is transparent to the apps. Also in this change: - Removed unused VIDEO_QUEUE from notification-configmap (consumer only reads MP3_QUEUE; the video queue is the converter's). - Added secret.yaml.example templates for all four services (committed) so operators have the key structure without any real secret entering git. - Added imagePullPolicy: IfNotPresent to the four backend deployments, which CD re-tags with immutable git-SHA images. Left the frontend on the default (Always) since it still uses a mutable :latest tag. - Updated the deployment guide's secret-creation step for the moved keys. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ning Comment-only changes documenting known issues that cannot be safely fixed in a surgical pass without coordinated schema/data work: - auth-service/server.py + Postgres/init.sql: flag plaintext password storage and comparison; recommend bcrypt/argon2 + constant-time verify for production. - MongoDB pvc.yaml: flag that the 1Gi claim binds a 10Gi PV, leaving ~9Gi unused. No behaviour changes; these guide the next engineer toward the proper fixes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Trivy (CRITICAL,HIGH, ignore-unfixed) was failing on vulnerabilities that the bookworm base-image bump alone did not clear, at two layers below the app deps: - OS packages: added `apt-get upgrade -y` to pull patched libgnutls30 (CRITICAL CVE-2026-33845, CVE-2026-42010) and the libkrb5* family (HIGH). - Build toolchain: added `pip install --upgrade setuptools wheel` so the image ships patched wheel (CVE-2026-24049) and setuptools-vendored jaraco.context (CVE-2026-23949), neither of which the app imports but Trivy still scans. Also: dropped the unused build-essential/libpq-dev/python3-dev from the notification image (its deps are pure-Python wheels), and added apt-cache cleanup (`rm -rf /var/lib/apt/lists/*`) to keep the images slim. Verified the debian target reports 0 vulnerabilities on all four images locally. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Rewrote all four requirements.txt as minimal >= floors so pip resolves patched transitive deps (Jinja2, MarkupSafe, idna, charset-normalizer, etc.) instead of the old fully-frozen 2022 pins. Dropped dev-only tooling (pylint/astroid/jedi/ isort) that was never imported at runtime, and auth's cryptography (the service signs JWTs with HS256 = stdlib hmac; cryptography is only needed for RS256). Key version floors (each clears a Trivy-flagged fixable CVE): - Flask >=3.0.3 / Werkzeug >=3.0.3 — CVE-2024-34069 (debugger RCE) is only fixed in Werkzeug 3.0.3, which requires Flask 3. gateway's flask-pymongo bumped to >=3.0.1 for Flask-3 compatibility (the .db API it uses is unchanged). - Flask-Cors >=4.0.2 — CVE-2024-6221 (CORS bypass). - requests >=2.31.0 — CVE-2023-32681. - certifi >=2023.7.22 — CVE-2023-37920. - urllib3 >=2.6.0 — the latest 1.26.x still has 4 fixable HIGH CVEs (e.g. CVE-2025-66418) patched only in the 2.x line; safe because requests supports urllib3 2.x and no app code uses urllib3 directly. - converter: numpy <2.0 (moviepy 1.0.3 compat) + Pillow >=10.3.0 (CVE-2023-44271 / CVE-2023-50447, CRITICAL). Verified locally: all four images pass `trivy image --severity CRITICAL,HIGH --ignore-unfixed --exit-code 1` (0 findings), and Flask-3/Flask-PyMongo-3 and moviepy imports were smoke-tested in-container. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…aform Replaces static AWS access keys in the CD pipeline with short-lived, OIDC-issued credentials — no long-lived secrets stored in GitHub. Terraform: - New module terraform/modules/github-oidc: creates the GitHub Actions OIDC identity provider and a deploy IAM role whose trust policy is scoped to repo:johnnybabs/microservices-python-app:* (aud sts.amazonaws.com). The role grants only eks:DescribeCluster (for `aws eks update-kubeconfig`). - eks module: set access_config.authentication_mode = API_AND_CONFIG_MAP so EKS access entries work alongside aws-auth. - root module: wire the github-oidc module and add an aws_eks_access_entry + access_policy_association granting the deploy role AmazonEKSEditPolicy at cluster scope — this is what lets `kubectl set image` actually run. Added github_org/github_repo variables and a github_actions_role_arn output. Workflow: - cd.yml now uses aws-actions/configure-aws-credentials@v4 with role-to-assume and adds `permissions: id-token: write` to request the OIDC token. Drops the AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY inputs. - GITHUB_SECRETS_REQUIRED.md: CD secrets section rewritten for OIDC (AWS_DEPLOY_ROLE_ARN from `terraform output github_actions_role_arn`). Validated with `terraform fmt` + `terraform validate` (backend=false). Not yet applied — cluster provisioning runs next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Both StatefulSets referenced a Secret (mongodb-secret, rabbitmq-secret) that no chart template produced. Fresh helm installs hung in ContainerCreating (Mongo: FailedMount) or CreateContainerConfigError (RabbitMQ: secret not found) until the secrets were created manually. - MongoDB: 5 keys (MONGO_ROOT_USERNAME/PASSWORD, MONGO_USERNAME/PASSWORD, MONGO_USERS_LIST) sourced from values.yaml.secret.* - RabbitMQ: 2 keys (RABBITMQ_DEFAULT_USER/PASS) sourced from values.yaml.secret.* (new section - values.yaml had no secret config) Postgres chart intentionally untouched: it has no referenced-but-missing secret; it injects POSTGRES_USER/PASSWORD/DB directly as env vars from values.yaml, so it renders and runs cleanly as-is. .gitignore: the blanket **/secret.yaml rule (meant for real app-manifest secrets) was also hiding these chart templates. Added scoped negations so the templates are tracked; they hold no literal credentials, only {{ .Values.secret.* }} references. Manual secrets remain in place for the current deployment to avoid Helm ownership conflicts. Charts are now self-contained for the next clean install. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Without bootstrap_cluster_creator_admin_permissions=true, the principal that runs terraform apply has no kubectl access to the resulting cluster and must manually create their own access entry. This locked out johnadmin today after the first terraform apply. Fix makes the access grant automatic on cluster creation, preventing recurrence on rebuild. NOT applied to the live cluster: this attribute is creation-only (ForceNew in the AWS provider), so applying against the existing vidcast-cluster would force-replace it. The fix takes effect on the next greenfield rebuild. terraform CLI is also not present in this operator environment, so fmt/validate/plan were not re-run here; the edit is a single aligned attribute addition matching terraform fmt style. Also gitignore the local 'tfplan'/'*.tfplan' binary plan artifacts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Previously the pika connection was constructed with no credentials, which silently defaulted to guest:guest. With the RabbitMQ Helm chart now configuring rabbituser as the only user, connections failed with ACCESS_REFUSED. This change reads RABBITMQ_DEFAULT_USER and RABBITMQ_DEFAULT_PASS from the container environment, with a guest:guest fallback so local development without a secret still works. The env vars are injected in production via envFrom: secretRef: rabbitmq-secret in each deployment manifest. Gateway has two connection sites (module-level publish channel and the /healthz probe); both now use a shared PlainCredentials object. Resolves the credential mismatch between the chart and the running application code. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- Image references updated from nasi101/* (upstream tutorial) to johnbaabalola/*-service (this fork's CI-built images), pinned to commit SHA c91216a for deterministic deploys. Image names match the CI matrix (auth-service, gateway-service, etc.), not the short nasi101 names. - Gateway, converter, and notification deployments now load RabbitMQ credentials from rabbitmq-secret via an additional envFrom: secretRef (appended to existing envFrom blocks, not replacing them). - Auth service image bumped but no RabbitMQ secret added (it does not connect to RabbitMQ). Works with the prior commit that reads RABBITMQ_DEFAULT_USER/PASS from the environment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The CVE dependency bump (5c224a3) upgraded PyMongo to a release that requires MongoDB >= 4.2 (wire version 8). The chart pinned mongo:4.0.8 (wire version 7), so gateway and converter failed at runtime with: 'Server at mongodb:27017 reports wire version 7, but this version of PyMongo requires at least 8 (MongoDB 4.2).' This surfaced as gateway /healthz 503 (mongodb check) and would have broken all GridFS upload/download. mongo:4.2 is the minimum compatible version and the supported single-step upgrade from 4.0 (a direct jump to 4.4+ refuses to start against a 4.0 feature-compatibility-version data dir). Live cluster already bumped via 'kubectl set image statefulset/mongodb' (no app data existed, so the in-place upgrade was non-destructive). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The converter and notification deployments use an exec liveness probe (test -f /tmp/healthy), but the file was only created AFTER a message was successfully processed. An idle consumer with no traffic therefore never created the file and was killed by the probe (~45s), crash-looping forever. For notification this was unrecoverable: with a placeholder Gmail password, email.notification() always errors -> basic_nack -> the per-message touch never runs, so the pod could never become healthy. Now each consumer touches /tmp/healthy once immediately after connecting to RabbitMQ and being ready to consume (a meaningful 'connected and consuming' signal), and still refreshes it after each processed message. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… to 16f49a0 Three deploy-time fixes found during the live rollout to vidcast-cluster: - gateway: add an emptyDir volume mounted at /tmp. With readOnlyRootFilesystem=true and no writable temp dir, Werkzeug's multipart upload buffering failed -> POST /upload returned 500 ('No usable temporary directory found'). Other consumers already had this volume; gateway was missing it. - converter: 4 -> 2 replicas (and maxSurge 8 -> 1). The single m7i-flex.large node (2 vCPU) could not schedule 4 converters @ 250m CPU request alongside the rest; the extra pods sat Pending with 'Insufficient cpu'. 2 replicas comfortably handle demo throughput. - all four services pinned to johnbaabalola/<svc>:16f49a0 (the SHA that includes the RabbitMQ-credential and /tmp/healthy startup fixes). End-to-end verified: login -> upload -> convert (MoviePy) -> mp3 queue -> notification consume. Email itself fails by design (placeholder Gmail App Password). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Uploads through the frontend /api proxy failed with 413 Request Entity Too Large: nginx defaults client_max_body_size to 1m, but VidCast uploads MP4s (the bundled test asset alone is 2.8MB). Direct gateway uploads (NodePort 30002) were unaffected because they bypass nginx; only the frontend path (30006 -> /api/) hit the limit. Raised to 256m. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

CI does not build the frontend (matrix covers only the 4 backend services), so johnbaabalola/frontend:latest never existed on Docker Hub. Built locally and pushed to this account's ECR (501562869470.dkr.ecr.eu-west-2.amazonaws.com/vidcast-frontend); the EKS node IAM role can pull from ECR in-account, so no registry credentials or imagePullSecret are needed. Pinned to commit fd35335 (includes the nginx client_max_body_size upload fix). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Adds an account-creation flow so new users aren't limited to the single seeded login. - auth-service: new POST /register (JSON email+password). Rejects duplicates with 409, inserts into auth_user, and returns a JWT so the new user is signed in immediately. Password stored plaintext to match the existing /login comparison and seeded schema (hashing is a separate, coordinated change touching /login too). - gateway: public POST /register proxying to auth-service via access.register(). - frontend: api.register() and a Sign In / Sign Up toggle on the Login page (with confirm-password + duplicate/mismatch error handling). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Fix 1 of the frontend-improvements plan. Replaces the "every JWT says admin=true" lie with genuine role-based access control, and closes a privilege-escalation hole in self-registration. auth-service: - JWT now carries the user's real role: emits both admin (bool, back-comp for existing gateway/frontend readers) and role (string, forward-comp). - /login verifies against a bcrypt hash with checkpw (constant-time) and issues the role from the DB. Also fixes a latent psycopg2 bug: execute() always returns None, so the old `if res is None` made unknown users 500 instead of 401 — login could not reliably say "no". - /register hashes with bcrypt and inserts role='user'; returns a non-admin token. Previously it minted an admin JWT for anyone who signed up. - add bcrypt>=4.1.2. Postgres init.sql: - add role (default 'user'), UNIQUE(email), created_at. - seed admins (baabalola@, johnbsignups@) with bcrypt hashes + role=admin, idempotent via ON CONFLICT. Hashes generated locally from the gitignored plaintext; only the hashes are committed. gateway: - /upload and /download now require authentication, not admin (if not access -> 401). They were gated on access["admin"], which only worked while every token lied; real RBAC would have locked out all users. frontend: - auth.js decodes the JWT; App.jsx shows Dashboard/Architecture and routes to them only for admins (previously always shown, routes unguarded). Breaking at deploy time: the bcrypt auth image and the new DB seed must land together (a bcrypt image against a plaintext DB breaks all logins). Migration runbook in src/auth-service/RBAC_EXPLAINED.md — run with John at merge. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…rashing Fix 3 of the frontend-improvements plan. Per-user email routing already worked end-to-end (gateway puts the JWT username on the video message, converter forwards it to the mp3 message, send/email.py uses it as the recipient), so this commit is the robustness half the routing was missing. send/email.py now obeys a clear contract and never raises: - returns None -> consumer ACKs (success, or a deliberate skip) - returns a str -> consumer NACKs (retryable failure) Changes: - json.loads wrapped: unparseable bodies are dropped (ACK), not looped on. - message.get("username"): messages from before per-user routing (no username) are skipped (ACK) instead of raising KeyError. Backward compatible. - SMTP send wrapped in try/except: a send failure returns an error string so the consumer nacks gracefully. This removes the CrashLoopBackOff root cause (a bad/placeholder Gmail password let SMTPAuthenticationError propagate out of the callback and kill the pod; with a stuck message that was an infinite crash loop). - friendlier subject/body. Known limitation (documented): a permanently-bad credential requeues in a loop (poison message). Bounding that needs a dead-letter queue + max-retry — deliberately out of scope (no new infra). Not reachable today now that the real Gmail app password is in the secret. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…bcrypt hash Review follow-up F1-F. bcrypt.checkpw raises ValueError("Invalid salt") if the stored password isn't a bcrypt hash — e.g. a legacy plaintext row from before the migration. The unguarded call made /login 500 (and leak a stack trace) for such a row. Wrap it: on ValueError/TypeError, log and treat as a failed login (401). Defence-in-depth on top of the merge runbook, which ensures all rows are bcrypt. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Review follow-up F1-K. The runbook previously lived only in RBAC_EXPLAINED.md, which is gitignored (*_EXPLAINED.md = local study aids), so it would not travel with the branch/PR. Move it to a tracked operational doc. Parameterised — reads PGPASSWORD from the gitignored config, commits no credentials. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…R + live) Review follow-up BH-A. The frontend image 8582bf1 exists in account ECR and is the image the live deployment is already running; the manifest just hadn't been updated from fd35335. Commit it so the manifest matches reality. Confirmed deliberate (not applied by CD — CD only set-images the 4 backends). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Fix 2 of the frontend-improvements plan. Adds file ownership and an in-app notification so users see when their conversion is ready without refreshing. Ownership (metadata.owner_email, sourced from the uploader's JWT username): - gateway storage/util.py: tag the stored video with owner_email + filename. - converter to_mp3.py: copy the tag onto the resulting mp3 (.get so legacy messages without a username don't crash) + give it a filename. Gateway endpoints (auth required, scoped to the caller's own files): - GET /notifications/unseen-count?since=<ISO> -> {count} of the user's mp3s created after `since`. Uses count_documents on the GridFS files collection (PyMongo 4 removed Cursor.count()); bad `since` falls back to epoch. - GET /my-files -> {files:[{fid,filename,size,created}]} newest first (feeds the My Conversions page in Feature 1). Frontend: - api.js: unseenCount() + myFiles() helpers. - hooks/useUnseenCount.js: 5s polling hook (deliberately polling, not SSE/WS, for a single-user demo), cancels cleanly on unmount/token change. - App.jsx: a `since` "last seen" marker (resets on login and on visiting the Download tab); red badge on the Download nav link when count > 0. No backfill for pre-ownership files (no correct owner to assign); they simply don't appear in any user's list. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Feature 1 of the frontend-improvements plan. A token-guarded /my-files page listing the user's converted MP3s (filename, date, size) newest-first, each with a Download button. Almost entirely a view over Fix 2's work: it calls the existing myFiles() helper / gateway /my-files endpoint and reuses Download.jsx's blob-download pattern. No new backend or infra. - pages/MyConversions.jsx: fetch on mount (with unmount-cancel guard), loading/ error/empty states, per-row download with a per-row spinner, null-safe size/ date formatting. - App.jsx: "My Conversions" nav link + /my-files route (redirects to / if logged out). The page is the concrete demo of per-user ownership: the gateway scopes results to the caller's owner_email, so a user only ever sees their own files. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Generated by npm install while building the frontend locally to verify the RBAC/notifications + My Conversions changes. Committing it pins transitive dependency versions so local and (future) CI builds resolve identically. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ns note Fix 4 polish. The sign-up endpoint, gateway proxy and React form already existed (commit 8582bf1 + the RBAC hardening); this adds the spec's remaining bits: - auth /register: reject passwords shorter than 8 chars with a 400 (server-side is the real guard). - Login.jsx: matching client-side length check (fails fast before the request), an "At least 8 characters" hint under the password field in signup mode, and an "About email notifications" info box explaining that the download link is emailed to the address they sign up with. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Feature 4 of the frontend-improvements plan. An admin-only /admin/users page that makes RBAC concrete: list every user with role, signup date, and conversion count, and promote/demote between user and admin. auth-service (internal, ClusterIP — no role check of its own; the gateway enforces admin): - GET /users -> [{email, role, created_at}] - PATCH /users/<email> -> validate role in {user,admin}, UPDATE ... RETURNING; 404 if no such email. gateway (enforces admin + guardrails): - GET /admin/users -> admin only; merges the auth user list with per-user conversion counts (Mongo aggregation on fs.files by metadata.owner_email). - PATCH /admin/users/<email> -> admin only; guardrails before proxying: * self-demotion -> 403 (no accidental self-lockout) * last-admin demotion -> 409 (no cluster-wide admin lockout) * unknown email -> 404 (passed through from auth) Emits an audit line: AUDIT admin_role_change admin=<caller> target=<email> new_role=<role> result=<status>. frontend: - api.js: adminUsers() + setUserRole(). - pages/AdminUsers.jsx: table with role badges + Promote/Demote buttons; disables the button on your own row (mirrors the 403 guard); maps 403/409/404 to clear messages and reloads after a change. - App.jsx: admin-only "Users" nav link + admin-guarded /admin/users route. No new dependencies, no new deployments. Known limitations (in-cluster trust gap; stdout audit is not tamper-evident) documented in ADMIN_USERS_EXPLAINED.md with the "real fix would be" framing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…tion) Six decisions made on this branch, each as choose/alternatives/trade-off/where-it- breaks/real-fix: bcrypt-now, polling-vs-SSE, stats-panel-skip, in-cluster trust gap, stdout audit, admin guardrails. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Feature/rbac and notifications

…ication Python print() to stdout is block-buffered in the containers, so diagnostics — notably the gateway admin role-change AUDIT line — never reached `kubectl logs` (Werkzeug access logs did, because they go through logging->stderr). Setting PYTHONUNBUFFERED=1 flushes stdout per line so the audit trail is visible immediately. Same one-line env on all three Python services that print at runtime. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ary) Two learnings from the integration test: (A) the bcrypt migration is forward-only — once Postgres holds bcrypt hashes the pre-bcrypt auth image can't verify them, so post-migration recovery is fix-forward not rollback; (B) the self-demote 403 and last-admin 409 guards are complementary, not redundant — 409 is the defense for the stale-admin-token case that 403 doesn't cover. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Fix/unbuffered audit logs

A single self-contained guide explaining VidCast from inception to current state, written for three audiences at once (group members, technical assessors, non-technical guests) with analogies inline rather than segregated. 16 sections: what it does, architecture, the microservices, data layer, the upload->download journey, an authn/authz deep dive, infrastructure, the CI and CD pipelines stage-by-stage, the Docker-Hub<->Git trust chain, dev-vs-prod (GitHub Actions vs the written-but-not-yet-running Jenkins pipeline), observability, the eight problems-faced stories, decisions & trade-offs, known limitations, and a glossary. Synthesised from the code, git history, DECISIONS_MADE.md, the merge runbook, and the *_EXPLAINED companions — not stitched. Corrects several aspirational points to match reality (no unit-test stage, SHA-only image tags, MoviePy drives ffmpeg, cluster-level monitoring) and parks genuine gaps honestly in section 15. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

johnnybabs and others added 30 commits June 1, 2026 09:11

Trigger CI for Docker image builds

983174e

This edit triggers the CI process for Docker image builds.

Remove CI trigger comment from README

be63d88

Removed a line indicating an edit to trigger CI.

Edit Readme to trigger CI pipeline

a47207a

johnnybabs and others added 17 commits June 2, 2026 22:36

Merge pull request #1 from johnnybabs/feature/rbac-and-notifications

d9e4282

Feature/rbac and notifications

Merge pull request #2 from johnnybabs/fix/unbuffered-audit-logs

c36b319

Fix/unbuffered audit logs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs/project guide#11

Docs/project guide#11
johnnybabs wants to merge 47 commits into
N4si:mainfrom
johnnybabs:docs/project-guide

johnnybabs commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnnybabs commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant