AI Cloud Engineer Bootcamp — Syllabus

The week-by-week topics, labs, and reference resources I use to run my live bootcamp.

For the current cohort dates, format, price, and registration, see the course page: → becloudready.com/programs/ai-cloud-engineer-bootcamp

Format note (updated): Earlier cohorts ran as a 5-day Mon–Fri intensive. Based on feedback that the pace was overwhelming, the program now runs across 6 weeks, every Monday, 6:00–8:00 PM EST. The total instructional time (~12 hours live) is the same — just spread out so you can actually finish the labs between sessions. Ongoing Slack support across all 6 weeks, plus a 1-hour interview prep session after completion.

Prerequisites: basic Linux CLI, comfort with Bash or Python, an AWS Free Tier account.

Week 0 — Before You Show Up

Get these out of the way so we don't burn live time on setup.

AWS Free Tier account, IAM admin user with access keys, AWS CLI installed
Docker Desktop installed and docker run hello-world works
Terraform CLI installed (terraform -v)
kubectl and helm installed
VS Code (or your editor) + Git configured

Light pre-reading if you have time:

LFS101 — Intro to Linux Chapters 1–5
Git & GitHub — my YouTube playlist
AWS Cloud Practitioner Essentials — first 2 modules

Week 1 — Cloud Foundations

Topics

VPC, IAM, EC2, S3, RDS — deep dive
Networking, Security Groups, NACLs
Cost optimisation & billing
AWS CLI & SDK fundamentals

Lab (this week) Stand up a 3-tier VPC (public/private subnets, NAT, IGW), launch an EC2 instance into the private subnet, attach a bastion, query S3 via CLI from the instance, and write a least-privilege IAM policy that scopes access to a single bucket. Lab support runs in Slack all week.

References

AWS Technical Essentials
VPC docs · IAM docs · EC2 docs · S3 docs
Well-Architected Labs
quick-labs — my repo for fast instance launchers

Week 2 — Containers

Topics

Docker — images, layers, networking, volumes, multi-stage builds
Kubernetes architecture — pods, deployments, services, ingress
EKS cluster setup on AWS
Helm charts and GitOps with Argo CD

Lab (this week) Containerise a Python/FastAPI app with a multi-stage Dockerfile (<50 MB final image), push to ECR, deploy to an EKS cluster via a Helm chart, then wire Argo CD to auto-sync the deployment from a Git repo.

References

Week 3 — Terraform

Topics

Terraform modules & workspaces
Remote state with S3 + DynamoDB locking
Drift detection and remediation
Security scanning with tfsec

Lab (this week) Rewrite Week 1's manually-built VPC + EC2 as Terraform modules with a dev and prod workspace, store state remotely with locking, and demo terraform plan catching a drift.

References

Week 4 — DevOps / SRE

Topics

GitHub Actions — build, test, deploy workflows
Blue-green and canary deployment strategies
Container registry workflows + image scanning
Secrets management (AWS Secrets Manager / Vault)
SLOs, error budgets, alerting with Grafana / Datadog
Log aggregation and distributed tracing (OpenTelemetry)
Incident response playbooks

Lab (this week) Wire a GitHub Actions workflow that, on push to main: runs tests → builds + scans a container → pushes to ECR (via OIDC, no static keys) → triggers Argo CD to roll out a blue-green deploy to EKS. Add Prometheus/Grafana dashboards and an OpenTelemetry-instrumented service, define one SLO with an error budget, and write a one-page runbook for a simulated incident.

References

GitHub Actions docs · Quickstart
Google SRE Book · Site Reliability Workbook
Prometheus — First Steps
Grafana Labs tutorials
OpenTelemetry documentation
DORA — DevOps Research & Assessment — the four key metrics
OWASP DevSecOps Guideline
Snyk Learn

Week 5 — AI, RAG, and AgentOps

Topics

Why every production AI agent is a distributed system, not a notebook
LLM serving — hosted endpoints vs. self-hosted (vLLM, TGI) on GPU nodes
Vector databases — pgvector, Pinecone; embedding workers; index lifecycle
RAG pipeline architecture — retrieval API, reranking, evals
AgentOps — agent orchestration, tool registries, observability for LLM calls, cost guardrails, prompt versioning
SQL-safety patterns for agents that touch real data (read-only roles, query budgets, schema allow-lists)

Lab (this week) Layer a small RAG service on top of the EKS cluster from earlier weeks: pgvector as the vector store, a small embedding worker, a FastAPI retrieval service, and an OpenAI-compatible LLM endpoint (vLLM or a hosted model). Instrument the LLM calls with OpenTelemetry, set a per-request token budget, and add a Grafana panel for cost-per-request.

References

LangChain docs
LlamaIndex docs
pgvector · Pinecone docs
vLLM · Text Generation Inference (TGI)
OpenTelemetry for LLMs (Traceloop / OpenLLMetry)
db-agent — my open-source reference for the SQL-safety + agent pattern

Week 6 — Project: End-to-End Agentic AI Data Engineering (db-agent)

The capstone. You take everything from Weeks 1–5 and ship one working system: an agentic text-to-SQL data engineering pipeline, deployed on the cloud you've been building all along.

What you'll build

A working clone of the open-source db-agent pattern on your own AWS account
Backend: FastAPI retrieval service + LLM endpoint (Week 5) running on the EKS cluster (Week 2), provisioned by Terraform (Week 3), deployed via the GitHub Actions + Argo CD pipeline (Week 4)
Read-only SQL execution path with query budgets and a schema allow-list (the safety pattern that keeps agents out of production data)
Observability — traces, token-cost dashboards, an SLO on retrieval latency
A one-page architecture diagram, a runbook, and a short demo video

This is the project you point recruiters at. End-to-end, real cloud, real AI infra, your code.

References

db-agent repo — the pattern, three deployment variants, five progressive modules
AAAI-25 workshop materials — context on the SQL-safety design

After the Bootcamp

1-hour interview prep session the week after Week 6 — resume review, LinkedIn audit, technical-interview question patterns for Cloud / DevOps / AI Cloud Engineer roles.
Lifetime access to the beCloudReady Slack community.
Eligible to re-attend future cohorts at no extra cost as the curriculum evolves.

Also useful:

Interview prep practice repos: k8s-interview-action · interview-challenges

— Chandan Kumar, beCloudReady

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Cloud Engineer Bootcamp — Syllabus

Week 0 — Before You Show Up

Week 1 — Cloud Foundations

Week 2 — Containers

Week 3 — Terraform

Week 4 — DevOps / SRE

Week 5 — AI, RAG, and AgentOps

Week 6 — Project: End-to-End Agentic AI Data Engineering (db-agent)

After the Bootcamp

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

AI Cloud Engineer Bootcamp — Syllabus

Week 0 — Before You Show Up

Week 1 — Cloud Foundations

Week 2 — Containers

Week 3 — Terraform

Week 4 — DevOps / SRE

Week 5 — AI, RAG, and AgentOps

Week 6 — Project: End-to-End Agentic AI Data Engineering (db-agent)

After the Bootcamp

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages