fastmachinelearning · bo3z · May 8, 2026 · May 11, 2026 · May 12, 2026 · May 12, 2026
diff --git a/.gitignore b/.gitignore
@@ -3,7 +3,9 @@ __pycache__
 *~
 *.npy
 _build
-model_1
-model_2
-model_3
 .DS_Store
+.claude/
+hls4ml_prjs/
+data/
+models/
+6_more_models/outputs/
diff --git a/1_getting_started/1a_train_keras.ipynb b/1_getting_started/1a_train_keras.ipynb
@@ -0,0 +1,263 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Part 1a: Getting started with Keras\n",
+    "\n",
+    "In this notebook we will train a small neural network on the LHC jet tagging dataset using Keras v3. When you are done, head straight to **`1c_hls4ml_synth.ipynb`** to convert the trained model to an FPGA design with hls4ml."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "48fc9aa8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ['KERAS_BACKEND'] = 'tensorflow'\n",
+    "\n",
+    "import numpy as np\n",
+    "import sys\n",
+    "\n",
+    "sys.path.append('..')\n",
+    "\n",
+    "from sklearn.datasets import fetch_openml\n",
+    "from sklearn.model_selection import train_test_split\n",
+    "from sklearn.preprocessing import LabelEncoder, StandardScaler\n",
+    "\n",
+    "%matplotlib inline\n",
+    "np.random.seed(0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1ea16a9e",
+   "metadata": {},
+   "source": [
+    "## Fetch the jet tagging dataset from Open ML\n",
+    "\n",
+    "The [HLS4ML LHC jet dataset](https://openml.org/search?type=data&id=42468) was introduced by [Duarte et al. (2018)](https://arxiv.org/abs/1804.06913) to benchmark fast neural network inference on FPGAs for particle physics applications.\n",
+    "\n",
+    "Jets are collimated sprays of particles produced when quarks or gluons are knocked out of colliding protons at the LHC. Identifying the origin of a jet in real time is a core task for LHC trigger systems, which must decide within a few microseconds whether to keep or discard each collision event.\n",
+    "\n",
+    "The dataset contains 16 high-level jet substructure observables derived from simulated proton-proton collisions at √s = 13 TeV. These include energy correlation functions, N-subjettiness ratios, a groomed jet mass, and constituent multiplicity. The goal is to classify each jet into one of five categories:\n",
+    "\n",
+    "| Label | Jet origin |\n",
+    "|-------|------------|\n",
+    "| `g`   | Gluon |\n",
+    "| `q`   | Light quark |\n",
+    "| `w`   | W boson decay (W → qq') |\n",
+    "| `z`   | Z boson decay (Z → qq') |\n",
+    "| `t`   | Top quark decay (t → bqq') |"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4c7e8da3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = fetch_openml('hls4ml_lhc_jets_hlf')\n",
+    "X, y = data['data'], data['target']"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "77c862eb",
+   "metadata": {},
+   "source": [
+    "### Let's print some information about the dataset\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "77e2e0b9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(data['feature_names'])\n",
+    "print(X.shape, y.shape)\n",
+    "print(X[:5])\n",
+    "print(y[:5])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "41b77e9a",
+   "metadata": {},
+   "source": [
+    "As you saw above, the `y` target is an array of strings, e.g. `['g', 'w', ...]` etc.\n",
+    "We need to make this a \"One Hot\" encoding for the training.\n",
+    "Then, split the dataset into training and validation sets:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "31201326",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "le = LabelEncoder()\n",
+    "y_encoded = le.fit_transform(y)\n",
+    "y = np.eye(5)[y_encoded]  # one-hot encode\n",
+    "X_train_val, X_test, y_train_val, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n",
+    "print(y[:5])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f3b4d39e",
+   "metadata": {},
+   "outputs": [],
+   "source": "scaler = StandardScaler()\nX_train_val = scaler.fit_transform(X_train_val)\nX_test = scaler.transform(X_test)\n\nos.makedirs('../data/jet-tagging', exist_ok=True)\nnp.save('../data/jet-tagging/X_train_val.npy', X_train_val)\nnp.save('../data/jet-tagging/X_test.npy', X_test)\nnp.save('../data/jet-tagging/y_train_val.npy', y_train_val)\nnp.save('../data/jet-tagging/y_test.npy', y_test)\nnp.save('../data/jet-tagging/classes.npy', le.classes_)"
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f9238a75",
+   "metadata": {},
+   "source": [
+    "## Now construct a model\n",
+    "We'll use 3 hidden layers with 64, then 32, then 32 neurons. Each layer will use ReLU activation.\n",
+    "Finally, we add an output layer with 5 neurons and the Softmax activation, to calculate the probability of each of the five classes."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7c50f7cd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from keras.models import Sequential\n",
+    "from keras.layers import Dense\n",
+    "from keras.optimizers import Adam\n",
+    "\n",
+    "model = Sequential()\n",
+    "model.add(Dense(64, input_shape=(16,), name='fc1', activation='relu'))\n",
+    "model.add(Dense(32, name='fc2', activation='relu'))\n",
+    "model.add(Dense(32, name='fc3', activation='relu'))\n",
+    "model.add(Dense(5, name='output', activation='softmax'))\n",
+    "model.summary()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "da38ba67",
+   "metadata": {},
+   "source": [
+    "## Train the model\n",
+    "We'll use the Adam optimiser with categorical crossentropy loss.\n",
+    "The model isn't very complex, so this should take just a few minutes even on a CPU."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "520e2fc5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model.compile(optimizer=Adam(learning_rate=1e-3), loss='categorical_crossentropy', metrics=['accuracy'])\n",
+    "model.fit(\n",
+    "    X_train_val,\n",
+    "    y_train_val,\n",
+    "    batch_size=1024,\n",
+    "    epochs=20,\n",
+    "    validation_split=0.25,\n",
+    "    shuffle=True,\n",
+    ")\n",
+    "os.makedirs('../models', exist_ok=True)\n",
+    "model.save('../models/keras_model_part1.h5')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Check performance\n",
+    "Check the accuracy and make a ROC curve:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import plotting\n",
+    "import matplotlib.pyplot as plt\n",
+    "from sklearn.metrics import accuracy_score\n",
+    "\n",
+    "y_keras = model.predict(X_test)\n",
+    "print(\"Accuracy: {}\".format(accuracy_score(np.argmax(y_test, axis=1), np.argmax(y_keras, axis=1))))\n",
+    "plt.figure(figsize=(9, 9))\n",
+    "_ = plotting.makeRoc(y_test, y_keras, le.classes_)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "An accuracy of ~75% is expected for this 5-class problem — random guessing gives only 20%, and some classes (notably gluon vs. light quark) are physically very similar and genuinely hard to separate even with more sophisticated methods.\n",
+    "\n",
+    "The ROC (Receiver Operating Characteristic) curve shows, for each class, the trade-off between signal efficiency (true positive rate) and background efficiency (false positive rate) as the decision threshold is varied. The area under the curve (AUC) ranges from 0.5 (random classifier) to 1.0 (perfect). Higher and further to the upper-left is better."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "713edc21",
+   "metadata": {},
+   "source": [
+    "**N.B.** This notebook trains a full-precision (32-bit floating-point) model. When converting to an FPGA design, hls4ml applies post-training quantization (PTQ) by default, which works well at 16-bit precision but struggles to match accuracy below ~8 bits. For the most resource-efficient FPGA designs **quantization-aware training (QAT)** gives substantially better results. See **Part 2** for QKeras (Keras) and Brevitas (PyTorch) QAT examples."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Next step\n",
+    "\n",
+    "Your model is trained and saved. Open **`1c_hls4ml_synth.ipynb`** to convert it to an FPGA design with hls4ml."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1c38f1d6",
+   "metadata": {},
+   "source": [
+    "## Further reading\n",
+    "\n",
+    "For more details, see: Duarte, Han, Harris et al., \"Fast inference of deep neural networks in FPGAs for particle physics\", JINST 13 P07027 (2018), [arXiv:1804.06913](https://arxiv.org/abs/1804.06913)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "hls4ml-tutorial",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.16"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}