Skip to content

Commit 39eb0d4

Browse files
committed
Add Elasticsearch instructions to README
1 parent 9b0c7d6 commit 39eb0d4

1 file changed

Lines changed: 60 additions & 0 deletions

File tree

  • asap-tools/execution-utilities/benchmark

asap-tools/execution-utilities/benchmark/README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,67 @@ python run_benchmark.py \
264264
--output-dir ./results \
265265
--output-prefix h2o
266266
```
267+
---
268+
## Elasticsearch End-to-End Example using H2O Dataset
269+
270+
### Step 1-5:
271+
Follow the same instructions from the H2O GroupBy example above.
267272

273+
### Step 6 — Launch Arroyo sketch pipeline
274+
275+
```bash
276+
python export_to_arroyo.py \
277+
--streaming-config ./configs/h2o_streaming.yaml \
278+
--source-type file \
279+
--input-file ./data/h2o_arroyo.json \
280+
--file-format json \
281+
--ts-format unix_millis \
282+
--pipeline-name h2o_pipeline \
283+
--arroyosketch-dir ~/ASAPQuery/asap-summary-ingest \
284+
--output-dir ./arroyo_outputs
285+
```
286+
287+
### Step 7 — Start QueryEngineRust
288+
289+
```bash
290+
cd ~/ASAPQuery/asap-query-engine
291+
292+
./target/release/query_engine_rust \
293+
--kafka-topic sketch_topic
294+
--input-format json \
295+
--config ~/ASAPQuery/asap-tools/execution-utilities/benchmark/configs/h2o_inference.yaml \
296+
--streaming-config ~/ASAPQuery/asap-tools/execution-utilities/benchmark/configs/h2o_streaming.yaml \
297+
--http-port 8088 --delete-existing-db --log-level DEBUG \
298+
--output-dir ./output --streaming-engine arroyo \
299+
--query-language SQL --lock-strategy per-key \
300+
--prometheus-scrape-interval 1 > /tmp/query_engine.log 2>&1 &
301+
```
302+
303+
### Step 8 — Load data into Elasticsearch (baseline)
304+
305+
```bash
306+
python export_to_database.py
307+
--dataset h2o
308+
--file-path ./data/G1_1e7_1e2_0_0.csv
309+
--es-host localhost
310+
--es-port 9200
311+
--es-index h2o_groupby
312+
--es-api-key your-api-key
313+
--es-bulk-size 5000
314+
```
315+
316+
### Step 9 — Run benchmark
317+
318+
```bash
319+
python run_benchmark.py
320+
--mode asap
321+
--asap-sql-file ./queries/h2o_asap.sql
322+
--baseline-sql-file ./queries/h2o_elasticsearch.sql
323+
--elastic-host localhost
324+
--elastic-port 9200
325+
--elastic-api-key your-api-key
326+
--output-dir ./results --output-prefix h2o
327+
```
268328
---
269329

270330
## Custom Dataset

0 commit comments

Comments
 (0)