Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/ingestion/native-batch.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ For related information on batch indexing, see:
To run either kind of JSON-based batch indexing task, you can:

- Use the **Load Data** UI in the web console to define and submit an ingestion spec.
- Define an ingestion spec in JSON based upon the [examples](#parallel-indexing-example) and reference topics for batch indexing. Then POST the ingestion spec to the [Tasks API endpoint](../api-reference/tasks-api.md), `/druid/indexer/v1/task`, the Overlord service. Alternatively, you can use the indexing script included with Druid at `bin/post-index-task`.
- Define an ingestion spec in JSON based upon the [examples](#parallel-indexing-example) and reference topics for batch indexing. Then POST the ingestion spec to the [Tasks API endpoint](../api-reference/tasks-api.md), `/druid/indexer/v1/task`, the Overlord service.

## Parallel task indexing

Expand Down
8 changes: 4 additions & 4 deletions docs/tutorials/tutorial-batch.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,14 +118,14 @@ Once the spec is submitted, wait a few moments for the data to load, after which

## Loading data with a spec (via command line)

For convenience, the Druid package includes a batch ingestion helper script at `bin/post-index-task`.

This script will POST an ingestion task to the Druid Overlord and poll Druid until the data is available for querying.
To load data with a spec, you need to POST an ingestion task to the Druid Overlord and poll Druid until the data is available for querying.

Run the following command from Druid package root:

```bash
bin/post-index-task --file quickstart/tutorial/wikipedia-index.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/wikipedia-index.json
```

You should see output like the following:
Expand Down
16 changes: 11 additions & 5 deletions docs/tutorials/tutorial-compaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,12 @@ This tutorial uses the Wikipedia edits sample data included with the Druid distr
To load the initial data, you use an ingestion spec that loads batch data with segment granularity of `HOUR` and creates between one and three segments per hour.

You can review the ingestion spec at `quickstart/tutorial/compaction-init-index.json`.
Submit the spec as follows to create a datasource called `compaction-tutorial`:
Submit the spec as follows to the Druid Overlord API to create a datasource called `compaction-tutorial`:

```bash
bin/post-index-task --file quickstart/tutorial/compaction-init-index.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/compaction-init-index.json
```

:::info
Expand Down Expand Up @@ -106,7 +108,9 @@ This datasource only has 39,244 rows. 39,244 is below the default limit of 5,000
Submit the compaction task now:

```bash
bin/post-index-task --file quickstart/tutorial/compaction-keep-granularity.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/compaction-keep-granularity.json
```

After the task finishes, refresh the [segments view](http://localhost:8888/unified-console.html#segments).
Expand Down Expand Up @@ -169,10 +173,12 @@ The Druid distribution includes a compaction task spec to create `DAY` granulari

Note that `segmentGranularity` is set to `DAY` in this compaction task spec.

Submit this task now:
Now, submit this task:

```bash
bin/post-index-task --file quickstart/tutorial/compaction-day-granularity.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/compaction-day-granularity.json
```

It takes some time before the Coordinator marks the old input segments as unused, so you may see an intermediate state with 25 total segments. Eventually, only one DAY granularity segment remains:
Expand Down
6 changes: 4 additions & 2 deletions docs/tutorials/tutorial-delete-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,12 @@ This tutorial requires the following:

In this tutorial, we will use the Wikipedia edits data, with an indexing spec that creates hourly segments. This spec is located at `quickstart/tutorial/deletion-index.json`, and it creates a datasource called `deletion-tutorial`.

Let's load this initial data:
Let's load our initial data by calling Druid Overlord:

```bash
bin/post-index-task --file quickstart/tutorial/deletion-index.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/deletion-index.json
```

When the load finishes, open [http://localhost:8888/unified-console.md#datasources](http://localhost:8888/unified-console.html#datasources) in a browser.
Expand Down
10 changes: 6 additions & 4 deletions docs/tutorials/tutorial-ingestion-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -580,13 +580,15 @@ We've finished defining the ingestion spec, it should now look like the followin

## Submit the task and query the data

From the `apache-druid-{{DRUIDVERSION}}` package root, run the following command:
From the `apache-druid-{{DRUIDVERSION}}` package root, run the following command to create a datasource called `ingestion tutorial`:

```bash
bin/post-index-task --file quickstart/ingestion-tutorial-index.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/ingestion-tutorial-index.json
Comment thread
avaamsel marked this conversation as resolved.
```

After the script completes, we will query the data.
After the ingestion completes, we will query the data.

In the web console, open a new tab in the **Query** view. Run the following query to view the ingested data:

Expand All @@ -602,4 +604,4 @@ Returns the following:
| `2018-01-01T01:02:00.000Z` | `9000` | `18.1` | `2` | `2.2.2.2` | `7000` | `90` | `6` | `1.1.1.1` | `5000` |
| `2018-01-01T01:03:00.000Z` | `6000` | `4.3` | `1` | `2.2.2.2` | `7000` | `60` | `6` | `1.1.1.1` | `5000` |
| `2018-01-01T02:33:00.000Z` | `30000` | `56.9` | `2` | `8.8.8.8` | `5000` | `300` | `17` | `7.7.7.7` | `4000` |
| `2018-01-01T02:35:00.000Z` | `30000` | `46.3` | `1` | `8.8.8.8` | `5000` | `300` | `17` | `7.7.7.7` | `4000` |
| `2018-01-01T02:35:00.000Z` | `30000` | `46.3` | `1` | `8.8.8.8` | `5000` | `300` | `17` | `7.7.7.7` | `4000` |
6 changes: 4 additions & 2 deletions docs/tutorials/tutorial-retention.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,12 @@ It will also be helpful to have finished [Load a file](../tutorials/tutorial-bat

For this tutorial, we'll be using the Wikipedia edits sample data, with an ingestion task spec that will create a separate segment for each hour in the input data.

The ingestion spec can be found at `quickstart/tutorial/retention-index.json`. Let's submit that spec, which will create a datasource called `retention-tutorial`:
The ingestion spec can be found at `quickstart/tutorial/retention-index.json`. Let's submit that spec by calling Druid Overlord, which will create a datasource called `retention-tutorial`:

```bash
bin/post-index-task --file quickstart/tutorial/retention-index.json --url http://localhost:8081
curl -X POST http://localhost:8081/druid/indexer/v1/task \
-H "Content-Type: application/json" \
-d @quickstart/tutorial/retention-index.json
```

After the ingestion completes, go to [http://localhost:8888/unified-console.html#datasources](http://localhost:8888/unified-console.html#datasources) in a browser to access the web console's datasource view.
Expand Down
Loading