Skip to content

Commit 0f18e42

Browse files
authored
Update DATA_PIPELINE.md with recent changes
Updated the last updated date and total views in the documentation. Adjusted environment variable table formatting for clarity.
1 parent 9ac1e98 commit 0f18e42

1 file changed

Lines changed: 14 additions & 19 deletions

File tree

src/DATA_PIPELINE.md

Lines changed: 14 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Costa Rica
55
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
66
[brown9804](https://github.com/brown9804)
77

8-
Last updated: 2025-11-24
8+
Last updated: 2025-11-12
99

1010
----------
1111

@@ -113,7 +113,6 @@ curl -o src/data/updated_product_catalog(in).csv https://raw.githubusercontent.c
113113

114114
## Scripts
115115

116-
117116
<details>
118117
<summary><b> pipelines/ingest_to_cosmos.py </b> (Click to expand)</summary>
119118

@@ -142,7 +141,6 @@ curl -o src/data/updated_product_catalog(in).csv https://raw.githubusercontent.c
142141
</details>
143142

144143

145-
<details>
146144
<details>
147145
<summary><b> pipelines/create_search_index.py </b> (Click to expand)</summary>
148146

@@ -154,10 +152,8 @@ curl -o src/data/updated_product_catalog(in).csv https://raw.githubusercontent.c
154152

155153
</details>
156154

157-
158155
<details>
159-
<parameter name="oldString"><details>
160-
<summary><b> pipelines/create_search_index.py </b> (Click to expand)</summary>
156+
<summary><b> pipelines/create_search_index.py </b> (Click to expand)</summary>
161157

162158
- Creates Azure AI Search index with vector search
163159
- Configures HNSW algorithm for vector search
@@ -166,9 +162,8 @@ curl -o src/data/updated_product_catalog(in).csv https://raw.githubusercontent.c
166162

167163
</details>
168164

169-
170165
<details>
171-
<parameter name="newString"><summary><b> pipelines/upload_to_search.py </b> (Click to expand)</summary>
166+
<summary><b> pipelines/upload_to_search.py </b> (Click to expand)</summary>
172167

173168
- Reads all documents from Cosmos DB container
174169
- Authenticates using AAD or key-based auth (auto-fallback)
@@ -209,14 +204,14 @@ AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-3-small
209204

210205
### Environment Variable Reference
211206

212-
| Variable | Default | Description |
213-
|----------|---------|-------------|
214-
| `COSMOS_SKIP_IF_EXISTS` | `true` | Skip import if container already has data |
215-
| `COSMOS_FORCE_INGEST` | `false` | Force re-import even if data exists (overrides skip) |
216-
| `COSMOS_DB_ENDPOINT` | - | Cosmos DB account endpoint URL |
217-
| `COSMOS_DB_KEY` | - | Cosmos DB account key (optional if using AAD) |
218-
| `COSMOS_DB_NAME` | - | Database name |
219-
| `COSMOS_DB_CONTAINER_NAME` | - | Container name for product catalog |
207+
| Variable | Default | Description |
208+
|----------------------------|---------|--------------------------------------------------------|
209+
| `COSMOS_SKIP_IF_EXISTS` | `true` | Skip import if container already has data |
210+
| `COSMOS_FORCE_INGEST` | `false` | Force re-import even if data exists (overrides skip) |
211+
| `COSMOS_DB_ENDPOINT` | - | Cosmos DB account endpoint URL |
212+
| `COSMOS_DB_KEY` | - | Cosmos DB account key (optional if using AAD) |
213+
| `COSMOS_DB_NAME` | - | Database name |
214+
| `COSMOS_DB_CONTAINER_NAME` | - | Container name for product catalog |
220215

221216
## Verification
222217

@@ -259,7 +254,7 @@ az search index show-statistics \
259254

260255
<!-- START BADGE -->
261256
<div align="center">
262-
<img src="https://img.shields.io/badge/Total%20views-1410-limegreen" alt="Total views">
263-
<p>Refresh Date: 2025-11-24</p>
257+
<img src="https://img.shields.io/badge/Total%20views-1386-limegreen" alt="Total views">
258+
<p>Refresh Date: 2025-11-12</p>
264259
</div>
265-
<!-- END BADGE -->
260+
<!-- END BADGE -->

0 commit comments

Comments
 (0)