Skip to content

Commit 92daf0c

Browse files
Add feature helpers and insights to Python client
1 parent 281da9b commit 92daf0c

8 files changed

Lines changed: 403 additions & 4 deletions

File tree

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,19 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [0.2.0] - 2025-12-28
6+
7+
### Added
8+
- Feature helpers: `add_features()` for lags, rolling stats, MoM/YoY %, and z-scores
9+
- ML-ready bundles: `get_ml_ready()` to fetch multiple series, align dates, impute gaps, and add per-series features
10+
- Lightweight insights: `get_insight()` returns summary text with latest value, MoM/YoY changes, volatility, and trend
11+
- Semantic search option: `search(..., mode="semantic")` (where supported by the API)
12+
- Documentation updates for the new helpers and quotas mapped to existing plans
13+
14+
### Changed
15+
- Expose new helpers via top-level package; version bump to 0.2.0
16+
- README updated with examples for features, insights, ML-ready bundles, and search modes
17+
518
## [0.1.1] - 2025-12-17
619

720
### Fixed

README.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ Search for datasets by keyword.
111111
- `query` (str): Search term (searches titles, descriptions, IDs)
112112
- `limit` (int): Max results to return (default: `10`, max: `10`)
113113
- `offset` (int): Pagination offset (default: `0`)
114+
- `mode` (str): `"keyword"` (default) or `"semantic"` (where supported by API)
114115

115116
**Returns:** `pd.DataFrame` with columns: `id`, `slug`, `title`, `description`, `provider`, `frequency`, `start_date`, `end_date`, `last_updated`
116117

@@ -127,6 +128,54 @@ print(results[["id", "title", "provider"]])
127128

128129
---
129130

131+
### Feature Engineering Helpers
132+
133+
#### `add_features(series, lags=(1,3,12), windows=(3,6,12), include=None, dropna=False)`
134+
135+
Generate common modeling features (lags, rolling stats, MoM/YoY %, z-scores) for a single series.
136+
137+
```python
138+
df = iq.add_features("fred-cpi", lags=[1, 3, 12], windows=[3, 12])
139+
print(df[["value", "value_yoy_pct", "value_mom_pct", "value_lag_1"]].tail())
140+
```
141+
142+
---
143+
144+
### Lightweight Insights
145+
146+
#### `get_insight(series, window="1y")`
147+
148+
Return a small dict with summary text + key metrics (latest value, MoM, YoY, volatility, trend).
149+
150+
```python
151+
insight = iq.get_insight("fred-cpi", window="1y")
152+
print(insight["summary"])
153+
# fred-cpi: latest 311.17 on 2023-12-01 | +0.24% vs prior | +3.12% YoY | trend upward | volatility (std) 1.23
154+
```
155+
156+
---
157+
158+
### ML-Ready Bundles
159+
160+
#### `get_ml_ready(series_ids, align="inner", impute="ffill+median", features="default")`
161+
162+
Fetch multiple series, align on date, impute gaps, and add per-series features (lags, rolling stats, MoM/YoY %, z-score).
163+
164+
```python
165+
df = iq.get_ml_ready(
166+
["fred-cpi", "fred-gdp"],
167+
align="inner",
168+
impute="ffill+median",
169+
features="default",
170+
lags=[1, 3, 12],
171+
windows=[3, 12],
172+
)
173+
174+
print(df.head())
175+
```
176+
177+
---
178+
130179
### Configuration
131180

132181
#### `set_api_key(api_key)`

datasetiq/__init__.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,13 @@
2121
- Pricing: https://www.datasetiq.com/pricing
2222
"""
2323

24-
__version__ = "0.1.2"
24+
__version__ = "0.2.0"
2525

2626
# Public API exports
2727
from .client import get, search
2828
from .config import configure, set_api_key
2929
from .cache import clear as clear_cache, get_cache_size
30+
from .features import add_features, get_insight, get_ml_ready
3031

3132
# Exception exports (for type checking)
3233
from .exceptions import (
@@ -45,6 +46,9 @@
4546
# Core functions
4647
"get",
4748
"search",
49+
"add_features",
50+
"get_insight",
51+
"get_ml_ready",
4852
"configure",
4953
"set_api_key",
5054
# Cache management

datasetiq/client.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,7 @@ def search(
402402
query: str,
403403
limit: int = 10,
404404
offset: int = 0,
405+
mode: str = "keyword",
405406
) -> pd.DataFrame:
406407
"""
407408
Search for datasets by keyword.
@@ -410,6 +411,7 @@ def search(
410411
query: Search query (searches titles, descriptions, IDs)
411412
limit: Maximum results to return (default: 10, max: 10)
412413
offset: Pagination offset (default: 0)
414+
mode: Search mode ('keyword' default, or 'semantic' where supported)
413415
414416
Returns:
415417
DataFrame with columns: id, slug, title, description, provider,
@@ -438,7 +440,7 @@ def search(
438440

439441
# Make request
440442
url = f"{config.base_url}/search"
441-
params = {"q": query, "limit": min(limit, 10), "offset": offset}
443+
params = {"q": query, "limit": min(limit, 10), "offset": offset, "mode": mode}
442444

443445
response = _make_request_with_retry("GET", url, headers=headers, params=params)
444446
result = response.json()

0 commit comments

Comments
 (0)