Skip to content

Commit d8f071b

Browse files
committed
sqlite-diffable objects command, closes #7
1 parent dc78897 commit d8f071b

2 files changed

Lines changed: 93 additions & 3 deletions

File tree

README.md

Lines changed: 39 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ Tools for dumping/loading a SQLite database to diffable directory structure
1010

1111
pip install sqlite-diffable
1212

13+
## Demo
14+
15+
The repository at [simonw/simonwillisonblog-backup](https://github.com/simonw/simonwillisonblog-backup) contains a backup of the database on my blog, https://simonwillison.net/ - created using this tool.
16+
1317
## Dumping a database
1418

1519
Given a SQLite database called `fixtures.db` containing a table `facetable`, the following will dump out that table to the `dump/` directory:
@@ -32,11 +36,43 @@ You can replace those tables (dropping them before restoring them) using the `--
3236

3337
sqlite-diffable load restored.db dump/ --replace
3438

35-
## Demo
39+
## Converting to JSON objects
3640

37-
The repository at [simonw/simonwillisonblog-backup](https://github.com/simonw/simonwillisonblog-backup) contains a backup of the database on my blog, https://simonwillison.net/ - created using this tool.
41+
Table rows are stored in the `.ndjson` files as newline-delimited JSON arrays, like this:
42+
43+
```
44+
["a", "a", "a-a", 63, null, 0.7364712141640124, "$null"]
45+
["a", "b", "a-b", 51, null, 0.6020187290499803, "$null"]
46+
```
47+
48+
Sometimes it can be more convenient to work with a list of JSON objects.
49+
50+
The `sqlite-diffable objects` command can read a `.ndjson` file and its accompanying `.metadata.json` file and output JSON objects to standard output:
51+
52+
sqlite-diffable objects fixtures.db dump/sortable.ndjson
53+
54+
The output of that command looks something like this:
55+
```
56+
{"pk1": "a", "pk2": "a", "content": "a-a", "sortable": 63, "sortable_with_nulls": null, "sortable_with_nulls_2": 0.7364712141640124, "text": "$null"}
57+
{"pk1": "a", "pk2": "b", "content": "a-b", "sortable": 51, "sortable_with_nulls": null, "sortable_with_nulls_2": 0.6020187290499803, "text": "$null"}
58+
```
59+
60+
Add `-o` to write that output to a file:
61+
62+
sqlite-diffable objects fixtures.db dump/sortable.ndjson -o output.txt
63+
64+
Add `--array` to output a JSON array of objects, as opposed to a newline-delimited file:
65+
66+
sqlite-diffable objects fixtures.db dump/sortable.ndjson --array
67+
Output:
68+
```
69+
[
70+
{"pk1": "a", "pk2": "a", "content": "a-a", "sortable": 63, "sortable_with_nulls": null, "sortable_with_nulls_2": 0.7364712141640124, "text": "$null"},
71+
{"pk1": "a", "pk2": "b", "content": "a-b", "sortable": 51, "sortable_with_nulls": null, "sortable_with_nulls_2": 0.6020187290499803, "text": "$null"}
72+
]
73+
```
3874

39-
## Format
75+
## Storage format
4076

4177
Each table is represented as two files. The first, `table_name.metadata.json`, contains metadata describing the structure of the table. For a table called `redirects_redirect` that file might look like this:
4278

sqlite_diffable/cli.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import pathlib
44
import sqlite_utils
55
import sqlite3
6+
import sys
67

78

89
@click.group()
@@ -106,3 +107,56 @@ def load(dbpath, directory, replace):
106107
if line.strip()
107108
)
108109
db[info["name"]].insert_all(rows)
110+
111+
112+
@cli.command()
113+
@click.argument(
114+
"filepath",
115+
type=click.Path(file_okay=True, allow_dash=False, dir_okay=False, exists=True),
116+
)
117+
@click.option(
118+
"-o",
119+
"--output",
120+
type=click.Path(file_okay=True, allow_dash=True, dir_okay=False),
121+
)
122+
@click.option(
123+
"--array",
124+
is_flag=True,
125+
help="Output JSON array instead of newline-delimited objects",
126+
)
127+
def objects(filepath, output, array):
128+
"""
129+
Output rows from a .ndjson file as newline-delimited JSON objects
130+
131+
Usage:
132+
133+
sqlite-diffable objects dump-location/mytable.ndjson
134+
135+
This will read the column names from the accompanying .metadata.json file.
136+
"""
137+
if not filepath.endswith(".ndjson"):
138+
raise click.ClickException("Must be a .ndjson file")
139+
path = pathlib.Path(filepath)
140+
metadata = path.parent / (path.stem + ".metadata.json")
141+
if not metadata.exists():
142+
raise click.ClickException("No accompanying .metadata.json file")
143+
# Read the column names
144+
info = json.loads(metadata.read_text())
145+
columns = info["columns"]
146+
# Output the rows
147+
out = sys.stdout if output is None else open(output, "w")
148+
if array:
149+
out.write("[")
150+
first = True
151+
for line in path.open():
152+
row = json.loads(line)
153+
if array and not first:
154+
out.write(",\n")
155+
else:
156+
out.write("\n")
157+
out.write(json.dumps(dict(zip(columns, row))))
158+
first = False
159+
if array:
160+
out.write("\n]\n")
161+
else:
162+
out.write("\n")

0 commit comments

Comments
 (0)