From de43b22c6d3c2814bf76df21898179a92e375488 Mon Sep 17 00:00:00 2001 From: lepy Date: Mon, 29 Jun 2026 16:22:56 +0200 Subject: [PATCH] docs: CHANGELOG (Keep a Changelog) + Docs-Seite MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bislang fehlte ein CHANGELOG; master ist seit v1.2.0 um ~48 Commits (12 Features) voraus. CHANGELOG.md im Keep-a-Changelog-Stil: - [Unreleased]: strikt additiver Zuwachs — Blob-Foundation (RFC 0003), gemeinsamer Integritaets-Mixin + DataFrame.as_blob (RFC 0004), erweitertes DataFrame- Serialisierungs-Portfolio (Arrow-Field-Metadata, Data Package, HDF5/RFC 0002) und native, formatuebergreifende Bild-Metadaten ueber 6 Container + Sidecar (RFC 0005). - [1.2.0]: maschinenlesbares Metadaten-Rueckgrat (JSON-LD/RDF, Schema, Verifiable Credentials, Interactive), self-describing DataFrame, MkDocs-Doku, schlanke Core-Deps. docs/changelog.md bindet die Root-Datei via pymdownx-snippets ein (keine Duplikat- Pflege); mkdocs-Nav: "Changelog". Versions-Bump bleibt der Release-Entscheidung ueberlassen (Empfehlung: 1.3.0). mkdocs build --strict gruen. --- CHANGELOG.md | 69 +++++++++++++++++++++++++++++++++++++++++++++++ docs/changelog.md | 1 + mkdocs.yml | 1 + 3 files changed, 71 insertions(+) create mode 100644 CHANGELOG.md create mode 100644 docs/changelog.md diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..23fdcb2 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,69 @@ +# Changelog + +All notable changes to **sdata** are documented here. The format is based on +[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to +[Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [Unreleased] + +A large, strictly **additive** increment: a content/integrity foundation under all +data containers (`Blob`), a much broader `DataFrame` serialization portfolio, and +native, format-agnostic metadata embedding for images. Core dependencies remain +`numpy`, `pandas`, `suuid`; every new backend stays optional with a pure-Python path. + +### Added + +- **Native image metadata (RFC 0005).** New pure-Python, Pillow-free module + `sdata.imagemeta` embeds/reads sdata metadata **natively** into six containers with + one API (`detect_format`/`embed`/`extract`/`supported_formats`): **PNG** (`iTXt`), + **JPEG** (`APP1`), **JPEG 2000** (`uuid` box), **GIF** (comment extension), + **WebP** (`sdAT` chunk) and **TIFF** (private IFD tag, original bytes untouched). + `Image` gains a uniform `save`/`from_file` flow, `embedded_metadata()`, and a + lossless `.meta.json` **sidecar fallback** for formats without a native + carrier (e.g. BMP), controllable via `save(sidecar=True|False|None)`. +- **`Blob` as the content/integrity/provenance foundation (RFC 0003).** Hardened + `Blob` with `sha256`/`sha1`/`md5`, `size`, `verify()`/`update_checksum()`, a lazy + `content_bytes` cache, `exists()`, `write(uri)`/`open()` (fsspec), standard-vocabulary + provenance metadata (`dcat:mediaType`, `dcterms:*`, `schema:sha256`) and + mime/creation-date autofill. `FileReference` and `Image` now build on `Blob`. +- **Shared integrity mixin (RFC 0004, Option B).** `sdata.sclass.content.ContentIntegrityMixin` + provides the hash/`verify`/`size` layer to both `Blob` and `DataFrame` via a + `content_bytes` hook (no inheritance between them). +- **`DataFrame.as_blob(fmt)` (RFC 0004, Option C).** Render a table as a standalone + `Blob` in a chosen format (`parquet`/`csv`/`arrow`/`feather`) — composition that + grants hash/`verify`/`size`/`write`/`open` without changing the base class. +- **`DataFrame` serialization portfolio.** Native per-column field metadata in + **Arrow/Feather** (`to_arrow`/`from_arrow`/`to_feather`/`from_feather`), a + Frictionless **Data Package** bundle (`to_datapackage`/`from_datapackage`, `.zip`), + and **HDF5** I/O (`to_hdf`/`from_hdf`, optional `sdata[hdf]`, RFC 0002). +- **RFCs.** 0002 (HDF5), 0003 (Blob foundation), 0004 (DataFrame vs. Blob), + 0005 (native image metadata); MkDocs nav, API reference and usage guides extended. + +### Changed + +- `DataFrame.content_bytes` hashes the **data only** (plain Parquet), so storing the + checksum in the metadata does not change the hash (no self-reference). +- Documentation reorganized around the full DataFrame serialization portfolio and the + image-metadata workflow (new `usage/image-metadata.md`). + +### Notes + +- 100 % line coverage maintained; `mkdocs build --strict` green. `sdata.imagemeta` is + measured (100 %) via synthetic, Pillow-free tests, so coverage holds even without + Pillow installed. + +## [1.2.0] - 2026-06-26 + +- **Machine-readable metadata backbone.** Typed dtype registry, a registered JSON-LD + `@context` (vocab/units/BFO), `to_jsonld`/`to_rdf`/`to_turtle` with `.meta.jsonld` + sidecars, declarative `MetadataSchema`/`TableSchema` validation, an interactive + Jupyter layer (`_repr_html_`, attribute autocomplete) and signed metadata as W3C + **Verifiable Credentials** over the pure-Python EdDSA stack (`sdata.did`). +- **Self-describing `DataFrame` container** with per-column metadata and Parquet/CSV/ + dict/JSON-LD serialization, superseding the deprecated `Data` class. +- **Docs & packaging.** MkDocs Material + mkdocstrings documentation site; core + dependencies reduced to `numpy`/`pandas`/`suuid` (stdlib `zoneinfo`); warning-free + test suite. + +[Unreleased]: https://github.com/lepy/sdata/compare/v1.2.0...HEAD +[1.2.0]: https://github.com/lepy/sdata/releases/tag/v1.2.0 diff --git a/docs/changelog.md b/docs/changelog.md new file mode 100644 index 0000000..786b75d --- /dev/null +++ b/docs/changelog.md @@ -0,0 +1 @@ +--8<-- "CHANGELOG.md" diff --git a/mkdocs.yml b/mkdocs.yml index fd46930..63f4d0d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -73,3 +73,4 @@ nav: - "0004 — DataFrame and Blob": rfc/0004-dataframe-and-blob.md - "0005 — Native image metadata": rfc/0005-native-image-metadata.md - Releasing: releasing.md + - Changelog: changelog.md