Skip to content

docs: add DuckDB transformation documentation#916

Open
MonikaFeigler wants to merge 8 commits into
mainfrom
devin/1777279792-duckdb-transformation-docs
Open

docs: add DuckDB transformation documentation#916
MonikaFeigler wants to merge 8 commits into
mainfrom
devin/1777279792-duckdb-transformation-docs

Conversation

@MonikaFeigler
Copy link
Copy Markdown
Contributor

@MonikaFeigler MonikaFeigler commented Apr 27, 2026

Jira issue(s): N/A

Changes:

  • Add new DuckDB Transformation documentation page (transformations/duckdb/index.md) covering creation, configuration, block-based orchestration, dynamic backends, Parquet format, data type inference, sync actions, DuckDB version selection, examples, best practices, and DuckDB vs Snowflake guidance
  • Add Snowflake to DuckDB Migration Guide (transformations/duckdb/snowflake-migration.md) covering identifier case sensitivity, data type mapping, SQL function differences (QUALIFY, NVL, IFF, DATEADD, LISTAGG, etc.), CREATE TABLE syntax differences, and migration tips
  • Add step-by-step tutorial (5 steps with screenshots) for the experimental Transformation Migration component (keboola.app-transformation-migration) that automates Snowflake → DuckDB migration using SQLGlot, including a prominent warning that ~25% of migrated transformations will require manual adjustments and a dedicated Limitations section
  • Add DuckDB to the sidebar navigation in _data/navigation.yml
  • Update transformations/index.md to list DuckDB as an available SQL backend (marked as beta) and add it to the features table
  • Include 15 UI screenshots illustrating transformation creation, configuration, version selection, data type inference behavior, job errors/success, query examples, and the full migration component workflow (configuration → select transformations → save & run → job detail → migrated result)

Updates since last revision:

  • Add Semicolons Between Statements best practice — documents the requirement that each SQL statement must be terminated with ; when multiple statements are in a single script
  • Add Real-World Example: CRM Data Transformation — a multi-table HubSpot CRM example demonstrating common DuckDB patterns: TRY_CAST(NULLIF(...) AS TYPE) for safe type conversion of empty strings, ::BOOLEAN shorthand casting, and proper semicolon separation between statements

Notes for reviewers:

  • Technical details (backend memory sizes, SQL syntax differences, case sensitivity behavior) were provided by the author and cross-referenced with the component README.
  • The case sensitivity section states that column names are always case-sensitive regardless of quoting — this is a Keboola-component-specific nuance sourced from the component README. Worth verifying this matches actual runtime behavior.
  • The migration component tutorial is based on the component-transformation-migration README. The component ID referenced is keboola.app-transformation-migration.
  • The "~85% automatic conversion" and "~25% need manual fixes" claims are user-provided estimates. Consider whether these should be softened (e.g., "most" / "some") if there's no hard data behind them.
  • The CRM example uses HubSpot-like table/column names — verify this is appropriate for public-facing docs or if it should be more generic.
  • Jekyll build passes locally with no errors.

Release Notes

Justification, description

Adds user-facing documentation for the new DuckDB Transformation backend (beta), including a main guide, a Snowflake-to-DuckDB migration reference with SQL dialect differences, and a step-by-step tutorial for the automated Transformation Migration component.

Plans for Customer Communication

N/A

Impact Analysis

Documentation-only change. No code or infrastructure impact.

Deployment Plan

Standard docs deployment on merge to main.

Rollback Plan

Revert the merge commit.

Post-Release Support Plan

N/A

Link to Devin session: https://app.devin.ai/sessions/ae19dd2027b64ddc87769f9a9dd7538e
Requested by: @MonikaFeigler

- Add main DuckDB transformation page with configuration, backend sizes,
  block orchestration, Parquet format, data type inference, examples,
  and best practices
- Add Snowflake to DuckDB migration guide with data type mapping,
  SQL function differences, case sensitivity, and CREATE TABLE syntax
- Add DuckDB to navigation sidebar and transformations features table
- Include UI screenshots for configuration, version selection, job results,
  data type inference, and query examples

Co-Authored-By: Monika Feigler <monika@feigler.cz>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration Bot and others added 2 commits April 27, 2026 09:02
- Add sync actions section (syntax_check, lineage, execution plan, expected inputs)
- Clarify DuckDB version selection with latest alias and pinning
- Fix case sensitivity: columns always case-sensitive per component README
- Separate table vs column case sensitivity rules in migration guide
- Improve debug mode and configuration descriptions

Co-Authored-By: Monika Feigler <monika@feigler.cz>
Co-Authored-By: Monika Feigler <monika@feigler.cz>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

Testing Results

Ran local Jekyll server and verified all new documentation pages render correctly.

All 5 tests passed
Test Result
DuckDB main page renders with all sections and TOC Passed
All 10 screenshots display on DuckDB main page Passed
Migration guide renders with Transformation Migration tutorial Passed
Code blocks render with SQL syntax highlighting Passed
DuckDB appears in transformations index and sidebar navigation Passed
Screenshots

DuckDB Main Page — title, TOC, sidebar nav all render:
DuckDB Main Page

Migration Guide — Transformation Migration Component tutorial with screenshot, steps, and experimental warning:
Migration Guide

Transformations Index — DuckDB in features table with (beta) label:
Features Table

Devin session

devin-ai-integration Bot and others added 4 commits April 27, 2026 10:47
… manual fix warning

Co-Authored-By: Monika Feigler <monika@feigler.cz>
Removed Snowflake to DuckDB Migration from DuckDB Transformations.
Co-Authored-By: Monika Feigler <monika@feigler.cz>
@MonikaFeigler MonikaFeigler marked this pull request as ready for review April 27, 2026 10:59
@MonikaFeigler MonikaFeigler enabled auto-merge April 27, 2026 10:59
@MonikaFeigler MonikaFeigler requested a review from kudj April 27, 2026 10:59
@MonikaFeigler
Copy link
Copy Markdown
Contributor Author

@kudj can we release this, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant