Skip to content

feat(keboola-cli): add DuckDB transformation skill#69

Open
MonikaFeigler wants to merge 1 commit into
mainfrom
devin/1777287521-duckdb-transformation-skill
Open

feat(keboola-cli): add DuckDB transformation skill#69
MonikaFeigler wants to merge 1 commit into
mainfrom
devin/1777287521-duckdb-transformation-skill

Conversation

@MonikaFeigler
Copy link
Copy Markdown

Summary

Adds a new duckdb-transformation skill to the keboola-cli plugin, providing AI agents with comprehensive knowledge for writing, optimizing, and migrating DuckDB transformations in Keboola.

The skill (plugins/keboola-cli/skills/duckdb-transformation/SKILL.md) covers:

  • Configuration structure and component ID (keboola.duckdb-transformation)
  • Block-based orchestration and parallel execution model
  • Dynamic backends (XSmall through Large), Parquet format, data type inference
  • SQL writing rules: semicolons, case sensitivity, type casting patterns, DuckDB extensions
  • Snowflake → DuckDB migration: data type mapping, function differences, case sensitivity changes, QUALIFY rewriting, temp table behavior
  • Transformation Migration Component (keboola.app-transformation-migration) usage and limitations
  • Real-world CRM (HubSpot-like) example demonstrating TRY_CAST/NULLIF patterns
  • Best practices and DuckDB vs Snowflake decision guidance

Also bumps keboola-cli plugin version 1.0.01.1.0 and updates READMEs per CLAUDE.md instructions.

Related: content is aligned with connection-docs PR #916 which adds the user-facing DuckDB documentation.

Review & Testing Checklist for Human

  • Verify technical accuracy of case sensitivity claims — the skill states "column names are always case-sensitive regardless of quoting." This comes from the component README, not standard DuckDB docs. Confirm this matches actual runtime behavior in Keboola.
  • Review the ~85% / ~25% migration estimates — the skill states SQLGlot converts ~85% of Snowflake SQL and ~25% of migrations need manual fixes. These are user-provided estimates. Consider whether AI agents should present these as hard numbers or soften the language.
  • Check HubSpot CRM example appropriateness — the real-world example uses HubSpot-like field names (companyId, hs_analytics_source, etc.) in a public repo. Confirm this is acceptable or if it should be made more generic.

Suggested test: Install the plugin locally and ask Claude Code a few DuckDB transformation questions (e.g., "how do I migrate a Snowflake QUALIFY query to DuckDB?", "what backend size should I use for a 7GB dataset?") to verify the skill triggers correctly and produces accurate guidance.

Notes

  • This is a knowledge-only change (no executable code). CI can validate JSON syntax but not content accuracy.
  • The SKILL.md frontmatter follows the same pattern as the existing keboola-config skill.

Link to Devin session: https://app.devin.ai/sessions/ae19dd2027b64ddc87769f9a9dd7538e
Requested by: @MonikaFeigler

…migration, and best practices

Co-Authored-By: Monika Feigler <monika@feigler.cz>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@MonikaFeigler MonikaFeigler marked this pull request as ready for review April 27, 2026 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant