Skip to content

README: clarify catalog/schema addressing, upload sync, and entry-point registration #18

@elviskahoro

Description

@elviskahoro

Feedback from integrating hotdata-ibis into a dlt + Hotdata demo. Ordered by how much time each item would have saved.

High impact

1. Explain the ("your-db-name", schema)("default", schema) asymmetry

The addressing-summary table shows it but never says why. Suggest adding one sentence:

Managed databases are exposed for reading under a synthetic default catalog, regardless of the name you used to create them. The original name is only used for write/admin ops (create_table, drop_table).

This is the single most confusing thing about the API surface.

2. Say where the schema name ("main", "public", …) comes from

Every example uses "main" with no explanation. When data lands via dlt (or any external loader), the schema is often "public". Either show one non-"main" example, or add:

schema is the schema the table lives in inside the managed database — main for tables created via create_table, or whatever schema your external loader used (e.g. dlt writes to public).

3. Drop or replace time.sleep(2) in the Quickstart

A literal sleep in a quickstart reads as a smell and leaves users guessing what "briefly" means. Options:

  • Expose a con.wait_for_table(...) helper
  • Document real polling knobs for uploads (poll_interval_s / poll_timeout_s are shown for queries but it's unclear if they apply to uploads)
  • Replace with a con.list_tables(...)-until-present loop and add: "upload completion is eventually consistent."

4. Note that ibis.hotdata is registered via entry point

Add one line after Install:

No import hotdata_ibis is needed — installing the package registers the backend, and you call it as ibis.hotdata.connect(...).

I hit ModuleNotFoundError: hotdata_ibis before realizing this.

Medium impact

5. Clarify database_id= at connect time

It's listed but never demonstrated. Add a short example showing when to use it — e.g. "if you already have a managed database id from another tool (a dlt run, the Hotdata CLI), bind it directly and skip the name lookup" — and what it unlocks (con.table("x", database=("default", schema)) works without a create_database round-trip).

6. Define session_id / sandbox

Currently described as "sandbox id (X-Session-Id header)" — that's circular. One sentence on what a sandbox is and when a reader would care.

7. Move "What's supported" up

Put it right after the Quickstart. Readers want to know if temp tables / memtables / UDFs work before reading 200 lines of API surface.

8. Spell out the ergonomic win of default_connection + default_schema

In the existing-sources section, t = con.table("orders") works without the database=(...) tuple — call that out explicitly. Right now it's only visible if you read the code carefully.

Low impact

9. One-line description per example script

examples/04_ibis_table_workflows.py etc. — file names alone don't say what's inside.

10. Link to "where do I find my workspace_id?"

The Hotdata UI/CLI path, in one sentence.


If only two land, #1 and #2 are the ones that would have saved me the most time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions