feat(catalog): hadoop table and namespace CRUD operations by tanmayrauth · Pull Request #969 · apache/iceberg-go

tanmayrauth · 2026-05-01T22:08:27Z

4: CreateTable + LoadTable + CheckTableExists
Implement the three core table operations. CreateTable validates the namespace exists, rejects custom locations, builds metadata via table.NewMetadata, writes v1.metadata.json through a temp-file-plus-rename pattern, and does a best-effort version-hint write. LoadTable calls findVersion to get the current version, builds the metadata path, and delegates to table.NewFromLocation. CheckTableExists delegates to isTableDir. Tests cover create-and-load round-trip, create with partition spec / sort order / properties, reject custom location, create in non-existent namespace, create duplicate, load non-existent, load with stale hint, and check exists true/false.

Depends on #968 #963
Relates to #798

Implement CreateNamespace, DropNamespace, CheckNamespaceExists, ListNamespaces, LoadNamespaceProperties, and UpdateNamespaceProperties (unsupported, matching Java). Relates to apache#798 Depends-on: apache#953 (scaffold) Depended-on-by: PR 4 (table CRUD), PR 5 (list/drop/rename)

…mented

Set up Docker and Spark infrastructure for Hadoop catalog cross-compatibility testing with Java's HadoopCatalog. - Add hadoop_validation.py: SparkSession configured with spark.sql.catalog.hadoop_test (type=hadoop, warehouse=/home/iceberg/hadoop-warehouse) - Add shared volume mount in docker-compose.yml: /tmp/iceberg-hadoop-warehouse (host) <-> /home/iceberg/hadoop-warehouse (Spark) - Copy hadoop_validation.py into Spark container via Dockerfile - Add make integration-hadoop target No Go code — purely infrastructure so subsequent PRs can add integration test cases that validate Go ↔ Spark interop. Depends-on: nothing (parallel with PR 1) Depended-on-by: PRs 4, 5, 6 (integration test cases)

…p catalog Implement the three core table operations: - CreateTable: validates namespace exists, rejects custom locations, writes v1.metadata.json via temp-file+rename, updates version hint - LoadTable: uses findVersion with three-tier fallback, delegates to table.NewFromLocation for metadata parsing - CheckTableExists: delegates to isTableDir Relates to apache#798 Depends-on: PR 2 (version-hint), PR 3 (namespace-ops) Depended-on-by: PR 6 (CommitTable)

Add cross-compatibility integration tests verifying CreateTable, LoadTable, and CheckTableExists work between Go and Spark Hadoop catalogs. Pre-create the hadoop-warehouse directory before Docker compose to ensure runner ownership in CI.

tanmayrauth · 2026-05-03T04:22:19Z

@laskoviymishka @zeroshade can you please review this PR?

zeroshade · 2026-05-04T16:02:23Z

+	info, err := os.Stat(nsPath)
+	if os.IsNotExist(err) || (err == nil && !info.IsDir()) {
+		return nil, fmt.Errorf("%w: %s", catalog.ErrNoSuchNamespace, strings.Join(ns, "."))
+	}


shouldn't this support customizable file systems beyond just local? i.e. shouldn't this use the io package?

This is intentionally local-only for now to match the scoped plan (local parity with Spark's Java HadoopCatalog first). The io.IO interface doesn't currently have Stat or MkdirAll equivalents needed for directory-based namespace operations, so switching to it would require extending the interface. I'll open a follow-up issue to add something like StatableIO and refactor to use icebergio.IO throughout for HDFS/cloud support.

Fair enough. Let's continue to use pkg.go.dev/io/fs as inspiration for any changes we make to the IO package.

zeroshade

LGTM just update the docstrings for NewCatalog/Catalog to specify that this only supports local filesystem for now

Update Catalog and NewCatalog docstrings to note that only local filesystem paths are currently supported.

tanmayrauth · 2026-05-04T23:52:16Z

Updated the docstring.

zeroshade · 2026-05-05T15:11:57Z

looks good, just need to resolve the conflicts!

tanmayrauth added 5 commits May 1, 2026 14:42

fix(catalog): revert UpdateNamespaceProperties error to not yet imple…

cdc18c6

…mented

Merge branch 'feat/hadoop-integration-infra' into temp-rebase-base

d05ed17

tanmayrauth requested a review from zeroshade as a code owner May 1, 2026 22:08

tanmayrauth force-pushed the feat/hadoop-table-crud branch from 3305012 to fb3dd76 Compare May 2, 2026 01:52

zeroshade reviewed May 4, 2026

View reviewed changes

zeroshade requested changes May 4, 2026

View reviewed changes

docs(catalog): clarify Hadoop catalog supports local filesystem only

bd4feb8

Update Catalog and NewCatalog docstrings to note that only local filesystem paths are currently supported.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(catalog): hadoop table and namespace CRUD operations#969

feat(catalog): hadoop table and namespace CRUD operations#969
tanmayrauth wants to merge 7 commits intoapache:mainfrom
tanmayrauth:feat/hadoop-table-crud

tanmayrauth commented May 1, 2026 •

edited

Loading

Uh oh!

tanmayrauth commented May 3, 2026

Uh oh!

zeroshade May 4, 2026

Uh oh!

tanmayrauth May 4, 2026

Uh oh!

zeroshade May 4, 2026

Uh oh!

zeroshade left a comment

Uh oh!

tanmayrauth commented May 4, 2026

Uh oh!

zeroshade commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tanmayrauth commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tanmayrauth commented May 3, 2026

Uh oh!

zeroshade May 4, 2026

Choose a reason for hiding this comment

Uh oh!

tanmayrauth May 4, 2026

Choose a reason for hiding this comment

Uh oh!

zeroshade May 4, 2026

Choose a reason for hiding this comment

Uh oh!

zeroshade left a comment

Choose a reason for hiding this comment

Uh oh!

tanmayrauth commented May 4, 2026

Uh oh!

zeroshade commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tanmayrauth commented May 1, 2026 •

edited

Loading