Skip to content

loader(dm): update import-into external ID handling#12716

Merged
ti-chi-bot[bot] merged 1 commit into
pingcap:masterfrom
GMHDBJD:update-tidb-importinto-external-id-20260615173725
Jun 16, 2026
Merged

loader(dm): update import-into external ID handling#12716
ti-chi-bot[bot] merged 1 commit into
pingcap:masterfrom
GMHDBJD:update-tidb-importinto-external-id-20260615173725

Conversation

@GMHDBJD

@GMHDBJD GMHDBJD commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #12699

What is changed and how it works?

  • Update TiDB dependencies to include lightning, importinto: add external ID compatibility flag tidb#69193.
  • Revert the previous DM-side early stripping of external-id from the Lightning Mydumper.SourceDir, so Dumpling and Lightning storage access still use the configured S3 external ID.
  • For DM import-into mode, enable TiDB Lightning's compatibility config TikvImporter.StripS3ExternalIDForImportSQL on the per-task Lightning config. TiDB then strips explicit S3 external IDs only from generated IMPORT INTO SQL resource parameters.
  • Keep DM import-into on the existing Lightning RunOnceWithOptions path instead of running the import-into importer directly.
  • Remove the obsolete TiFlow-side storage.StripS3ExternalID helper and its tests.

Check List

Tests

  • Unit test
    • go test ./dm/loader ./dm/pkg/storage
    • make dm_unit_test_pkg PKG=github.com/pingcap/tiflow/dm/loader
  • Manual test (add detailed scripts or steps below)
    • make fmt
    • git diff --check

Questions

Will it cause performance regression or break compatibility?

No performance regression is expected. This keeps the original S3 URI for Dumpling and Lightning storage access, and only enables the TiDB Lightning import-into SQL compatibility path for DM import-into mode.

Do you need to update user documentation, design documentation or monitoring documentation?

No.

Release note

Fix DM import-into mode to preserve S3 external-id for Lightning storage access while keeping compatibility with older IMPORT INTO planners.

@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. area/dm Issues or PRs related to DM. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 15, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the DM loader to run the import-into backend directly, allowing compatibility options to be passed while keeping the source directory intact. It also introduces progress tracking for import-into using atomic byte counters and removes the now-redundant StripS3ExternalID utility. Feedback on the changes highlights a potential database connection leak in runImportInto where the downstream database connection pool is not closed on successful execution, which should be resolved using a deferred close.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread dm/loader/lightning.go Outdated
Comment on lines +359 to +376
importDB, err := conn.GetDownstreamDB(&l.cfg.To)
if err != nil {
return err
}

l.importIntoTotalBytes.Store(0)
l.importIntoFinishedBytes.Store(0)
importer, err := importinto.NewImporter(ctx, cfg, importDB.DB,
importinto.WithProgressUpdater(&importIntoProgressUpdater{loader: l}),
importinto.WithStripS3ExternalIDForImportSQL(),
)
if err != nil {
if closeErr := importDB.Close(); closeErr != nil {
l.logger.Warn("failed to close import-into DB connection", zap.Error(closeErr))
}
return err
}
defer importer.Close()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The database connection pool importDB created by conn.GetDownstreamDB is only closed when importinto.NewImporter returns an error. If NewImporter succeeds, importDB is never closed, leading to a connection leak every time runImportInto is executed.

Using a deferred function right after successfully obtaining importDB ensures that the connection pool is always closed when runImportInto exits.

Suggested change
importDB, err := conn.GetDownstreamDB(&l.cfg.To)
if err != nil {
return err
}
l.importIntoTotalBytes.Store(0)
l.importIntoFinishedBytes.Store(0)
importer, err := importinto.NewImporter(ctx, cfg, importDB.DB,
importinto.WithProgressUpdater(&importIntoProgressUpdater{loader: l}),
importinto.WithStripS3ExternalIDForImportSQL(),
)
if err != nil {
if closeErr := importDB.Close(); closeErr != nil {
l.logger.Warn("failed to close import-into DB connection", zap.Error(closeErr))
}
return err
}
defer importer.Close()
importDB, err := conn.GetDownstreamDB(&l.cfg.To)
if err != nil {
return err
}
defer func() {
if closeErr := importDB.Close(); closeErr != nil {
l.logger.Warn("failed to close import-into DB connection", zap.Error(closeErr))
}
}()
l.importIntoTotalBytes.Store(0)
l.importIntoFinishedBytes.Store(0)
importer, err := importinto.NewImporter(ctx, cfg, importDB.DB,
importinto.WithProgressUpdater(&importIntoProgressUpdater{loader: l}),
importinto.WithStripS3ExternalIDForImportSQL(),
)
if err != nil {
return err
}
defer importer.Close()

@GMHDBJD GMHDBJD force-pushed the update-tidb-importinto-external-id-20260615173725 branch from e87f862 to 3211cdf Compare June 16, 2026 04:47
@GMHDBJD

GMHDBJD commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

/test pull-check

@GMHDBJD GMHDBJD force-pushed the update-tidb-importinto-external-id-20260615173725 branch from 3211cdf to c31bac8 Compare June 16, 2026 05:40
@ti-chi-bot ti-chi-bot Bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Jun 16, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, joechenrh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Jun 16, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

[LGTM Timeline notifier]

Timeline:

  • 2026-06-16 06:08:14.703359099 +0000 UTC m=+1458595.773676479: ☑️ agreed by D3Hunter.
  • 2026-06-16 06:11:24.324163125 +0000 UTC m=+1458785.394480505: ☑️ agreed by joechenrh.

@GMHDBJD

GMHDBJD commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@ti-chi-bot ti-chi-bot Bot merged commit f721ec0 into pingcap:master Jun 16, 2026
35 checks passed
dveeden added a commit to dveeden/tiflow that referenced this pull request Jun 16, 2026
Resolve conflict in dm/pkg/storage/utils_test.go: upstream PR pingcap#12716
(loader(dm): update import-into external ID handling) removed the
StripS3ExternalID helper from dm/pkg/storage/utils.go in favor of the
lightning TikvImporter.StripS3ExternalIDForImportSQL config flag. Drop
the now-orphaned TestStripS3ExternalID test accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved area/dm Issues or PRs related to DM. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DM import-into should not pass explicit S3 external ID to TiDB

3 participants