Skip to content

bug(openlineage): add namespace-based DB service resolution for db_table lookups#27005

Merged
TeddyCr merged 10 commits intoopen-metadata:mainfrom
jsingh-yelp:openlineage-namespace-db-service-resolution
Apr 8, 2026
Merged

bug(openlineage): add namespace-based DB service resolution for db_table lookups#27005
TeddyCr merged 10 commits intoopen-metadata:mainfrom
jsingh-yelp:openlineage-namespace-db-service-resolution

Conversation

@jsingh-yelp
Copy link
Copy Markdown
Contributor

@jsingh-yelp jsingh-yelp commented Apr 2, 2026

Describe your changes:

  • Bring in namespaceToServiceMapping changes from the Java API to the Python OpenLineage connector to resolve db/table_name. This change was introduced here: e9784ac
  • Also extend it with automatic scheme-based resolution use OpenLineage dataset namespace URL schemes (e.g. mysql://, redshift://) to narrow table lookups to services of the matching type, eliminating false matches when the same table name exists across multiple services.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@jsingh-yelp jsingh-yelp requested a review from a team as a code owner April 2, 2026 22:25
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@jsingh-yelp jsingh-yelp force-pushed the openlineage-namespace-db-service-resolution branch from 4da573e to ba89dc7 Compare April 2, 2026 22:26
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@jsingh-yelp jsingh-yelp force-pushed the openlineage-namespace-db-service-resolution branch from ba89dc7 to c9fb800 Compare April 2, 2026 22:48
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@jsingh-yelp jsingh-yelp force-pushed the openlineage-namespace-db-service-resolution branch from c9fb800 to 3a5a4d5 Compare April 2, 2026 22:49
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@jsingh-yelp jsingh-yelp changed the title [openLineage] add namespace-based DB service resolution for db_table lookups bug(openlineage) add namespace-based DB service resolution for db_table lookups Apr 2, 2026
@jsingh-yelp jsingh-yelp changed the title bug(openlineage) add namespace-based DB service resolution for db_table lookups bug(openlineage): add namespace-based DB service resolution for db_table lookups Apr 2, 2026
Comment thread ingestion/src/metadata/ingestion/source/pipeline/openlineage/table_resolver.py Outdated
@harshach harshach added the safe to test Add this label to run secure Github workflows on PRs label Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

⚠️ TypeScript Types Need Update

The generated TypeScript types are out of sync with the JSON schema changes.

Since this is a pull request from a forked repository, the types cannot be automatically committed.
Please generate and commit the types manually:

cd openmetadata-ui/src/main/resources/ui
./json2ts-generate-all.sh -l true
git add src/generated/
git commit -m "Update generated TypeScript types"
git push

After pushing the changes, this check will pass automatically.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

⚠️ TypeScript Types Need Update

The generated TypeScript types are out of sync with the JSON schema changes.

Since this is a pull request from a forked repository, the types cannot be automatically committed.
Please generate and commit the types manually:

cd openmetadata-ui/src/main/resources/ui
./json2ts-generate-all.sh -l true
git add src/generated/
git commit -m "Update generated TypeScript types"
git push

After pushing the changes, this check will pass automatically.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (37)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http CVE-2026-33870 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.10.Final
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 CVE-2026-33871 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.11.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (27)

Package Vulnerability ID Severity Installed Version Fixed Version
Authlib CVE-2026-27962 🔥 CRITICAL 1.6.6 1.6.9
Authlib CVE-2026-28490 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28498 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28802 🚨 HIGH 1.6.6 1.6.7
PyJWT CVE-2026-32597 🚨 HIGH 2.11.0 2.12.0
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.7 3.1.8
apache-airflow-providers-http CVE-2025-69219 🚨 HIGH 5.6.4 6.0.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
litellm CVE-2026-35030 🔥 CRITICAL 1.81.6 1.83.0
litellm CVE-2026-35029 🚨 HIGH 1.81.6 1.83.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
pyasn1 CVE-2026-30922 🚨 HIGH 0.6.2 0.6.3
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
tornado CVE-2026-31958 🚨 HIGH 6.5.4 6.5.5
tornado CVE-2026-35536 🚨 HIGH 6.5.4 6.5.5
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (2)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.6 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2026-25679 🚨 HIGH v1.25.6 1.25.8, 1.26.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpng-dev CVE-2026-33416 🚨 HIGH 1.6.39-2+deb12u3 1.6.39-2+deb12u4
libpng-dev CVE-2026-33636 🚨 HIGH 1.6.39-2+deb12u3 1.6.39-2+deb12u4
libpng16-16 CVE-2026-33416 🚨 HIGH 1.6.39-2+deb12u3 1.6.39-2+deb12u4
libpng16-16 CVE-2026-33636 🚨 HIGH 1.6.39-2+deb12u3 1.6.39-2+deb12u4

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (37)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http CVE-2026-33870 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.10.Final
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 CVE-2026-33871 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.11.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (13)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.7 3.1.8
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

auto-merge was automatically disabled April 4, 2026 16:51

Head branch was pushed to by a user without write access

@jsingh-yelp
Copy link
Copy Markdown
Contributor Author

@harshach I resolved two out of them, others are either false-positives or non-issue IMO.

aniketkatkar97
aniketkatkar97 previously approved these changes Apr 6, 2026
@mohittilala
Copy link
Copy Markdown
Contributor

Manually cherry-picked to 1.12.7 470d0de

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 24, 2026

Code Review ✅ Approved 14 resolved / 14 findings

Implements namespace-based DB service resolution for OpenLineage table lookups. The update consolidates redundant methods, resolves multiple ingestion pipeline crashes, and optimizes API usage by correcting caching and ambiguity handling.

✅ 14 resolved
Quality: Duplicate methods: _find_table_fqn and _get_table_fqn_from_om are identical

📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py:292-306 📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py:337-351 📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py:548
_find_table_fqn (lines 292-309) and _get_table_fqn_from_om (lines 337-354) have identical logic: iterate through DB services, call fqn.build(), return first match or raise FQNNotFoundException. The only difference is the error message string. _find_table_fqn is called from _get_table_fqn, while _get_table_fqn_from_om is called from get_create_table_request. This duplication will diverge over time and is confusing for maintainers.

Additionally, _get_table_fqn_from_om at line 548 doesn't benefit from the new namespace-based service resolution, so table creation lookups still search all services — which may be intentional but is inconsistent with the lookup path.

Edge Case: Bidirectional prefix match may produce false positives

📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/table_resolver.py:106-109
In find_service_by_namespace_mapping (line 108), the condition key.startswith(namespace) allows a short namespace like "mysql" or "mysql://host" to match a longer key like "mysql://host:3306/specific_db". This bidirectional matching is documented but could produce unexpected matches when multiple mapping entries share a common prefix. The first match wins due to dict iteration order, which is insertion-order in Python 3.7+ but may not be obvious to users.

For example, with mapping {"mysql://cluster-a:3306": "svc_a", "mysql://cluster-a:3306/db1": "svc_b"} and namespace "mysql://cluster-a:3306", the exact match returns svc_a. But if namespace were "mysql://cluster-a" (truncated), the first prefix match wins arbitrarily.

Edge Case: _get_by_name_cached permanently caches None (entity-not-found)

📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py:226-235 📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py:744
The _get_by_name_cached method (lines 231-235) caches the result of metadata.get_by_name() including None results. If get_create_table_request creates a table during the same pipeline run, subsequent lookups for that table via _get_by_name_cached at line 744 will still return the cached None, causing an AttributeError when accessing .id.root.

This is a timing edge case: the table is created by get_create_table_request (line 573-579), then looked up via cache in yield_pipeline_lineage_details (line 744) in the same event processing loop.

Quality: Docstring references wrong function name in table_resolver.py

📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/table_resolver.py:62-67
The docstring for extract_db_scheme_from_namespace shows examples referencing extract_namespace_scheme (a non-existent function name), which will confuse readers and break if used as a doctest.

Edge Case: Unknown namespace schemes match all non-standard DB services

📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/table_resolver.py:146-152 📄 ingestion/src/metadata/ingestion/source/pipeline/openlineage/metadata.py:277-286
find_services_by_scheme returns all services whose type is NOT in the known scheme map when the namespace scheme is unrecognized. If a user has multiple custom/non-standard DB services configured, an unknown scheme like custom:// will match all of them, and _get_table_fqn_from_om will return the first one that has the table — which may be wrong.

This is documented behavior and the warning log helps, but there's no test covering the multiple-custom-services scenario. Consider adding a test and/or documenting that namespaceToServiceMapping should be used in this case.

...and 9 more resolved from earlier reviews

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Openlineage safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants