Skip to content

[Do Not Merge] Testing with random cluster alias#27826

Open
mohityadav766 wants to merge 13 commits intomainfrom
random-alias-in-test
Open

[Do Not Merge] Testing with random cluster alias#27826
mohityadav766 wants to merge 13 commits intomainfrom
random-alias-in-test

Conversation

@mohityadav766
Copy link
Copy Markdown
Member

@mohityadav766 mohityadav766 commented Apr 29, 2026

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Error Handling:
    • Added EntityNotFoundException handling in EntityRepository.setInheritedFields to allow graceful fallback when parent entities are hard-deleted.
    • Refactored GlossaryTermRepository to use bulk entity resolution, preventing 404 errors for orphaned parent or glossary references.
  • Resilience:
    • Updated inheritance logic to treat missing parent entities as "no inheritance" instead of failing history response requests.

This will update automatically on new commits.

Copilot AI review requested due to automatic review settings April 29, 2026 15:15
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 29, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates OpenMetadata integration tests to support a per-test-session (optionally randomized) search cluster alias, so test code that talks directly to OpenSearch/Elasticsearch can derive the correct index names instead of assuming a fixed openmetadata_* prefix.

Changes:

  • Resolve the search clusterAlias in TestSuiteBootstrap from -DclusterAlias (or a randomized default) and pass it into the app’s search configuration.
  • Update ITs that directly reference index names to prefix with TestSuiteBootstrap.getClusterAlias().
  • Adjust inline documentation around index naming in a couple of tests.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestSuiteResourceIT.java Build the test-suite index name using the resolved cluster alias.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TableResourceIT.java Build the table index name using the resolved cluster alias; update method Javadoc.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SearchIndexFieldLimitIT.java Use the resolved cluster alias for the table index constant.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/bootstrap/TestSuiteBootstrap.java Introduce per-session cluster alias resolution (overrideable) and wire it into the search config + logging.

Comment on lines +125 to +126
private static final String ELASTIC_SEARCH_CLUSTER_ALIAS = resolveClusterAlias();

Comment on lines +127 to +133
private static String resolveClusterAlias() {
String override = System.getProperty("clusterAlias");
if (override != null && !override.isBlank()) {
return override.trim().toLowerCase(java.util.Locale.ROOT);
}
return "omtest_" + java.util.UUID.randomUUID().toString().replace("-", "").substring(0, 8);
}
Comment on lines +4589 to +4590
* Get the full Elasticsearch index name with cluster alias prefix. The alias is randomized
* per JUnit session by {@link org.openmetadata.it.bootstrap.TestSuiteBootstrap}.
private static final int NUM_CUSTOM_PROPERTIES = 50;
// Index name with cluster alias prefix (from TestSuiteBootstrap.ELASTIC_SEARCH_CLUSTER_ALIAS)
private static final String TABLE_INDEX = "openmetadata_table_search_index";
// Index name uses the cluster alias resolved by TestSuiteBootstrap (randomized per session).
Copilot AI review requested due to automatic review settings April 29, 2026 15:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the integration test infrastructure to avoid search-index cross-test pollution by introducing a per-JUnit-session randomized search clusterAlias (with an optional -DclusterAlias=... override), and adjusts tests that directly query OpenSearch/Elasticsearch to use the dynamic index prefix.

Changes:

  • Randomize ELASTIC_SEARCH_CLUSTER_ALIAS per launcher session in TestSuiteBootstrap (overrideable via -DclusterAlias with validation).
  • Update integration tests that query search indices directly to compute full index names using TestSuiteBootstrap.getClusterAlias().
  • Update a few tests to use the resolved alias constant instead of hardcoding "openmetadata".

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
openmetadata-integration-tests/src/test/java/org/openmetadata/it/bootstrap/TestSuiteBootstrap.java Adds dynamic cluster alias resolution/validation and logs the chosen alias; passes alias into search configuration.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestSuiteResourceIT.java Builds the test suite index name using the resolved cluster alias.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TableResourceIT.java Builds the table index name using the resolved cluster alias and updates related documentation.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SearchIndexFieldLimitIT.java Uses the resolved cluster alias to build the mapping/index name used in low-level mapping assertions.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/OrphanedIndexCleanerScopedCleanupIT.java Replaces hardcoded cluster alias with the resolved session alias for scoped orphan-index tests.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/IndexTemplateIT.java Replaces hardcoded cluster alias with the resolved session alias for index-template assertions.

@github-actions
Copy link
Copy Markdown
Contributor

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copilot AI review requested due to automatic review settings April 30, 2026 06:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR primarily improves integration-test isolation and stability by introducing a per-session randomized search clusterAlias (with an override for reproducibility) and by reducing test flakiness in a few timing-/async-sensitive tests. It also updates the tag index mapping to include classification.displayName.

Changes:

  • Randomize the OpenSearch/Elasticsearch clusterAlias per JUnit launcher session (overrideable via -DclusterAlias=...) and update ITs that depend on concrete index names.
  • Stabilize flaky ITs by adjusting cache-performance sampling logic and by refreshing entity state before patching in async-workflow scenarios.
  • Increase GlossaryOntologyExportIT request timeout and extend tag index mappings with classification.displayName.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
openmetadata-integration-tests/src/test/java/org/openmetadata/it/bootstrap/TestSuiteBootstrap.java Introduces randomized/overrideable cluster alias and logs it; used to prefix search indices in IT runs.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/IndexTemplateIT.java Uses dynamic cluster alias for template/index assertions.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/OrphanedIndexCleanerScopedCleanupIT.java Uses dynamic cluster alias to scope “our” indices vs “foreign” indices in assertions.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SearchIndexFieldLimitIT.java Uses dynamic cluster alias when directly querying table index mappings.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TableResourceIT.java Uses dynamic cluster alias when referencing the table search index directly.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestSuiteResourceIT.java Uses dynamic cluster alias when referencing the test suite search index directly.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/UserResourceIT.java Reduces cache-performance test flakiness via warmup + median-of-samples approach.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/GlossaryTermResourceIT.java Refreshes server-side entity state before patching to avoid async status-transition related failures.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/GlossaryOntologyExportIT.java Increases export timeout for slower RDF/XML serialization scenarios.
openmetadata-spec/src/main/resources/elasticsearch/en/tag_index_mapping.json Adds classification.displayName field mapping for tag documents.
openmetadata-spec/src/main/resources/elasticsearch/jp/tag_index_mapping.json Adds classification.displayName field mapping for tag documents.
openmetadata-spec/src/main/resources/elasticsearch/ru/tag_index_mapping.json Adds classification.displayName field mapping for tag documents.
openmetadata-spec/src/main/resources/elasticsearch/zh/tag_index_mapping.json Adds classification.displayName field mapping for tag documents.

Comment on lines +297 to +307
"displayName": {
"type": "text",
"analyzer": "om_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowercase_normalizer",
"ignore_above": 256
}
}
},
Comment on lines +4594 to +4595
return org.openmetadata.it.bootstrap.TestSuiteBootstrap.getClusterAlias()
+ "_table_search_index";
Comment on lines +1142 to +1143
return org.openmetadata.it.bootstrap.TestSuiteBootstrap.getClusterAlias()
+ "_test_suite_search_index";
// Index name uses the cluster alias resolved by TestSuiteBootstrap (randomized per session by
// default; pin with -DclusterAlias=... for reproducible debugging).
private static final String TABLE_INDEX =
org.openmetadata.it.bootstrap.TestSuiteBootstrap.getClusterAlias() + "_table_search_index";
Copilot AI review requested due to automatic review settings April 30, 2026 09:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates OpenMetadata integration tests and search index mappings to reduce test flakiness in parallel CI runs by (1) randomizing the search cluster alias per JUnit session and (2) making search lookups exact-match on UUIDs via id.keyword. It also aligns tag index mappings to include classification.displayName and hardens a few flaky tests.

Changes:

  • Randomize the search cluster alias per JUnit session (overrideable via -DclusterAlias) and update ITs that talk to physical index names to use the alias.
  • Standardize IT search lookups to use id.keyword:<uuid> for exact-match behavior (avoids tokenized UUID matches).
  • Add classification.displayName to tag index mappings (multiple languages) and adjust a few flaky ITs (timing, patch-state refresh, longer ontology export timeout).

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
openmetadata-spec/src/main/resources/elasticsearch/en/tag_index_mapping.json Add classification.displayName mapping for tag documents (EN).
openmetadata-spec/src/main/resources/elasticsearch/jp/tag_index_mapping.json Add classification.displayName mapping for tag documents (JP).
openmetadata-spec/src/main/resources/elasticsearch/ru/tag_index_mapping.json Add classification.displayName mapping for tag documents (RU).
openmetadata-spec/src/main/resources/elasticsearch/zh/tag_index_mapping.json Add classification.displayName mapping for tag documents (ZH).
openmetadata-integration-tests/src/test/java/org/openmetadata/it/bootstrap/TestSuiteBootstrap.java Randomize per-session search cluster alias (+ override validation/logging).
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/BaseEntityIT.java Use id.keyword for entity ID search helper.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DataProductResourceIT.java Use id.keyword for UUID-based search lookups.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DomainResourceIT.java Use id.keyword for exact domain ID search assertions.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/GlossaryOntologyExportIT.java Increase export HTTP timeout to reduce CI flakiness.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/GlossaryTermResourceIT.java Refresh entity before patch operations to avoid async workflow status races; use id.keyword in search assertions.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/IndexTemplateIT.java Use randomized cluster alias when asserting index template names/patterns.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/MultiDomainHasDomainIT.java Use id.keyword for table ID search verification.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/OrphanedIndexCleanerScopedCleanupIT.java Use per-session cluster alias for scoped orphan index tests.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/SearchIndexFieldLimitIT.java Use per-session cluster alias when addressing the physical table index.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TableResourceIT.java Use id.keyword in table search assertions; compute physical index name from cluster alias.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TagResourceIT.java Use id.keyword for exact tag ID lookups in search index checks.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestCaseResourceIT.java Use id.keyword for exact test case ID lookups in search index checks.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TestSuiteResourceIT.java Use per-session cluster alias for physical test-suite index refresh/search via low-level client.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/UserResourceIT.java Make cache performance test less flaky by sampling and asserting median hit vs miss; add median() helper.

Comment on lines +110 to +159
/**
* Pattern allowed for {@code -DclusterAlias} overrides — must be a valid OpenSearch /
* Elasticsearch index name prefix (lowercase alphanumeric, underscore, or hyphen; must start
* with a letter or digit; max 63 chars).
*
* <p>Declared <em>before</em> {@link #ELASTIC_SEARCH_CLUSTER_ALIAS} on purpose: static fields
* initialize in declaration order, and {@link #resolveClusterAlias()} reads this pattern. If
* this declaration moved below, override validation would NPE on the only path that uses it.
*/
private static final java.util.regex.Pattern CLUSTER_ALIAS_PATTERN =
java.util.regex.Pattern.compile("[a-z0-9][a-z0-9_\\-]{0,62}");

/**
* Cluster alias used as the prefix for all search indices in this test session.
*
* <p>The OpenSearch / Elasticsearch testcontainer is shared across the entire JUnit launcher
* session (single static container, see {@link #SEARCH_CONTAINER}). When tests run in parallel
* (the {@code parallel-tests} profile sets {@code junit.jupiter.execution.parallel.enabled=true}
* and {@code reuseForks=true} keeps everything in one JVM), every test reads and writes against
* the same set of indices. {@link org.openmetadata.it.util.TestNamespace} only isolates entity
* FQNs in the database — it does not isolate documents in the search index.
*
* <p>To prevent cross-test pollution between concurrent CI runs that share the cluster, the alias
* is randomized per session by default so each session writes to its own {@code <alias>_*}
* indices. Set {@code -DclusterAlias=openmetadata} (or any fixed value matching {@link
* #CLUSTER_ALIAS_PATTERN}) to pin the alias for reproducible debugging.
*/
private static final String ELASTIC_SEARCH_CLUSTER_ALIAS = resolveClusterAlias();

private static String resolveClusterAlias() {
String override = System.getProperty("clusterAlias");
if (override == null || override.isBlank()) {
return "omtest_" + java.util.UUID.randomUUID().toString().replace("-", "").substring(0, 8);
}
String normalized = override.trim().toLowerCase(java.util.Locale.ROOT);
if (!CLUSTER_ALIAS_PATTERN.matcher(normalized).matches()) {
throw new IllegalArgumentException(
"Invalid -DclusterAlias='"
+ override
+ "'. Must match "
+ CLUSTER_ALIAS_PATTERN.pattern()
+ " (lowercase alphanumeric, underscore, or hyphen; must start with a letter or"
+ " digit; max 63 chars) so it forms a valid OpenSearch/Elasticsearch index prefix.");
}
return normalized;
}

public static String getClusterAlias() {
return ELASTIC_SEARCH_CLUSTER_ALIAS;
}
@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented Apr 30, 2026

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Fixes a static initialization order issue where CLUSTER_ALIAS_PATTERN was null. Addressed that, but note that the median cache-hit versus cache-miss assertion remains prone to flakiness at sub-millisecond scales.

💡 Edge Case: Median cache-hit <= cache-miss assertion can still flake

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/UserResourceIT.java:2214-2218

At sub-millisecond scale, medianHit <= medianMiss (strict ≤) can still flake: a warm JIT + hot CPU cache means a cache-miss sample that stays resident can complete as fast as a hit, or GC/scheduling noise can make a single median crossover. Consider adding a small tolerance (e.g., medianHit <= medianMiss + 0.5) or dropping the ordering assertion entirely and relying solely on the absolute < 200 ms bound, which is the meaningful regression gate.

Suggested fix
assertTrue(
    medianHit <= medianMiss + 0.5,
    String.format(
        "Cache hit should not be materially slower than miss (miss=%.3fms hit=%.3fms)",
        medianMiss, medianHit));
✅ 1 resolved
Bug: Static init ordering: CLUSTER_ALIAS_PATTERN is null when used

📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/bootstrap/TestSuiteBootstrap.java:125-128 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/bootstrap/TestSuiteBootstrap.java:136
The field ELASTIC_SEARCH_CLUSTER_ALIAS (line 125) is initialized before CLUSTER_ALIAS_PATTERN (line 127). Static fields are initialized in declaration order. When resolveClusterAlias() is called at line 125, it references CLUSTER_ALIAS_PATTERN (line 136), which is still null at that point. This causes a NullPointerException whenever -DclusterAlias is set to a non-blank value, crashing the entire test suite bootstrap.

The default path (no system property) does not hit the pattern check, so the bug only manifests when someone pins the alias for debugging — exactly the reproducibility scenario the PR intends to support.

🤖 Prompt for agents
Code Review: Fixes a static initialization order issue where CLUSTER_ALIAS_PATTERN was null. Addressed that, but note that the median cache-hit versus cache-miss assertion remains prone to flakiness at sub-millisecond scales.

1. 💡 Edge Case: Median cache-hit <= cache-miss assertion can still flake
   Files: openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/UserResourceIT.java:2214-2218

   At sub-millisecond scale, `medianHit <= medianMiss` (strict ≤) can still flake: a warm JIT + hot CPU cache means a cache-miss sample that stays resident can complete as fast as a hit, or GC/scheduling noise can make a single median crossover. Consider adding a small tolerance (e.g., `medianHit <= medianMiss + 0.5`) or dropping the ordering assertion entirely and relying solely on the absolute `< 200 ms` bound, which is the meaningful regression gate.

   Suggested fix:
   assertTrue(
       medianHit <= medianMiss + 0.5,
       String.format(
           "Cache hit should not be materially slower than miss (miss=%.3fms hit=%.3fms)",
           medianMiss, medianHit));

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants