ATLAS-5032: Fix basic search when querying by long attribute values#650
Open
saksenasonali wants to merge 1 commit into
Open
ATLAS-5032: Fix basic search when querying by long attribute values#650saksenasonali wants to merge 1 commit into
saksenasonali wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
ATLAS-5032: Fix basic search for long qualifiedName with startsWith / endsWith / contains
Problem
Basic search with attribute filters on qualifiedName returns no results when filter values exceed Solr’s default max token length (255). This affects startsWith, endsWith, and contains, especially when multiple criteria on the same attribute are combined with AND (e.g. qualifiedName starts with a long prefix and ends with @primary).
Root cause: Solr ignores tokens longer than maxTokenLength, so index-based search does not match even though the entity exists and can be retrieved by GUID.
Solution (Approach 2 from ATLAS-5032)
For indexed string attributes, when the filter value length exceeds the configured Solr token limit, do not use the Solr index for STARTS_WITH, ENDS_WITH, or CONTAINS. Search falls back to JanusGraph instead.
Also ensure index and graph query paths stay consistent when the same attribute appears in multiple AND criteria:
Skip graph filter construction when the criterion is still index-searchable.
Skip index query construction when the criterion is not index-searchable.
How was this patch tested?
Unit / module tests
EntitySearchProcessorTest — 48 tests, including 6 new ATLAS-5032 scenarios (short and long qualifiedName, hive_table and hive_column, tokenized name, CONTAINS + ENDS_WITH).
Full repository module: mvn -pl repository test — 2391 tests, 0 failures.
mvn -pl common,repository -DskipTests install — build success.
Manual / REST (local Docker Atlas)
Reproduced the JIRA flow against http://localhost:21000:
Created a hive_table with a ~370-character name and qualifiedName default.@primary.
Basic search with AND:
qualifiedName startsWith default.<370-char-name>
qualifiedName endsWith @primary
Result: 1 matching entity (previously empty).