public class CommonTermsQuery extends Query
added terms: low-frequency
terms are added to a required boolean clause and high-frequency terms are
added to an optional boolean clause. The optional clause is only executed if
the required "low-frequency" clause matches. In most cases, high-frequency terms are
unlikely to significantly contribute to the document score unless at least
one of the low-frequency terms are matched. This query can improve
query execution times significantly if applicable.
CommonTermsQuery has several advantages over stopword filtering at
index or query time since a term can be "classified" based on the actual
document frequency in the index and can prevent slow queries even across
domains without specialized stopword files.
Note: if the query only contains high-frequency terms the query is rewritten into a plain conjunction query ie. all high-frequency terms need to match in order to match a document.
| Modifier and Type | Field and Description |
|---|---|
protected float |
highFreqBoost |
protected float |
highFreqMinNrShouldMatch |
protected BooleanClause.Occur |
highFreqOccur |
protected float |
lowFreqBoost |
protected float |
lowFreqMinNrShouldMatch |
protected BooleanClause.Occur |
lowFreqOccur |
protected float |
maxTermFrequency |
protected java.util.List<Term> |
terms |
| Constructor and Description |
|---|
CommonTermsQuery(BooleanClause.Occur highFreqOccur,
BooleanClause.Occur lowFreqOccur,
float maxTermFrequency)
Creates a new
CommonTermsQuery |
| Modifier and Type | Method and Description |
|---|---|
void |
add(Term term)
Adds a term to the
CommonTermsQuery |
protected Query |
buildQuery(int maxDoc,
TermContext[] contextArray,
Term[] queryTerms) |
protected int |
calcHighFreqMinimumNumberShouldMatch(int numOptional) |
protected int |
calcLowFreqMinimumNumberShouldMatch(int numOptional) |
void |
collectTermContext(IndexReader reader,
java.util.List<LeafReaderContext> leaves,
TermContext[] contextArray,
Term[] queryTerms) |
boolean |
equals(java.lang.Object other)
Override and implement query instance equivalence properly in a subclass.
|
float |
getHighFreqBoost()
Gets the boost used for high frequency terms.
|
float |
getHighFreqMinimumNumberShouldMatch()
Gets the minimum number of the optional high frequent BooleanClauses which must be
satisfied.
|
BooleanClause.Occur |
getHighFreqOccur()
Gets the
BooleanClause.Occur used for high frequency terms. |
float |
getLowFreqBoost()
Gets the boost used for low frequency terms.
|
float |
getLowFreqMinimumNumberShouldMatch()
Gets the minimum number of the optional low frequent BooleanClauses which must be
satisfied.
|
BooleanClause.Occur |
getLowFreqOccur()
Gets the
BooleanClause.Occur used for low frequency terms. |
float |
getMaxTermFrequency()
Gets the maximum threshold of a terms document frequency to be considered a
low frequency term.
|
java.util.List<Term> |
getTerms()
Gets the list of terms.
|
int |
hashCode()
Override and implement query hash code properly in a subclass.
|
protected Query |
newTermQuery(Term term,
TermContext context)
Builds a new TermQuery instance.
|
Query |
rewrite(IndexReader reader)
Expert: called to re-write queries into primitive queries.
|
void |
setHighFreqMinimumNumberShouldMatch(float min)
Specifies a minimum number of the high frequent optional BooleanClauses which must be
satisfied in order to produce a match on the low frequency terms query
part.
|
void |
setLowFreqMinimumNumberShouldMatch(float min)
Specifies a minimum number of the low frequent optional BooleanClauses which must be
satisfied in order to produce a match on the low frequency terms query
part.
|
java.lang.String |
toString(java.lang.String field)
Prints a query to a string, with
field assumed to be the
default field and omitted. |
classHash, createWeight, sameClassAs, toStringprotected final java.util.List<Term> terms
protected final float maxTermFrequency
protected final BooleanClause.Occur lowFreqOccur
protected final BooleanClause.Occur highFreqOccur
protected float lowFreqBoost
protected float highFreqBoost
protected float lowFreqMinNrShouldMatch
protected float highFreqMinNrShouldMatch
public CommonTermsQuery(BooleanClause.Occur highFreqOccur, BooleanClause.Occur lowFreqOccur, float maxTermFrequency)
CommonTermsQueryhighFreqOccur - BooleanClause.Occur used for high frequency termslowFreqOccur - BooleanClause.Occur used for low frequency termsmaxTermFrequency - a value in [0..1) (or absolute number >=1) representing the
maximum threshold of a terms document frequency to be considered a
low frequency term.java.lang.IllegalArgumentException - if BooleanClause.Occur.MUST_NOT is pass as lowFreqOccur or
highFreqOccurpublic void add(Term term)
CommonTermsQueryterm - the term to addpublic Query rewrite(IndexReader reader) throws java.io.IOException
Queryprotected int calcLowFreqMinimumNumberShouldMatch(int numOptional)
protected int calcHighFreqMinimumNumberShouldMatch(int numOptional)
protected Query buildQuery(int maxDoc, TermContext[] contextArray, Term[] queryTerms)
public void collectTermContext(IndexReader reader, java.util.List<LeafReaderContext> leaves, TermContext[] contextArray, Term[] queryTerms) throws java.io.IOException
java.io.IOExceptionpublic void setLowFreqMinimumNumberShouldMatch(float min)
By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required.
min - the number of optional clauses that must matchpublic float getLowFreqMinimumNumberShouldMatch()
public void setHighFreqMinimumNumberShouldMatch(float min)
By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required.
min - the number of optional clauses that must matchpublic float getHighFreqMinimumNumberShouldMatch()
public java.util.List<Term> getTerms()
public float getMaxTermFrequency()
public BooleanClause.Occur getLowFreqOccur()
BooleanClause.Occur used for low frequency terms.public BooleanClause.Occur getHighFreqOccur()
BooleanClause.Occur used for high frequency terms.public float getLowFreqBoost()
public float getHighFreqBoost()
public java.lang.String toString(java.lang.String field)
Queryfield assumed to be the
default field and omitted.public int hashCode()
QueryQueryCache works properly.hashCode in class QueryQuery.equals(Object)public boolean equals(java.lang.Object other)
QueryQueryCache works properly.
Typically a query will be equal to another only if it's an instance of
the same class and its document-filtering properties are identical that other
instance. Utility methods are provided for certain repetitive code.equals in class QueryQuery.sameClassAs(Object),
Query.classHash()protected Query newTermQuery(Term term, TermContext context)
This is intended for subclasses that wish to customize the generated queries.
term - termcontext - the TermContext to be used to create the low level term query. Can be null.Copyright © 2000–2025 The Apache Software Foundation. All rights reserved.