Searching¶
Lucene indexes are queried using a custom JSON syntax defining the kind of search to be done.
You can find samples source code at Hawkore's Apache Ignite extensions sample project.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter: ( <filter> )* )? (, query: ( <query> )* )? (, sort: ( <sort> )* )? (, refresh: ( true | false ) )? }';
where <filter>
and <query>
are a JSON object:
<filter>:= {type: <type> (, <option>: ( <value> | <value_list> ) )* } <query>:= {type: <type> (, <option>: ( <value> | <value_list> ) )* }
and <sort>
is another JSON object:
<sort>:= <simple_sort_field> | <geo_distance_sort_field> <simple_sort_field>:= { field: <field> (, type: "simple" )? (, reverse: <reverse> )? } <geo_distance_sort_field>:= { type: "geo_distance", field: <field>, latitude: <Double>, longitude: <Double> (, reverse: <reverse> )? }
USE INDEX(<table_name>_LUCENE_IDX)
is an index hint to instruct the query optimizer to choose table's lucene index instead of others.
When searching by filter
, without any query
or sort
defined,
then the results are returned in the Apache Ignite’s natural order, which is
defined by the partitioner and the column name comparator. When searching
by query
, results are returned sorted by descending relevance. Sort option is used
to specify the order in which the indexed rows will be traversed. When
simple_sort_field
sorting is used, the query scoring is delayed.
geo_distance_sort_field
is use to sort Rows by min distance to point
indicating the GeoPointMapper to use by mapper field.
Relevance queries must touch all the nodes in the ring in order to find the globally best results, so you should prefer filters over queries when no relevance nor sorting are needed.
Tip
Take a look to Affinity Collocation to improve search performance.
The refresh
boolean option indicates if the search must commit pending
writes and refresh the Lucene IndexSearcher before being performed. This
way a search with refresh
set to true will view the most recent changes
done to the index, independently of the index auto-refresh time.
Please note that it is a costly operation, so you should not use it
unless it is strictly necessary. The default value is false. You can
explicitly refresh all the index shards (all cluster nodes) with an empty search:
SELECT * FROM ("<schema_name>".)?<table_name> WHERE lucene = '{refresh:true}';
This way the subsequent searches will view all the writes done before this operation, without needing to wait for the index auto refresh. It is useful to perform this operation before searching after a bulk data load.
Types of search and their options are summarized in the table below. Details for each of them are available in individual sections.
You can find samples source code at Hawkore's Apache Ignite extensions sample project.
In addition to the options described in the table, all search types have a “ boost ” option that acts as a weight on the resulting score.
Search type |
Option |
Value type |
Default value |
Mandatory |
---|---|---|---|---|
field |
string |
Yes |
||
vt_from |
string/long |
0L |
No |
|
vt_to |
string/long |
Long.MAX_VALUE |
No |
|
tt_from |
string/long |
0L |
No |
|
tt_to |
string/long |
Long.MAX_VALUE |
No |
|
must |
search |
No |
||
should |
search |
No |
||
not |
search |
No |
||
field |
string |
Yes |
||
values |
array |
Yes |
||
doc_values |
boolean |
false |
No |
|
field |
string |
Yes |
||
from |
string/long |
0 |
No |
|
to |
string/long |
Long.MAX_VALUE |
No |
|
operation |
string |
intersects |
No |
|
field |
string |
Yes |
||
value |
string |
Yes |
||
max_edits |
integer |
2 |
No |
|
prefix_length |
integer |
0 |
No |
|
max_expansions |
integer |
50 |
No |
|
transpositions |
boolean |
true |
No |
|
query |
string |
Yes |
||
default_field |
string |
"lucene" |
No |
|
field |
string |
Yes |
||
min_latitude |
double |
Yes |
||
max_latitude |
double |
Yes |
||
min_longitude |
double |
Yes |
||
max_longitude |
double |
Yes |
||
field |
string |
Yes |
||
latitude |
double |
Yes |
||
longitude |
double |
Yes |
||
max_distance |
string |
Yes |
||
min_distance |
string |
No |
||
field |
string |
Yes |
||
shape |
string (WKT) |
Yes |
||
operation |
string |
is_within |
No |
|
field |
string |
Yes |
||
value |
any |
Yes |
||
doc_values |
boolean |
false |
No |
|
field |
string |
Yes |
||
value |
string |
Yes |
||
slop |
integer |
0 |
No |
|
field |
string |
Yes |
||
value |
string |
Yes |
||
field |
string |
Yes |
||
lower |
any |
No |
||
upper |
any |
No |
||
include_lower |
boolean |
false |
No |
|
include_upper |
boolean |
false |
No |
|
doc_values |
boolean |
false |
No |
|
field |
string |
Yes |
||
value |
string |
Yes |
||
field |
string |
Yes |
||
value |
string |
Yes |
All search¶
Search for all the indexed rows.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): {type: "all"} }';
Example: search for all the indexed rows:
SELECT * FROM "test".user WHERE lucene = '{ {filter: {type: "all"} }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(all()).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Bitemporal search¶
Search for bitemporally-indexed rows according to the specified transaction time and valid time ranges.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "bitemporal", (vt_from: <vt_from> ,)? (vt_to: <vt_to> ,)? (tt_from: <tt_from> ,)? (tt_to: <tt_to> ,)? (operation: <operation> )? } }';
where:
- vt_from (default = 0L): a string or a number being the beginning of the valid date range.
- vt_to (default = Long.MAX_VALUE): a string or a number being the end of the valid date range.
- tt_from (default = 0L): a string or a number being the beginning of the transaction date range.
- tt_to (default = Long.MAX_VALUE): a string or a number being the end of the transaction date range.
- operation (default = intersects): the spatial operation to be performed, it can be intersects, contains and is_within.
Bitemporal searching is so complex that we want to stay an example.
We want to implement a system for census bureau to track where resides a citizen and when the censyus bureau knows this.
First we create the Census query entity (a.k.a table) where all this data resides:
public class CensusKey { @QuerySqlField @AffinityKeyMapped private String name; @QuerySqlField private String vtFrom; @QuerySqlField private String ttFrom; ... getters and setters }
@QueryTextField( // Index configuration indexOptions = @IndexOptions(refreshSeconds = 60, partitions = 10), bitemporalMappers = @BitemporalMapper( name= "bitemporal", tt_from = "ttFrom", tt_to = "ttTo", vt_from = "vtFrom", vt_to = "vtTo", pattern = "yyyy/MM/dd", now_value = "2200/12/31" ) ) public class Census { @QuerySqlField private String city; @QuerySqlField private String vtTo; @QuerySqlField private String ttTo; ... getters and setters }
<bean id="testCache" class="org.apache.ignite.configuration.CacheConfiguration"> <property name="name" value="test" /> <property name="rebalanceMode" value="ASYNC"/> <property name="cacheMode" value="PARTITIONED" /> <property name="indexedTypes"> <array> <value>com.hawkore.ignite.examples.entities.CensusKey</value> <value>com.hawkore.ignite.examples.entities.Census</value> </array> </property> </bean>
We insert the population of 5 citizens lives in each city from 2015/01/01 until now
INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('John', 'Madrid', '2015/01/01', '2200/12/31', '2015/01/01', '2200/12/31'); INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('Margaret', 'Barcelona', '2015/01/01', '2200/12/31', '2015/01/01', '2200/12/31'); INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('Cristian', 'Ceuta', '2015/01/01', '2200/12/31', '2015/01/01', '2200/12/31'); INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('Edward', 'New York','2015/01/01', '2200/12/31', '2015/01/01', '2200/12/31'); INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('Johnatan', 'San Francisco', '2015/01/01', '2200/12/31', '2015/01/01', '2200/12/31');
John moves to Amsterdam in '2015/03/05' but he does not communicate this to census bureau until '2015/06/29' because he need it to apply for taxes reduction.
So, the system need to update last information from John, and insert the new. This is done with batch execution updating the transaction time end of previous data and inserting new.
-- This update until when the system believed in this false information UPDATE "test".census SET ttTo = '2015/06/29' WHERE name = 'John' AND vtFrom = '2015/01/01' AND ttFrom = '2015/01/01' AND ttTo = '2200/12/31'; -- Here inserts the new knowledge about the period where john resided in Madrid INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('John', 'Madrid', '2015/01/01', '2015/03/04', '2015/06/30', '2200/12/31'); -- This inserts the new knowledge about the period where john resides in Amsterdam INSERT INTO "test".census(name, city, vtFrom, vtTo, ttFrom, ttTo) VALUES ('John', 'Amsterdam', '2015/03/05', '2200/12/31', '2015/06/30', '2200/12/31');
Now , we can see the main difference between valid time and transaction time. The system knows from '2015/01/01' to '2015/06/29' that John resides in Madrid from '2015/01/01' until now, and resides in Amsterdam from '2015/03/05' until now.
There are several types of queries concerning this type of indexing
If its needed to get all the data in the table:
SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM "test".census USE INDEX (census_lucene_idx) WHERE lucene = '{ refresh: true }';
If you want to know what is the last info about where John resides, you perform a query with tt_from and tt_to set to now_value:
SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM "test".census USE INDEX (census_lucene_idx) WHERE lucene = '{ filter : { type : "bitemporal", field : "bitemporal", vt_from : 0, vt_to : "2200/12/31", tt_from : "2200/12/31", tt_to : "2200/12/31" } }' AND name='John';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM \"test\".census USE INDEX(census_lucene_idx) WHERE lucene = ? and name = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bitemporal("bitemporal").ttFrom("2200/12/31") .ttTo("2200/12/31") .vtFrom(0) .vtTo("2200/12/31").build(), 'John'); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
If you want to know what is the last info about where John resides now, you perform a query with tt_from, tt_to, vt_from, vt_to set to now_value:
SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM "test".census USE INDEX (census_lucene_idx) WHERE lucene = '{ filter : { type : "bitemporal", field : "bitemporal", vt_from : "2200/12/31", vt_to : "2200/12/31", tt_from : "2200/12/31", tt_to : "2200/12/31" } }' AND name='John';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM \"test\".census USE INDEX(census_lucene_idx) WHERE lucene = ? and name = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bitemporal("bitemporal").ttFrom("2200/12/31") .ttTo("2200/12/31") .vtFrom("2200/12/31") .vtTo("2200/12/31")).build(), 'John'); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
If the test case needs to know what the system was thinking at '2015/03/01' about where John resides in "2015/03/01".
SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM "test".census USE INDEX (census_lucene_idx) WHERE lucene = '{ filter : { type : "bitemporal", field : "bitemporal", vt_from: "2015/03/01", vt_to : "2015/03/01", tt_from : "2015/03/01", tt_to : "2015/03/01" } }' AND name='John';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM \"test\".census USE INDEX(census_lucene_idx) WHERE lucene = ? and name = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bitemporal("bitemporal").ttFrom("2015/03/01") .ttTo("2015/03/01") .vtFrom("2015/03/01") .vtTo("2015/03/01")).build(), 'John'); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
If the test case needs to know what the system was thinking at '2015/07/05' about where John resides:
SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM "test".census USE INDEX (census_lucene_idx) WHERE lucene = '{ filter : { type : "bitemporal", field : "bitemporal", tt_from : "2015/07/05", tt_to : "2015/07/05" } }' AND name='John';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT name, city, vtFrom, vtTo, ttFrom, ttTo FROM \"test\".census USE INDEX(census_lucene_idx) WHERE lucene = ? and name = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bitemporal("bitemporal").ttFrom("2015/07/05").ttTo("2015/07/05").build(), 'John'); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
This code is available in SQL script at Hawkore's Apache Ignite extensions sample project.
Boolean search¶
Searches for rows matching boolean combinations of other searches.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { ( type: "boolean" )? (, must: [(search,)?] )? (, should: [(search,)?] )? (, not: [(search,)?] )? } }';
where:
- must: represents the conjunction of searches: search_1 AND search_2 AND … AND search_n
- should: represents the disjunction of searches: search_1 OR search_2 OR … OR search_n
- not: represents the negation of the disjunction of searches: NOT(search_1 OR search_2 OR … OR search_n)
Example 1: search for rows where name ends with “a” AND food starts with “tu”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "boolean", must: [ {type: "wildcard", field: "name", value: "*a"}, {type: "wildcard", field: "food", value: "tu*"} ] } }');
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bool().must(wildcard("name", "*a"), wildcard("food", "tu*"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
You can also write this search without the type
attribute:
SELECT * FROM "test".user WHERE lucene = '{ filter: { must: [ {type: "wildcard", field: "name", value: "*a"}, {type: "wildcard", field: "food", value: "tu*"} ] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(must(wildcard("name", "*a"), wildcard("food", "tu*"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Or inside the base filter path:
SELECT * FROM "test".user WHERE lucene = '{ filter: [ {type: "wildcard", field: "name", value: "*a"}, {type: "wildcard", field: "food", value: "tu*"} ] }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(wildcard("name", "*a"), wildcard("food", "tu*")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for rows where food starts with “tu” but name does not end with “a”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "boolean", not: [{type: "wildcard", field: "name", value: "*a"}], must: [{type: "wildcard", field: "food", value: "tu*"}] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bool().must(wildcard("food", "tu*")).not(wildcard("name", "*a"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
You can also write this search without the type
attribute:
SELECT * FROM "test".user WHERE lucene = '{ filter: { not: [{type: "wildcard", field: "name", value: "*a"}], must: [{type: "wildcard", field: "food", value: "tu*"}] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(must(wildcard("food", "tu*")).not(wildcard("name", "*a"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
It is also possible to write the search this way:
SELECT * FROM "test".user WHERE lucene = '{ filter: [ {type: "wildcard", field: "food", value: "tu*"}, {not: {type: "wildcard", field: "name", value: "*a"}} ] }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(wildcard("food", "tu*"), not(wildcard("name", "*a"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 3: search for rows where name ends with “a” or food starts with “tu”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "boolean", should: [ {type: "wildcard", field: "name", value: "*a"}, {type: "wildcard", field: "food", value: "tu*"} ] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bool().should(wildcard("name", "*a"), wildcard("food", "tu*"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
You can also write this search without the type
attribute:
SELECT * FROM "test".user WHERE lucene = '{ filter: { should: [ {type: "wildcard", field: "name", value: "*a"}, {type: "wildcard", field: "food", value: "tu*"} ] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(should(wildcard("name", "*a"), wildcard("food", "tu*"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 4: will return zero rows independently of the index contents:
SELECT * FROM "test".user WHERE lucene = '{ filter: {type: "boolean"} }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(search().filter(bool()).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 5: search for rows where name does not end with “a”, which is a resource-intensive pure negation search:
SELECT * FROM "test".user WHERE lucene = '{ filter: {not: [{type: "wildcard", field: "name", value: "*a"}]} }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(bool().not(wildcard("name", "*a"))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Contains search¶
Searches for rows matching one or more of the specified terms.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ ( filter | query ): { type: "contains", field: <field_name>, values: <value_list> (, doc_values: <doc_values> )? } }';
where:
- doc_values (default = false): if the generated Lucene query should use doc values instead of inverted index. Doc values searches are typically slower, but they can be faster in the dense case where most rows match the search.
Example 1: search for rows where name matches “Alicia” or “mancha”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "contains", field: "name", values: ["Alicia", "mancha"] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(contains("name", "Alicia", "mancha").build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for rows where date matches “2014/01/01″, “2014/01/02″ or “2014/01/03″:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "contains", field: "date", values: ["2014/01/01", "2014/01/02", "2014/01/03"] } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(contains("date", "2014/01/01", "2014/01/02", "2014/01/03")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Date range search¶
Searches for date ranges/durations indexed by a date range mapper, using a spatial approach. This allows you to use spatial operators such as intersects, contains and is_within. If you just want to search for single-column dates (points in time) within a certain time range, you should index them use a range search.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "date_range", (from: <from> ,)? (to: <to> ,)? (operation: <operation> )? } }';
where:
- from: a string or a number being the beginning of the date range.
- to: a string or a number being the end of the date range.
- operation: the spatial operation to be performed, it can be intersects, contains and is_within.
Example 1: will return rows where duration intersects "2014/01/01" and "2014/12/31":
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "date_range", field: "duration", from: "2014/01/01", to: "2014/12/31", operation: "intersects" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(search().filter(dateRange("duration").from("2014/01/01").to("2014/12/31").operation("intersects")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for rows where duration contains "2014/06/01" and "2014/06/02":
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "date_range", field: "duration", from: "2014/06/01", to: "2014/06/02", operation: "contains" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(dateRange("duration").from("2014/06/01").to("2014/06/02").operation("contains")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 3: search for rows where duration is within "2014/01/01" and "2014/12/31":
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "date_range", field: "duration", from: "2014/01/01", to: "2014/12/31", operation: "is_within" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(dateRange("duration").from("2014/01/01").to("2014/12/31").operation("is_within")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Fuzzy search¶
Searches for rows matching a term using similarity based on Damerau-Levenshtein distance.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "fuzzy", field: <field_name> , value: <value> (, max_edits: <max_edits> )? (, prefix_length: <prefix_length> )? (, max_expansions: <max_expansion> )? (, transpositions: <transposition> )? } }';
where:
- max_edits (default = 2): a integer value between 0 and 2. Will
return rows which distance from
to content has a distance of at most . Distance will be interpreted according to the value of “transpositions”. - prefix_length (default = 0): an integer value being the length of the common non-fuzzy prefix
- max_expansions (default = 50): an integer for the maximum number of terms to match
- transpositions (default = true): if transpositions should be treated as a primitive edit operation (Damerau-Levenshtein distance). When false, comparisons will implement the classic Levenshtein distance.
Example 1: search for any rows where “phrase” contains a word that differs in one edit operation from “puma”, such as “pumas”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "fuzzy", field: "phrase", value: "puma", max_edits: 1 } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(fuzzy("phrase", "puma").maxEdits(1)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: same as example 1 but will limit the results to rows where phrase contains a word that starts with “pu”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "fuzzy", field: "phrase", value: "puma", max_edits: 1, prefix_length: 2 } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(fuzzy("phrase", "puma").maxEdits(1).prefixLength(2)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Lucene search¶
Searches for rows with columns match a lucene search expression using Lucene classic query parser syntax.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "lucene", query: <lucene search expression> (, default_field: <field_name> )? } }');
where:
- query: lucene search expression.
- default_field (default = "lucene"): The name of the field where the clauses will be applied.
Example 1: search for any rows where “phrase” contains a word that differs in one edit operation from “puma”, such as “pumas”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "lucene", query: "puma~1", default_field: "phrase" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(lucene("puma~1").defaultField("phrase")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Geo bbox search¶
Searches for rows with geographical points or geographical shapes contained in the specified bounding box.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "geo_bbox", field: <field_name>, min_latitude: <min_latitude>, max_latitude: <max_latitude>, min_longitude: <min_longitude>, max_longitude: <max_longitude> } }';
where:
- min_latitude: a double value between -90 and 90 being the min allowed latitude.
- max_latitude: a double value between -90 and 90 being the max allowed latitude.
- min_longitude: a double value between -180 and 180 being the min allowed longitude.
- max_longitude: a double value between -180 and 180 being the max allowed longitude.
Example 1: search for any rows where “place” is formed by a latitude between -90.0 and 90.0, and a longitude between -180.0 and 180.0:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "geo_bbox", field: "place", min_latitude: -90.0, max_latitude: 90.0, min_longitude: -180.0, max_longitude: 180.0 } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(geoBBox("place", -180.0, 180.0, -90.0, 90.0)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for any rows where “place” is formed by a latitude between -90.0 and 90.0, and a longitude between 0.0 and 10.0:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "geo_bbox", field: "place", min_latitude: -90.0, max_latitude: 90.0, min_longitude: 0.0, max_longitude: 10.0 } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(geoBBox("place", 0.0, 10.0, -90.0, 90.0)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 3: search for any rows where “place” is formed by a latitude between 0.0 and 10.0, and a longitude between -180.0 and 180.0 sorted by min distance to point [0.0, 0.0]:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "geo_bbox", field: "place", min_latitude: 0.0, max_latitude: 10.0, min_longitude: -180.0, max_longitude: 180.0 }, sort: { type: "geo_distance", field: "geo_point", reverse: false, latitude: 0.0, longitude: 0.0 } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(geoBBox("place", -180.0, 180.0, 0.0, 10.0)) .sort(geoDistanceSortField("geo_point", 0.0, 0.0).reverse(false) .build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Geo distance search¶
Searches for rows with geographical points or geographical shapes within a distance range from a specified point.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "geo_distance", field: <field_name> , latitude: <latitude> , longitude: <longitude> , max_distance: <max_distance> (, min_distance: <min_distance> )? } }';
where:
- latitude: a double value between -90 and 90 being the latitude of the reference point.
- longitude: a double value between -180 and 180 being the longitude of the reference point.
- max_distance: a string value being the max allowed
distance <#distance>
__ from the reference point. - min_distance: a string value being the min allowed
distance <#distance>
__ from the reference point.
Example 1: search for any rows where “place” is within one kilometer from the geo point (40.225479, -3.999278):
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "geo_distance", field: "place", latitude: 40.225479, longitude: -3.999278, max_distance: "1km" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(geoDistance("place", -3.999278d, 40.225479d, "1km").build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for any rows where “place” is within one yard and ten yards from the geo point (40.225479, -3.999278) sorted by min distance to point (40.225479, -3.999278):
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "geo_distance", field: "place", latitude: 40.225479, longitude: -3.999278, max_distance: "10yd", min_distance: "1yd" }, sort: { fields: [ { type: "geo_distance", field: "geo_point", reverse: false, latitude: 40.225479, longitude: -3.999278 } ] } }' LIMIT 100;
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM "test".user WHERE lucene = ? LIMIT 100"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(geoDistance("place", -3.999278d, 40.225479d, "10yd").minDistance("1yd")) .sort(geoDistanceSortField("geo_point", -3.999278, 40.225479).reverse(false)) .build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Geo shape search¶
Searches for rows with geographical points or geographical shapes related to a specified shape. Search shapes can be either shapes with Well Known Text (WKT) format or transformations over WKT shapes. The supported WKT shapes are point, linestring, polygon, multipoint, multilinestring and multipolygon.
This search type depends on Java Topology Suite (JTS).
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type : "geo_shape", field: <fieldname> , shape: <shape> (, operation: <operation>)? } }';
where:
- shape: a geospatial shape.
- operation: the type of spatial operation to be performed. The possible values are
"intersects"
,"is_within"
and"contains"
. Defaults to"is_within"
.
Example 1: search for shapes within a polygon:
SELECT * FROM "shapes".block WHERE lucene = '{ refresh: true, filter: { type: "geo_shape", field: "place", shape: { type: "wkt", value: "POLYGON((-80.14247278479951 25.795756477689594, -66.11315588592869 18.47447597127288, -64.82713517019887 32.33019640254669, -80.14247278479951 25.795756477689594))" } } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"shapes\".block WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); String shape = "POLYGON((-80.14247278479951 25.795756477689594, -66.11315588592869 18.47447597127288, -64.82713517019887 32.33019640254669, -80.14247278479951 25.795756477689594))"; q.setArgs(search().filter(geoShape("place", wkt(shape))).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for shapes intersecting with a shape defined by a buffer 10 kilometers around a segment of the Florida's coastline:
SELECT * FROM "shapes".block WHERE lucene = '{ refresh: true, filter: { type: "geo_shape", field: "place", operation: "intersects", shape: { type: "buffer", max_distance: "10km", shape: { type: "wkt", value: "LINESTRING(-80.90 29.05, -80.51 28.47, -80.60 28.12, -80.00 26.85, -80.05 26.37)" } } } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"shapes\".block WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); String coastLine = "LINESTRING(-80.90 29.05, -80.51 28.47, -80.60 28.12, -80.00 26.85, -80.05 26.37)"; q.setArgs(search().filter(geoShape("place", buffer(coastLine).maxDistance("10km")).operation("intersects")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Match search¶
Searches for rows with columns containing the specified term. The matching depends on the used analyzer.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "match", field: <field_name>, value: <value> (, doc_values: <doc_values> )? } }';
where:
- doc_values (default = false): if the generated Lucene query should use doc values instead of inverted index. Doc values searches are typically slower, but they can be faster in the dense case where most rows match the search.
Example 1: search for rows where name matches “Alicia”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "match", field: "name", value: "Alicia" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(match("name", "Alicia")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for any rows where phrase contains “mancha”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "match", field: "phrase", value: "mancha" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(match("phrase", "mancha").build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 3: search for rows where date matches "2014/01/01":
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "match", field: "date", value: "2014/01/01", doc_values: true } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(match("date", "2014/01/01").docValues(true)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
None search¶
Returns no results.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): {type: "none"} }';
Example: will return no one of the indexed rows:
SELECT * FROM "test".user WHERE lucene = '{ filter: {type: "none"} }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(none()).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Phrase search¶
Searches for rows with columns containing a particular sequence of terms.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "phrase", field: <field_name>, value: <value> (, slop: <slop> )? } }');
where:
- values: an ordered list of values.
- slop (default = 0): number of words permitted between words.
Example 1: search for rows where “phrase” contains the word “camisa” followed by the word “manchada”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "phrase", field: "phrase", values: "camisa manchada" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(phrase("phrase", "camisa manchada")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for rows where “phrase” contains the word “mancha” followed by the word “camisa” having 0 to 2 words in between:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "phrase", field: "phrase", values: "mancha camisa", slop: 2 } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(phrase("phrase", "mancha camisa").slop(2)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Prefix search¶
Searches for rows with columns with terms starting with the specified prefix.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "prefix", field: <field_name> , value: <value> } }';
Example: search for rows where “phrase” contains a word starting with “lu”. If the column is indexed as “text” and uses an analyzer, words ignored by the analyzer will not be retrieved:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "prefix", field: "phrase", value: "lu" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(prefix("phrase", "lu")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Range search¶
Searches for rows with columns with terms within the specified term range.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "range", field: <field_name> (, lower: <lower>)? (, upper: <upper>)? (, include_lower: <include_lower> )? (, include_upper: <include_upper> )? (, doc_values: <doc_values> )? } }';
where:
- lower: lower bound of the range.
- upper: upper bound of the range.
- include_lower (default = false): if the lower bound is included (left-closed range).
- include_upper (default = false): if the upper bound is included (right-closed range).
- doc_values (default = false): if the generated Lucene query should use doc values instead of inverted index. Doc values searches are typically slower, but they can be faster in the dense case where most rows match the search.
Lower and upper will default to: \pm\infty for number. In the case of byte and string like data, all values from lower up to upper will be returned if both are specified. If only “lower” is specified, all rows with values from “lower” will be returned. If only “upper” is specified then all rows with field values up to “upper” will be returned. If both are omitted than all rows will be returned.
Example 1: search for rows where age is in [65, ∞):
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "range", field: "age", lower: 65, include_lower: true } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(range("age").lower(65).includeLower(true)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 2: search for rows where age is in (-∞, 0]:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "range", field: "age", upper: 0, include_upper: true, doc_values: true } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(range("age").upper(0).includeUpper(true).docValues(true)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 3: search for rows where age is in [-1, 1]:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "range", field: "age", lower: -1, upper: 1, include_lower: true, include_upper: true, doc_values: false } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(range("age").lower(-1).upper(1) .includeLower(true) .includeUpper(true) .docValues(true)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Example 4: search for rows where date is in [2014/01/01, 2014/01/02]:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "range", field: "date", lower: "2014/01/01", upper: "2014/01/02", include_lower: true, include_upper: true } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(range("date").lower("2014/01/01") .upper( "2014/01/02") .includeLower(true) .includeUpper(true)).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Regexp search¶
Searches for rows with columns with terms satisfying the specified regular expression.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "regexp", field: <field_name>, value: <regexp> } }';
where:
- value: a regular expression. See org.apache.lucene.util.automaton.RegExp for syntax reference.
Example: search for rows where name contains a word that starts with “J” and a vowel repeated twice (e.g. “Jane”):
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "regexp", field: "name", value: "[J][aeiou]{2}.*" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(regexp("name", "[J][aeiou]{2}.*")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();
Wildcard search¶
Searches for rows with columns with terms satisfying the specified wildcard pattern.
Syntax:
SELECT ( <fields> | * ) FROM ("<schema_name>".)?<table_name> ( USE INDEX(<table_name>_LUCENE_IDX) )? WHERE lucene = '{ (filter | query): { type: "wildcard", field: <field_name>, value: <wildcard_exp> } }';
where:
- value: a wildcard expression. Supported wildcards are
*
, which matches any character sequence (including the empty one), and?
, which matches any single character.\
is the escape character.
Example: search for rows where food starts with or is “tu”:
SELECT * FROM "test".user WHERE lucene = '{ filter: { type: "wildcard", field: "food", value: "tu*" } }';
import static org.hawkore.ignite.lucene.builder.Builder.*; (...) String query = "SELECT * FROM \"test\".user WHERE lucene = ?"; SqlFieldsQuery q = new SqlFieldsQuery(query); q.setArgs(search().filter(wildcard("food", "tu*")).build()); List<List<?>> resultSet = ignite.context().query().querySqlFields(q, false).getAll();