7.14.16. query

7.14.16.1. Summary

query provides --match_columns and --query parameters of select feature as function. You can specify multiple query functions in --filter parameter in select.

Because of such flexibility, you can control full text search behavior by combination of multiple query functions.

query can be used in only --filter in select.

7.14.16.2. Syntax

query requires two arguments - match_columns and query_string.

The parameter query_expander, substitution_table and options are optional.

query(match_columns, query_string)
query(match_columns, query_string, query_expander)
query(match_columns, query_string, substitution_table)
query(match_columns, query_string, options)

options accepts the following keys:

{
  "expander": query_expander,
  "default_mode": default_mode,
  "flags": flags
}

7.14.16.3. Usage

Here are a schema definition and sample data to show usage.

Sample schema:

Execution example:

table_create Documents TABLE_NO_KEY
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Documents content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Terms TABLE_PAT_KEY ShortText --default_tokenizer TokenBigram  --normalizer NormalizerAuto
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms documents_content_index COLUMN_INDEX|WITH_POSITION Documents content
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Users TABLE_NO_KEY
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users name COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Users memo COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Lexicon TABLE_HASH_KEY ShortText \
  --default_tokenizer TokenBigramSplitSymbolAlphaDigit \
  --normalizer NormalizerAuto
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon users_name COLUMN_INDEX|WITH_POSITION Users name
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Lexicon users_memo COLUMN_INDEX|WITH_POSITION Users memo
# [[0, 1337566253.89858, 0.000355720520019531], true]

Sample data:

Execution example:

load --table Users
[
{"name": "Alice", "memo": "groonga user"},
{"name": "Alisa", "memo": "mroonga user"},
{"name": "Bob",   "memo": "rroonga user"},
{"name": "Tom",   "memo": "nroonga user"},
{"name": "Tobby", "memo": "groonga and mroonga user. mroonga is ..."},
]
# [[0, 1337566253.89858, 0.000355720520019531], 5]

Here is the simple usage of query function which execute full text search by keyword 'alice' without using --match_columns and --query arguments in --filter.

Execution example:

select Users --output_columns name,_score --filter 'query("name * 10", "alice")'
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         1
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Alice",
#         10
#       ]
#     ]
#   ]
# ]

When executing above query, the keyword alice is weighted to the value 10.

Here are the contrasting examples with/without query.

Execution example:

select Users --output_columns name,memo,_score --match_columns "memo * 10" --query "memo:@groonga OR memo:@mroonga OR memo:@user" --sort_keys -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "memo",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Tobby",
#         "groonga and mroonga user. mroonga is ...",
#         4
#       ],
#       [
#         "Alice",
#         "groonga user",
#         2
#       ],
#       [
#         "Alisa",
#         "mroonga user",
#         2
#       ],
#       [
#         "Bob",
#         "rroonga user",
#         1
#       ],
#       [
#         "Tom",
#         "nroonga user",
#         1
#       ]
#     ]
#   ]
# ]

In this case, the all keywords groonga, mroonga and user use the default weight. You can't pass different weight value to each keyword in this way.

Execution example:

select Users --output_columns name,memo,_score --filter 'query("memo * 10", "groonga") || query("memo * 20", "mroonga") || query("memo * 1", "user")' --sort_keys -_score
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   [
#     [
#       [
#         5
#       ],
#       [
#         [
#           "name",
#           "ShortText"
#         ],
#         [
#           "memo",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "Tobby",
#         "groonga and mroonga user. mroonga is ...",
#         51
#       ],
#       [
#         "Alisa",
#         "mroonga user",
#         21
#       ],
#       [
#         "Alice",
#         "groonga user",
#         11
#       ],
#       [
#         "Tom",
#         "nroonga user",
#         1
#       ],
#       [
#         "Bob",
#         "rroonga user",
#         1
#       ]
#     ]
#   ]
# ]

On the other hand, by specifying multiple query, the keywords groonga, mroonga and user use different weight.

As a result, you can control full text search result score by specifying different weight to the keywords on your purpose.

7.14.16.4. Parameters

7.14.16.4.1. Required parameters

There are two required parameter, match_columns and query_string.

7.14.16.4.1.1. match_columns

Specifies the default target column for fulltext search by query_string parameter value. It is the same role as match_columns parameter in select.

7.14.16.4.1.2. query_string

Specifies the search condition in Query syntax. It is the same role as query parameter in select.

See match_columns about query parameter in select.

7.14.16.4.2. Optional parameter

There are some optional parameters.

7.14.16.4.2.1. query_expander

Specifies the plugin name for query expansion.

There is one plugin bundled in official release - QueryExpanderTSV.

See QueryExpanderTSV about details.

7.14.16.4.2.2. substitution_table

Specifies the substitution table and substitution column name by following format such as ${TABLE}.${COLUMN} for query expansion.

See query_expander about details.

7.14.16.4.2.3. default_mode

Specifies the default search mode. You can custom the default search mode by column:@keyword like syntax. The default search mode is used when you just specify keyword instead of column:@keyword. See Query syntax for more syntax details.

Here are available modes. The default is MATCH mode. It does full text search.

Mode Aliases Description
EQUAL == It uses Equal condition as the default mode.
NOT_EQUAL != It uses Not equal condition as the default mode.
LESS < It uses Less than condition as the default mode.
GREATER > It uses Greater than condition as the default mode.
LESS_EQUAL <= It uses Less than or equal to condition as the default mode.
GREATER_EQUAL >= It uses Greater than or equal to condition as the default mode.
MATCH @

It uses Full text search condition as the default mode.

It's the default.

NEAR *N It uses Near search condition as the default mode.
SIMILAR *S It uses Similar search condition as the default mode.
PREFIX ^, @^ It uses Prefix search condition as the default mode.
SUFFIX $, @$ It uses Suffix search condition as the default mode.
REGEXP ~, @~ It uses Regular expression condition as the default mode.

7.14.16.4.2.4. flags

Specifies the flags that customizes how to parse query.

You can specify multiple flags by separating each flags by |. Here is the example to specify multiple flags:

query("title * 10 || content",
      "keyword",
      {"flags": "ALLOW_COLUMN|ALLOW_LEADING_NOT"})

See query_flags for available flags.

7.14.16.5. Return value

query returns whether any record is matched or not. If one or more records are matched, it returns true. Otherwise, it returns false.

7.14.16.6. See also