groonga - An open-source fulltext search engine and column store.

8.3.23. suggest

Note

The suggest feature specification isn't stable. The specification may be changed.

8.3.23.1. NAME

suggest - returns completion, correction and/or suggestion for a query.

8.3.23.2. SYNOPSIS

suggest types table column query [sortby [output_columns [offset [limit [frequency_threshold [conditional_probability_threshold [prefix_search]]]]]]]

8.3.23.3. DESCRIPTION

The suggest command returns completion, correction and/or suggestion for a specified query.

See Introduction about completion, correction and suggestion.

8.3.23.4. OPTIONS

types

It specifies what types are returned by the suggest command.

Here are available types:

complete
The suggest command does completion.
correct
The suggest command does correction.
suggest
The suggest command does suggestion.

You can specify one or more types separated by |. Here are examples:

It returns correction:

correct

It returns correction and suggestion:

correct|suggest

It returns complete, correction and suggestion:

complete|correct|suggest
table

It specifies table name that has item_${DATA_SET_NAME} format. For example, item_query is a table name if you created dataset by the following command:

groonga-suggest-create-dataset /tmp/db-path query
column
It specifies a column name that has furigana in Katakana in table table.
query
It specifies query for completion, correction and/or suggestion.
sortby

It specifies sort key.

Default:
-_score
output_columns

It specifies output columns.

Default:
_key,_score
offset

It specifies returned records offset.

Default:
0
limit

It specifies number of returned records.

Default:
10
frequency_threshold

It specifies threshold for item frequency. Returned records must have _score that is greater than or equal to frequency_threshold.

Default:
100

conditional_probability_threshold

It specifies threshold for conditional probability. Conditional probability is used for learned data. It is probability of query submission when query is occurred. Returned records must have conditional probability that is greater than or equal to conditional_probability_threshold.

Default:
0.2
prefix_search

It specifies whether optional prefix search is used or not in completion.

Here are available values:

yes
Prefix search is always used.
no
Prefix search is never used.
auto
Prefix search is used only when other search can't find any records.
Default:
auto
similar_search

It specifies whether optional similar search is used or not in correction.

Here are available values:

yes
Similar search is always used.
no
Similar search is never used.
auto
Similar search is used only when other search can't find any records.
Default:
auto

8.3.23.5. RETURN VALUE

8.3.23.5.1. JSON format

Here is a returned JSON format:

{"type1": [["candidate1", score of candidate1],
           ["candidate2", score of candidate2],
           ...],
 "type2": [["candidate1", score of candidate1],
           ["candidate2", score of candidate2],
           ...],
 ...}

type

A type specified by types.

candidate

A candidate for completion, correction or suggestion.

score of candidate

A score of corresponding candidate. It means that higher score candidate is more likely candidate for completion, correction or suggestion. Returned candidates are sorted by score of candidate descending by default.

8.3.23.6. EXAMPLE

Here are learned data for completion.

Execution example:

load --table event_query --each 'suggest_preparer(_id, type, item, sequence, time, pair_query)'
[
{"sequence": "1", "time": 1312950803.86057, "item": "e"},
{"sequence": "1", "time": 1312950803.96857, "item": "en"},
{"sequence": "1", "time": 1312950804.26057, "item": "eng"},
{"sequence": "1", "time": 1312950804.56057, "item": "engi"},
{"sequence": "1", "time": 1312950804.76057, "item": "engin"},
{"sequence": "1", "time": 1312950805.86057, "item": "engine", "type": "submit"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 6]

Here are learned data for correction.

Execution example:

load --table event_query --each 'suggest_preparer(_id, type, item, sequence, time, pair_query)'
[
{"sequence": "2", "time": 1312950803.86057, "item": "s"},
{"sequence": "2", "time": 1312950803.96857, "item": "sa"},
{"sequence": "2", "time": 1312950804.26057, "item": "sae"},
{"sequence": "2", "time": 1312950804.56057, "item": "saer"},
{"sequence": "2", "time": 1312950804.76057, "item": "saerc"},
{"sequence": "2", "time": 1312950805.76057, "item": "saerch", "type": "submit"},
{"sequence": "2", "time": 1312950809.76057, "item": "serch"},
{"sequence": "2", "time": 1312950810.86057, "item": "search", "type": "submit"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 8]

Here are learned data for suggestion.

Execution example:

load --table event_query --each 'suggest_preparer(_id, type, item, sequence, time, pair_query)'
[
{"sequence": "3", "time": 1312950803.86057, "item": "search engine", "type": "submit"},
{"sequence": "3", "time": 1312950808.86057, "item": "web search realtime", "type": "submit"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 2]

Here is a completion example.

Execution example:

suggest --table item_query --column kana --types complete --frequency_threshold 1 --query en
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "complete": [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "engine",
#         1
#       ]
#     ]
#   }
# ]

Here is a correction example.

Execution example:

suggest --table item_query --column kana --types correct --frequency_threshold 1 --query saerch
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "correct": [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "search",
#         1
#       ]
#     ]
#   }
# ]

Here is a suggestion example.

Execution example:

suggest --table item_query --column kana --types suggest --frequency_threshold 1 --query search
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "suggest": [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "search engine",
#         1
#       ],
#       [
#         "web search realtime",
#         1
#       ]
#     ]
#   }
# ]

Here is a mixed example.

Execution example:

suggest --table item_query --column kana --types complete|correct|suggest --frequency_threshold 1 --query search
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "suggest": [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "search engine",
#         1
#       ],
#       [
#         "web search realtime",
#         1
#       ]
#     ],
#     "complete": [
#       [
#         2
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "search",
#         2
#       ],
#       [
#         "search engine",
#         2
#       ]
#     ],
#     "correct": [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "search",
#         2
#       ]
#     ]
#   }
# ]