groonga - An open-source fulltext search engine and column store.

5.3. Completion

This section describes about the following completion features:

  • How it works
  • How to use
  • How to learn

5.3.1. How it works

The completion feature uses three searches to compute completed words:

  1. Prefix RK search against registered words.
  2. Cooccurrence search against learned data.
  3. Prefix search against registered words. (optional)

5.3.2. How to use

Groonga provides suggest command to use completion. --type complete option requests completion.

For example, here is an command to get completion results by "en":

Execution example:

suggest --table item_query --column kana --types complete --frequency_threshold 1 --query en
# [
#   [
#     0,
#     1337566253.89858,
#     0.000355720520019531
#   ],
#   {
#     "complete": [
#       [
#         1
#       ],
#       [
#         [
#           "_key",
#           "ShortText"
#         ],
#         [
#           "_score",
#           "Int32"
#         ]
#       ],
#       [
#         "engine",
#         1
#       ]
#     ]
#   }
# ]

5.3.3. How it learns

Cooccurrence search uses learned data. They are based on query logs, access logs and so on. To create learned data, groonga needs user input sequence with time stamp and user submit input with time stamp.

For example, an user wants to search by "engine". The user inputs the query with the following sequence:

  1. 2011-08-10T13:33:23+09:00: e
  2. 2011-08-10T13:33:23+09:00: en
  3. 2011-08-10T13:33:24+09:00: eng
  4. 2011-08-10T13:33:24+09:00: engi
  5. 2011-08-10T13:33:24+09:00: engin
  6. 2011-08-10T13:33:25+09:00: engine (submit!)

Groonga can be learned from the input sequence by the following command:

load --table event_query --each 'suggest_preparer(_id, type, item, sequence, time, pair_query)'
[
{"sequence": "1", "time": 1312950803.86057, "item": "e"},
{"sequence": "1", "time": 1312950803.96857, "item": "en"},
{"sequence": "1", "time": 1312950804.26057, "item": "eng"},
{"sequence": "1", "time": 1312950804.56057, "item": "engi"},
{"sequence": "1", "time": 1312950804.76057, "item": "engin"},
{"sequence": "1", "time": 1312950805.86057, "item": "engine", "type": "submit"}
]

Table Of Contents

Previous topic

5.2. Tutorial

Next topic

5.4. Correction

This Page