7.17. Tuning

7.17.1. 概要

There are some tuning parameters for improving Groonga performance or handling a large database.

7.17.2. 引数

このセクションではすべての引数について説明します。

7.17.2.1. The max number of open files per process

This parameter is for handling a large database.

Groonga creates one or more files per table and colum. If your database has many tables and columns, groonga process needs to open many files.

System limits the max number of open files per process. So you need to relax the limitation.

Here is an expression that compute how many files are opened by groonga:

3 (for DB) +
  N tables +
  N columns (except index clumns) +
  (N index columns * 2) +
  X (the number of plugins etc.)

以下はスキーマの例です。

table_create Entries TABLE_HASH_KEY ShortText
column_create Entries content COLUMN_SCALAR Text
column_create Entries n_likes COLUMN_SCALAR UInt32
table_create Terms TABLE_PAT_KEY|KEY_NORMALIZE ShortText --default_tokenizer TokenBigram
column_create Terms entries_key_index COLUMN_INDEX|WITH_POSITION Entries _key
column_create Terms entries_content_index COLUMN_INDEX|WITH_POSITION Entries content

This example opens at least 11 files:

3 +
  2 (Entries and Terms) +
  2 (Entries.content and Entries.n_likes) +
  4 (Terms.entries_key_index and Terms.entries_content_index) +
  X = 11 + X

7.17.2.2. メモリ使用量

This parameter is for handling a large database.

Groonga maps database files onto memory and accesses to them. Groonga doesn't maps unnecessary files onto memory until they are nneded.

If you access to all data in database, all database files are mapped onto memory. If total size of your database files is 6GiB, your groonga process uses 6GiB memory.

Normally, your all database files aren't mapped onto memry. But is may be occurred. It is an example case that you dump your database.

You must have memory and swap that is larger than database.

7.17.3. Linux

このセクションではLinux上で引数をカスタマイズする方法について説明します。

7.17.3.1. nofile

You can relax the The max number of open files per process parameter by creating a configuration file /etc/security/limits.d/groonga.conf that has the following content:

${USER} soft nofile ${MAX_VALUE}
${USER} hard nofile ${MAX_VALUE}

If you run groonga process by groonga user and your groonga process needs to open less than 10000 files, use the following configuration:

groonga soft nofile 10000
groonga hard nofile 10000

The configuration is applied after your groonga service is restarted or re-login as your groonga user.

7.17.3.2. vm.overcommit_memory

This is メモリ使用量 related parameter. You can handle a database that is larger than your memory and swap by setting vm.overcommit_memory kernel parameter to 1. 1 means that Groonga can always map database files onto memory. It is no problem until groonga touch mapped database files that their size is larger than memory and swap. Groonga recommends the configuration.

See Linux kernel documentation about overcommit about vm.overcommit_memory parameter details.

You can set the configuration by putting a configuration file /etc/sysctl.d/groonga.conf that has the following content:

vm.overcommit_memory = 1

設定した内容はシステムを再起動するか、次のコマンドを実行することで反映されます。:

% sudo sysctl -p

7.17.3.3. vm.max_map_count

This is メモリ使用量 related parameter. You can handle a 16GiB or more larger size database by increasing vm.max_map_count kernel parameter. The parameter limits the max number of memory maps.

The default value of the kernel parameter may be 65530 or 65536. Groonga maps 256KiB memory chunk at one time. If a database is larger than 16GiB, groonga reaches the limitation. (256KiB * 65536 = 16GiB)

You needs to increase the value of the kernel parameter to handle 16GiB or more larger size database. For example, you can handle almost 32GiB size database by 65536 * 2 = 131072. You can set the configuration by putting a configuration file /etc/sysctl.d/groonga.conf that has the following content:

vm.max_map_count = 131072

Note that your real confiugration file will be the following because you already have vm.overcommit_memory configuration:

vm.overcommit_memory = 1
vm.max_map_count = 131072

設定した内容はシステムを再起動するか、次のコマンドを実行することで反映されます。:

% sudo sysctl -p

7.17.4. FreeBSD

このセクションではFreeBSD上で引数をカスタマイズする方法を説明します。

7.17.4.1. kern.maxfileperproc

TODO