--- title: Search Engine title_extention: Integrated site search based on Lunr tagline: QuickSearch Module date: 2024-02-08 #last_modified: 2023-01-01 description: > The search option for the J1 Template is based on the search engine Lunr and is fully integrated with the template. Lunr is designed to be lightweight yet full-featured to provide users with a great search experience. Using Lunr for a Jekyll website, there is no need to integrate complex external, server-sided search engines like Google or Bing. keywords: > open source, free, template, jekyll, jekyllone, web, sites, static, jamstack, bootstrap, lunr, site search, quick search, find, google, bing categories: [ Roundtrip ] tags: [ Module, Site search ] image: path: /assets/images/modules/attics/lunr-1280x800.jpg width: 1920 height: 1280 regenerate: false permalink: /pages/public/learn/roundtrip/quicksearch/ resources: [ animate, lightbox, rouge ] resource_options: - attic: slides: - url: /assets/images/modules/attics/lunr-1280x800.jpg alt: Lunr Search Engine --- // Page Initializer // ============================================================================= // Enable the Liquid Preprocessor :page-liquid: // Set (local) page attributes here // ----------------------------------------------------------------------------- // :page--attr: // Load Liquid procedures // ----------------------------------------------------------------------------- {% capture load_attributes %}themes/{{site.template.name}}/procedures/global/attributes_loader.proc{%endcapture%} // Load page attributes // ----------------------------------------------------------------------------- {% include {{load_attributes}} scope="all" %} // Page content // ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [role="dropcap"] The search option for the J1 Template is based on the Lunar, a great search engine. *Lunar*, written in Javscript by http://lunrjs.com[Oliver Nightingale, {browser-window--new}], is fully integrated with the template system. The engine is designed to be small in code size yet full-featured to provide users with a great search experience. mdi:clock-time-five-outline[24px, md-gray mt-5 mr-2] *5-10 Minutes* to read // Include sub-documents (if any) // ----------------------------------------------------------------------------- [role="mt-5"] Using the J1 Search module for a website, it is *not* longer needed to integrate complex external search engines like _Bing_ or _Google_ into your web. Searching a website using QuickSearch is s little different from Internet search engines. Search platforms are using complex algorithms to provide a simple interface but require a lot of artificial intelligence methods to make sense out of a handful of words given for a search and to inject advertising elements for their customers. Nevertheless, the J1 implementation of Lunar, is simple like searching at _Google_ or _Bing_ but offers additional features to do searches more specifically. QuickSearch provides an easy-to-use query language for better results with no advertising included. [role="mt-4"] == Core concepts Understanding some of the concepts and terminology that search engine of J1 Template, will allow users to provide powerful search expressions to get more relevant search results. [role="mt-4"] === Indexing documents QuickSearch offers searches on *all* documents of the website generated by J1. Advantage: no internet access is needed for searches. Searches are based on a pre-build local *full-text* index loaded by the browser for all pages. The index for a site is generated by the Jekyll index plugin `lunr_index.rb` located in the plugins folder `_plugins`. The full-text index is always generated by Jekyll at build-time: .Index creation at buildtime [source, text, role="noclip"] ---- Startup the site .. Configuration file: ... Incremental build: enabled Generating... J1 QuickSearch: creating search index ... J1 QuickSearch: finished, index ready. .... ---- Or, if you're running a website in development mode, the index get refreshed for all files added or modified when enabled. .Index creation if files added, or modified [source, text, role="noclip"] ---- site: Regenerating: n file(s) changed at ... site: ... site: J1 QuickSearch: creating search index ... site: J1 QuickSearch: finished, index ready. ... ---- [role="mt-4"] === Documents The searchable data in an index is organized as documents containing the text and the words you want to search on. A document is a JSON-based data set with fields that are processed to create the result list for a search. A document data set might look like this: .JSON Dokument [source, json, role="noclip"] ---- { "id": 3, "title": "Roundtrip", "tagline": "present images", "url": "/pages/public/learn/roundtrip/present_images/", "date": "2020-11-03 +0100", "tags": [ "Introduction", "Module", "Image" ], "categories": [ "Roundtrip" ], "description": "Welcome to the preview page ... and galleries.\n", "is_post": false } ---- In this document, there are several fields, like *title*, *tagline*, or *description*, that could be used for full-text searches. But additional fields are available, like *tags* or *categories* that can be used for more specific searches. [NOTE] ==== The document content is collected from the HTML body element ``. To limit the size of the index data loaded by the browser, the bodys' content is removed from the document source, but the *content* is still fully searchable. ==== To do a simple full-text search as well as more specific searches, the search core engine *Lunar* offers a query language, a DSL (domain-specific language). Find more about *QuickSearch|Lunr DSL* queries with the section <>. [role="mt-4"] === Scoring The relevance, the *score*, is calculated based on an algorithm called *BM25*, along with other factors. You don’t need to worry too much about the details of how this technique works. To summarize: the more a search term occurs in a document, the more that term will increase that documents' score, but th more a search term occurs in the *overall* collection of documents, the less that term will increase a document’s score. In other words: seldom words count and increase the score. Scoring information generated by the BM25 algorithm is added to the search index and allows a very fast calculation of the relevance of documents for queries. Imagine you’re website contains documents about Jekyll. The term *Jekyll* may occur very frequently throughout the entire website. Used quite often for the content. So finding a document that mentions the term Jekyll isn’t very significant for a search. However, if you’re searching for *Jekyll Generator*, only some documents of the website has the word *Generator* in them, and that will bring the score, the relevance, for documents having both words in them at a higher level, and bring them higher up in the search results. Matching and scoring are used by all search engines - the same as for J1 QuickSearch. You’ll see for QuickSearch a similar behavior in *sorting* search results as you already know from commercial internet search engines like Google: the top results are the more relevant ones. [role="mt-5"] == Searching To access QuickSearch, a magnifier button is available in the *Quicklinks* area in the menu bar at the top-right of every page. .Search button (magnifier) in the quick access area lightbox::quicksearch-icon[ 1024, {data-quicksearch--icon} ] A mouse-click on the magnifier button opens the search input and disables all other navigation to focus on what you're intended to do: searching. .Input for a QuickSearch lightbox::quicksearch-input[ 1024, {data-quicksearch--input} ] The results for seaching for the word *Jekyll* may look like so: .Results for a QuickSearch lightbox::quicksearch-results[ 1024, {data-quicksearch--results} ] Search queries look like simple text. But the search engine transforms the given search string always into a *search query*. Search queries support a special syntax, the DSL, for defining more complex queries for better results. [role="mt-4"] === Simple searches The simplest way to run a search is to pass the words on which you want to search on. [source, text] ---- jekyll ---- The above will return all documents that match the term `jekyll`. Searches for *multiple* terms (words) are also supported. If a document matches *at least* one of the search terms, it will show in the results. The search terms are combined by an logical *OR*. [source, text] ---- jekyll tutorial ---- The above example will match documents that contain either *jekyll* or *tutorial*. Documents that contain *both* words will increase the score, and the matching documents returned first. [NOTE] ==== Comparing to a Google search (words are combined at Google by a logical *and*) a Quicksearch combines the terms by an logical *or*. ==== To combine search terms in a QuickSearch query by a logical *and*, the words could be prepended by a plus sign `+` to mark them as for the search query (DSL) as *required*. [source, text] ---- +jekyll +tutorial ---- [role="mt-4"] === Wildcards QuickSearch supports *wildcards* when performing searches. A wildcard is represented as a star character `*` and can appear anywhere in a search term. For example, the following will match all documents with words beginning with **Jek**. [source, text role="noclip"] ---- jek* ---- [NOTE] ==== Language grammar rules are not relevant for searches. For simplification, all words are transformed to lower case. As a result, the word *Jekyll* is the same as the lowercase wriiten word *jekyll* from a search-engines perspective. Language variations of *Jekyll* or plurals like Generators* are reduced to their base form. For searches, don't take care of grammar rules but the *spelling*. If you're unsure about the spelling of a word, use *wildcards*. ==== [role="mt-4"] === Fields By default, Lunar will search *all* fields in a document for the given query terms. And it is possible to *restrict* a term to a specific *field*. The following example searches for the term *jekyll* in the field *title*: [source, text] ---- title:jekyll ---- The search term is prefixed with the field's name, followed by a colon `:`. [CAUTION] ==== The field *must* be one of the fields defined when building the index. *Unknown* fields will lead to an *error*. ==== Search queries based on fields can be combined with all other term modifiers like *wildcards*. For example, to search for words beginning with *jek* in the title *and* the wildcard *coll** in a document, the following query can be used. [source, text] ---- +title:jek* +coll* ---- Besides the document *content*, some *specific* fields are available for searches. .Available fields [cols="3a,3a,6a, options="header", width="100%", role="rtable mt-3"] |=== |Name |Value |Description\|Example\|s |`title` |`string` |The headline of a document (article, post) Example\|s: QuickSearch [source, text] ---- title:QuickSearch ---- |`tagline` |`string` |The subtitle of a document (article, post) Example\|s: full index search |`tags` |`string` |Tags describe the content of a document. Example\|s: Roundtrip, QuickSearch |`categories` |`string` |Categories describe the group of documnets a document belongs to. Example\|s: Search |`description` |`string` |The description is given by the author for a document. It gives a brief summary what the document is all about. Example\|s: QuickSearch is based on the search engine Lunar, fully integrated with J1 Template ... |=== //// === Boosts In multi-term searches, a single term may be important than others. For these cases Lunr supports term level boosts. Any document that matches a boosted term will get a higher relevance score, and appear higher up in the results. A boost is applied by appending a caret (`^`) and then a positive integer to a term. [source, javascript] ---- idx.search('foo^10 bar') ---- The above example weights the term “foo” 10 times higher than the term “bar”. The boost value can be any positive integer, and different terms can have different boosts: [source, javascript] ---- idx.search('foo^10 bar^5 baz') ---- === Fuzzy Matches Lunr supports fuzzy matching search terms in documents, which can be helpful if the spelling of a term is unclear, or to increase the number of search results that are returned. The amount of fuzziness to allow when searching can also be controlled. Fuzziness is applied by appending a tilde (`~`) and then a positive integer to a term. The following search matches all documents that have a word within 1 edit distance of “foo”: [source, javascript] ---- idx.search('foo~1') ---- An edit distance of 1 allows words to match if either adding, removing, changing or transposing a character in the word would lead to a match. For example “boo” requires a single edit (replacing “f” with “b”) and would match, but “boot” would not as it also requires an additional “t” at the end. //// [role="mt-4"] === Term presence By default, Lunar combines multiple terms in a search with a logical *or*. A search for *jekyll* and *collections* will match documents that contain the word *jekyll* or contain *collections* or contain *both*. This behavior is controllable. For example the presence of each term in matching documents can be specified. A document must have at least *one* matching term to return a results. It is possible to specify that a term must be present in documents or that should be absent. To indicate that a term must be *present* in matching documents, the term could be prefixed by a plus sign `+`, and to indicate that a term must be *absent*, the term should be prefixed by a minus sign `-`. The below example searches for documents that *must* contain the word *jekyll*, and must *not* contain the word *collection*. [source, text] ---- +jekyll -collection ---- To simulate a logical function like *and* in a search of documents that contain the word *jekyll* and the word *collection*, prefix both words by a plus sign `+`. [source, text] ---- +jekyll +collection ---- [role="mt-5"] == What next You reached the end of the overview series presentieng what J1 can do. I hope you enjoyed exploring what the Template System can do for your new website. To learn more on using J1 for your site, I recomment to go for *J1 in a Day* next. J1 in a Day is a *tutorial* learning to create modern websites using the J1 Template. The tutorial focuses on the basics of Jekyll *and* J1, which all people should know for a successful way to a modern static website. Jekyll (and J1) is quite different from classic Content Management Systems (CMS). Knowlege in CMS system can help someway, but generatora like Jekyll for static websites work system-related quite different. If you would like to learn more about the use of Jekyll and J1 Template, the tutorials present what you need to know to have a successful start in creating modern websites using Jekyll and J1: * The basics of modern static webs * Creating an awesome site in minutes * Learning the Development System of J1 Template * Introduction to the Project Management for a static web * Content creation for J1 based static websites It sounds much, spending a whole day to get Jekyll and J1 to know. Yes, it is much. But it really makes sense to get a full overview of what can be achieved by modern static websites on your own. [role="mb-7"] It's a promise: you'll have a pleasant journey to learn what modern static webs can offer today. Start your experience from here: link:{url-j1-kickstarter--web-in-a-day}[J1 in a Day, {browser-window--new}].