/*! Provides packed multiple substring search, principally for a small number of patterns. This sub-module provides vectorized routines for quickly finding matches of a small number of patterns. In general, users of this crate shouldn't need to interface with this module directly, as the primary [`AhoCorasick`](crate::AhoCorasick) searcher will use these routines automatically as a prefilter when applicable. However, in some cases, callers may want to bypass the Aho-Corasick machinery entirely and use this vectorized searcher directly. # Overview The primary types in this sub-module are: * [`Searcher`] executes the actual search algorithm to report matches in a haystack. * [`Builder`] accumulates patterns incrementally and can construct a `Searcher`. * [`Config`] permits tuning the searcher, and itself will produce a `Builder` (which can then be used to build a `Searcher`). Currently, the only tuneable knob are the match semantics, but this may be expanded in the future. # Examples This example shows how to create a searcher from an iterator of patterns. By default, leftmost-first match semantics are used. (See the top-level [`MatchKind`] type for more details about match semantics, which apply similarly to packed substring search.) ``` use aho_corasick::{packed::{MatchKind, Searcher}, PatternID}; # fn example() -> Option<()> { let searcher = Searcher::new(["foobar", "foo"].iter().cloned())?; let matches: Vec = searcher .find_iter("foobar") .map(|mat| mat.pattern()) .collect(); assert_eq!(vec![PatternID::ZERO], matches); # Some(()) } # if cfg!(all(feature = "std", any( # target_arch = "x86_64", target_arch = "aarch64", # ))) { # example().unwrap() # } else { # assert!(example().is_none()); # } ``` This example shows how to use [`Config`] to change the match semantics to leftmost-longest: ``` use aho_corasick::{packed::{Config, MatchKind}, PatternID}; # fn example() -> Option<()> { let searcher = Config::new() .match_kind(MatchKind::LeftmostLongest) .builder() .add("foo") .add("foobar") .build()?; let matches: Vec = searcher .find_iter("foobar") .map(|mat| mat.pattern()) .collect(); assert_eq!(vec![PatternID::must(1)], matches); # Some(()) } # if cfg!(all(feature = "std", any( # target_arch = "x86_64", target_arch = "aarch64", # ))) { # example().unwrap() # } else { # assert!(example().is_none()); # } ``` # Packed substring searching Packed substring searching refers to the use of SIMD (Single Instruction, Multiple Data) to accelerate the detection of matches in a haystack. Unlike conventional algorithms, such as Aho-Corasick, SIMD algorithms for substring search tend to do better with a small number of patterns, where as Aho-Corasick generally maintains reasonably consistent performance regardless of the number of patterns you give it. Because of this, the vectorized searcher in this sub-module cannot be used as a general purpose searcher, since building the searcher may fail even when given a small number of patterns. However, in exchange, when searching for a small number of patterns, searching can be quite a bit faster than Aho-Corasick (sometimes by an order of magnitude). The key take away here is that constructing a searcher from a list of patterns is a fallible operation with no clear rules for when it will fail. While the precise conditions under which building a searcher can fail is specifically an implementation detail, here are some common reasons: * Too many patterns were given. Typically, the limit is on the order of 100 or so, but this limit may fluctuate based on available CPU features. * The available packed algorithms require CPU features that aren't available. For example, currently, this crate only provides packed algorithms for `x86_64` and `aarch64`. Therefore, constructing a packed searcher on any other target will always fail. * Zero patterns were given, or one of the patterns given was empty. Packed searchers require at least one pattern and that all patterns are non-empty. * Something else about the nature of the patterns (typically based on heuristics) suggests that a packed searcher would perform very poorly, so no searcher is built. */ pub use crate::packed::api::{Builder, Config, FindIter, MatchKind, Searcher}; mod api; mod ext; mod pattern; mod rabinkarp; mod teddy; #[cfg(all(feature = "std", test))] mod tests; mod vector;