README.md in regexp-examples-0.4.0 vs README.md in regexp-examples-0.4.1

- old
+ new

@@ -7,12 +7,13 @@ This method generates a list of (some\*) strings that will match the given regular expression \* If the regex has an infinite number of possible srings that match it, such as `/a*b+c{2,}/`, or a huge number of possible matches, such as `/.\w/`, then only a subset of these will be listed. -For more detail on this, see [configuration options](#configuration_options). +For more detail on this, see [configuration options](#configuration-options). + ## Usage ```ruby /a*/.examples #=> [''. 'a', 'aa'] /ab+/.examples #=> ['ab', 'abb', 'abbb'] @@ -39,14 +40,34 @@ * Control characters, e.g. `/\ca/`, `/\cZ/`, `/\C-9/` * Escape sequences, e.g. `/\x42/`, `/\x3D/`, `/\x5word/`, `/#{"\x80".force_encoding("ASCII-8BIT")}/` * Unicode characters, e.g. `/\u0123/`, `/\uabcd/`, `/\u{789}/` * **Arbitrarily complex combinations of all the above!** -## Not-Yet-Supported syntax +## Bugs and Not-Yet-Supported syntax -* Options, e.g. `/pattern/i`, `/foo.*bar/m` - Using options will currently just be ignored, e.g. `/test/i.examples` will NOT include `"TEST"` +* Backreferences are replaced by the _first_ occurance of the group, not the _last_ (as it should be). This is quite a rare occurance, but for example: + * `/(a|b){2} \1/.examples` incorrectly includes: `"ba b"` rather than the correct: `"ba a"` +* Options, e.g. `/pattern/i`, `/foo.*bar/m` - Using options will currently just be ignored, for example: + * `/test/i.examples` will NOT include `"TEST"` + * `/white space/x.examples` will not strip out the whitespace from the pattern, i.e. this incorrectly returns `["white space"]` rather than `["whitespace"]` + +* Nested character classes, and the use of set intersection ([See here](http://www.ruby-doc.org/core-2.2.0/Regexp.html#class-Regexp-label-Character+Classes) for the official documentation on this.) For example: + * `/[[abc]]/.examples` (which _should_ return `["a", "b", "c"]`) + * `/[[a-d]&&[c-f]]/.examples` (which _should_ return: `["c", "d"]`) + +* Extended groups are not yet supported, such as: + * Including comments inside the pattern, i.e. `/(?#...)/` + * Conditional capture groups, such as `/(group1) (?(1)yes|no)` + * Options toggling, i.e. `/(?imx)/`, `/(?-imx)/`, `/(?imx: re)/` and `/(?-imx: re)/` + +* Possessive quantifiers, i.e. `/.?+/`, `/.*+/`, `/.++/` + +* The patterns: `/\10/` ... `/\77/` should match the octal representation of their character code, if there is no nth grouped subexpression. For example, `/\10/.examples` should return `["\x08"]`. Funnily enough, I did not think of this when writing my regexp parser. + +Full documentation on all the various other obscurities in the ruby (version 2.x) regexp parser can be found [here](https://raw.githubusercontent.com/k-takata/Onigmo/master/doc/RE). + Using any of the following will raise a RegexpExamples::UnsupportedSyntax exception (until such time as they are implemented!): * POSIX bracket expressions, e.g. `/[[:alnum:]]/`, `/[[:space:]]/` * Named properties, e.g. `/\p{L}/` ("Letter"), `/\p{Arabic}/` ("Arabic character"), `/\p{^Ll}/` ("Not a lowercase letter") * Subexpression calls, e.g. `/(?<name> ... \g<name>* )/` (Note: These could get _really_ ugly to implement, and may even be impossible, so I highly doubt it's worth the effort!) @@ -62,11 +83,10 @@ * [Anchors](http://ruby-doc.org/core-2.2.0/Regexp.html#class-Regexp-label-Anchors) (`\b`, `\B`, `\G`, `^`, `\A`, `$`, `\z`, `\Z`), e.g. `/\bword\b/`, `/line1\n^line2/` * However, a special case has been made to allow `^` and `\A` at the start of a pattern; and to allow `$`, `\z` and `\Z` at the end of pattern. In such cases, the characters are effectively just ignored. (Note: Backreferences are not really "regular" either, but I got these to work with a bit of hackery!) -<a name="configuration_options"/> ##Configuration Options When generating examples, the gem uses 2 configurable values to limit how many examples are listed: * `max_repeater_variance` (default = `2`) restricts how many examples to return for each repeater. For example: @@ -87,28 +107,20 @@ ```ruby /a*/.examples(max_repeater_variance: 5) #=> [''. 'a', 'aa', 'aaa', 'aaaa' 'aaaaa'] /[F-X]/.examples(max_group_results: 10) #=> ['F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O'] ``` -**_WARNING_**: Choosing huge numbers, along with a "complex" regex, could easily cause your system to freeze! +_**WARNING**: Choosing huge numbers, along with a "complex" regex, could easily cause your system to freeze!_ For example, if you try to generate a list of _all_ 5-letter words: `/\w{5}/.examples(max_group_results: 999)`, then since there are actually `63` "word" characters (upper/lower case letters, numbers and "\_"), this will try to generate `63**5 #=> 992436543` (almost 1 _trillion_) examples! In other words, think twice before playing around with this config! A more sensible use case might be, for example, to generate one random 1-4 digit string: `/\d{1,4}/.examples(max_repeater_variance: 3, max_group_results: 10).sample(1)` (Note: I may develop a much more efficient way to "generate one example" in a later release of this gem.) - -## Known Bugs - -There are a few obscure bugs that have yet to be resolved: - -* Various (weird!) legal patterns do not get parsed correctly, such as `/[[wtf]]/.examples` - To solve this, I'll probably have to dig deep into the Ruby source code and imitate the actual Regex parser more closely. - -* Backreferences are replaced by the _first_ occurance of the group, not the _last_ (as it should be). This is quite a rare occurance, but for example: `/(a|b){2} \1/.examples` incorrectly includes: `"ba b"` rather than the correct: `"ba a"` ## Installation Add this line to your application's Gemfile: