Today I learned: How to match a portion of text that spans over multiple lines with a Javascript regular expression

Problem

Let's say you have the following 5 lines long quote by Mark Twain:

Keep away from people who try to belittle your ambitions.
Small people always do that,
but the really great make you feel that you,
too,
can become great.

Now you wish to match, with a regular expression, from people on line 1, and until really on line 3.

Hint: Spotlight on a very useful set of tokens

The token [^] lets you match any character, including new line.
If you append the * quantifier token to it, it matches what matches [^] but between zero and an unlimited amount of times.

Beware though, because the token [^] seems to only work in Javascript.
An alternative to it seems to be [\s\S].

  1. \s matches any whitespace character
  2. \S matches any non-whitespace character

Solution

Using [^]*

/people[^]*.*really/m

This, according to regex101.com goes as follow:

  1. people matches the characters people literally (case sensitive)
  2. [^] matches any character, including newline
  3. * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  4. . matches any character (except for line terminators)
  5. * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  6. really matches the characters really literally (case sensitive)

Global pattern flag:
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Using [\s\S]*

/people[\s\S]*.*really/m

This, according to regex101.com goes as follow:

  1. people matches the characters people literally (case sensitive)
  2. \s matches any whitespace character (equivalent to [\r\n\t\f\v \u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff])
  3. \S matches any non-whitespace character (equivalent to [^\r\n\t\f\v \u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff])
  4. * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  5. . matches any character (except for line terminators)
  6. * matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
  7. really matches the characters really literally (case sensitive)

Global pattern flag:
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

References