Skip to content Skip to sidebar Skip to footer

Javascript Regular Expression For Word Boundaries, Tolerating In-word Hyphens And Apostrophes

I'm looking for a Regular Expression for JavaScript that will identify word boundaries in English, while accepting hyphens and apostrophes that appear inside words, but excluding t

Solution 1:

You can organize your word-boundary characters into two groups.

  1. Characters that cannot be alone.
  2. Characters that can be alone.

A regex that works with your example would be:

[\s.,'-]{2,}|[\s.]

Regex101 Demo

Now all that's left is to keep adding all non-word characters into those two groups until it fits all of your needs. So you might start adding symbols and more punctuation to those character classes.

Solution 2:

You could write something like that:

(\s|[!-/]|[:-@]|[\[-`]|[\{-~])*\s(\s|[!-/]|[:-@]|[\[-`]|[\{-~])*

Or the compact version:

(\s|[!-/:-@\[-`\{-~])*\s(\s|[!-/:-@\[-`\{-~])*

The RegExp requires one \s (Space character) and selects als spaces and non alphanumeric chars before and after it.

https://regex101.com/r/bR8sV1/4

  • \s matches all spaces
  • !-/ every char from ! to /
  • :-@ every char from : to @
  • \[-`` every char from [ to ``
  • \{-~ every char from { to ~

Post a Comment for "Javascript Regular Expression For Word Boundaries, Tolerating In-word Hyphens And Apostrophes"