Noodle of Death Noodle of Death - 2 months ago 8
Swift Question

Regular expression to add a prefix to each line in a match

I want to reformat my Swift documentation from

/**
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the
1500s, when an unknown printer took a galley of type and scrambled it to
make a type specimen book. It has survived not only five centuries, but
also the leap into electronic typesetting, remaining essentially
unchanged. It was popularised in the 1960s with the release of Letraset
sheets containing Lorem Ipsum passages, and more recently with desktop
publishing software like Aldus PageMaker including versions of Lorem Ipsum.
*/


to

/// Lorem Ipsum is simply dummy text of the printing and typesetting industry.
/// Lorem Ipsum has been the industry's standard dummy text ever since the
/// 1500s, when an unknown printer took a galley of type and scrambled it to
/// make a type specimen book. It has survived not only five centuries, but
/// also the leap into electronic typesetting, remaining essentially
/// unchanged. It was popularised in the 1960s with the release of Letraset
/// sheets containing Lorem Ipsum passages, and more recently with desktop
/// publishing software like Aldus PageMaker including versions of Lorem Ipsum.


A regular expression find and replace edit: , using Xcode's find/replace regex engine, would be preferred because God only knows how many doc notes I have in my library of over 200 different classes. The expression to find these blocks is simple, but I do not know how to make my replace expression so that I am able to add a prefix to each line, well.

Current Search Expression:
(?s:/\*\*(.*?)\*/)
- matches all text in between
/** */


Current Replace Expression:
/// $1


Obviously, the expressions above do not achieve exactly what I am looking for. I appreciate any help, in advance! Thanks.

Sam Sam
Answer

This is the nicest solution I could come up with, but it relies on some specialized PCRE anchors (most importantly \G, but I also went with \A and \K) so it may not work in your flavor of regex. This is also a two step solution, but I don't think it'd be possible to get it down to one -- would love to see someone prove me wrong here!


First, you want to match every line starting with whitespace in between /** and */ and replace the whitespace with ///.

Find:

~(?<=/[*]{2}|(?<!\A)\G)\n\K^\s*+(?![*]/)(.*)$~gm

Replace:

/// $1

Demo.


Then we want to take the result of that replacement and remove the left over lines denoting the old comment, /** and */.

Find:

~^.*(?:/[*]{2}|[*]/).*$\n?~gm

Replace:

[null]

Demo.


Most of the above is straightforward, or at least should be assuming you have a basic understanding of regex...which it appears as though you do. The obvious one standing out is that first one. Let's break it down (yay!)...

(?<=       # start a zero-length lookahead
  /[*]{2}  # look for the start of a comment
 |         # or...
  (?<!\A)  # negate this part if we're at the beginning of the string
  \G       # start at the end of the last match (or beginning of string)
)          # end that lookahead and move on to each line
\n\K       # find the newline and then reset the match for clarity
^\s*+      # match whitespace at the beginning of the line
(?![*]/)   # negate this match if we're at the end of the comment
(.*)$      # capture everything up until the end of the line

The only thing in the above that may need an extra explanation is the (?<!\A)\G) "hack" I use. \G lets us start a match at the end of the last match, which is quite necessary (for an all-encompassing solution such as this) in a repeating problem like we have here. However, \G also matches the beginning of the string which we don't want (we deal with that in the first half of the lookahead where we match the start of the comment) so we negate matching the beginning of the string with (?<!\A). Boom!

\K isn't necessary, but makes for a cleaner expression when we get this complicated. Without it, the \n is part of the match and we would need to manually replace this with \n\\\ $1 instead.

Comments