Ed B Ed B - 1 month ago 12
C# Question

How do you remove repeated characters in a string

I have a website which allows users to comment on photos.
Of course, users leave comments like:


'OMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG!!!!!!!!!!!!!!!'


or


'YOU SUCCCCCCCCCCCCCCCCCKKKKKKKKKKKKKKKKKK'


You get it.

Basically, I want to shorten those comments by removing at least most of those excess repeated characters.
I'm sure there's a way to do it with Regex..i just can't figure it out.

Any ideas?

Answer

Keeping in mind that the English language uses double letters often you probably don't want to blindly eliminate them. Here is a regex that will get rid of anything beyond a double.

Regex r = new Regex("(.)(?<=\\1\\1\\1)", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled);

var x = r.Replace("YOU SUCCCCCCCCCCCCCCCCCKKKKKKKKKKKKKKKKKK", String.Empty);
// x = "YOU SUCCKK"

var y = r.Replace("OMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG!!!!!!!!!!!!!!!", String.Empty);
// y = "OMGG!!"