epoch epoch - 3 months ago 19
Java Question

Split String at n-th character preserving words

Expanding on this answer, using this regex

(?<=\\G.{" + count + "})
; I would also like to modify the expression to not split words in the middle.


String string = "Hello I would like to split this string preserving these words";

if I want to split on 10 characters it would look like this:

[Hello I wo, uld like t, o split th, is string , preserving, these wor, ds]


Is this even possible using only
, or would a lexer or some other string manipulation be needed?


This is what I want to use it on:

+ -------------------------------------------JVM Information------------------------------------------ +
| sun.boot.class.path : C:\Program Files\Java\jdk1.6.0_33\jre\lib\resources.jar;C:\Program Files\Java\ |
| jdk1.6.0_33\jre\lib\rt.jar;C:\Program Files\Java\jdk1.6.0_33\jre\lib\sunrsasig |
| n.jar;C:\Program Files\Java\jdk1.6.0_33\jre\lib\jsse.jar;C:\Program Files\Java |
| \jdk1.6.0_33\jre\lib\jce.jar;C:\Program Files\Java\jdk1.6.0_33\jre\lib\charset |
| s.jar;C:\Program Files\Java\jdk1.6.0_33\jre\lib\modules\jdk.boot.jar;C:\Progra |
| m Files\Java\jdk1.6.0_33\jre\classes |
+ ---------------------------------------------------------------------------------------------------- +

The box surrounding it has the character limit minus the key width, however this does not look good. This example is also not the only use-case, i use that box for multiple types of information.


"not split words in the middle" does not define what should happen in case of "not splitting".

Given the split length being 10 and the string:

Hello I would like to split this string preserving these words

If you want to split right after a word, resulting in the list:

Hello I would, like to split, this string, preserving, these words

You can accomplish all kinds of tricky "splits" by using plain matching.

Simply match all occurences of this expression:


(Using (?s) to turn on the DOTALL flag.)

In Perl it's as simple as @array = $str =~ /\G.{10,}?\b/gs, but Java seems to lack a quick function to return all matches, so you'd probably have to use a matcher and push the results on to an array/list.