Hasen Hasen - 1 year ago 32
PHP Question

PHP Split sentence by full stop before last if over a certain length

I want to first detect if a sentence if over a certain length (obviously not that hard) and then if so split it by the full stop (or question mark/exclamation mark) before the end. So for example:

This line is going to be too long. This is why. Also I don't know what's next!

If there are no full stops in the sentence it should not split despite the length.

A shorter sentence, this is fine.

This line will surely also be too long? I'm not sure why though.


It should be split to:

This line is going to be too long. This is why.

Also I don't know what's next!

If there are no full stops in the line it should not split despite the length.

A shorter sentence, this is fine.

This line will surely also be too long?

I'm not sure why though.


Apart from a full stop at the end, it should not split if there are no period marks in the middle of the line.

EDIT: I didn't think there'd be much point to add this code in but since it was requested:

If (strlen($array[0])>44) {
*code here*
}


I don't now how it would detect the second from last ?/./!/ mark in a line.

Answer Source

If your sentences don't include decimal numbers

$lines = array(
"This line is going to be too long. This is why. Also I don't know what's next!",
"If there are no full stops in the sentence it should not split despite the length.",
"A shorter sentence, this is fine.",
"This line will surely also be too long? I'm not sure why though.",
);

$result = array();
foreach($lines as $line) {
    if (preg_match('/^(.*?[^.!?]*[.!?])\h*([^.!?]*[.!?])$/', $line, $matches)) {
        $result[] = $matches[1];
        $result[] = $matches[2];
    } else {
        $result[] = $line;
    }
}
print_r($result);

Output:

Array
(
    [0] => This line is going to be too long. This is why.
    [1] => Also I don't know what's next!
    [2] => If there are no full stops in the sentence it should not split despite the length.
    [3] => A shorter sentence, this is fine.
    [4] => This line will surely also be too long?
    [5] => I'm not sure why though.
)

Regex explanation:

/               : regex delimiter
  ^             : begining of line
    (           : start capture group 1
      .*?       : 0 or more any character but newline, not greedy
      [^.!?]*   : 0 or more character that is not . ! ?
      [.!?]     : 1 of these characters
    )           : end group 1
    \h*         : optional horizontal spaces
    (           : start capture group 2
      [^.!?]*   : 0 or more character that is not . ! ?
      [.!?]     : 1 of these characters
    )           : end group 2
  $             : end of line
/               : regex delimiter
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download