Mary Mary - 1 month ago 6
PHP Question

Apply 2 preg_split with regex to text

Context:
I have to split an email with several customers’ reservations details that is received every day, with a set of rules. This is an example of the email:

A N K U N F T 11.08.15
*** NEUBUCHUNG ***
11.08.15 xxx xxx X3 2830 14:25 17:50
18.08.15 xxx xxx X3 2831 18:40
F882129 dsdsaidsaia
F882129 xxxyxyagydaysd
sadsdsdsdsadsadadssda
sadsdsdsdsadsadadssda
**«CUT HERE2»**


A N K U N F T 18.08.15
*** NEUBUCHUNG ***
11.08.15 xxx xxx X3 2830 14:25 17:50
18.08.15 xxx xxx X3 2831 18:40
F881554 ZXCXZCXCXZCCXZ
F881554 xcvcxvcxvcvxc
F881554 xvcxvcxcvxxvccvxxcv

**«CUT HERE»**


11.08.15 xxx xxx X3 2830 14:25 17:50
18.08.15 xxx xxx X3 2831 18:40
F881605 xczxcdfsfdsdfs
F881605 zxccxzxzdffdsfds

**«CUT HERE»**


So it basically has to be cut whenever the last F999999 appears (where 9 can be any digit), because F999999 is the reservation code.*
I inserted the text: «CUT HERE» just to better understand where to cut.

*NOTE: reservation code may have the following formats: F999999, A999999, E999999 or 999999.

So I apply a working preg_split with the following regex:

Regex1 = "/(?:\\s(F|A|E)?\\d{6}\\s?+.*?\r\n\\s?\r\n)\\K//ms";


However sometimes I have to cut where «CUT HERE2» appears, because sometimes there is some text after the reservation code delimiter.

So I created this regex:

Regex2 = "/^\h*(F|A|E)?\d{6}.*?\R{2}\K/ms"


Yet, I sometimes have this format (newlines between, F999999 of the same reservation), making my previous regex (regex2) cut where it says «NOT CUT HERE»:

A N K U N F T 11.08.15
*** NEUBUCHUNG ***
11.08.15 xxx xxx X3 2830 14:25 17:50
18.08.15 xxx xxx X3 2831 18:40
F882129 dsdsaidsaia

<<NOT CUT HERE>>

F882129 xxxyxyagydaysd
sadsdsdsdsadsadadssda
sadsdsdsdsadsadadssda
**«CUT HERE»**


A N K U N F T 18.08.15
*** NEUBUCHUNG ***
11.08.15 xxx xxx X3 2830 14:25 17:50
18.08.15 xxx xxx X3 2831 18:40
F881554 ZXCXZCXCXZCCXZ

<<NOT CUT HERE>>

F881554 xcvcxvcxvcvxc
F881554 xvcxvcxcvxxvccvxxcv

**«CUT HERE»**


11.08.15 xxx xxx X3 2830 14:25 17:50
18.08.15 xxx xxx X3 2831 18:40
F881605 xczxcdfsfdsdfs
F881605 zxccxzxzdffdsfds

**«CUT HERE»**


I just want it to cut where «CUT HERE» appears.

This error happens for example:

***NEUBUCHUNG ***
23.02.17 DUS FNC DE 1414 12:05 15:10
09.03.17 FNC DUS DE 1415 16:40
FNC011 Enotel Baia 9360-215 Ponta do Sol
1 DZ Typ I Meerblick 2Erw. Frühstück
am 03.10.16 CRS: MX - PNR: 1290689
Fluggeber: Condor Flugdienst / PNR: 1290689 Frühbucher 10% inkl. Reiseleitung und Transfer ab/bis
A025808 HERR Berg, Ulrich 62


<<NOT CUT HERE>

Anfrage.
A025808 FRAU Berghaus, Petra 58

**«CUT HERE»**

***S T O R N O **
04.10.16 STR X3 2810
11.10.16 FNC STR X3 2811 18:15
FNC036 The Flame Tree Funchal
1 DZ Meerblick 2Erw. H
A987025 FRAU BURG, GERTRUD *** STORNO *** O


<<NOT CUT HERE>>


A987025 HERR BURG, WALTER *** STORNO *** O

**«CUT HERE»**

***ÄNDERUNG ***
NEU:01.11.16 FRA X3 2806 13:35 16:50
08.11.16 FNC FRA X3 2807 17:40
FNC813 Golden Residence/Wanderk. 9000-105 Funchal
1 Suite seitl. Meerblick 3Erw. F
A982512 FRAU KROST, SIMONE
Frühbucher 15%


<<NOT CUT HERE>>

inkl. Reiseleitung
und Transfer ab/bis
Im Reisepreis bereits enthalten: Drei
geführte Wanderungen (1 Ganztags- und 2
Halbtagswanderungen) inkl. aller
Transfers.

**«SHOULD CUT HERE»**

***ÄNDERUNG ***
ALT:01.11.16 FRA X3 2806 13:35 16:50
08.11.16 FNC FRA X3 2807 17:40
FNC813 Golden Residence/Wanderk. 9000-105 Funchal
1 Suite seitl. Meerblick 3Erw. F
A982512 HERR KROST, SIMONE

**«CUT HERE»**


25.04.17 DRS FNC ST 1602 13:25 17:15
09.05.17 FNC DRS ST 1607 00:00
FNC076 Baia Azul 9004-530 Funchal
1 DZ Typ I Meerblick 2Erw. Halbpension
am 03.10.16 CRS: MX - PNR: 15326821
Fluggeber: alltours / PNR: 15326821
inkl. Reiseleitung
und Transfer ab/bis Flughafen
A025986 HERR Schulze, Steffen 55
A025986 FRAU Schulze, Kerstin 54

**«CUT HERE»**

***S T O R N O **
13.11.16 FRA X3 2806
20.11.16 FNC FRA X3 2807 17:35
FNC096 Pestana Village & Miramar Funchal
1 Studio 2Erw. H
A976918 FRAU HEBING, BETTINA *** STORNO *** O

<<NOT CUT HERE>>

A976918 HERR HEBING, LUDGER *** STORNO *** O

**«CUT HERE»**


I put «NOT CUT HERE» where it splits but shouldn’t. I put: «SHOULD CUT HERE» where it should cut. And i put «CUT HERE» were it cuts correctly.

Answer

You may use

'~^\h*F\d{6}.*?\R{2}\K~sm'

See the regex demo

Details:

  • ^ - start of a line
  • \h* - 0+ horizontal whitespaces
  • F\d{6} - F + 6 digits -.*? - any 0+ chars up to the first
  • \R{2} - 2 linebreaks
  • \K - and omit the whole match text.

See PHP demo:

$re = '~^\h*F\d{6}.*?\R{2}\K~ms'; 
$str = "A N K U N F T   11.08.15\n*** NEUBUCHUNG ***\n 11.08.15  xxx  xxx  X3 2830  14:25   17:50\n 18.08.15  xxx  xxx  X3 2831  18:40\n F882129  dsdsaidsaia\n F882129  xxxyxyagydaysd\nsadsdsdsdsadsadadssda\nsadsdsdsdsadsadadssda\n\nA N K U N F T   18.08.15\n*** NEUBUCHUNG ***\n 11.08.15  xxx  xxx  X3 2830  14:25   17:50\n 18.08.15  xxx  xxx  X3 2831  18:40\n F881554  ZXCXZCXCXZCCXZ\n F881554  xcvcxvcxvcvxc\n F881554  xvcxvcxcvxxvccvxxcv\n\n\n11.08.15  xxx  xxx  X3 2830  14:25   17:50\n 18.08.15  xxx  xxx  X3 2831  18:40\n F881605  xczxcdfsfdsdfs\n F881605  zxccxzxzdffdsfds\n\n"; 
print_r(preg_split($re, $str));
Comments