GSxxx GSxxx - 23 days ago 9
Perl Question

Grep -P (pcre) How do I compare two values? #university_exercise

Given this data

A 1.20 GBP 1.2 GBP
B 1.2 GBP 1.20 GBP
C 01 GBP 1 GBP
D 1 GBP 01 GBP
E 1.0 GBP 1 GBP
F 1 GBP 1.0 GBP
G 2.10 GBP 3.2 GBP
H 4.1 GBP 3.20 GBP
I 04 GBP 3 GBP
J 4 GBP 03 GBP
K 4.0 GBP 3 GBP
L 4 GBP 3.0 GBP


I have to find lines where the values are different (using grep -P).

There is one space between each number and also
3.2 = 03.20, 3.0 = 3


I tried this

grep -P '([1-9][0-9]*(?:\.[0-9]*[1-9])?)(\.?0*) ([A-Z]{3}) 0*(?!\1).* \3' filename


Unfortunately it doesn't seem to work properly. I'm not actually certain about negative lookahead.

Edit:



I know that there are many better ways to achieve this result.

However I'm student and it's an exercise that I have to do using grep with regular expressions.

What I have tried works until it gets more tricky tests, so if you can help, just tell me what I'm doing wrong.

The result should be:

G 2.10 GBP 3.2 GBP
H 4.1 GBP 3.20 GBP
I 04 GBP 3 GBP
J 4 GBP 03 GBP
K 4.0 GBP 3 GBP
L 4 GBP 3.0 GBP


I have tested my solution and it additionally returns:

A 1.20 GBP 1.2 GBP
B 1.2 GBP 1.20 GBP
D 1 GBP 01 GBP


I have also checked the regular expression in https://regex101.com/. And result was surprising, because for lines A and B regular expression takes only numbers after period. Check it to know what I'm saying.

Another edit for those who tell about grep -v:
I did not present whole exercise. There are, after every number, currencies and there is additional thing that they have to be the same, when so I use grep -v, it still doesn't work and it's known why. There has to be one negation.

Answer

You can use this bit complex regex for this task:

grep -P '\h+0*(?:(?:(\d+)\.?0*\h+0*\1\.?0*|(\d+\.\d*[1-9])0*\h+\g{2}0*)(*SKIP)(*F)|.*)$' file

G 2.10 3.2
H 4.1 3.20
I 04 3
J 4 03
K 4.0 3
L 4 3.0

RegEx Demo

PCRE verbs (*SKIP)(*F) are used for skipping a match in an alternation.


Alternatively, you can use this negative lookahead regex as well:

grep -P '^\S+\h+(?!0*(?:(\d+)\.?0*\h+0*\1\.?0*|(\d+\.\d*[1-9])0*\h+\g{2}0*)$)' file

G 2.10 3.2
H 4.1 3.20
I 04 3
J 4 03
K 4.0 3
L 4 3.0

RegEx Demo 2

Comments