Arash Rabiee - 3 months ago 13x

SQL Question

One of my fields contains a strange section of text, but there is something in it that is important for me:

EX: my field is something like this:

\"'Way Out\"(How to be one) (1961) {Side Show (#1.12)}\"'t Schaep Met

De 5 Pooten\"

All the text is garbage, I just want 1961.

Also, my table has more than 200K rows.

The year is inside of the parentheses. The query which is shown below works at every position, but when I have something inside of parentheses before year there is a problem.

All of the text is garbage but the year is required. I already used:

`UPDATE `myTable``

SET `myField` = (SUBSTRING_INDEX(`myField`, '(', -3));

UPDATE `myTable`

SET `myField` = (SUBSTRING_INDEX(`myField`, ')', 3));

UPDATE `myTable`

SET `myField` = (SUBSTRING_INDEX(`myField`, ')', 2));

UPDATE `myTable`

SET `myField` = (SUBSTRING_INDEX(`myField`, ')', 1));

Is there any way to make a query that saves just the value inside of any parentheses which contain 4 digits?

In the above ex the result should be: 1961

Answer

You may give it a try ( I guess it's a one time job hence leave the performance issue):

```
UPDATE myTable MT
CROSS JOIN
(
Select (th*1000+h*100+t*10+u+1) x from
(select 0 th union select 1 union select 2 union select 3 union select 4 union
select 5 union select 6 union select 7 union select 8 union select 9) A,
(select 0 h union select 1 union select 2 union select 3 union select 4 union
select 5 union select 6 union select 7 union select 8 union select 9) B,
(select 0 t union select 1 union select 2 union select 3 union select 4 union
select 5 union select 6 union select 7 union select 8 union select 9) C,
(select 0 u union select 1 union select 2 union select 3 union select 4 union
select 5 union select 6 union select 7 union select 8 union select 9) D
WHERE (th*1000+h*100+t*10+u+1) >= 1800
AND (th*1000+h*100+t*10+u+1) <= 3000
order by x
) t
SET MT.myField = t.x
WHERE `myField` LIKE CONCAT('%',t.x,'%');
```

**Note:** Assuming your year string is within this range : `from 1800 to 3000`

Better put your minimum and maximum year in the where clause

```
WHERE (th*1000+h*100+t*10+u+1) >= MINIMUM_YEAR
AND (th*1000+h*100+t*10+u+1) <= MAXIMUM_YEAR
```

**Before running the above query:**

**After running the above query:**

Source (Stackoverflow)

Comments