info perto info perto - 5 months ago 11x
SQL Question

How to remove specific words in database?

I will try to explain with an explain because it is bit complicated.I would like to remove words from my SQL table if the word's root already exist in table. So my table structure is like this

Words Scores
car 5
book 11
cars 2
pen 10
tool 4
car's 8
tools 2

So in this case, since car is root of car's and cars, tool is root of tools, I would like to remove them and add their score on the root word and get my table like this;

Words Scores
car 15
book 11
pen 10
tool 6

Not: if there is "a" in the list then all words that starting with a going to be remove: for this I thought about having if condition like;
if (string[i].Length>=3)
but ofcourse it won't avoid all possibilities maybe there will be word "book"and "booking" so booking is going to be deleted but it is OK.


Well, I can't think of an easier solution if you don't have a relations table containing root and childs, but you can try something like this - In two steps, first step is update the scores, the second delete the childs :

UPDATE YourTable t
SET t.Scores =(SELECT sum(s.scores) FROM YourTable s
               WHERE s.words like concat('%',t.words,'%'))

EDIT: or this

update t
set t.score=sum(s.score) 
from YourTable t
INNER JOIN YourTable s
 ON (s.words like concat('%',t.words,'%'))

This will update every one to their childs(look alike) total score.

Then delete :

WHERE t.words in(SELECT s.words FROM YourTable s
                 WHERE t.words like concat('%',s.words,'%')
                 AND t.words <> s.words)

This will delete all the records that are a child(look alike) of another word. It won't work on any DBMS, so here is another version of it with a join(Update join syntax is different from one db to another):

INNER JOIN YourTable s
 ON(t.words like concat('%',s.words,'%')
    AND t.words <> s.words)

You didn't provide your RDBMS, so this is an answer for ANSI-SQL. This is untested , so check if it works.

Edit: Remeber that without a root-child table, there will be some exceptions that won't work and may lead to unnecessary update/deletion. You have to make rules of when a word is a child of another word, which will have no expection(I don't know if its even possible using sql).

My best suggestion - populate by yourself such table, insert all the root-child options, and use this table for the delete/update , this will ensure no errors will be made.