How to compare string data to table data in SQL Server - I need to know if a value in a string doesn't exist in a column


I have two tables, one an import table, the other a FK constraint on the table the import table will eventually be put into. In the import table a user can provide a list of semicolon separated values that correspond to values in the 2nd table.

So we're looking at something like this:

ID | Column1
1  | A; B; C; D

ID  | Column2
1   | A
2   | B
3   | D
4   | E

The requirement is:

Rows in TABLE 1 with a value not in TABLE 2 (C in our example) should be marked as invalid for manual cleanup by the user. Rows where all values are valid are handled by another script that already works.

In production we'll be dealing with 6 columns that need to be checked and imports of AT LEAST 100k rows at a time. As a result I'd like to do all the work in the DB, not in another app.

BTW, it's SQL2008.

I'm stuck, anyone have any ideas. Thanks!

By : Kyle West


Here is an easy and straightforward solution for the IDs of the invalid rows, despite its lack of performance because of string manipulations.

select T1.ID
from [TABLE 1] T1
    left join [TABLE 2] T2
        on ('; ' + T1.COLUMN1 + '; ') like ('%; ' + T2.COLUMN2 + '; %')
where T1.COLUMN1 is not null
group by T1.ID
having count(*) < len(T1.COLUMN1) - len(replace(T1.COLUMN1, ';', '')) + 1

There are two assumptions:

  1. The semicolon-separated list does not contain duplicates
  2. TABLE 2 does not contain duplicates in COLUMN2.

The second assumption can easily be fixed by using (select distinct COLUMN2 from [TABLE 2]) rather than [TABLE 2].

If it is possible, try putting the values in separate rows when importing (instead of storing it as ; separated).

This might help.

This video can help you solving your question :)
By: admin