Sunday, February 19, 2012

How can I fix bad design?

I've been asked to work on database for my employer. One field in one table
utilizes one of the worst violations of relational databases: multiple value
s
are stored in the same database field. Now I have a varchar(20) field with
values like ADEJG, BFGMR, GJNV etc., where each letter inidates a piece of
information about the record, uniquely identified by a recordID. Is there a
n
easy way to separate all these values out into recordID:value pairs other
than running "...select recordid,'A' where datafield like '%A%' 26 times,
once for each letter a to z?
The only thing I can tell you about the data is that if "A" is present it is
always first and "V" (currently last letter being used) is always last.
Position means nothing and any letter can be in any position save A up to it
s
order in the alphabet.
TIAWhat about something like this (See script below)
You don't have to query on Like A, B, C etc as you can rely on the string
you're interested in being only one character wide.
Let me know how you get on.
Damien
DROP TABLE #tmp
CREATE TABLE #tmp ( record_id INT IDENTITY PRIMARY KEY, multiple_values
VARCHAR(20) NOT NULL )
SET NOCOUNT ON
-- Dummy values for demo
INSERT INTO #tmp ( multiple_values ) VALUES ( 'ADEJG' )
INSERT INTO #tmp ( multiple_values ) VALUES ( 'BFGMR' )
INSERT INTO #tmp ( multiple_values ) VALUES ( 'GJNV' )
-- Create a control table based on the max width of multiple_values
DROP TABLE #control
CREATE TABLE #control ( control_no INT PRIMARY KEY )
DECLARE @.i INT
SET @.i = 1
WHILE @.i <= ( SELECT MAX( LEN ( multiple_values ) ) FROM #tmp )
BEGIN
INSERT INTO #control VALUES ( @.i )
SET @.i = @.i + 1
END
SET NOCOUNT OFF
-- Now, split your combined field into a temp table ( or make it permanent )
DROP TABLE #single_values
CREATE TABLE #single_values ( sv_id INT IDENTITY UNIQUE NOT NULL, record_id
INT NOT NULL, single_value CHAR(1) NOT NULL, PRIMARY KEY( record_id,
single_value ) )
INSERT INTO #single_values ( record_id, single_value )
SELECT t.record_id, SUBSTRING( t.multiple_values, x.control_no, 1 )
FROM #tmp t, #control x
WHERE LEN( t.multiple_values ) >= x.control_no
-- List normalized values
SELECT *
FROM #single_values
"Matthew Speed" wrote:

> I've been asked to work on database for my employer. One field in one tab
le
> utilizes one of the worst violations of relational databases: multiple val
ues
> are stored in the same database field. Now I have a varchar(20) field wi
th
> values like ADEJG, BFGMR, GJNV etc., where each letter inidates a piece of
> information about the record, uniquely identified by a recordID. Is there
an
> easy way to separate all these values out into recordID:value pairs other
> than running "...select recordid,'A' where datafield like '%A%' 26 times,
> once for each letter a to z?
> The only thing I can tell you about the data is that if "A" is present it
is
> always first and "V" (currently last letter being used) is always last.
> Position means nothing and any letter can be in any position save A up to
its
> order in the alphabet.
> TIA

No comments:

Post a Comment