All Forums Database
jld@fitc 14 posts Joined 04/12
22 Mar 2013
Error 2803 Secondary index uniqueness violation error utf-8 impact?

Hi there,
I assume I have a pretty naughty case .
I have a dimension table with a secondary unique index (COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE) from columns with datatype
VARCHAR() CHARACTER SET UNICODE NOT CASESPECIFIC
and for a while this table was fed with not properly converted utf-8 string since the feeding apps driver did not support utf-8, now the  feeding apps driver is utf-8 compatible and I get (on first run) the error above when loading my dimension table.
I have like a first guess that the the index representation might not be the same as the "restituted" representation of the column´s value and that  in CITY_TXT the earlier value  ???????? (on Target) and the new ΠΕΙΡΑΙΑΣ (InFLow)  when comparing on COUNTRY_ISO_A2 , ZIP_CODE  would have the same index representation
How can I check that?
Any other suggestion welcome since I do not see after "functional" control not other reason
thanks
Rgds
JL D
 

Tags:
dnoeth 4628 posts Joined 11/04
22 Mar 2013

Are you shure the duplicate is not in your staging table?

sel COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE 
from staging 
group by 1 
having count(*) > 1

And to find out which rows might cause that error:

sel COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE from staging
intersect
sel COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE from target

If those queries don't return any rows, it might be due to some rules for comparing strings in UTF?
To find out which rows caused the error you could create a copy of the target table with a UPI on those columns, do a CREATE ERROR TABLE and then use a MERGE or INS/SEL with LOGGING ALL ERRORS. Thus there's no unspecific "a row caused an error", but the rows are recorded within the error table.
Dieter

Dieter

jld@fitc 14 posts Joined 04/12
22 Mar 2013

Hi Daniel,
thanks
both queries returned no occurrence, so I am left with the next control.
rgds
JL D
 

jld@fitc 14 posts Joined 04/12
22 Mar 2013

Sorry Dieter for the wrong name
JL

jld@fitc 14 posts Joined 04/12
04 Apr 2013

Hi there,
probably this explains the problem
CREATE MULTISET TABLE test.geo_dim2 ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      GEO_SK BIGINT GENERATED ALWAYS AS IDENTITY
           (START WITH 1
            INCREMENT BY 1
            MINVALUE 1
            MAXVALUE 2147483647
            NO CYCLE),
      COUNTRY_ISO_A2 CHAR(2) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,
      CITY_TXT VARCHAR(100) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL,
      ZIP_CODE VARCHAR(30) CHARACTER SET UNICODE NOT CASESPECIFIC NOT NULL)
UNIQUE PRIMARY INDEX ( GEO_SK )
UNIQUE INDEX ( COUNTRY_ISO_A2 ,CITY_TXT ,ZIP_CODE );
insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('GR','ΚΩΣ' ,'85300');
  OK
insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('GR','Κως' ,'85300');
  Error: [Teradata Database] [TeraJDBC 14.00.00.30] [Error 2803] [SQLState 23000] Secondary index uniqueness violation in EP_DEV1.geo_dim2
insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('BE','ΚORTRIJK' ,'8500');
  OK
insert  into test.geo_dim2 (country_iso_a2, city_txt, zip_code) values ('BE','Κortrijk' ,'8500');
  Error: [Teradata Database] [TeraJDBC 14.00.00.30] [Error 2803] [SQLState 23000] Secondary index uniqueness violation in EP_DEV1.geo_dim2
 
select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='GR' and GEO.ZIP_CODE='85300' AND GEO.CITY_TXT='ΚΩΣ'
  returns 1 row
select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='GR' and GEO.ZIP_CODE='85300' AND GEO.CITY_TXT='Κως'
  returns 0 row   <<<<===================
  the process discussed earlier is doing an
  insert into target
  select from staging
  where staging column part of the key not in (select column part of the key from target table)
select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='BE' and GEO.ZIP_CODE='8500' AND GEO.CITY_TXT='ΚORTRIJK'
  returns 1 row
select geo_sk, country_iso_a2, city_txt, zip_code from test.geo_dim2 GEO where GEO.COUNTRY_ISO_A2='BE' and GEO.ZIP_CODE='8500' AND GEO.CITY_TXT='Κortrijk'
  returns 1 row
Nevertheless, I still do not know why tests on another Teradata box returned 1 row in all case
and 'help session' cmd on both machines return the same setup
?
rgds
JL D

You must sign in to leave a comment.