All Forums Database
vidya46 4 posts Joined 09/10
30 Mar 2012
SQL to query 3NF "Traited" Dimension

Hello experts

We are designing a dimension of ITEMS (in a Point of sales DWH) as a table with Name-Value Pairs i.e. ITEM attributes stored as ROWS instead of COLUMNS.This ITEM table is joined to a fact table containing SALES $ and QUANTITY.

CREATE TABLE ITEM_TRAIT
(ITEM_KEY INTEGER NOT NULL,
TRAIT_CODE CHAR(10),
TRAIT_VALUE VARCHAR(100))
unique primary index (ITEM_KEY,TRAIT_CODE);


collect stats BAA_D_R2W_QA_WKY_SEM_N_01. ITEM_TRAIT column(item_key);
collect stats BAA_D_R2W_QA_WKY_SEM_N_01. ITEM_TRAIT column(trait_code);
collect stats BAA_D_R2W_QA_WKY_SEM_N_01. ITEM_TRAIT column(item_key,trait_code);
INSERT into ITEM_TRAIT Values('111','BRAND','SHAMPOO');
INSERT into ITEM_TRAIT Values('111','DESC','Shampoo 16OZ');

++++SQL to query ITEM/FACT+++
SELECT
                                CAL.WEEK AS WEEK,
                                (case when item.trait_code='BRAND' then trait_value end ) AS BRAND,
                                SUM(FACT.POSSLS) AS POSSALES,
                                SUM(FACT.POSQTY) AS POSQTY
                FROM     
                               FACT,
                               CALENDAR CAL,
                               ITEM_TRAIT ITEM
                WHERE
                                FACT.PERIOD_KEY = CAL.PERIOD_KEY
                                AND  FACT.ITEM_KEY=ITEM.ITEM_KEY
                                AND  CAL.PERIOD_KEY  between 20100101 and 20100228
                                AND  POSSLS <> 0
                GROUP BY
                                1,2 
++++++++++++++++++++++++++
  • So for 10000 ITEMS each with 30 attributes,this table contains 90K rows.
  • Typical queries ask for sales/quantity (from fact) grouped by BRAND for a time period

 

  1. How does the optimiser know that the ITEM table contains repeated ITEM_KEY?
  2. When ITEM table joins to FACT,the optimiser does not seem to "know" that BRAND is an attribute,unlike a normal dimension and seems to incur a lot more I/O ...This is compared to a similar SQL to a "normal" ITEM table and the response time  and I/O are significantly less.

TIA

 

You must sign in to leave a comment.