All Forums Database
tdouglas 4 posts Joined 07/12
05 Jul 2012
How to limit rows to only those where there are duplicates of a specific column?

I've seen how to select distinct information, but I haven't successfully found anything for the opposite.  I'm looking to limit my search only to items with duplicate UPCs.  My query essentially looks like this...

SELECT
item_number
,upc
,description
,store_number

FROM
table1

WHERE
product_category IN (1,2,3)

...and my results look like:

 

item_num

upc

descrip

store_num

5

44444-44444

fish

23

3

33333-33333

cat

17

6

33333-33333

cat

42

2

22222-22222

dog

17

4

22222-22222

dog

42

1

11111-11111

snake

8

I only want to see rows with a UPC that exists more than once, but I want to see each row.  I don't want a count function to tell me there are two each of 22222-22222 and 33333-33333, but I want to restrict my results to see only and exactly the rows containing UPCs which appear more than once.  In this scenario, I want to see four rows--both cat rows and both dog rows.

 

Thank you in advance!

dnoeth 4628 posts Joined 11/04
05 Jul 2012

select * from table1
qualify count(*) over (partition by upc) > 1

Dieter

Dieter

tdouglas 4 posts Joined 07/12
05 Jul 2012

Thank you!  This does exactly what I want, and surprisingly more easily than I thought.

The last thing I need to do is repeat this step, with a column other than UPC.  Let's pretend store_num is 17 for both of my 33333-33333 UPCs, but my 22222-22222 has the store_num as shown.  How would I limit it to show only and exactly the rows there both UPC and store_num are duplicate?

I tried just adding, but this did not change m

AND COUNT(*) OVER (PARTITION BY store_num)>1

Then I tried this, but it keeps asking for something like ")" betwen "upc" and "AND".

COUNT(*) OVER (PARTITION BY upc AND store_num)>1

 

At this point I would be satisfied to play with it in Excel and filter it as I need it, but even after the first qualify my query is returning 170,000 rows...quite a few more than the 65,000 I'm allowed.

dnoeth 4628 posts Joined 11/04
05 Jul 2012

PARTITION BY is similar to GROUP BY for aggregates,
if i understand you correctly you need:

COUNT(*) OVER (PARTITION BY upc, store_num)>1

 

Dieter

 

Dieter

tdouglas 4 posts Joined 07/12
05 Jul 2012

This works perfectly--thanks for all of your help.  For the most part I understand how the PARTITION BY works...you're partitioning the data by a certain criterion.

But when you PARTITION BY more than one column, does it know to partition by one or the other first?  or does it not matter?

dnoeth 4628 posts Joined 11/04
05 Jul 2012

It's like GROUP BY col1, col2 vs. GROUP BY col2, col1, it doesn't matter.

Dieter

Dieter

geetika 1 post Joined 11/12
17 Nov 2012

I have a teradata query which uses aggregate function and GROUP BY. GROUP BY restrictes the results to half for eg. without using aggregate/group by, query fetches 26,000 records and with aggreagate/group by it restrictes the resulting rows to 13,000 Only. How I can display all duplicate records in my table with aggregate function and GROUP BY.

dnoeth 4628 posts Joined 11/04
19 Nov 2012

Try
HAVING COUNT(*) > 1
Dieter

Dieter

Chandan 3 posts Joined 08/09
12 Aug 2014

Hi Dieter,
Can you please help for below requirement.
I have below records in my table,

  ACCT    ID    CD     

1)  123   2500   AB1         

2)  123   24386  AB2

3)  123   2500   AB3

4)  123   2500   AB4

5)  123   2500   AB5

6)  123   24386  AB6

7)  123   2500   AB7

8)  123   2500   AB8

 

order by CD after then IF my ID 24386 is in between 2500(2 nd row &6 rows are in between ) then i need to take the cd of next rows till i found another 24386.

 

expected result for column  DER_CD

 

  ACCT    ID    CD     DER_CD

1)  123   2500   AB1   AB1       

3)  123   2500   AB3   AB3

4)  123   2500   AB4   AB3

5)  123   2500   AB5   AB3

7)  123   2500   AB7   AB7

8)  123   2500   AB8   AB7

 

Please help on that

 

Thanks,

Chandan

 

dnoeth 4628 posts Joined 11/04
14 Aug 2014

Hi Chandan,
based on your result set (I didn't fully understand your explanantion) this should work:

CREATE VOLATILE TABLE vt (  ACCT INT,    ID  INT,  code     CHAR(3))
ON COMMIT PRESERVE ROWS;

INS vt(123 ,  2500   ,'AB1');         
INS vt(123 ,  24386  ,'AB2');
INS vt(123 ,  2500   ,'AB3');
INS vt(123 ,  2500   ,'AB4');
INS vt(123 ,  2500   ,'AB5');
INS vt(123 ,  24386  ,'AB6');
INS vt(123 ,  2400   ,'AB7');
INS vt(123 ,  2400   ,'AB8');

SELECT dt.*,
   COALESCE(MAX(x) 
                OVER (PARTITION BY acct
                      ORDER BY code 
                      ROWS UNBOUNDED PRECEDING), code) 
FROM
 (
   SELECT vt.*,
      CASE WHEN id = 24386
           THEN MIN(code) 
                OVER (PARTITION BY acct
                      ORDER BY code 
                      ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) 
      END AS x
   FROM vt
 ) AS dt
QUALIFY id <> 24386

 

Dieter

lijianguonew 5 posts Joined 07/14
14 Aug 2014

select t1.acct,t1.id,t1.code,t2.acct,t2.id,t2.code,t2.row_idnew,t1.row_id,t2.row_idnew-t1.row_id aa,
min(t1.code) over(partition by t2.row_idnew-t1.row_id)
from
(select acct,id,code,row_number() over(order by code) 
from avt
where id<>24386
) t1(acct,id,code,row_id)
inner join
(select acct,id,code,row_number() over(order by code) 
from avt
qualify id<>24386
) t2(acct,id,code,row_idnew)
on t1.acct=t2.acct
and t1.code=t2.code
order by t1.code

Chandan 3 posts Joined 08/09
16 Aug 2014

Thanks much Dieter.. It's working...
 

Chandan 3 posts Joined 08/09
31 Oct 2014

Hi Dieter,
There is slight change in requirement.
I have below scenario:

  ACCT    ID        NUM               CODE    

 

1)  123   2500     1111              AB1         

2)  123   24386   1111              AB2

3)  123   2500     1111              AB3

4)  123   2500     1111              AB4

5)  123   2500     1111              AB5

6)  123   24386   1111              AB6

7)  123   24386   1111              AB7

8)  123   2500     1111              AB8

9)  123   2500     1111              AB9

 

 

 

my expected result will be :

 

  ACCT    ID      NUM             CODE     DER_CODE

 

1)  123   2500    1111           AB1       1111       

3)  123   2500    1111           AB3       1111_AB3

4)  123   2500    1111           AB4       1111_AB3

5)  123   2500    1111           AB5       1111_AB3

8)  123   2500    1111           AB8       1111

9)  123   2500    1111           AB9       1111

 

Thanks in advance,Your help will be appreciate.

 

Thanks,

Chandan

 

PramodhG 2 posts Joined 04/13
05 Jul 2015

Hi,
Can someone help me here ... My Current table has the below data --
Rank, Type
1  S
2  S
3  U
4  U
5  S
6  S
7  S
 
From the above table, I need to pull the record of type 'U' with max Rank and the one before it and also the one after it. The result set should show as below...
3  U
4  U
5  S

 
 

manib0907 61 posts Joined 04/15
05 Jul 2015

SELECT
B.RANK,B.TYPE
(SELECT
MAX(RANK) OVER(PARTITION BY TYPE ORDER BY RANK) MAX_RANK
FROM
TABLE
WHERE TYPE='U') A,
TABLE B
WHERE B.RANK=A.MAX_RANK OR B.RANK=A.MAX_RANK+1 OR B.RANK=A.MAX_RANK-1
Try this...

Cheers,
BM

 

Cheers,
Mani

PramodhG 2 posts Joined 04/13
05 Jul 2015

Hi BM,
Thanks for your reply... But, i am getting the below error message..

Ordered Analytical Fuctions are not allowed in Subqueries

 

Is there any other way of finding the solution for this ?

dnoeth 4628 posts Joined 11/04
05 Jul 2015

Is the rank actually gapless? Created by a ROW_NUMBER?

SELECT Rank, Type
FROM tab
QUALIFY Rank = MAX(CASE WHEN Type = 'U' THEN Rank END) OVER () - 1
     OR Rank = MAX(CASE WHEN Type = 'U' THEN Rank END) OVER ()
     OR Rank = MAX(CASE WHEN Type = 'U' THEN Rank END) OVER () + 1

 

Dieter

ELMABUKOWSKI 1 post Joined 06/16
02 Jun 2016

Creative suggestions , I am thankful for the points - Does someone know if my business might be able to grab a template CA DE-120 example to work with ?

You must sign in to leave a comment.