All Forums Database
15 Nov 2014
Help on Below SQL

drop table retail.distance;

 

drop table  retail.distance1; 

create table retail.distance

 (

 source_Place varchar(20),

 destination_Place varchar(20),

 distance  integer

 );

 

 insert into retail.distance values ('Pune', 'Mumbai',100);

 insert into retail.distance values ('Mumbai', 'Pune',100);

 insert into retail.distance values ('Pune', 'Delhi',300);

 insert into retail.distance values ('Delhi', 'Pune',300);

 insert into retail.distance values ('Delhi', 'JAmmu',500);

 insert into retail.distance values ('Delhi', 'Agra',600);

 insert into retail.distance values ('Agra', 'Delhi',600);

insert into retail.distance values ('Agra', 'Bhopal',700);

 

/*

   source_Place destination_Place distance

1 Mumbai Pune 100

2 Pune Mumbai 100

3 Delhi Pune 300

4 Pune Delhi 300

5 Delhi JAmmu 500

6 Delhi Agra 600

7 Agra Delhi 600

8 Agra Bhopal 700

 

*/

 

Now in above result set we can see clearly row 1 & 2 are duplicate(i.e. same information in both row).

In same way row 3 & 4, and 6 & 7 are duplicate.

 

I have to remove these duplicate and want result set like below.

 

/*

source_Place destination_Place distance

1 Mumbai Pune 100

--2 Pune Mumbai 100

3 Delhi Pune 300

--4 Pune Delhi 300

5 Delhi JAmmu 500

6 Delhi Agra 600

--7 Agra Delhi 600

8 Agra Bhopal 700

 

*/

 

Desire output should have either row 1,3,5,6,8 or 2,4,5,7,8.

Could any of expert help me on above.

 

 

 

Tags:
dnoeth 4628 posts Joined 11/04
15 Nov 2014

Hey, this is one of the labs I do in my trainings ;-)
There are several solutions to this problem, their performance mainly depends on the number of duplicates. Usually a good one is a ROW_NUMBER:

SELECT
   source_Place,
   destination_Place,
   distance
FROM distance
QUALIFY
   ROW_NUMBER()
   OVER (PARTITION BY
           CASE WHEN destination_Place < source_Place THEN destination_Place ELSE source_Place END,
           CASE WHEN destination_Place < source_Place THEN source_Place ELSE destination_Place END
         ORDER BY source_Place) = 1

Dieter

15 Nov 2014

Thanks Dieter for your quick response...:)
Could you please below query once and Please suggest to optimize this query.

SELECT	SOURCE_PLACE,DESTINATION_PLACE,distance 
FROM	RETAIL.DISTANCE a
 WHERE	(SOURCE_PLACE,DESTINATION_PLACE)  
  NOT IN  (
SELECT	DESTINATION_PLACE,SOURCE_PLACE FROM RETAIL.DISTANCE b  )
  
UNION
 
SELECT	SOURCE_PLACE,DESTINATION_PLACE,distance 
FROM	RETAIL.DISTANCE a
 WHERE	(SOURCE_PLACE,DESTINATION_PLACE)  
  IN  (
SELECT	DESTINATION_PLACE,SOURCE_PLACE FROM RETAIL.DISTANCE b 
where	 a.SOURCE_PLACE < b.SOURCE_PLACE );

 

Thanks,
Abhijeet

dnoeth 4628 posts Joined 11/04
15 Nov 2014

Hi Abhijeet,
simply compare Explain and actual cpu/io in the query log.
Better performing than NOT IN/UNION should be this:

SELECT DISTINCT source_Place, destination_Place, distance
FROM distance t1
WHERE NOT EXISTS(
  SELECT * FROM distance t2
  WHERE t1.source_Place = t2.destination_Place
    AND t1.destination_Place = t2.source_Place
    AND t1.source_Place > t2.source_Place
);

 

Dieter

You must sign in to leave a comment.