All Forums Database
taruntrehan 43 posts Joined 10/12
20 Sep 2013
Volatile Table In View ??

Hi,
I am writing a view within which i have some restriction to apply.
Following is my current working query :

REPLACE VIEW person_view
AS LOCKING ROW FOR ACCESS

SELECT distinct
date_time ,
person_id ,
person_key ,
name ,
address 

FROM _person a 

inner join 

(select max(date_time) as max_date_time , person_id as max_person_id , person_key as max_person_key 
from _person group by max_person_id , max_person_key) b

on a.date_time = b.max_date_time
and a.person_id = b.max_person_id
and a.person_key = b.max_person_key;

 

The above query works but i am not sure about:

  1. Performance of the query ?
  2. Is there a possibility to create a volatile table. It should be better as its indexed.
  3. How does TERADATA handle the inner table ?

 
Appreciate your inputs.

Regards, Tarun Trehan http://allzhere.in
dnoeth 4628 posts Joined 11/04
20 Sep 2013

#1: Those MAX/MIN queries can usually be rewritten using RANK/ROW_NUMBER:

SELECT *
FROM _person
QUALIFY 
  RANK() 
  OVER (PARTITION BY person_id, person_key
        ORDER BY date_time DESC) = 1

In most cases this is (much) faster, only in some specific cases when the GROUP BY/JOIN is the PI and you rewrite it as a Correlated Subquery it might be a bit worse. I always prefer the OLAP version as it's also much easier to write.
 
#2: You can't create a Volatile Table within a View. You could something similar in a MACRO: INSERT/SELECT into a Global Temporary Table and then use it.
 
#3: See Explain. It's materializing the result of the Derived Table. This is similar to a CREATE VOLATILE TABLE, that's why you usually don't care about it. Just let the optimizer do it's work.
 
Dieter

Dieter

taruntrehan 43 posts Joined 10/12
01 Oct 2013

Hi,
Thanks for the inputs. It helped me resolve the issue. However, i would like to add the following:
I tried the 

SELECT *
FROM _person
QUALIFY 
  RANK() 
  OVER (PARTITION BY person_id, person_key
        ORDER BY date_time DESC) = 1

option as well as the following :

where 
a.date_time = ( select max(b.date_time) 
	from _person b 
	where  a.person_id = b.person_id
	and a.person_key = b.person_key )

 
Though both are giving response times that are acceptable for me; the IO & Spool Usage by the former query is higher.
Just wanted to know the reason.

Regards,
Tarun Trehan
http://allzhere.in

You must sign in to leave a comment.