All Forums Tools
standalone 17 posts Joined 08/08
14 Oct 2008
Improving Loading Mechanism using Fastload

Hi all!I am new to Teradata and this is my first topic.first of all, i like to share my problem (perhaps a challenge for those who advance in Teradata :) ).Okay!i have to load a lot of file from A system to teradata.A system produces 40.000 files per day.each file will consist of 40.000 records delimited by "|" and consist of 250 column.and each file is in gz compression format (so it have to be gunzip first to read the file).the goal is to load 40.000 file in less than a day.what i have done are:- gunzip -c per 15 file > to "special" mkfifo- run fastload from A system that read "special" mkfifo- run bteq to summarizing the data in to new table (the FACT table) using merge into statement.so far the performance is around 12 minutes per 15 files.i am using 36 amp teradata and using 32 session fastload.with this performance i can only load 1.800 files in a day MAX.the question is:- is there any configuration that i need to set to make it run faster?- is it posible to use more than 1 fastload in 1 system? let say the performance is 1.800 file per fastload, so i can load it using 30 fastload to different table. then insert it to just 1 single table from 30 fastload table. then just summmarize from that single table into the fact.thanks in advance.btw, ask me if the illustration is not clear enaough.

dnoeth 4628 posts Joined 11/04
15 Oct 2008

40.000*40.000 = 1.600.000.000 rows/dayThat's a lot for a small 32 AMP system: 50.000.000/AMP/dayHopefully "summarizing" reduces that number significantly, otherwise you run out of space within a few weeks.32 session is far too much for 36 AMPs, try a multiple of the number of nodes.What's the size of the target table, is it partitioned?What's the timing for:FastLoad Phase 1 (is it constrained by load server CPU/network/Teradata IO?)FastLoad Phase 2BTEQ Time for standalone Select vs. Insert/Select"12 minutes per 15 files" -> 50.000/min seems to be quite slow.You should probably increase the number of files per load and then might run 2 jobs rotational:job 1 -> FastLoad -> BTEQ -> FastLoad -> BTEQ -> ....job 2 -> FastLoad -> BTEQ -> FastLoad -> ....This way there's always a FastLoad and an insert/select at the same time.Dieter

Dieter

standalone 17 posts Joined 08/08
15 Oct 2008

since the source file keep coming to landing point.---so, instead of increasing the session, wiser to increase number of fastload node.for that reason, i come out with this scenario.(1)FASTLOAD scenario:1. using gunzip -c (number of file) > mkfifo -m. 2. fastload to target(n) table where the target is not partitioned and no index (since i can determine the index, but is it possible to have NUPI in fastload?). 3. run BTEQ to put to a single point of temporary_detail table that optimized for summary (have NUPI and NUSI).4. add column date_load on temporary_detail table.5. this scenario will run every 5 minute (but will not run if the same process already run).6. this scenario will run in parallel node.(2)SUMMARY scenario1. run bteq to summarize from temporary_detail table for record where date_load between 2 hours before and 1 hour before.2. run bteq to delete record where date_load more the 3 hours before.3. this scenario will run every 1 hour.4. this scenario will not be run in parallel.40.000 files -> (1) fastload_1, fastload_2, ..., fastload_20 table -> temporary_detail table -> (2)fact tablerun 20 fastload scenario and 1 summary scenarioso, with this fastload scenario, i can use around 20 fastload node. so that i can keep up with the 40.000 files in a day loading.summary scenario is not a problem anymore if all the a day record has been loaded to teradata.actually, the slow loading is not because the fastload configuration, but the gunzip process take a time to finish.i am interesting with your statement that said so, what the better number of session should i use for 36 amp if i were use 20 fastload node?---fyi, the summarizing time for about 2 million records is about 2 minutes.and this is still not good because the source table is not using any index.so it will take only 1 AMP in spool space to run this query.i am hoping with the new scenario i can make the table structure for summarizing source is optimize.so it can distribute the AMP evenly in spool space.thanks Dieter

dnoeth 4628 posts Joined 11/04
18 Oct 2008

"2. fastload to target(n) table where the target is not partitioned and no index (since i can determine the index, but is it possible to have NUPI in fastload?)."Each and every table in Teradata always has a Primary Index, if *you* don't specify it, the system will.What's the PI of the target table and how the stage table will be summarized? You should choose that PI to improve the following aggregate. "3. run BTEQ to put to a single point of temporary_detail table that optimized for summary (have NUPI and NUSI)."Instead of running 20 FastLoads just run 1 directly into that stage table, so you can skip that.Btw, the number of parallel FastLoad jobs is usually limited to less than 20. Depending on your release that number might be 5, but dba might incease it to 15 (or beyond)."4. add column date_load on temporary_detail table."No. Just define column date_load with a default current_date and you can skip that step, too."5. this scenario will run every 5 minute (but will not run if the same process already run)."Better increase that interval"6. this scenario will run in parallel node."Does that mean you don't have load server and run that process on a Teradata node?This is bad, gunzip using a lot of cpu will slow down the system."(2)SUMMARY scenario1. run bteq to summarize from temporary_detail table for record where date_load between 2 hours before and 1 hour before."Do you need that previously added column just for that purpose? Then it's better to use distinct stage tables for each FrastLoad, and you can drop that column, because there's no WHERE-condition. "2. run bteq to delete record where date_load more the 3 hours before."No. Use different stage tables."3. this scenario will run every 1 hour."Then this is approx. the time for that single concurrent FastLoad."4. this scenario will not be run in parallel."If a BTEQ job has finished you "actually, the slow loading is not because the fastload configuration, but the gunzip process take a time to finish."So you have to increase the cpu power of the load system until that bottleneck is removed.If the network is the next bottleneck you have to increase it's speed until finally Teradata is the bottleneck.And that bottleneck is easily removed by adding some Teradata nodes ;-)"fyi, the summarizing time for about 2 million records is about 2 minutes.and this is still not good because the source table is not using any index.so it will take only 1 AMP in spool space to run this query.i am hoping with the new scenario i can make the table structure for summarizing source is optimize.so it can distribute the AMP evenly in spool space."Probably the automatically created PI (most likely the first column) is totally skewed.Could you provide some additional information?- target table DDL (at least info about SET/MULTISET, PI, partitioning), table size/rowcount and cardinality of the PI - # of nodes/AMPs of your systemDieter

Dieter

standalone 17 posts Joined 08/08
21 Oct 2008

i am still using the previous mentioned scenario with some changes:1. Add Primary Index to fastload target table- i have put primary index in fastload target table using multiple column that use in group by statement in summary scenario as your recommendation. the result is AMP distribution is better than before, although it is not distributed equally, but it use all AMPs.2. Use multiple staging table- instead of using single temporary table, i use (n) temporary table where (n) is #fastload table used. I am still using date_load column as a mentioned previously. this strategy is done because using single temporary table have issue in inserting performance.and some information what the table look like.- (n) fastload target table will consist column name FIELD1, ..., FIELD235 where their type are varchar(200).- (n) staging table will consist of call_time, subscriber_number, subscriber_target_number, call_location, call_service, call_charge, call_duration, date_load- the fact table is consist of call_time, subscriber_number, subscriber_target_number, call_location, call_service, total_call_charge, total_call_duration, total_trxbtw, i still cant figure out on using only 1 fastload.it slow and only load 15 files per 5 minute max.so, what i have done is using 15 fastload in paralel.right now the loading speed increase from 15 file per ~10 minutes to 15 files per 5 minute * 15 paralel fastload.it should be around 225 per 15 minutes, 900 files per hour, 21600 files per daybut it still far from the target where 40.000 files per day.the loading server is in another box.

dnoeth 4628 posts Joined 11/04
21 Oct 2008

"- (n) fastload target table will consist column name FIELD1, ..., FIELD235 where their type are varchar(200)."1. Are those fields really 200 byte each or is it just "i don't know/care about the actual size"?This calculates as 235*200 = 47000 bytes and FastLoad uses the maximum possible size to calculate the number of records per message, so this results in exactly 1 record per message sent to Teradata.2. In your input file there are 235 fields, but your target table only got 8 colums?If this is true, then *all* fields are sent across the network, but only those 8 are really needed.Try to get rid of the unused fields (e.g. awk/sed after gunzip) or switch to MLoad which allows FILLER (but probably uses more resources).Could you post the output of a FastLoad and the DDL of the target/stage tables?Dieter

Dieter

standalone 17 posts Joined 08/08
21 Oct 2008

is not that i don't care of how much each column length and type.it is because the content will vary and depend on the record profilelet say there are 3 types of record profile with:Type 1column 1: value Acolumn 2: value Bcolumn 3: value CColumn 4: value DType 2Column 1: value Acolumn 2: value Ccolumn 3: value Dcolumn 4: value nullltype 3column 1: value Acolumn 2: value nullcolumn 3: value Ecolumn 4: value Fthe profile determined by column 1.and value A through F are different in type.that is why i can't put exact type for a column.for profile type 1 it might integer but for type 2 it might date with format YY-MM-DD HH:mm:ss.i have try to only send column that i need.but it not affect the performance.please find the attachment for ddl

standalone 17 posts Joined 08/08
21 Oct 2008

please find the attachment for fastload and inserting to stage table log.

standalone 17 posts Joined 08/08
22 Oct 2008

okay.i think i know the problem isi set the wrong argument for SET SESSIONi useSESSION 4 8;it has to be SESSION 8 4;right now i can process 20 files per 5 minute.and set the session toSESSION 12 8;but if there is another performance tips i can use, feel free to share :)

dnoeth 4628 posts Joined 11/04
22 Oct 2008

**** 00:35:21 Number of recs/msg: 1 **** 00:35:21 Starting to send to RDBMS with record 1**** 00:36:05 Sending row 100000**** 00:40:50 Sending row 200000**** 00:43:51 Sending row 300000**** 00:45:43 Sending row 375291**** 00:45:43 Finished sending rows to the RDBMSThere's the bottleneck:10+ minutes for 375.000 rows"i have try to only send column that i need.but it not affect the performance."Because FastLoad still sends all the fields even if they're not used for insert (just don't ask me why).You need about 15 fields, but send 235. As i already wrote you have to remove those unnecessary fields *before* FastLoading. This results in about 20 recs/msg. And if you modify that VARCHAR(200) to the actual max size for each fieldit's probably much more (max size is COSTBAND VARCHAR(40)). Finally this will improve performance for FastLoad's insert phase more than 20x.But i'd switch to MultiLoad and use FILLER for those fields, easier to maintain and no more FastLoad table:Insert/Select -> Insert directly into stage table in MLoad LABEL UNION ALL -> 2 APPLY ... WHERE ...TIMESTAMP '2008-10-22 00:45:52' -> SYSDATE4/SYSTIMEDieter

Dieter

standalone 17 posts Joined 08/08
22 Oct 2008

this is log if i run in single fastload (not parallel fastload)**** 11:53:59 Processing starting at: Wed Oct 22 11:53:58 2008**** 11:53:59 FDL4866 SESSIONS command accepted**** 11:53:59 Error limit set to: 1**** 11:53:59 Teradata Database Release: 12.00.01.13**** 11:53:59 Teradata Database Version: 12.00.01.13**** 11:53:59 Current CLI or RDBMS allows maximum row size: 64K**** 11:53:59 Character set for this job: ASCII**** 11:54:00 Number of FastLoad sessions requested = 8**** 11:54:00 Number of FastLoad sessions connected = 8**** 11:54:00 FDL4808 LOGON successful**** 11:54:01 Command completed successfully**** 11:54:01 RDBMS error 3807: Object 'IN_CAT_LOAD_PAR_BENCH_ERR_1'**** 11:54:01 RDBMS error 3807: Object 'IN_CAT_LOAD_PAR_BENCH_ERR_2'**** 11:54:01 Command completed successfully**** 11:54:02 Number of AMPs available: 12**** 11:54:02 BEGIN LOADING COMPLETE**** 11:54:02 Now set to read 'Variable-Length Text' records**** 11:54:02 Delimiter character(s) is set to '|'**** 11:54:02 Command completed successfully**** 11:54:02 FDL4803 DEFINE statement processed**** 11:54:02 Number of recs/msg: 1**** 11:54:02 Starting to send to RDBMS with record 1**** 11:54:17 Sending row 100000**** 11:54:32 Sending row 200000**** 11:54:42 Sending row 272375**** 11:54:42 Finished sending rows to the RDBMS**** 11:54:46 END LOADING COMPLETE**** 11:54:47 Logging off all sessions**** 11:54:47 Total processor time used = '41.28 Seconds'**** 11:54:47 FDL4818 FastLoad Terminatedthe weird thing is it took less than a minute.

dnoeth 4628 posts Joined 11/04
22 Oct 2008

The bottleneck is FastLoad's insert phase, if you run several FastLoads in parallel they will compete for the same resources.Dieter

Dieter

novice 38 posts Joined 07/07
13 May 2009

Hi Dieter/Standalone,I have been following this thread and really appreciates the knowledge being shared.I have one question related to the fastload log being posted we have statement axsmod np_axsmod.sl "";What does it mean? Looking through the fastload/mload manual suggests its the Name of the access module file to be used to import data. but what does this means that the file specified in Define statement i.e. file=/data2/TERADATA/IN/FACT_IN_CALL/LOAD_FIFO_08; is not used and instead the output of np_axsmod.sl is used as input stream.does gunzip -c per 15 file > to "special" mkfifo creates a Stream of data instead of creating a merged file and eating more space and then axsmod np_axsmod.sl looks into this stream?if yes how does axsmod np_axsmod.sl comes to know that it has to take data from "special" stream and not another stream?

joedsilva 505 posts Joined 07/05
13 May 2009

That's the named pipe access module.Teradata utilities are capable of reading/writing into a fifo (aka named pipe). This reduces the necessity for filesystem storage etc, especially when transferring data from one system to another.But there's a catch though. The traditional FIFOs, don't support the "seek" operation i.e. you can't point to a particular position in data. This is much essential for having restart capabilities for Terdata utilities.The named pipe access module solves this problem by being an intermediate link between the fifo and the utility, it reads data from the fifo and keeps track of it's progress (ie check point etc) in a data file of it's own. This helps it to recover in the event the job needs to be restarted (or say there was a DB restart) without doing a seek operation against the fifo (which will error if attempted).the accessmodule knows it needs to read from that fifo, because fastload knows it, and it tells the accessmodule.You can find more info on this module in the Access module reference manual.

feinholz 1234 posts Joined 05/08
03 Jun 2009

What was not clear (and I did not read every reply) is whether the data from these 40,000 files must be loaded into a single table.If so, has any thought been given to using TPT (Teradata Parallel Transporter).TPT has the ability to read multiple files at one time.In fact, TPT can be configured to read an entire directory of files in one job.

--SteveF

standalone 17 posts Joined 08/08
13 Jul 2009

Hi everyone,actually, i had already implement 10 parallel fastload to load these 40.000 files.with these configuration we managed to load 60.000 files per day.but,...with that kind of configuration, it over burden the box with to many I/O process.after 3 month, we decide to change the architecture.instead load the raw data and process it inside TERADATA, we decide to pre-process it out side TERADATA first.so, TERADATA will only recieve the final data.and we change the fastload to multiload, because the intensive I/O process.after a week or so, it seems the multiload become slower and slower.so, today we decide to use TPT.has anyone use TPT in parallel mode and insert directly to 1 table?

dnoeth 4628 posts Joined 11/04
14 Jul 2009

TPT uses the same load protocols as FastLoad or MLoad, if TPT will be faster depends where's the bottleneck. Could you post a log output again?Dieter

Dieter

standalone 17 posts Joined 08/08
15 Jul 2009

Hi dieter,actually, yesterday we found out that TPT is using MLOAD also.so, we postpone the development for TPT, instead we focus on finding the bottle neck.we found out that in our target table, we use NUSI.after we drop the NUSI, MLOAD performance drastically increase.right now we still monitoring the this changes, wheter the performance is stable.

feinholz 1234 posts Joined 05/08
24 Jul 2009

Depending on where the bottleneck is, TPT could still product faster results (unless you are totally I/O bound or CPU bound).TPT supports the MultiLoad protocol as well.If you have the CPU and I/O bandwidth, then TPT will improve on performance due to the ability to read multiple files in parallel, with a single job (much easier to script and manage over running parallel utility jobs by yourself).

--SteveF

vasudev 24 posts Joined 12/12
21 Mar 2013

Hi Feinholz,
I must have load multiple input data files as they appear in a Unix file directory using asingle load operation. Is the TPT supports this?
Thanks in advance.

feinholz 1234 posts Joined 05/08
21 Mar 2013

Yes. TPT supports it.
Please refer to the TPT documentation (user guide).

--SteveF

You must sign in to leave a comment.