All Forums Tools
b.saipavankumar 12 posts Joined 03/12
02 Nov 2015
Difference in Aquisition phase of Mload & FastLoad

Hello Everyone,
I have a question on Mload & FastLoad.
Lets assume that I have an empty table and I am trying to  load a file into this table using Mload or fastload. Now, based on below knowledge, I want to conclude, in application phase , which of these utilities will perform better.
To my knowledge, in Mload, data gets loaded to worktable in aquisition phase and then will be moved to actual table in application phase. My question here is

  1. Will we have copy of work table on all amps
  2. Will the parser hash the data from file and then pass the record to worktable on corresponding amp or will it push randomly like in Fastload and then re-distribute in application phase.

In Fastload, data first gets pushed on to the amps in aquisition phase and then in application phase, it gets re-distributed across amps. Does this mean that hashing in fastload happens in application phase and will the data gets moved across amps over BYNET. If this is true, application phase of Fastload takes more time compared to Mload as hashing is involved here.
Please help & correct me if my understanding is wrong 

Thanks, Sai Pavan Kumar Bhamidipati
Fred 1096 posts Joined 08/04
02 Nov 2015

In FastLoad Phase 1, blocks of data are sent to arbitrary AMPs which deblock the data, compute the rowhash, and send each row to the proper destination AMP. So "redistribution" is part of Phase 1. Phase 2 is AMP-local (sort by ROWID).
The MultiLoad worktable has the same PI as the target table, and is always Fallback protected (so the rows are actually written twice). Again, the hash computation and row (re)distribution happens in the first (Acquisition) phase. The Application phase is AMP-local (with no NUSIs, essentially merges worktable to target table).

b.saipavankumar 12 posts Joined 03/12
02 Nov 2015

Thanks Fred. So, if I understood you point correctly, be it phase-2 of fastload or application phase of Mload, data is just committed to the actual target table. Redistribution/hashing of data to move to the corresponding Amp will happen in Phase-1/Aquisition phase itself.
Also, Could you please clarify on the below.

  • In Fastload in Phase-1, will each Amp have complete set of data before data is hashed. I mean will each Amp have complete data present in the file we are trying to load. Then hashing starts and un-wanted data is purged from that Amp or is it like File is just split into blocks and blocks are randomly sent to Amps and then are hashed & redistributed properly.
  • What about the same in Mload. Will the hashing happen while each block (as mload loads data in blocks) is pushed on to worktable or like in fastload, are the blocks arbitarly pushed to Work table on different Amps and then hashing/redistribution happens.


Sai Pavan Kumar Bhamidipati

Fred 1096 posts Joined 08/04
04 Nov 2015

In both cases, an entire data block from the client is sent to an AMP, which immediately de-blocks the data, builds individual rows, and sends each row to the appropriate target AMP, which then appends the row to the table (FastLoad) or inserts it to the worktable (MultiLoad).

You must sign in to leave a comment.