All Forums Aster
apataki 1 post Joined 04/12
14 May 2012
Stream API for R

I found a tutorial on on how to use R with asterdata. The main body of the mapper looks like this:

while (1)


 input_list = scan(stdin,what=list(stock_id=" ",open_price=0),nlines=1, quiet=TRUE)

 if (length(id) == 0)

 score = score_function(input)

 # Output original tuple with attached score
 result = c(id, score)

 write(result, stdout(), sep=DELIMITER, ncolumns = length(result))


In this example the R-code scans and processes the input (stdin) line-by-line and calculates some score for each line. Now let's suppose that I want to do scoring for 5 loan products and I have samples of n size for each product so all together 5 x n lines. If n = 200 I can set nlines = 200. If the sample is ordered then each cycle in the loop will process one product. But normally, the structure is not that symmetric, and n can be different for different products.

How should I write the SQL query in SQL-MR so that the stream function be called 5 times, once for each product?









13 Jul 2013

SELECT score
        ON (
                stock_id, open_price


        PARTITION BY loan_product
        SCRIPT ('RSCRIPT my_rscript.R "arg1" "arg2" ')

        OUTPUTS ('score numeric(5,2)')







You must sign in to leave a comment.