All Forums Aster
Pradi 11 posts Joined 04/14
07 Jul 2014
GLM SQL-MR predict function gives empyt fitted_value

Hi,
I'm applying the Generalized Linear Model for prediction on the below sample data.
Using AsterAnalytics_Beta__indep_indep.05.10.00.00 libraray
Sample data training data :
origin | dest | year | month | dayofmonth | dayofweek | uniquecarrier | flightnum | deptime | crsdeptime | distance | depdelay | isdelayed
--------+------+------+-------+------------+---------- -+---------------+-----------+---------+------------+- ---------+----------+-----------
 IAD    | TPA  | 2008 |     1 |          3 |                    4 | WN                |       335        |    2003 |       1955 |      810 |        8    |         0
 IAD    | TPA  | 2008 |     1 |          3 |                   4 | WN                |      3231        |     754 |        735 |      810 |       19     |         1
 IND    | BWI  | 2008 |     1 |          3 |                 4 | WN                 |       448         |     628 |        620 |      515 |        8      |         0
 IND    | BWI  | 2008 |     1 |          3 |               4 | WN                   |      3920        |    1829 |       1755 |      515 |       34   |         1
 IND    | JAX  | 2008 |     1 |          3 |                4 | WN                  |       378          |    1940 |       1915 |      688 |       25   |         1
 Query for creating model
 

SELECT * FROM GLM (
ON (SELECT 1)
PARTITION BY 1
database('beehive')
userid('db_superuser')
password('db_superuser')
inputTable('airline_final_nneg')
outputTable('airline_ouput3')
columnNames('isdelayed', 'origin','year', 'month', 'dayofmonth', 'uniquecarrier', 'flightnum','dest', 'deptime', 'crsdeptime', 'distance' , 'depdelay')
categoricalColumns('origin', 'dest', 'uniquecarrier')
family('LOGISTIC')
link('CANONICAL')
weight('1')
threshold('0.01')
maxIterNum('10')
);
 
ERROR:  SQL-MR function GLM failed: [AsterData][ASTERJDBCDSII](34) ERROR: SQL-MR function GLM_REDUCESOLVEANDUPDATE failed: The input data results in a singular matrix and hence there is no solution ()
ERROR:  SQL-MR function GLM failed: Connection to jdbc:ncluster://192.168.100.100/beehive could not be established
Then after removing the columns 'year', 'month', 'dayofmonth,  it created Model.
 
But now prediction Function is not giving proper output, fitted_value is empty
Test data
 origin | dest | year | month | dayofmonth | dayofweek | uniquecarrier | flightnum | deptime | crsdeptime | distance | depdelay
--------+------+------+-------+------------+---------- -+---------------+-----------+---------+------------+- ---------+----------
 XNA    | IAH  | 2008 |     2 |         29 |         5 | XE            |      2577 |    1154 |       1157 |      438 |       -3
 JAX    | IAH  | 2008 |     2 |          1 |         5 | XE            |      2157 |    1645 |       1525 |      817 |       80
 IAH    | PBI  | 2008 |     2 |         11 |         1 | XE            |      2522 |     707 |        710 |      956 |       -3
 IAH    | BTR  | 2008 |     2 |          5 |         2 | XE            |      2472 |    1432 |       1435 |      253 |       -3
 EWR    | JAN  | 2008 |     2 |         16 |         6 | XE            |      2117 |    1908 |       1905 |     1055 |        3
 
Prediciton function
SELECT * FROM GLMPREDICT (
ON airline_input
DATABASE('beehive')
USERID('db_superuser')
PASSWORD('db_superuser')
MODELTABLE ('airline_ouput1')
ACCUMULATE ('Origin','dest','Year','Month','DayofMonth',' DayOfWeek','UniqueCarrier','FlightNum','CRSDepTime',' Distance', 'depdelay')
FAMILY('LOGISTIC')
LINK('LOGIT')
)
ORDER BY  Year, Month ;
Please can some one let me know where I'm getting it wrong.
 
Thank you.
 

You must sign in to leave a comment.