machine learning - SVM different results in R with same input and parameters -
i have developed svm model fraud detection in train dataset using following parameters:
set.seed(1234) gamma.optimal <- 0.02 cost.optimal <- 4 svm_model1 <- svm(log(response+0.00012345) ~ . , data_test, kernel="radial", gamma=gamma.opt, cost=cost.opt) after creating svm, evaluated svm_model1 in test data set obtain total fraud quantity: sum(response) , equal 30.080 usd: 
predictions <- exp(predict(svm_model1 , testing)) this result equal in laptop (local mode r gui) , small cluster using sparkr (4 nodes , 1 master cloudera 5.6).
happy these results tried perform the same r script the same test data set, the same svm_model1 saved set.seed(1234) in .rdata executable file, time in 2 different systems: oracle bda (6 slave nodes , 1 master) , 1 4 slave nodes , cloudera 5.7. 
the results in these 2 final systems were: sum(response) equal 30.130 usd, using same.
predictions <- exp(predict(svm_model1 , testing)) my question is:
1) if used same script, same model saved in executable file (.rdata), , same data; how possible e1071 svm radial kernel gives me different results?
2) these results related radial kernel nature's , parallel processing issues or different hardware characteristics? or if use set.seed() no matter hardware have result must same in r using predict() function?
i thank in advance time , help. best regards.
as prediction function (obviously) deterministic, results should same-- e1071 uses libsvm, i.e. c++ code-- , floating point operations can (and will) vary between hardware platforms (and different compiler flags and/or compilers). try write own r prediction function, should give (for fixed models) same answer on platforms.
Comments
Post a Comment