使用spark加载libsvm格式的文件代码如下,另外添加了训练集和测试集随机分割的代码:
// "http://spark.apache.org/docs/latest/mllib-optimization.html" // use libsvm to load data val data = MLUtils.loadLibSVMFile(sc, "/*/date=20170324/*").cache() val splits = data.randomSplit(Array(0.8, 0.2), seed = 11L) val training = splits(0).cache() val test = splits(1).cache()