scala - how to provide CSV input to naive bayes classifier -


hi working on disease classification using naïve bayes model. have csv file have disease along symptoms. format of csv: symptom-1 symptom-2 symptom-3 disease how provide csv naïve bayes model , classify disease based on symptoms there standard code read csv , provide naïve bayes model perform classification using spark machine learning library this.

code

this modified example mllib doc

import org.apache.spark.mllib.classification.{naivebayes, naivebayesmodel} import org.apache.spark.mllib.linalg.vectors import org.apache.spark.mllib.regression.labeledpoint  val data = sc.textfile("your csv path") val parseddata = data.map { line =>   val parts = line.split(',')   // labeled point labeledpoint(disease,(symptom 1,2,3)) // assuming of them numeric   labeledpoint(parts(3).todouble,vectors.dense(parts(0).todouble,parts(1).todouble,parts(2).todouble)) }  // split data training (60%) , test (40%). val splits = parseddata.randomsplit(array(0.6, 0.4), seed = 11l) val training = splits(0) val test = splits(1)  val model = naivebayes.train(training, lambda = 1.0, modeltype = "multinomial")  val predictionandlabel = test.map(p => (model.predict(p.features), p.label)) val accuracy = 1.0 * predictionandlabel.filter(x => x._1 == x._2).count() / test.count()  // save , load model model.save(sc, "target/tmp/naivebayesmodel") val samemodel = naivebayesmodel.load(sc, "target/tmp/naivebayesmodel") 

Comments

Popular posts from this blog

Django REST Framework perform_create: You cannot call `.save()` after accessing `serializer.data` -

Why does Go error when trying to marshal this JSON? -