Error In Labelled Point Object Pyspark
I am writing a function which takes a RDD as input splits the comma separated values then convert each row into labelled point object finally fetch the output as a dataframe code
Solution 1:
The reason you had no errors until you execute the action:
output.take(5)
Is due to the nature of spark, which is lazy. i.e. nothing was execute in spark until you execute the action "take(5)"
You have a few issues in your code, and I think that you are failing due to extra "[" and "]" in [line[1:]]
So you need to remove extra "[" and "]" in [line[1:]] (and keep only the line[1:])
Another issue which you might need to solve is the lack of dataframe schema.
i.e. replace "toDF()" with "toDF(["features","label"])" This will give the dataframe a schema.
Post a Comment for "Error In Labelled Point Object Pyspark"