Skip to content Skip to sidebar Skip to footer

Typeerror Converting A Pandas Dataframe To Spark Dataframe In Pyspark

Did my research, but didn't find anything on this. I want to convert a simple pandas.DataFrame to a spark dataframe, like this: df = pd.DataFrame({'col1': ['a', 'b', 'c'], 'col2':

Solution 1:

It's related to your spark version, latest update of spark makes type inference more intelligent. You could have fixed this by adding the schema like this :

mySchema = StructType([ StructField("col1", StringType(), True), StructField("col2", IntegerType(), True)])
sc_sql.createDataFrame(df,schema=mySchema)

Post a Comment for "Typeerror Converting A Pandas Dataframe To Spark Dataframe In Pyspark"