Skip to content Skip to sidebar Skip to footer
Showing posts with the label Rdd

Pyspark Application Fail With Java.lang.outofmemoryerror: Java Heap Space

I'm running spark via pycharm and respectively pyspark shell. I've stacked with this error:… Read more Pyspark Application Fail With Java.lang.outofmemoryerror: Java Heap Space

Spark: How To "reducebykey" When The Keys Are Numpy Arrays Which Are Not Hashable?

I have an RDD of (key,value) elements. The keys are NumPy arrays. NumPy arrays are not hashable, an… Read more Spark: How To "reducebykey" When The Keys Are Numpy Arrays Which Are Not Hashable?

How To Classify Images Using Spark And Caffe

I am using Caffe to do image classification, can I am using MAC OS X, Pyhton. Right now I know how … Read more How To Classify Images Using Spark And Caffe

How Can I Use Reducebykey Instead Of Groupbykey To Construct A List?

My RDD is made of many items, each of which is a tuple as follows: (key1, (val1_key1, val2_key1)) (… Read more How Can I Use Reducebykey Instead Of Groupbykey To Construct A List?