How To Read A Csv File Subset By Subset With Pandas?
Solution 1:
You can also use the parameters nrows
or skiprows
to break it up into chunks. I would recommend against using iterrows
since that is typically very slow. If you do this when reading in the values, and saving these chunks separately, then it would skip the iterrows section. This is for the file reading if you want to split up into chunks (which seems to be an intermediate step in what you're trying to do).
Another way is to subset using generators by seeing if the values belong to each set: [[1..360], [360..712], ..., [12640..13000]]
So write a function that takes the chunks with indices divisible by 360 and if the indices are in that range, then choose that particular subset.
I just wrote these approaches down as alternative ideas you might want to play around with, since in some cases you may only want a subset and not all of the chunks for calculation purposes.
Post a Comment for "How To Read A Csv File Subset By Subset With Pandas?"