Skip to content Skip to sidebar Skip to footer

Converting A Very Large Json File To Csv

I have a JSON file that is about 8GB in size. When I try to convert the file using this script: import csv import json infile = open('filename.json','r') outfile = open('data.csv'

Solution 1:

Would it be easier to convert each of these individually and then combine them into one csv?

Yes, it certainly would

For example, this will put each JSON object/array (whatever is loaded from the file) onto its own line of a single CSV.

import json, csv
from glob import glob

withopen('out.csv', 'w') as f:
    for fname in glob("*.json"):  # Reads all json from the current directorywithopen(fname) as j:
            f.write(str(json.load(j)))
            f.write('\n')

Use glob pattern **/*.json to find all json files in nested folders

Not really clear what for row in ... was doing for your data since you don't have an array. Unless you wanted each JSON key to be a CSV column?

Solution 2:

Yes, it is absolutely can be done in a very easy way. I opened a 4GB json file in a few seconds. For me, I dont need to convert to csv. But it can be done in a very easy way.

  1. start the mongodb with Docker.
  2. create a temporary database on mongodb, e.g. test
  3. copy the json file to into the Docker container
  4. run mongoimport command

    docker exec -it container_id mongoimport --db test --collection data --file /tmp/data.json --jsonArray

  5. run the mongo export command to export to csv

    docker exec -it container_id mongoexport --db test --collection data --csv --out data.csv --fields id,objectType

Post a Comment for "Converting A Very Large Json File To Csv"