How Do You Save A Google Sheets File As Csv From Python 3 (or 2)?
Solution 1:
While the requests
library is the gold standard for HTTP requests from Python, this style of download is (while not deprecated yet) not likely to last, specifically referring to the use of links, managing cookies & redirects, etc. One of the reasons for not preferring links is that it's less secure and generally such access should require authorization. Instead, the currently accepted way of exporting Google Sheets as CSV is by using the Google Drive API.
So why the Drive API? Isn't this supposed to be something for the Sheets API instead? Well, the Sheets API is for spreadsheet-oriented functionality, i.e., data formatting, column resize, creating charts, cell validation, etc., while the Drive API is for file-oriented functionality, i.e., import/export, copy, rename, etc.
Below is a complete cmd-line solution. (If you don't do Python, you can use it as pseudocode and pick any language supported by the Google APIs Client Libraries.) For the code snippet, assume the most current Sheet named inventory
(older files with that name are ignored) and DRIVE
is the API service endpoint:
FILENAME = 'inventory'
SRC_MIMETYPE = 'application/vnd.google-apps.spreadsheet'
DST_MIMETYPE = 'text/csv'# query for latest file named FILENAME
files = DRIVE.files().list(
q='name="%s" and mimeType="%s"' % (FILENAME, SRC_MIMETYPE),
orderBy='modifiedTime desc,name').execute().get('files', [])
# if found, export Sheets file as CSVif files:
fn = '%s.csv' % os.path.splitext(files[0]['name'].replace(' ', '_'))[0]
print('Exporting "%s" as "%s"... ' % (files[0]['name'], fn), end='')
data = DRIVE.files().export(fileId=files[0]['id'], mimeType=DST_MIMETYPE).execute()
# if non-empty fileif data:
withopen(fn, 'wb') as f:
f.write(data)
print('DONE')
If your Sheet is large, you may have to export it in chunks -- see this page on how to do that. If you're generally new to Google APIs, I have a (somewhat dated but) user-friendly intro video for you. (There are 2 videos after that which maybe useful too.)
Solution 2:
Google responds to the initial request with a series of cookie-setting 302 redirects. If you don't store and resubmit the cookies between requests, it redirects you to the login page.
So, the problem is not with the User-Agent header, it's the fact that by default, urllib.request.urlopen
doesn't store cookies, but it will follow the HTTP 302 redirects.
The following code works just fine on a public spreadsheet available at the location specified by DOC_URL
:
>>>from http.cookiejar import CookieJar>>>from urllib.request import build_opener, HTTPCookieProcessor>>>opener = build_opener(HTTPCookieProcessor(CookieJar()))>>>resp = opener.open(DOC_URL)>>># should really parse resp.getheader('content-type') for encoding.>>>csv_content = resp.read().decode('utf-8')
Having shown you how to do it in vanilla python, I'll now say that the Right Way™ to go about this is to use the most-excellent requests library. It is extremely well documented and makes these sorts of tasks incredibly pleasant to complete.
For instance, to get the same csv_content
as above using the requests
library is as simple as:
>>>import requests>>>csv_content = requests.get(DOC_URL).text
That single line expresses your intent more clearly. It's easier to write and easier to read. Do yourself - and anyone else who shares your codebase - a favor and just use requests
.
Post a Comment for "How Do You Save A Google Sheets File As Csv From Python 3 (or 2)?"