Skip to content Skip to sidebar Skip to footer

How To Download And Save All Pdf From A Dynamic Web?

I am trying to download and save in a folder all the PDFs contained in some webs with dynamic elements i.e: https://www.bankinter.com/banca/nav/documentos-datos-fundamentales Every

Solution 1:

You have to make a post http requests with appropriate json parameter. Once you get the response, you have to parse two fields objectId and nombreFichero to use them to build right links to the pdf's. The following should work:

import os
import json
import requests

url = 'https://bancaonline.bankinter.com/publico/rs/documentacionPrix/list'
base = 'https://bancaonline.bankinter.com/publico/DocumentacionPrixGet?doc={}&nameDoc={}'
payload = {"cod_categoria": 2,"cod_familia": 3,"divisaDestino": None,"vencimiento": None,"edadActuarial": None}

dirf = os.environ['USERPROFILE'] + "\Desktop\PdfFolder"ifnot os.path.exists(dirf):os.makedirs(dirf)
os.chdir(dirf)

r = requests.post(url,json=payload)
for item in r.json():
    objectId = item['objectId']
    nombreFichero = item['nombreFichero'].replace(" ","_")
    filename = nombreFichero.split('.')[-2] + ".PDF"
    link = base.format(objectId,nombreFichero)
    withopen(filename, 'wb') as f:
        f.write(requests.get(link).content)

After executing the above script, wait a little for it to work as the site is real slow.

Post a Comment for "How To Download And Save All Pdf From A Dynamic Web?"