How To Find All Elements On The Webpage Through Scrolling Using Seleniumwebdriver And Python

December 21, 2023 Post a Comment

I can't seem to get all elements on a webpage. No matter what I have tried using selenium. I am sure I am missing something. Here's my code. The url has at least 30 elements yet wh

Solution 1:

you have to slowly scroll down the page. It only request price data with ajax when product viewed.

options = Options()
options.add_argument('--start-maximized')
driver = webdriver.Chrome(options=options)

url = 'https://www.adidas.com/us/men-shoes-new_arrivals'
driver.get(url)

scroll_times = len(driver.find_elements_by_class_name('col-s-6')) / 4# (divide by 4 column product per row)
scrolled = 0
scroll_size = 400while scrolled < scroll_times:
    driver.execute_script('window.scrollTo(0, arguments[0]);', scroll_size)
    scrolled +=1
    scroll_size += 400
    time.sleep(1)

shoe_prices = driver.find_elements_by_class_name('gl-price')

for price in shoe_prices:
    print(price.text)

print(len(shoe_prices))

Solution 2:

So there seems to be some difference in the results as using your code trial:

You find 30 items with requests and 6 items with Selenium
Where as I found 40 items with requests and 4 items with Selenium

This items on this website are dynamically generated through Lazy Loading so you have to scrollDown and wait for the new elements to render within the HTML DOM and you can use the following solution:

Code Block:

import requests
import webbrowser
from bs4 import BeautifulSoup as bs
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException, TimeoutException

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36'}
url = 'https://www.adidas.com/us/men-shoes-new_arrivals'
res = requests.get(url, headers = headers)
page_soup = bs(res.text, "html.parser")
containers = page_soup.findAll("div", {"class": "gl-product-card-container show-variation-carousel"})
print(len(containers))
shoe_colors = []
for container in containers:
    if container.find("div", {'class': 'gl-product-card__reviews-number'}) isnotNone:
    shoe_model = container.div.div.img["title"]
    review = container.find('div', {'class':'gl-product-card__reviews-number'})
    review = int(review.text)
options = Options()
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--disable-extensions')
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
driver.get(url)
myLength = len(WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.gl-price"))))
whileTrue:
    driver.execute_script("window.scrollBy(0,400)", "")
    try:
        WebDriverWait(driver, 20).until(lambda driver: len(driver.find_elements_by_css_selector("span.gl-price")) > myLength)
        titles = driver.find_elements_by_css_selector("span.gl-price")
        myLength = len(titles)
    except TimeoutException:
        breakprint(myLength)
for title in titles:
    print(title.text)
driver.quit()

Console Output:

47
$100$100$100$100$100$100$180$180$180$180$130$180$180$130$180$130$200$180$180$130$60$100$30$65$120$100$85$180$150$130$100$100$80$100$120$180$200$130$130$100$120$120$100$180$90$140$100

lacucinadiadine

How To Find All Elements On The Webpage Through Scrolling Using Seleniumwebdriver And Python

Solution 1:

Solution 2:

Post a Comment for "How To Find All Elements On The Webpage Through Scrolling Using Seleniumwebdriver And Python"

Widget HTML #3