Skip to content Skip to sidebar Skip to footer

Using Beautifulsoup To Get_text Of Td Tags Within A Resultset

I am extracting table data using BeautifulSoup from this website:https://afltables.com/afl/stats/teams/adelaide/2018_gbg.html There are many tables with a unique table id, that I h

Solution 1:

You might find the following approach a bit easier:

import pandas as pd    

tables = pd.read_html("https://afltables.com/afl/stats/teams/adelaide/2018_gbg.html")

for df in tables:
    df.drop(df.columns[9:], axis=1, inplace=True)   # remove unwanted columns
    df.columns = df.columns.droplevel(0)    # remove extra index levelfor table in tables:
    print(table[:3:], '\n')  # show first 3 rows

This will give you a list of pandas dataframes. Each one contains all the information for each table. So for example, the first one contains Disposals:

         Player    R1    R2    R3    R4    R5    R6    R7  Tot
0  Atkins, Rory  14.0  17.0  22.0  28.0  24.0  28.0  16.0  149
1  Betts, Eddie  14.0  20.0  16.0   6.0   NaN   NaN  10.0   66
2   Brown, Luke  15.0  23.0  23.0  16.0  16.0  24.0  11.0  128 

         Player    R1    R2    R3    R4    R5    R6    R7  Tot
0  Atkins, Rory   8.0  13.0  12.0  16.0  17.0  18.0  10.0   94
1  Betts, Eddie   7.0   6.0  10.0   2.0   NaN   NaN   7.0   32
2   Brown, Luke  10.0  17.0  17.0  10.0  11.0  16.0   9.0   90

You could then use pandas to work with the data.

Post a Comment for "Using Beautifulsoup To Get_text Of Td Tags Within A Resultset"