OiO.lk Blog python How to Extract Text from Nested Tags in BeautifulSoup Loop?
python

How to Extract Text from Nested Tags in BeautifulSoup Loop?


I’m trying to scrape metadata from a https://yellowpages.com.eg/en/category/abrasives using Selenium and BeautifulSoup. I can successfully extract some data, but I’m having trouble getting the text from a tag nested inside a div within a loop. Here’s my current code:

[

We specialize in surface treatment using abrasive products. We can offer you consultations on how to use the product and what is the best system to achieve (matte/mirror) finish

]



pagecount = 1
driver = webdriver.Chrome()
page_url = f"{base_url}/en/category/abrasives/p{pagecount}"
driver.get(page_url) 
driver.implicitly_wait(10) 
page_source = driver.page_source
time.sleep(1)
bs = BeautifulSoup(page_source, 'html.parser')
divs = bs.find_all('div', class_ = 'col-xs-12 item-details')
for div in divs:
    img_tag = div.find('img')
    if(img_tag):
        img_src = img_tag['data-src']
        print(img_src)
    else:
        # print("i provided no tag be off stupid")
        pass
    title = div.find('a', class_ = 'item-title').text.strip()
    print(title)
    address = div.find('a', class_ = 'address-text').find('span').text.strip()
    print(address)
    # description = div.find('div', class_ = 'item-aboutUs' )
    descriptions = div.find_all('div', class_='item-aboutUs')
    print(descriptions)

Issue:
I want to ensure that I’m correctly extracting the text from the a tag inside the item-aboutUs div. Is there a better way to handle this, especially if there are multiple item-aboutUs divs?



You need to sign in to view this answers

Exit mobile version