OiO.lk Blog python Scraping the hulkapps table using Selenium or Beautiful soup
python

Scraping the hulkapps table using Selenium or Beautiful soup


I have this URL that I am trying to scrape: https://papemelroti.com/products/live-free-badge

But it seems that I can’t find this table class

<table class="hulkapps-table table"><thead><tr><th style="border-top-left-radius: 0px;">Quantity</th><th style="border-top-right-radius: 0px;">Bulk Discount</th><th style="display: none">Add to Cart</th></tr></thead><tbody><tr><td style="border-bottom-left-radius: 0px;">Buy 50 +   <span class="hulk-offer-text"></span></td><td style="border-bottom-right-radius: 0px;"><span class="hulkapps-price"><span class="money"><span class="money"> ₱1.00 </span></span> Off</span></td><td style="display: none;"><button type="button" class="AddToCart_0" style="cursor: pointer; font-weight: 600; letter-spacing: .08em; font-size: 11px; padding: 5px 15px; border-color: #171515; border-width: 2px; color: #ffffff; background: #161212;" onclick="add_to_cart(50)">Add to Cart</button></td></tr></tbody></table>

I already have my Selenium code but it’s still not scraping it. Here’s my code:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
import time

# Set up Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")

service = Service('/usr/local/bin/chromedriver')  # Adjust path if necessary
driver = webdriver.Chrome(service=service, options=chrome_options)

def get_page_html(url):
    driver.get(url)
    time.sleep(3)  # Wait for JS to load
    return driver.page_source

def scrape_discount_quantity(url):
    page_html = get_page_html(url)
    soup = BeautifulSoup(page_html, "html.parser")

    # Locate the table containing the quantity and discount
    table = soup.find('table', class_='hulkapps-table')
    print(page_html)

    if table:
        table_rows = table.find_all('tr')
        for row in table_rows:
            quantity_cells = row.find_all('td')
            if len(quantity_cells) >= 2:  # Check if there are at least two cells
                quantity_cell = quantity_cells[0].get_text(strip=True)  # Get quantity text
                discount_cell = quantity_cells[1].get_text(strip=True)  # Get discount text
                return quantity_cell, discount_cell
    return None, None

# Example usage
url="https://papemelroti.com/products/live-free-badge"
quantity, discount = scrape_discount_quantity(url)
print(f"Quantity: {quantity}, Discount: {discount}")

driver.quit()  # Close the browser when done

It keeps on returning ‘None’

For reference:



You need to sign in to view this answers

Exit mobile version