OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Selenium WebDriver returns empty DataFrame when scraping CoinGecko in headless mode

  • Thread starter Thread starter HamidBee
  • Start date Start date
H

HamidBee

Guest
I'm trying to scrape Bitcoin market data from CoinGecko using Selenium in headless mode, but the script returns an empty DataFrame. The table rows are not being detected even though I've added a wait time. Here is a simplified version of the code I'm using to set up the WebDriver, navigate to the page, and extract the table data using XPath. The relevant parts of the log indicate that the requests are being made correctly, but no elements are found. What could be causing this issue, and how can I ensure the table data is correctly scraped in headless mode?.

Code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import pandas as pd
import time

# Path to your ChromeDriver
chrome_driver_path = 'C:\\Users\\hamid\\OneDrive\\Desktop\\chromedriver-win64\\chromedriver.exe'

# Set up headless mode
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1080")

# Set up the WebDriver
driver = webdriver.Chrome(executable_path=chrome_driver_path, options=options)

# Navigate to the CoinGecko Bitcoin page
driver.get('https://www.coingecko.com/en/coins/bitcoin')

# Wait for the page to load
time.sleep(5)

# Extract data from the page
rows = driver.find_elements(By.XPATH, '//table[@class="table"]/tbody/tr')
market_data = []

for row in rows:
    exchange = row.find_element(By.XPATH, './/td[2]/a').text
    pair = row.find_element(By.XPATH, './/td[3]/a/b').text
    price = row.find_element(By.XPATH, './/td[4]/span').text
    volume_24h = row.find_element(By.XPATH, './/td[5]/span').text
    volume_percentage = row.find_element(By.XPATH, './/td[6]').text
    category = row.find_element(By.XPATH, './/td[7]').text
    updated = row.find_element(By.XPATH, './/td[8]').text

    market_data.append({
        'exchange': exchange,
        'pair': pair,
        'price': price,
        'volume_24h': volume_24h,
        'volume_percentage': volume_percentage,
        'category': category,
        'updated': updated
    })

# Close the WebDriver
driver.quit()

# Convert to DataFrame
df = pd.DataFrame(market_data)
print(df)

When I run the script, I get the following output:

Code:
Empty DataFrame
Columns: []
Index: []
<p>I'm trying to scrape Bitcoin market data from CoinGecko using Selenium in headless mode, but the script returns an empty DataFrame. The table rows are not being detected even though I've added a wait time. Here is a simplified version of the code I'm using to set up the WebDriver, navigate to the page, and extract the table data using XPath. The relevant parts of the log indicate that the requests are being made correctly, but no elements are found. What could be causing this issue, and how can I ensure the table data is correctly scraped in headless mode?.</p>
<pre><code>from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import pandas as pd
import time

# Path to your ChromeDriver
chrome_driver_path = 'C:\\Users\\hamid\\OneDrive\\Desktop\\chromedriver-win64\\chromedriver.exe'

# Set up headless mode
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1080")

# Set up the WebDriver
driver = webdriver.Chrome(executable_path=chrome_driver_path, options=options)

# Navigate to the CoinGecko Bitcoin page
driver.get('https://www.coingecko.com/en/coins/bitcoin')

# Wait for the page to load
time.sleep(5)

# Extract data from the page
rows = driver.find_elements(By.XPATH, '//table[@class="table"]/tbody/tr')
market_data = []

for row in rows:
exchange = row.find_element(By.XPATH, './/td[2]/a').text
pair = row.find_element(By.XPATH, './/td[3]/a/b').text
price = row.find_element(By.XPATH, './/td[4]/span').text
volume_24h = row.find_element(By.XPATH, './/td[5]/span').text
volume_percentage = row.find_element(By.XPATH, './/td[6]').text
category = row.find_element(By.XPATH, './/td[7]').text
updated = row.find_element(By.XPATH, './/td[8]').text

market_data.append({
'exchange': exchange,
'pair': pair,
'price': price,
'volume_24h': volume_24h,
'volume_percentage': volume_percentage,
'category': category,
'updated': updated
})

# Close the WebDriver
driver.quit()

# Convert to DataFrame
df = pd.DataFrame(market_data)
print(df)
</code></pre>
<p>When I run the script, I get the following output:</p>
<pre><code>Empty DataFrame
Columns: []
Index: []
</code></pre>
 

Latest posts

A
Replies
0
Views
1
Alvah_Franey
A
H
Replies
0
Views
1
habrewning
H
Top