LookupError: unknown encoding: 'b'utf8''

I don’t know why, but I am getting a lookup error with an unknown encoding found, ‘b’utf8” when I try to scrape and parse Walmart’s web page.

I have already set the encoding to utf-8 and also tried removing BOM, according to this post: lxml LookupError occured. Arguments: ("unknown encoding: 'b'utf-8-sig''",).

Appreciate any help or pointers!

Complete code:

import httpx
from parsel import Selector
import json

# Fake browser-like headers
BASE_HEADERS = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",
    "accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
    "accept-language": "en-US;en;q=0.9",
    "accept-encoding": "gzip, deflate, br",
}

response = httpx.get("https://www.walmart.com/product-page-url", headers=BASE_HEADERS)
if response.encoding is None:
    response.encoding = 'utf-8' 

# Remove BOM if present
content = response.content
if content.startswith(b'\xef\xbb\xbf'):
    content = content[3:]  # Remove the BOM

response_text = content.decode('utf-8')
sel = Selector(text=response_text)
data = sel.xpath('//script[@id="__NEXT_DATA__"]/text()').get()

if data:
    data = json.loads(data)
    product = data["props"]["pageProps"]["initialData"]["data"]["product"]
    print(product)
else:
    print("No product data found.")

You need to sign in to view this answers

About Us

Categories

Android

C#

C++

CSS

GPL

HTML

Contact Info

LookupError: unknown encoding: 'b'utf8''

Leave feedback about this Cancel Reply

PROS

CONS

Categories

Android

C#

C++

CSS

GPL

HTML

java

javascript

jQuery

Node.js

pdf

PHP

Recent Posts

Postgres drop type XX000 “cache lookup failed for type”

PostgreSQL how to merge rows where some fields match and others are null

About Us

Categories

Android

C#

C++

CSS

GPL

HTML

Contact Info

Follow Us

LookupError: unknown encoding: 'b'utf8''

Share This Post:

Leave feedback about this Cancel Reply

PROS

CONS

Related Post

Android

C#

C++

CSS

GPL

HTML

java

javascript

jQuery

Node.js

pdf

PHP