OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

How to Check Bookmark URLs in Parallel Using Playwright with External Browser Connection

  • Thread starter Thread starter 348705ochnui983245
  • Start date Start date
3

348705ochnui983245

Guest
I have a task where I need to validate up to 1000 bookmark URLs using Playwright. I need to check the validity of these URLs in parallel (ideally 3-5 concurrent checks). I'm using an external Firefox browser instance and connecting to it with Playwright.

I want to ensure the following:

  1. Only create the browser instance once.
  2. Check the URLs in parallel (3-8 concurrent checks).
  3. Efficiently handle up to 1000 URLs.

Could someone provide guidance or a code snippet on how to achieve this? JavaScript and Python is ok.

Additional Information​


I attempted to create multiple browser contexts, each handling a URL validation task using Playwright. My expectation was that each browser context would independently navigate to the specified URL and check its availability (HTTP status 200).

my prototype

Code:
async def check_bookmark(context, url):
    page = await context.new_page()
    try:
        await page.goto(url)
        status = await page.evaluate('() => document.readyState')
        if status == 'complete':
            print(f"{url} is valid")
        else:
            print(f"{url} is invalid")
    except Exception as e:
        print(f"{url} is invalid: {str(e)}")
    finally:
        await page.close()

async def main():
    bookmark_urls = ["https://example.com"]

    async with async_playwright() as p:
        browser = await p.firefox.connect('ws://localhost:3000/playwright/firefox')
        
        contexts = []

        for i in range(0, 3):
            context = await browser.new_context()
            contexts.append(context)

        tasks = []
        num_contexts = len(contexts)
        num_urls = len(bookmark_urls)
        urls_per_context = num_urls // num_contexts

        for i in range(num_contexts):
            start_index = i * urls_per_context
            end_index = start_index + urls_per_context
            urls_subset = bookmark_urls[start_index:end_index]

            for url in urls_subset:
                tasks.append(check_bookmark(contexts[i], url))

        await asyncio.gather(*tasks)

        await browser.close()

if __name__ == '__main__':
    asyncio.run(main())

But this always gave me this error: Target page, context or browser has been closed
<p>I have a task where I need to validate up to 1000 bookmark URLs using Playwright. I need to check the validity of these URLs in parallel (ideally 3-5 concurrent checks). I'm using an external Firefox browser instance and connecting to it with Playwright.</p>
<p>I want to ensure the following:</p>
<ol>
<li>Only create the browser instance once.</li>
<li>Check the URLs in parallel (3-8 concurrent checks).</li>
<li>Efficiently handle up to 1000 URLs.</li>
</ol>
<p>Could someone provide guidance or a code snippet on how to achieve this?
JavaScript and Python is ok.</p>
<h3>Additional Information</h3>
<ul>
<li>External Browser Instance: <a href="https://github.com/browserless/browserless" rel="nofollow noreferrer">browserless</a></li>
</ul>
<p>I attempted to create multiple browser contexts, each handling a URL validation task using Playwright. My expectation was that each browser context would independently navigate to the specified URL and check its availability (HTTP status 200).</p>
<p>my prototype</p>
<pre class="lang-py prettyprint-override"><code>async def check_bookmark(context, url):
page = await context.new_page()
try:
await page.goto(url)
status = await page.evaluate('() => document.readyState')
if status == 'complete':
print(f"{url} is valid")
else:
print(f"{url} is invalid")
except Exception as e:
print(f"{url} is invalid: {str(e)}")
finally:
await page.close()

async def main():
bookmark_urls = ["https://example.com"]

async with async_playwright() as p:
browser = await p.firefox.connect('ws://localhost:3000/playwright/firefox')

contexts = []

for i in range(0, 3):
context = await browser.new_context()
contexts.append(context)

tasks = []
num_contexts = len(contexts)
num_urls = len(bookmark_urls)
urls_per_context = num_urls // num_contexts

for i in range(num_contexts):
start_index = i * urls_per_context
end_index = start_index + urls_per_context
urls_subset = bookmark_urls[start_index:end_index]

for url in urls_subset:
tasks.append(check_bookmark(contexts, url))

await asyncio.gather(*tasks)

await browser.close()

if __name__ == '__main__':
asyncio.run(main())
</code></pre>
<p>But this always gave me this error: <code>Target page, context or browser has been closed</code></p>
 
Top