home ¦ Archives ¦ Atom ¦ RSS

Playwright, Python, Requests-HTML

I was trying to lift and shift a web scraping script from my laptop to a Linux VM. The script uses the Requests-HTML and works fine OMM. On the VM, not so much. I just couldn’t get the right Ubuntu dependencies installed so a headless Chrome browser integrated correctly.

Enter Requests-HTML-Playwright which uses playwright-python to wrap the playwright library. Should do the trick right? Wrong. The HTMLSession from the requests_html_playwright module always failed.

HOWEVER!, playwright-python apparently installs the correct Ubuntu packages to make Requests-HTML operate properly. Go figure.

Victory?

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.