home ¦ Archives ¦ Atom ¦ RSS

Crawley

Link parkin’: Crawley Project

Crawley is Pythonic Crawling / Scraping framework intented to change the way you think about extracting data from the internet

It’s never actually materialized, but I’m still hoping that user grade, commodity cost, focused crawlers become a reality. Seems like the tech has caught up with the concept. Maybe something like Crawley plus Python’s copious machine learning and data mining toolkits could provide the foundation.

I believe not actually telling folks you’re exploiting focused crawling is one of the key tricks.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.