Production-ready project

Squirrel is a Data Web Crawler

About the project

Squirrel is a crawler for Linked Data and can be used to exploit the content of the Linked Data web. By ingesting initial seeds, it follows all the links found to gather more Linked Data using a load-balancing strategy. Squirrel has an extensible architecture allowing the user to customise the type of content that should be crawled. An example for this extensibility is the scraper component of Squirrel allowing the transformation of structured and semi-structured data from HTML into RDF. The distributed architecture of Squirrel allows the users to scale the crawler to their needs.
