Awesome
ants
open source, distributed, restful crawler engine
scrapy
this is a project base on scrapy,we reuse most of scrapy code in crawler part
elasticsearch
I design my distributed architecture by imitating elasticsearch. it spire me to do a engine for distributed crawler
basic components
- cluster
- node
- webservice
- schedule
- crawl
- engine
job schedule
you can see it in issues
user guide
it will be in wiki. so please look forward to it