aboutsummaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2019-10-07Add a vendor dependency used for testsale
2019-10-07Parse links in inline style blocksale
2019-09-26Switch to latest Go image for CI testale
2019-09-26Add Go module supportale
2019-09-26Update vendored dependenciesale
2019-01-20Refactor Handlers in terms of a Publisher interfaceale
2019-01-19Replace URLInfo with a simple URL presence checkale
2019-01-02Add multi-file outputale
2018-12-28Updated dependenciesale
2018-12-27Normalize URLs before checking if they are in scopeale
2018-12-27Merge branch 'master' of git.autistici.org:ale/crawlale
2018-12-06Apply --excludes to related resources tooale
2018-09-02Fix typoale
2018-09-02Explicitly mention the crawler limitationsale
2018-09-02Add --exclude and --exclude-file optionsale
2018-09-02Minimal support for <video> and <object> tagsale
2018-08-31Do not drop /index.html at the end of URLsale
2018-08-31Add a simple test for the full WARC crawlerale
2018-08-31Explicitly delegate retry logic to handlersale
2018-08-31Improve error handling, part twoale
2018-08-31Use a buffered Writer for WARC outputale
2018-08-31Improve error checkingale
2018-08-31Update dependenciesale
2018-08-30Mention trickle as a possible bandwidth limiterale
2018-08-30Improve install instructions a bit moreale
2018-08-30Update installation instructionsale
2017-12-19Provide better defaults for command-line optionsale
2017-12-19Merge branch 'master' of git.autistici.org:ale/crawlale
2017-12-19Exit gracefully on signalsale
2017-12-19Add a READMEale
2017-12-19Use a global http.Client with sane settingsale
2017-12-19Crawl IFRAMEs as related resourcesale
2017-12-19Simplify redirectHandler.Handleale
2017-12-19Add licenseale
2017-12-19Update cmd/links to new scope syntaxale
2017-12-19Skip data: URLsale
2017-12-19Add tags (primary/related) to linksale
2017-12-18Add CI configuration (test only)ale
2017-12-18Add support for @import syntax in cssale
2017-12-18Update location of the uuid packageale
2017-12-18Add vendor depsale
2017-12-18Switch to github.com/syndtr/goleveldbale
2015-07-03minor golint fixesale
2015-06-29clean up the state directory when doneale
2015-06-29improve queue code; golint fixesale
2015-06-28add ignore list from ArchiveBotale
2015-06-28fix timestamp formatale
2014-12-20move URLInfo logic into the Crawler itselfale
2014-12-20add a prefix iterator to gobDbale
2014-12-20add tests to scope.goale