aboutsummaryrefslogtreecommitdiff
path: root/cmd
AgeCommit message (Expand)Author
2022-03-24misc: update handler signatures, tests, housekeepingJordan
2022-03-24links, crawl: dramatically reduce memory usageJordan
2022-02-14client, crawl: fix/simplify net.Dialer overridesJordan
2022-02-14crawl, readme: record assembled seed URLs to seed_urls fileJordan
2022-02-10crawl, readme: max default WARC size 100 MB -> 5 GBJordan
2022-02-10misc: update crawl paths to reflect fork locationJordan
2022-02-10client, crawl: --bind, support making outbound requests from a particular add...Jordan
2022-02-10crawl: set User-Agent header to appear like Firefox on WindowsJordan
2022-02-10crawl: include crawl start date in directory nameJordan
2022-02-10crawl: create new directory to store crawl contents, resume paramJordan
2022-02-10crawl, scope: recurse infinitely by defaultJordan
2020-08-26Minor logging fixesale
2020-08-23Fix the crawl.go testsale
2020-08-23Allow setting DNS overrides using the --resolve optionale
2020-07-30Retry requests on transport-level errorsale
2020-02-17Fix the Handler in cmd/linksale
2020-02-17Propagate the link tag through redirectsale
2019-01-20Refactor Handlers in terms of a Publisher interfaceale
2019-01-02Add multi-file outputale
2018-12-06Apply --excludes to related resources tooale
2018-09-02Add --exclude and --exclude-file optionsale
2018-08-31Add a simple test for the full WARC crawlerale
2018-08-31Explicitly delegate retry logic to handlersale
2018-08-31Improve error handling, part twoale
2018-08-31Improve error checkingale
2017-12-19Provide better defaults for command-line optionsale
2017-12-19Exit gracefully on signalsale
2017-12-19Use a global http.Client with sane settingsale
2017-12-19Update cmd/links to new scope syntaxale
2017-12-19Add tags (primary/related) to linksale
2015-07-03minor golint fixesale
2015-06-29clean up the state directory when doneale
2015-06-28add ignore list from ArchiveBotale
2014-12-20move URLInfo logic into the Crawler itselfale
2014-12-20make Scope checking more modularale
2014-12-20move link extraction to a common locationale
2014-12-20move the WARC code into its own packageale
2014-12-19initial commitale