Age | Commit message (Collapse) | Author |
|
to prevent excessive memory usage and OOM crashes, rather than store and
pass around response bodies in memory buffers, let's store them
temporarily on the filesystem wget-style and delete them when processed
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
backreferences
|
|
|
|
address
|
|
|
|
|
|
|
|
|
|
Update module github.com/PuerkitoBio/goquery to v1.7.1
See merge request ale/crawl!3
|
|
|
|
Update module github.com/google/go-cmp to v0.5.6
See merge request ale/crawl!5
|
|
Update module github.com/pborman/uuid to v1.2.1
See merge request ale/crawl!2
|
|
Update module github.com/PuerkitoBio/purell to v0.1.0
See merge request ale/crawl!4
|
|
|
|
|
|
|
|
Configure Renovate
See merge request ale/crawl!1
|
|
|
|
This is an internal inconsistency that should be investigated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This allows users of crawl-as-a-library to recover from unexpected
errors as a last resort.
|
|
|
|
In order to do this we have to plumb it through the queue and the
Handler interface, but it should allow fetches of the resources
associated with a page via the IncludeRelatedScope even if it's behind
a redirect.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|