Age | Commit message (Collapse) | Author |
|
to prevent excessive memory usage and OOM crashes, rather than store and
pass around response bodies in memory buffers, let's store them
temporarily on the filesystem wget-style and delete them when processed
|
|
|
|
|
|
|
|
Detect write errors (both on the database and to the WARC output) and
abort with an error message.
Also fix a bunch of harmless lint warnings.
|
|
|
|
|
|
This change allows more complex scope boundaries, including loosening
edges a bit to include related resources of HTML pages (which makes
for more complete archives if desired).
|
|
|
|
The queuing code now performs proper lease accounting, and it will not
return a URL twice if the page load is slow.
|
|
|
|
|