diff options
author | Jordan <me@jordan.im> | 2022-03-24 09:08:13 -0700 |
---|---|---|
committer | Jordan <me@jordan.im> | 2022-03-24 09:08:13 -0700 |
commit | 6355aa4310ff0c32b056580e812ca6f0e2a5ee2f (patch) | |
tree | 3a3008d4d50e5e19f6805b1e1e03460e202048f9 /README.md | |
parent | a39310f111cef49ff630cc12fdebabc4df37ec28 (diff) | |
download | crawl-6355aa4310ff0c32b056580e812ca6f0e2a5ee2f.tar.gz crawl-6355aa4310ff0c32b056580e812ca6f0e2a5ee2f.zip |
links, crawl: dramatically reduce memory usage
to prevent excessive memory usage and OOM crashes, rather than store and
pass around response bodies in memory buffers, let's store them
temporarily on the filesystem wget-style and delete them when processed
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 2 |
1 files changed, 2 insertions, 0 deletions
@@ -6,6 +6,8 @@ which make crawl more amenable to serve as a drop-in replacement for [wpull](https://github.com/ArchiveTeam/wpull)/[grab-site](https://github.com/ArchiveTeam/grab-site). Notable changes include: +* dramatically reduce memory usage; (temporarily) write responses to + the filesystem rather than pass data around in memory buffers * --bind, support making outbound requests from a particular interface * --resume, directory containing the crawl state to continue from * infinite recursion depth by default |