aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorJordan <me@jordan.im>2022-03-24 09:08:13 -0700
committerJordan <me@jordan.im>2022-03-24 09:08:13 -0700
commit6355aa4310ff0c32b056580e812ca6f0e2a5ee2f (patch)
tree3a3008d4d50e5e19f6805b1e1e03460e202048f9 /README.md
parenta39310f111cef49ff630cc12fdebabc4df37ec28 (diff)
downloadcrawl-6355aa4310ff0c32b056580e812ca6f0e2a5ee2f.tar.gz
crawl-6355aa4310ff0c32b056580e812ca6f0e2a5ee2f.zip
links, crawl: dramatically reduce memory usage
to prevent excessive memory usage and OOM crashes, rather than store and pass around response bodies in memory buffers, let's store them temporarily on the filesystem wget-style and delete them when processed
Diffstat (limited to 'README.md')
-rw-r--r--README.md2
1 files changed, 2 insertions, 0 deletions
diff --git a/README.md b/README.md
index 128088c..d187cab 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,8 @@ which make crawl more amenable to serve as a drop-in replacement for
[wpull](https://github.com/ArchiveTeam/wpull)/[grab-site](https://github.com/ArchiveTeam/grab-site).
Notable changes include:
+* dramatically reduce memory usage; (temporarily) write responses to
+ the filesystem rather than pass data around in memory buffers
* --bind, support making outbound requests from a particular interface
* --resume, directory containing the crawl state to continue from
* infinite recursion depth by default