From fd60e00118d107e1d53fb57acc64aceb29628760 Mon Sep 17 00:00:00 2001
From: Jordan <me@jordan.im>
Date: Thu, 10 Feb 2022 19:35:23 -0700
Subject: readme: document changes from upstream

---
 README.md | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index bcf1bba..07720f2 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,20 @@
 A very simple crawler
 =====================
 
+This is a fork of [crawl](https://git.autistici.org/ale/crawl) with
+changes which make crawl more amenable to serve as a drop-in
+replacement for [wpull](https://github.com/ArchiveTeam/wpull)/
+[grab-site](https://github.com/ArchiveTeam/grab-site). Notable changes
+include:
+
+* --bind, support making outbound requests from a particular interface
+* --resume, directory containing the crawl state to continue from
+* infinite recursion depth by default
+* set User-Agent fingerprint to Firefox on Windows to look more like
+  a browser
+* store crawl contents in a dated directory
+* update ignore regex set per updates to [ArchiveBot](https://github.com/ArchiveTeam/ArchiveBot)
+
 This tool can crawl a bunch of URLs for HTML content, and save the
 results in a nice WARC file. It has little control over its traffic,
 save for a limit on concurrent outbound requests. An external tool
@@ -17,7 +31,7 @@ interrupted and restarted without issues.
 Assuming you have a proper [Go](https://golang.org/) environment setup,
 you can install this package by running:
 
-    $ go get git.autistici.org/ale/crawl/cmd/crawl
+    $ go get git.jordan.im/crawl/cmd/crawl
 
 This should install the *crawl* binary in your $GOPATH/bin directory.
 
@@ -82,5 +96,5 @@ Like most crawlers, this one has a number of limitations:
 
 # Contact
 
-Send bugs and patches to ale@incal.net.
+Send bugs and patches to me@jordan.im.
 
-- 
cgit v1.2.3-54-g00ecf