aboutsummaryrefslogtreecommitdiff
path: root/spec/control-spec/message-format.md
diff options
context:
space:
mode:
Diffstat (limited to 'spec/control-spec/message-format.md')
-rw-r--r--spec/control-spec/message-format.md185
1 files changed, 185 insertions, 0 deletions
diff --git a/spec/control-spec/message-format.md b/spec/control-spec/message-format.md
new file mode 100644
index 0000000..a9af669
--- /dev/null
+++ b/spec/control-spec/message-format.md
@@ -0,0 +1,185 @@
+<a id="control-spec.txt-2"></a>
+
+# Message format
+
+<a id="control-spec.txt-2.1"></a>
+
+## Description format
+
+The message formats listed below use ABNF as described in RFC 2234.
+The protocol itself is loosely based on SMTP (see RFC 2821).
+
+We use the following nonterminals from RFC 2822: atom, qcontent
+
+We define the following general-use nonterminals:
+
+QuotedString = DQUOTE \*qcontent DQUOTE
+
+There are explicitly no limits on line length. All 8-bit characters
+are permitted unless explicitly disallowed. In QuotedStrings,
+backslashes and quotes must be escaped; other characters need not be
+escaped.
+
+Wherever CRLF is specified to be accepted from the controller, Tor MAY also
+accept LF. Tor, however, MUST NOT generate LF instead of CRLF.
+Controllers SHOULD always send CRLF.
+
+<a id="control-spec.txt-2.1.1"></a>
+
+### Notes on an escaping bug
+
+CString = DQUOTE \*qcontent DQUOTE
+
+Note that although these nonterminals have the same grammar, they
+are interpreted differently. In a QuotedString, a backslash
+followed by any character represents that character. But
+in a CString, the escapes "\\n", "\\t", "\\r", and the octal escapes
+"\\0" ... "\\377" represent newline, tab, carriage return, and the
+256 possible octet values respectively.
+
+The use of CString in this document reflects a bug in Tor;
+they should have been QuotedString instead. In the future, they
+may migrate to use QuotedString instead. If they do, the
+QuotedString implementation will never place a backslash before a
+"n", "t", "r", or digit, to ensure that old controllers don't get
+confused.
+
+For future-proofing, controller implementors MAY use the following
+rules to be compatible with buggy Tor implementations and with
+future ones that implement the spec as intended:
+
+```text
+ Read \n \t \r and \0 ... \377 as C escapes.
+ Treat a backslash followed by any other character as that character.
+```
+
+Currently, many of the QuotedString instances below that Tor
+outputs are in fact CStrings. We intend to fix this in future
+versions of Tor, and document which ones were broken. (See
+bugtracker ticket #14555 for a bit more information.)
+
+Note that this bug exists only in strings generated by Tor for the
+Tor controller; Tor should parse input QuotedStrings from the
+controller correctly.
+
+<a id="control-spec.txt-2.2"></a>
+
+## Commands from controller to Tor { #commands }
+
+```text
+ Command = Keyword OptArguments CRLF / "+" Keyword OptArguments CRLF CmdData
+ Keyword = 1*ALPHA
+ OptArguments = [ SP *(SP / VCHAR) ]
+```
+
+A command is either a single line containing a Keyword and arguments, or a
+multiline command whose initial keyword begins with +, and whose data
+section ends with a single "." on a line of its own. (We use a special
+character to distinguish multiline commands so that Tor can correctly parse
+multi-line commands that it does not recognize.) Specific commands and
+their arguments are described below in section 3.
+
+<a id="control-spec.txt-2.3"></a>
+
+## Replies from Tor to the controller { #replies }
+
+```text
+ Reply = SyncReply / AsyncReply
+ SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine
+ AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine
+
+ MidReplyLine = StatusCode "-" ReplyLine
+ DataReplyLine = StatusCode "+" ReplyLine CmdData
+ EndReplyLine = StatusCode SP ReplyLine
+ ReplyLine = [ReplyText] CRLF
+ ReplyText = XXXX
+ StatusCode = 3DIGIT
+```
+
+Unless specified otherwise, multiple lines in a single reply from
+Tor to the controller are guaranteed to share the same status
+code. Specific replies are mentioned below in section 3, and
+described more fully in section 4.
+
+\[Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes
+generate AsyncReplies of the form "\*(MidReplyLine / DataReplyLine)".
+This is incorrect, but controllers that need to work with these
+versions of Tor should be prepared to get multi-line AsyncReplies with
+the final line (usually "650 OK") omitted.\]
+
+<a id="control-spec.txt-2.4"></a>
+
+## General-use tokens { #tokens }
+
+; CRLF means, "the ASCII Carriage Return character (decimal value 13)
+; followed by the ASCII Linefeed character (decimal value 10)."
+CRLF = CR LF
+
+; How a controller tells Tor about a particular OR. There are four
+; possible formats:
+; $Fingerprint -- The router whose identity key hashes to the fingerprint.
+; This is the preferred way to refer to an OR.
+; $Fingerprint~Nickname -- The router whose identity key hashes to the
+; given fingerprint, but only if the router has the given nickname.
+; $Fingerprint=Nickname -- The router whose identity key hashes to the
+; given fingerprint, but only if the router is Named and has the given
+; nickname.
+; Nickname -- The Named router with the given nickname, or, if no such
+; router exists, any router whose nickname matches the one given.
+; This is not a safe way to refer to routers, since Named status
+; could under some circumstances change over time.
+;
+; The tokens that implement the above follow:
+
+ServerSpec = LongName / Nickname
+LongName = Fingerprint \[ "~" Nickname \]
+
+; For tors older than 0.3.1.3-alpha, LongName may have included an equal
+; sign ("=") in lieu of a tilde ("~"). The presence of an equal sign
+; denoted that the OR possessed the "Named" flag:
+
+LongName = Fingerprint \[ ( "=" / "~" ) Nickname \]
+
+Fingerprint = "$" 40*HEXDIG
+NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9"
+Nickname = 1*19 NicknameChar
+
+; What follows is an outdated way to refer to ORs.
+; Feature VERBOSE_NAMES replaces ServerID with LongName in events and
+; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version
+; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later.
+ServerID = Nickname / Fingerprint
+
+; Unique identifiers for streams or circuits. Currently, Tor only
+; uses digits, but this may change
+StreamID = 1*16 IDChar
+CircuitID = 1*16 IDChar
+ConnID = 1*16 IDChar
+QueueID = 1*16 IDChar
+IDChar = ALPHA / DIGIT
+
+Address = ip4-address / ip6-address / hostname (XXXX Define these)
+
+; A "CmdData" section is a sequence of octets concluded by the terminating
+; sequence CRLF "." CRLF. The terminating sequence may not appear in the
+; body of the data. Leading periods on lines in the data are escaped with
+; an additional leading period as in RFC 2821 section 4.5.2.
+CmdData = *DataLine "." CRLF
+DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF
+LineItem = NonCR / 1*CR NonCRLF
+NonDotItem = NonDotCR / 1\*CR NonCRLF
+
+; ISOTime, ISOTime2, and ISOTime2Frac are time formats as specified in
+; ISO8601.
+; example ISOTime: "2012-01-11 12:15:33"
+; example ISOTime2: "2012-01-11T12:15:33"
+; example ISOTime2Frac: "2012-01-11T12:15:33.51"
+IsoDatePart = 4*DIGIT "-" 2*DIGIT "-" 2*DIGIT
+IsoTimePart = 2*DIGIT ":" 2*DIGIT ":" 2*DIGIT
+ISOTime = IsoDatePart " " IsoTimePart
+ISOTime2 = IsoDatePart "T" IsoTimePart
+ISOTime2Frac = IsoTime2 \[ "." 1\*DIGIT \]
+
+; Numbers
+LeadingDigit = "1" - "9"
+UInt = LeadingDigit \*Digit