diff options
Diffstat (limited to 'spec/control-spec/message-format.md')
-rw-r--r-- | spec/control-spec/message-format.md | 185 |
1 files changed, 185 insertions, 0 deletions
diff --git a/spec/control-spec/message-format.md b/spec/control-spec/message-format.md new file mode 100644 index 0000000..a9af669 --- /dev/null +++ b/spec/control-spec/message-format.md @@ -0,0 +1,185 @@ +<a id="control-spec.txt-2"></a> + +# Message format + +<a id="control-spec.txt-2.1"></a> + +## Description format + +The message formats listed below use ABNF as described in RFC 2234. +The protocol itself is loosely based on SMTP (see RFC 2821). + +We use the following nonterminals from RFC 2822: atom, qcontent + +We define the following general-use nonterminals: + +QuotedString = DQUOTE \*qcontent DQUOTE + +There are explicitly no limits on line length. All 8-bit characters +are permitted unless explicitly disallowed. In QuotedStrings, +backslashes and quotes must be escaped; other characters need not be +escaped. + +Wherever CRLF is specified to be accepted from the controller, Tor MAY also +accept LF. Tor, however, MUST NOT generate LF instead of CRLF. +Controllers SHOULD always send CRLF. + +<a id="control-spec.txt-2.1.1"></a> + +### Notes on an escaping bug + +CString = DQUOTE \*qcontent DQUOTE + +Note that although these nonterminals have the same grammar, they +are interpreted differently. In a QuotedString, a backslash +followed by any character represents that character. But +in a CString, the escapes "\\n", "\\t", "\\r", and the octal escapes +"\\0" ... "\\377" represent newline, tab, carriage return, and the +256 possible octet values respectively. + +The use of CString in this document reflects a bug in Tor; +they should have been QuotedString instead. In the future, they +may migrate to use QuotedString instead. If they do, the +QuotedString implementation will never place a backslash before a +"n", "t", "r", or digit, to ensure that old controllers don't get +confused. + +For future-proofing, controller implementors MAY use the following +rules to be compatible with buggy Tor implementations and with +future ones that implement the spec as intended: + +```text + Read \n \t \r and \0 ... \377 as C escapes. + Treat a backslash followed by any other character as that character. +``` + +Currently, many of the QuotedString instances below that Tor +outputs are in fact CStrings. We intend to fix this in future +versions of Tor, and document which ones were broken. (See +bugtracker ticket #14555 for a bit more information.) + +Note that this bug exists only in strings generated by Tor for the +Tor controller; Tor should parse input QuotedStrings from the +controller correctly. + +<a id="control-spec.txt-2.2"></a> + +## Commands from controller to Tor { #commands } + +```text + Command = Keyword OptArguments CRLF / "+" Keyword OptArguments CRLF CmdData + Keyword = 1*ALPHA + OptArguments = [ SP *(SP / VCHAR) ] +``` + +A command is either a single line containing a Keyword and arguments, or a +multiline command whose initial keyword begins with +, and whose data +section ends with a single "." on a line of its own. (We use a special +character to distinguish multiline commands so that Tor can correctly parse +multi-line commands that it does not recognize.) Specific commands and +their arguments are described below in section 3. + +<a id="control-spec.txt-2.3"></a> + +## Replies from Tor to the controller { #replies } + +```text + Reply = SyncReply / AsyncReply + SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine + AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine + + MidReplyLine = StatusCode "-" ReplyLine + DataReplyLine = StatusCode "+" ReplyLine CmdData + EndReplyLine = StatusCode SP ReplyLine + ReplyLine = [ReplyText] CRLF + ReplyText = XXXX + StatusCode = 3DIGIT +``` + +Unless specified otherwise, multiple lines in a single reply from +Tor to the controller are guaranteed to share the same status +code. Specific replies are mentioned below in section 3, and +described more fully in section 4. + +\[Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes +generate AsyncReplies of the form "\*(MidReplyLine / DataReplyLine)". +This is incorrect, but controllers that need to work with these +versions of Tor should be prepared to get multi-line AsyncReplies with +the final line (usually "650 OK") omitted.\] + +<a id="control-spec.txt-2.4"></a> + +## General-use tokens { #tokens } + +; CRLF means, "the ASCII Carriage Return character (decimal value 13) +; followed by the ASCII Linefeed character (decimal value 10)." +CRLF = CR LF + +; How a controller tells Tor about a particular OR. There are four +; possible formats: +; $Fingerprint -- The router whose identity key hashes to the fingerprint. +; This is the preferred way to refer to an OR. +; $Fingerprint~Nickname -- The router whose identity key hashes to the +; given fingerprint, but only if the router has the given nickname. +; $Fingerprint=Nickname -- The router whose identity key hashes to the +; given fingerprint, but only if the router is Named and has the given +; nickname. +; Nickname -- The Named router with the given nickname, or, if no such +; router exists, any router whose nickname matches the one given. +; This is not a safe way to refer to routers, since Named status +; could under some circumstances change over time. +; +; The tokens that implement the above follow: + +ServerSpec = LongName / Nickname +LongName = Fingerprint \[ "~" Nickname \] + +; For tors older than 0.3.1.3-alpha, LongName may have included an equal +; sign ("=") in lieu of a tilde ("~"). The presence of an equal sign +; denoted that the OR possessed the "Named" flag: + +LongName = Fingerprint \[ ( "=" / "~" ) Nickname \] + +Fingerprint = "$" 40*HEXDIG +NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9" +Nickname = 1*19 NicknameChar + +; What follows is an outdated way to refer to ORs. +; Feature VERBOSE_NAMES replaces ServerID with LongName in events and +; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version +; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later. +ServerID = Nickname / Fingerprint + +; Unique identifiers for streams or circuits. Currently, Tor only +; uses digits, but this may change +StreamID = 1*16 IDChar +CircuitID = 1*16 IDChar +ConnID = 1*16 IDChar +QueueID = 1*16 IDChar +IDChar = ALPHA / DIGIT + +Address = ip4-address / ip6-address / hostname (XXXX Define these) + +; A "CmdData" section is a sequence of octets concluded by the terminating +; sequence CRLF "." CRLF. The terminating sequence may not appear in the +; body of the data. Leading periods on lines in the data are escaped with +; an additional leading period as in RFC 2821 section 4.5.2. +CmdData = *DataLine "." CRLF +DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF +LineItem = NonCR / 1*CR NonCRLF +NonDotItem = NonDotCR / 1\*CR NonCRLF + +; ISOTime, ISOTime2, and ISOTime2Frac are time formats as specified in +; ISO8601. +; example ISOTime: "2012-01-11 12:15:33" +; example ISOTime2: "2012-01-11T12:15:33" +; example ISOTime2Frac: "2012-01-11T12:15:33.51" +IsoDatePart = 4*DIGIT "-" 2*DIGIT "-" 2*DIGIT +IsoTimePart = 2*DIGIT ":" 2*DIGIT ":" 2*DIGIT +ISOTime = IsoDatePart " " IsoTimePart +ISOTime2 = IsoDatePart "T" IsoTimePart +ISOTime2Frac = IsoTime2 \[ "." 1\*DIGIT \] + +; Numbers +LeadingDigit = "1" - "9" +UInt = LeadingDigit \*Digit |