The beware P10 protocol definition
http://ircd.bircd.org/bewarep10.txt
The aim of this document is to give a *complete* definition of a protocol which is compatible with existing implementations of the P10 protocol. it should allow writing a complete implementation, based on this document alone, without anything left uncertain.
It is based on:
- Undernet P10 Protocol and Interface Specification
- "The P10 server-server protocol" by Carlo Wood
- Raw data sent by ircu
- Ircu source code
- My own ideas about the protocol
Existing documentation about P10 is far from complete and leaves a lot of things uncertain.
Some definitions used in this document:
- beware
- The nickname of the author of this document.
- byte
- A unit of 8 bits of data.
- character
- char
- One byte, notated as a decimal number in the range 0-255 or a printable ascii character (example: 65, 'A')
- string
- Sequence of bytes
- parser
- The implementation which receives and processes the stream
- generate
- sending data which has not been received, as opposed to passing data on which has been received.
- TS
- "TimeStamp?". notation of a date+time. ascii decimal notation of the number of seconds, not counting leap seconds, since jan-1-1970, 00:00:00 UTC.
"must", "must not","should", "may" are as described in rfc2119. interpret "disallowed" as "must not".
Hexadecimal numbers in this document use pascal notation: a $ prefix. the number of hex digits (nibbles) represents the size of the data; for example a byte it anything between $00 and $ff.
"nick" without "num" or "numeric" refers to a nickname.
THE STREAM OF DATA, LINES, LINE TERMINATION
P10 is a "text" protocol. it is human readable/writeable.
- CR
- Carriage Return. character 13.
- LF
- Line Feed. character 10.
- CRLF
- <CR><LF>
- NULL
- Character 0.
- EOL
- End Of Line (line termination)
- Definition of the stream
-
<line><EOL><line><EOL> .... <garbage>
- Line termination (EOL)
- When sending, line termination may be be either <CRLF> or <LF>. It must not be anything else. The parser *must* accept <LF> and <CRLF> as line termination. it *may* accept any other sequence of <CR> and <LF> as EOL. It must not parse anything else as "line termination".
- line
- a sequence of characters, minimum length 1 byte, maximum length 510 bytes, *not* including the EOL. if a parser encounters a line with a length of 0 bytes, it must be silently ignored, and it must not do anything else. a line which is longer than the maximum length is disallowed.
NULL, CR, and LF are disallowed in a line, any other character is allowed.
- NULL character note
- A parser can encounter a line which contains a NULL character. it *may* terminate the line at the first NULL character (remove anything after and including the first NULL character from the line).
- garbage
- Any data between the last EOL and the end of the stream. it must not be parsed as a line.
P10 BASE64
P10 protocol uses a base64 notation for numeric nicks, and for the IP parameter in the N token. it uses the following set of 64 characters, in the sequence from 0 to 63:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789[]
They are from now referred to as the P10 base64 characters.
a P10 base64 string is a sequence of P10 base64 characters, with a minimum length of 1 character. it must not contain any other character. if the string has a length of more than one character, the string begins with the most significant character and ends with the least significant character.
note P10 base64 strings are case significant. whenever this document mentions "base64", read it as "P10 base64".
NUMERIC NICKS
P10 uses "numeric nicks" to identify users and servers on the network, as opposed to names. the numeric is a base64 string or 2 concatenated base64 strings.
- a server numeric is 2 base64 characters; there can be a maximum of 4096 servers on the network:
AA (0), AB (1), ... ]] (4095).
- a client numeric is a server numeric + the number of the client on that server. a total of 5 base64 chars. for example ABAAC is client #2 (AAC) on server #1 (AB). one server can have a maximum of 262144 clients.
A server has a "max client numeric", this is sent in SERVER messages. a client numeric on a server, ANDed with the max server numeric, must be unique. for example if server YY's max client numeric is YYA]] (4095), clients YYBXX and YCXX can't exist at the same time, doing this would cause a "numeric collision", which is in this a protocol violation. but such numerics which occupy the same "slot" are not identical - a message sent to user YYBXX in this example must not reach user YYCXX. the max client numeric has to be 2n-1.
Described above are "extended numerics", as used on undernet.
short numerics: P10 short numerics uses 1 char for server numerics, allowing 64 servers on the net, and a 3 chars for client numerics: server numeric + 2 characters for the client on that server, allowing 4096 clients on one server.
SS = extended numerics, server. SSCCC = extended numerics, client S = short numerics server SCC = short numerics client.
in any example numeric with S and C, interpret "S" as a character of the server number, "C" as the client on the server.
I describe 2 different standards which are not compatible with each other:
- "undernet" P10 (extended numerics only)
- The P10 protocol as used on undernet, and probably other networks, such as quakenet. One *must* parse and send extended numerics. one *may* parse short numerics. one must not generate short numerics.
- general purpose (mixed short/extended numeric)
- One must parse both short and extended numerics, and may generate both short and extended numerics. This also implies that any numeric which is ASACC or AS may be sent as short numeric.
- if one can parse short numerics, it *must* consider short numeric SCC and extended numeric ASACC, and also short numeric S and extended numeric AS, as being equivalent; both can identify the same thing.
- an implementation complies with both standards, if it can parse short numerics and extended numerics, and generates only extended numerics. this is true for undernet-ircu (version 2.10.10, 2.10.11), and beware ircd version 1.4.0 and later.
- Note
- Universal-ircu can send 4 character numerics (SCCC). this is *not* valid according to this protocol definition, one *must not* send them. right now, one *may* parse them, if they are translated to ASCCC. doing this allows a P10 implementation to link to universal. this may change later, 4 char numerics may later be used for something different, such as services.
- Note
- An implementation can be or not be transparent to numerics - sending them as it receives them, preserving short/extended. ircu is transparent to numerics. this means it can't be between something which sends short numerics, and something which can't parse short numerics.
SYNTAX OF A LINE
- Space is character 32 ($20)
The source, command, and parameters, are separated by spaces.
<source> <command> [<parameters>]
- One must send only the short command token. One may parse both short and long command token, and if one does, they must be considered equivalent; for example N = NICK. for example if i say "receives a NICK line" it may actually be a N token.
- Command tokens are uppercase. one must not send lowercase command tokens. one may parse them.
- If source begins with a colon, it (except for the colon) is the name. Otherwise, it is a numeric. A P10 implementation must only send lines with a numeric source prefix.
- If the source does not exist: If the command is SQUIT or KILL (or short token), the line must be parsed anyway, with the directly linked server from which the message came as the source. Otherwise the line must be ignored.
- If the source exists but the message comes from the wrong direction, it must be ignored.
- A line may have up to 15 parameters. Parameters are separated by spaces.
- The last parameter may be prefixed by a colon; this allows the last parameter to have spaces, or to have a length of 0 characters:
<source> <command> <param1> <paramN> :<last parameter>
A parser must be able to parse lines with colon prefixed last parameter, and without. for example parameters "a b c" and "a b :c" are equivalent.
P10 Command Tokens
Short Token | Long Token |
A | AWAY |
AC | ACCOUNT |
AD | ADMIN |
B | BURST |
C | CREATE |
CM | CLEARMODE |
D | KILL |
EB | END_OF_BURST |
EA | EOB_ACK |
G | PING |
GL | GLINE |
I | INVITE |
J | JOIN |
K | KICK |
L | PART |
M | MODE |
MO | MOTD |
N | NICK |
O | NOTICE |
OM | OPMODE |
P | PRIVMSG |
Q | QUIT |
R | STATS |
RI | RPING |
RO | RPONG |
S | SERVER |
SQ | SQUIT |
T | TOPIC |
V | VERSION |
Y | ERROR |
Z | PONG |
CLOSE | CLOSE |
CN | CNOTICE |
CO | CONNECT |
CP | CPRIVMSG |
DIE | DIE |
DNS | DNS |
E | NAMES |
F | INFO |
GET | GET |
H | WHO |
HASH | HASH |
HELP | HELP |
ISON | ISON |
JU | JUPE |
LI | LINKS |
LIST | LIST |
LL | ASLL |
LU | LUSERS |
MAP | MAP |
OPER | OPER |
PA | PASS |
POST | POST |
PRIVS | PRIVS |
PROTO | PROTO |
REHASH | REHASH |
RESET | RESET |
RESTART | RESTART |
SE | SETTIME |
SET | SET |
SJ | SVSJOIN |
SN | SVSNICK |
TI | TIME |
TR | TRACE |
U | SILENCE |
UP | UPING |
USER | USER |
USERHOST | USERHOST |
USERIP | USERIP |
W | WHOIS |
WA | WALLOPS |
WC | WALLCHOPS |
WU | WALLUSERS |
WV | WALLVOICES |
X | WHOWAS |