This is a text protocol for a DC style network that I could support. What I'm after is a simple protocol that doesn't require very much effort neither in hub nor client, and is yet extensible. It addresses some of the issues in the NMDC protocol, the most interesting being extensibility and hub bandwidth. The same protocol structure is used both for client-hub and client-client communication. This document is split into two parts, the first shows the structure of the protocol, while the second implements a specific system using this structure. ADC stands for anything you would like it to stand for, Advanced DC is the first neutral thing that springs to mind, apart from the obvious =).

1.2 Credits

Many ideas for this I’ve taken from Jan Vidar Krey’s DCTNG draft, others come from the DC dev hub people (notably cologic, fusbar and sedulus). Oh, and not to forget, Jon Hess for the original DC idea.

1.3 Some syntactic notes

<text> means a mandatory field, sometimes conditionally mandatory
a|b means a or b, either a or b may be used
anything outside brackets is to be sent literally

2 Line protocol

2.1 General

All messages consist of a pseudo-FOURCC, where the first letter designates how the message should be sent, and the other three specify what to do.
Parameters are separated by space, and each message is ended by newline (0x0a). Space is escaped using the string “\s”, newline “\n” and backslash “\\”. All other escapes using the backslash are reserved for future use, and any message containing unknown escapes must be discarded.
All text must be sent as UTF-8, including file lists, searches, nick’s, hub lists etc.
Clients must ignore unknown/badly formatted messages completely. Hubs must ignore invalid commands, and should dispatch unknown commands according to their type (possibly filtered by rules in a configuration file). This way, clients can support new features updating neither hub nor all other clients. The empty command can be used for keep-alive (a single newline).
Addresses are always sent as x.x.x.x:port for IPv4 and RFC2732 ([x:x:x:x:x:x:x:x]:port) with following :port for IPv6 (port must always be specified, default ports don’t exist). Hub addresses must always be specified in url form, with “adc” as protocol specifier (“adc://x.x.x.x:port/”). DNS names are only allowed in hub lists, hub links etc. Clients may never use DNS names as their IP, to avoid spoofing and resolving problems.
The connecting party always speaks first.
Numbers are sent as strings of [0-9], using ‘.’ as decimal separator and no thousands separator. Integers are numbers without decimal separator. Applications should be prepared to handle at least 64-bit (signed) integers, and >= 64-bit floats for numbers. Negative numbers are signaled by a ‘-‘ in front of them.
Short binary data is sent as base32 encoded strings. Longer binary data transfers should use the file transfer mechanism with special naming.
All text data in the protocol that is not entered by the user, including protocol names, extensions etc may only contain viewable characters that may be encoded by one byte in the UTF-8 encoding (ASCII codes 33-127). Although the protocol is case-sensitive, names distinguished only by case are disallowed (upper case is preferred).

2.2 Message Layout

The typical message looks like this:

XYYY <myCID> <targetCID> [p0] [p1] ... [pn] [AAq0] [BBq1] ... [XXqn]\n

X	Message type
YYY	Action
myCID	CID of the sender
targetCID	Target CID, type D messages only
p0 … pn	Positional parameters, these are always mandatory
q0 … qn	Named parameters, these are optional unless otherwise noted. Each name is a two-character code that identifies the named parameter followed by it’s value. Named parameters may also take the empty value, usually meaning that their effect is being withdrawn (for example OP rights removed for an op). Names may be reused with different meanings in different commands, but developers must strive to avoid name clashes in the relatively limited namespace within the context of a command, for example by publishing names and their effects on the ADC board (wherever that may be =).

Since action is separated from the message type, the client should ignore the type, and only look at the three action letters, although some sanity check filtering should be done to ensure proper operation even with buggy clients / hubs. This allows clients to support features sent in new ways without changing the hub (search targeted at one user for instance).

It is valid to send unknown messages, but it is preferred that they’re preceded with proper SUP to avoid sending garbage that nobody understands anyway. Other clients can be notified of extended features by adding flags to the INF.

All messages must have the originating CID specified as first parameter. Type D must also have the target CID as second parameter.

Each named parameter has the form AAyyy where AA are two upper-case letters and yyy some arbitrary data associated with the parameter. Named parameters are used to add special processing options to commands, and if a flag requires that the other party interprets the command in a non-standard way (compression for instance), a SUP is required to make sure both parties understand the flag correctly.

2.2.1 Message types

A	Active broadcast. Message should be broadcast to all UDP active clients.
B	Broadcast. Hub should send message to all connected clients.
C	Client message. All TCP client-client messages are sent like this (hubs will never see this type).
D	Direct message. The target CID must be inserted after the myCID but before the other parameters of the action. Apart from sending the message to the target, an exact copy must always be sent to the source to confirm that the hub has correctly processed the message.
I	Info message. This message originated from the hub. myCID will always by the CID of the hub.
H	Hub message. This message is intended for the hub only, not relayed to other clients.
P	Passive broadcast. Message should be broadcast to all UDP passive clients.
U	UDP message. Message is sent directly with UDP to the target client (hubs will never see this type).

2.2.2 Client identification (CID)

Connected clients are identified by a CID (Client IDentification), which globally and uniquely identifies a particular user. It is invalid for two clients with the same CID to connect to the same hub, and hubs must enforce this. Clients should also use the same CID when connecting to multiple hubs. If clients offer different shares on different hubs, they must keep track of where a connecting client comes from so that the correct files always will be available. Clients should also strive to keep the same CID between sessions, to ease the implementation of favorite users and queue handling.

CID’s are 64 bits in length, and should be generated using the DCE UUID standard (several libraries exist for this) and then XOR’ing the high and low 64-bit parts together.

It is up to the hub developer to decide whether to base hub registration on CID or nickname (during login, the client (usually) provides both, but only the CID is mandated by the protocol), but the latter is probably more convenient for the users.

3 Files

3.1 File names and structure

Filenames are relative to a fictive root in the user’s share. The ‘/’ is used to separate directories, and each file or directory name must be unique in a case-insensitive context. Any viewable characters (including space, char code >= 32) are valid names for files, the ‘/’ and ‘\’ are escaped by ‘\’. Clients must then take care to properly filter the filename for the target file system, but must be ready to request filenames from other clients according to these rules. The special names ‘.’ and ‘..’ may not occur as a directory or filename, any file list received containing those must be completely ignored. The shared files are identified relative to the unnamed root ‘/’ (“/dir/subdir/filename.ext”), while extensions can extend on this namespace by adding a named root, preferably using their SUP name. “TTH/<root-base32>” is for example be used to locate a file in the share by TTH root instead of filename. Rootless filenames are treated as special (they may not appear in the file listing), and can be used to supply binary transfers of arbitrary data but should not be used to avoid polluting the namespace by using a named root. All directory names must always end with a ‘/’. Names in the unnamed root are generally not used for identifying files in the share, as the TTH root does a much better job. Hence, commands that get files or file data may never use the unnamed root for selecting file.

The special, rootless filename “files.xml” specifies the full file listing, uncompressed, in XML using the utf-8 encoding. Clients can then compress this list and offer the compressed one on a SUP basis. I recommend bzip2 or generic zlib compressed transfers for this task, although the uncompressed list must always be available.

The special type “list” is used to browse partial lists. A partial file list has the same structure as a normal list, but some directories in it may be tagged with an attribute ‘Incomplete=”1”’ to specify that the directory has sub-entries. Only rootless directory names may be requested, for instance “/” and “/share/”. The content of that directory will then be sent to the requesting client, depth chosen by the sending client (it should normally only send the directory level requested, but may choose to send more if there are few entries, for example a directory only containing 2-3 subdirectories and a few files). The “Base” attribute of “FileListing” specifies which directory a particular file list represents. Clients must always allow list browsing regardless of available slots.

3.2 Hashes

Hashing is mandatory for files shared in an ADC client. For files, the merkle hash tree, as described by http://www.open-content.net/specs/draft-jchapweske-thex-02.html (this is the same that DC++ uses), is used to create a full tree of hashes. The Tiger algorithm, as described by http://www.cs.technion.ac.il/~biham/Reports/Tiger/ is used as hash algorithm and a base segment size of 1024 bytes must be used when generating the tree, but the clients may discard as many levels as they see fit to conserve space requirements for the tree data. A leaf granularity of 64 KiB is reasonable for smaller files, for large files at least 7 levels of leaf data must be kept (large files are those that aren’t covered by 7 levels of 64KiB blocks). Clients must always offer the tree regardless of available slots.

Generally, the root of the tree is used to identify a file uniquely within the network. It is used for searches and must always be present in the file list (incidentally, the root of the file list must also be available, and is discoverable by using GFI). The rest of the tree may be requested at a later stage using the normal client-client transfer procedure. The root is always encoded using base-32 encoding when converted to text.

3.3 File list

files.xml is the list of files intended for browsing. It has the following general structure:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

<FileListing Version="1" CID=”<my-CID>” Generator="DC++ 0.401" Base=”/”>

</Directory>

</Directory>

</FileListing>

“encoding” must always be set to utf-8. Clients must be prepared to handle xml files both with and without a BOM (byte order mark).

“Version” is not intended to change unless a breaking change is done to the structure of the file.

“CID” is the CID of the client that generated the list.

“Generator” is for statistical and informative purposes only and should not be used for extra content discovery.

“Base” is used for partial file lists, and must be present even in the normal list.

“TTH” is the base32 encoded tth root of the file.

“Incomplete” is used with partial file lists to signal that a directory contains unlisted items. “0” means that the directory is complete, but does normally not need to be specified.

More information may be added to the file by extensions, but is not guaranteed to be interpreted by other clients.

4 BASE messages

Each message is specified as the action code and the message type contexts under which it is valid. This particular implementation is known as BASE, as far as protocol identification is concerned. All ADC clients/hubs should support this minimum of functionality, extending as necessary. The connecting party will from now on be known as client, the other as server. It is always the server that controls state transitions.

The message types are merely a pointer to where the commands are most likely to appear, but clients should be prepared that they might arrive in other ways (for example type D or C searches to search a particular client).

For client-client communication, this protocol is identified by the string “ADC/1.0”.

In the descriptions of the commands, the mandatory <from-CID> and trailing named parameters have been omitted.

4.1 Client – Hub communication

During login, the client goes through a number of stages. An action is valid only in the NORMAL stage unless otherwise noted. The stages, in login order, are PROTOCOL (feature support discovery), IDENTIFY (user identification, static checks), VERIFY (password check), NORMAL (normal operation) and DATA (for binary transfers). Any error in hub communication means disconnection, hopefully preceded by an STA action.

4.2 Client – Client communication

The client – client messages use essentially the same stages as client – hub, but probably without VERIFY (Client access passwords are not supported in BASE), and an additional DATA state.

4.3 Actions

4.3.1 STA

STA <code> <description>

Types: C, D, I

States: All

Code

Status code in the form “xyy” where x specifies severity, and yy the specific error code. The severity and error code is treated separately, i e the same error could occur at different severity levels.

Severity values:

0 Success (used for confirming commands)

1 Recoverable (error but no disconnect)

2 Fatal (disconnect)

Error codes:

00 Generic, show description

x0 Same as 00, but categorized according to the rough structure set below

10 Generic hub error

11 Hub full

12 Hub disabled

20 Generic login/access error

21 Nick invalid

22 Nick taken

23 Invalid password

24 CID taken

25 Access denied, flag “FC” is the FOURCC of the offending command. Sent when a user is not allowed to execute a particular command

26 Registered users only

30 Kicks/bans/disconnects generic

31 Permanently banned

32 Temporarily banned, flag “TL” is an integer specifying the number of seconds left until it expires (This is used for kick as well…).

40 Protocol error

41 Transfer protocol unsupported, flag “TO” the token, flag “PR” the protocol string. The client receiving a CTM or RCM should send this if it doesn’t support the C-C protocol.

42 Direct connection failed, flag “TO” the token, flag “PR” the protocol string. The client receiving a CTM or RCM should send this if it tried but couldn’t connect.

42 Required INF field missing/bad, flag “FL” specifies the name of field.

43 Invalid state, flag “FC” the FOURCC of the offending command.

50 Client-client / file transfer error

51 File not available

52 File part not available

53 Slots full

Description

Text description of the error, suitable for viewing directly to the user

Even if an error code is unknown by the client, it should display the text message alone. Error codes are used so that the client can take different action on different errors. Most error codes don’t have parameters and only make sense in C and I types. Error responses should not be sent for obvious errors (a passive client sending a CTM for example). Some codes

4.3.2 SUP

SUP +|-<feature1> ... +|-<featureN>

Types: C, H, I

States: PROTOCOL, NORMAL

This command identifies which features a specific client / hub supports. The feature name should use only upper case letters, and possible a number to signal a revised feature. A central register of known features should be kept, to avoid clashes. All ADC clients should support the BASE feature (unless a future revision takes its place), which is this protocol. The resulting features used by two peers should be the intersection of features sent by the respective parties.

This command can also be used to dynamically add / remove features, ‘+’ meaning add and ‘-’ remove. For those commands that break or modify compatibility in some way (compression for example), the receiving end must verify with an equivalent SUP command, and the new feature set will be valid from that point. No other commands must be sent until the response has been received, to determine whether the other end actually supports the feature.

When the server receives this message the first time, it should reply with the same, send an INF about itself and move to the IDENTIFY state. The client, when it receives it the first time, should send an INF about itself.

4.3.3 INF

INF

Types: B, C, I, H

States: IDENTIFY, NORMAL

This command updates the information about a client. Each time this is received, it means that the fields specified have been added or updated. Each field is identified by two characters, directly followed by the data associated with that field. A field (and the effects of its presence) can be canceled by sending the field name without data. Clients should ignore any fields they don’t know, so that fields safely can be added in the future. Most of these fields are only interesting in the client-hub communication, during client-client this command is mainly used for identification purposes. Hubs can choose to require or ignore any or all of these fields; clients must work without any of them. Many of these fields, such as share size or client version, are purely informative heuristics, and should be taken with a grain of salt, as it is very easy to fake them. On the other hand, clients should strive to provide accurate data for the general health of the system, as providing invalid information probably will annoy a great deal of people. Updates are made in an incremental manner, by sending only the fields that have changed.

Fields:

I4	IPv4 address without port. A zero address (0.0.0.0) means that the server should replace it with the real IP of the client. Hubs must check a specified address so that it corresponds to what the client is connecting from to avoid DoS attacks, and only allow trusted clients to specify an address different from the one they’re connecting from. Thus it is best for clients to send the zero address and have it automatically detected to avoid complications for the user.
I6	IPv6 address without port. A zero address ([::]) means that the server should replace it with the real IP of the client.
U4	Client UDP port. Sending this field to the hub with a port means that this client wants to run in active mode. If this field is missing (or empty if changing modes), it means that the client should be treated as passive.
U6	Same as U4, but for IPv6.
SS	Share size in bytes, integer.
SF	Number of shared files, integer
VE	Client identification, version (client specific, recommended a short identifier then a float for version number). It is important that hubs don’t discriminate clients based on their VE tag but instead rely on SUP when it comes to which clients should be allowed (for example, “we only want clients that can hash”). VE is there mainly for informative reasons, and can perhaps be used to warn users that they’re using a known buggy or vulnerable client.
US	Maximum upload speed, bits/sec, integer
SL	Upload slots open, integer
AS	Automatic slot allocator speed limit, bytes/sec, integer. This is the recommended method of slot allocation, the client keeps opening slots as long as its total upload speed doesn’t exceed this value. SL then serves as a minimum number of slots open.
AM	Maximum number of slots open in automatic slot manager mode, integer.
EM	E-Mail, string.
NI	Nickname, string. Hub must ensure that this is unique (case sensitive) in each hub, to avoid confusion. Valid are all displayable characters (char code > 32) apart from space, although hubs are free to limit this further as they like with an appropriate error message.
DE	Description, string. Valid are all displayable characters (char code >= 32).
HN	Hubs where user is a normal user, integer.
HR	Hubs where user is registered (had to supply password), integer.
HO	Hubs where user is op in, integer.
TO	Token (used with CTM) in the c-c connection.
OP	1=op
AW	1=Away 2=Extended away, don’t care about main chat either (hubs can skip sending MSG commands if they want) (Other away modes are reserved for the future)
BO	1=Bot (in particular, this means that the client does not support file transfers, and thus should never be queried for direct connections)
HI	1=Hidden, should not be shown on the user list.
HU	1=Hub, this INF is about the hub itself

Hubs are welcome to mandate or discard any and all fields, but obviously the more the merrier (and clients could be disconnected for not sending some of them…).

Note; normally one would only accept an IP (I4 or I6) that is the same as the source IP of the connecting peer, allowing otherwise for trusted users only because your could channel DDoS attacks. Use caution when accepting unknown IPs. Only for trusted users one may allow a different IP or an IP from a different domain (IPv4 or IPv6) to be specified. If you fail to do this, your hub can be used as a medium for DDoS attacks.

When a server receives this in the IDENTIFY state, it should proceed to the VERIFY state by sending a PAS request or NORMAL state by starting sending the INF of all clients, where the INF of the connecting client must come last. When the hub that sends an INF about itself, the NI becomes hub name, VE version etc.

4.3.4 MSG

MSG <text>

Types: A, B, D, I, P

A chat message. The receiving clients should precede it with ‘<’ nick ‘>’, to allow for uniform displaying of messages. The client should not send its own nick in the text.

Flags:

PM<group-CID>	Private message, <group-CID> is the reply-to CID, and should be shown as header for the chat (window title, etc). This is used to implement group discussions such as op-chat. Must contain the originating CID if this is a normal private conversation.
ME	1 = message should be displayed as /me in IRC (“*nick text”)

4.3.5 SCH

SCH

Types: P, U, D, (B), (A)

Search. Each parameter is an operator followed by a term. Each term is a two-letter code followed by the data to search for. Clients that don’t recognize a field should ignore the search.

++, --, EX	String search term, where ++ is include, -- is exclude, and EX is extension. Each filename (including the path to it) should be matched using case insensitive substring search as follows: match all ++, remove those that match any --, and make sure the extension matches at least one of the EX (if it is present). Extensions must be sent without the leading ‘.’.
<=	Smaller than or equal size in bytes
>=	Larger than or equal size in bytes
==	Exact size in bytes
TO	Token, string. Used by the client to tell one search from the other. If present, the responding client must copy this field exactly to each search result.
TR	Tiger tree hash root, encoded with base32.
TY	File type, to be chosen from the following (none specified = any type): 1 = File 2 = Directory

Note that hubs normally only relay searches to passive clients (type P) and clients send searches to active clients by themselves using type U, which should prove a massive bandwidth saver for the hubs. Should ISP’s dislike this, a switch to type B searching is easily done.

4.3.6 RES

RES

Types: D, U

Search result, made up of fields similar to the INF ones. It is of course better for the network if the client sends all it knows about a file, unless it’s a lot of data. Search results without size and filename are obviously useless, but if a client has hashing or any other meta-data to add, that’s only good. Passive results should be limited to 5, active to 10. To return a directory as result, make sure the name ends with a path separator.

FN	Full filename including path
SI	Size, in bytes
SL	Slots currently available
TO	Token
TR	Tiger tree hash root, encoded with base32.
TD	Tiger tree depth, index of the highest level of tree data available, only root = 0, first level (2 leaves) = 1, second level = 2, etc…(this is useful when we want to verify a file and search for the most detailed tree)

4.3.7 CTM

CTM <proto> <port> TO<token>

Types: D

Connect to me. Used by active clients that want to connect to someone, or in answer to RCM. Only TCP active clients may send this. TO is a string that can be used to identify the connection in once a direct connection has been made, but is not mandatory. If present it must be passed with the initial INF during client-client connect. It is recommended that clients use tokens to avoid malicious clients that connect directly without going through the hub. <proto> is an arbitrary string specifying the protocol to connect with, in the case of an ADC compliant connection attempt, this should be the string “ADC/1.0”. If this is a response to a RCM, the <token> and <proto> fields should just be copied directly (if the protocol is supported of course). If a protocol is not supported, a DSTA must be sent indicating this.

4.3.8 RCM

RCM <proto> TO<token>

Types: D

Reverse CTM. Used by passive clients to signal that they want a connection token from an active client.

4.3.9 GPA

GPA <data>

Types: I

States: VERIFY

Get Password. The data parameter is at least 24 random bytes (base32 encoded), used to avoid replay attacks on the password.

4.3.10 PAS

PAS <password>

Types: H

States: VERIFY

Password. The CID (in base32), then the password, followed by the random data (binary, decoded byte-by-byte), passed through the Tiger hash algorithm (not Tiger Tree) then base32. When validated, this moves the server into NORMAL state.

4.3.11 QUI

QUI <CID>

Types: I

States: IDENTIFY, VERIFY, NORMAL

The client identified by CID disconnected from the hub. If the CID is the same as client receiving the QUI, it means that it should take action according to the reason (i e redirect or not reconnect in case of ban). The hub must never send any more data after the QUI to the client being disconnected.

The following flags may be present:

ID	CID of the initiator of the disconnect (for example the one that issued a kick)
TL	Time Left until reconnect is allowed, in seconds. -1 = forever.
MS	Message
RD	Redirect server url
DI	Any client that has this flag in the QUI message should have its transfers terminated by other clients connected to it, as it is unwanted in the system.

4.3.12 DSC

DSC <victim-CID>

Types: H

This is the friendly disconnect command. Kicks / Bans / Redirects etc can be implemented as User Commands to avoid locking onto a particular topology for who should receive the reasons, kick times etc.

4.3.13 GET

GET <type> <identifier> <start-pos> <bytes>

Types: C, H, I

Requests for a certain file or binary data to be transmitted. <start-pos> counts 0 as the first byte. <bytes> may be set to -1 to indicate that it is unknown. <type> is a [a-zA-Z0-9] that specifies the namespace for identifier, BASE requires that clients recognize the types “file”, “tthl” and “list”. Extensions may add to the identifier names as well as add new types.

“file” transfers transfer the file data in binary, starting at <start-pos> and sending <bytes> bytes. Identifier must be a TTH root value from the “TTH/” root.

“tthl” transfers send the highest level of leaves available (the one containing the most leaves) as a binary stream of leaves, right-to-left, with no spacing in between them. <start-pos> must be set to 0 and <bytes> to -1 when requesting the data. <bytes> must contain the total binary size of the leaf stream in SND, and by dividing this length by the individual hash length, the number of leaves, and thus the leaf level can be deducted. The received leaves can then be used to reconstruct the entire tree, and the resulting root must match the root of the file (this verifies the integrity of the tree itself). Identifier must be a TTH root value from the “TTH/” root.

“list” transfers are used for partial file lists and have a directory as identifier. <start-pos> is always 0 and <bytes> will contain the uncompress length of the generated XML text in the corresponding SND. An optional flag “RE1” means that the client is requesting a recursive list and that the sending client should send the directory itself and all subdirectories as well. If this is too much, the sending client may choose to send only parts. The flag should be taken as a hint that the requesting client will be getting the subdirs as well, so they might as well be sent in one go. Identifier must be a directory in the unnamed root, ending (and beginning) with ‘/’.

Passive clients depend on the “no slots” to be recoverable, if a client gets a recoverable error after a GET command and has nothing else to do, it must send NTD, otherwise the passive client will never get a chance at downloading if the other client has a file queued.

Note that GET can also be used for binary transfers between hub and client.

4.3.14 GFI

GFI <type> <identifier>

Types: C

Get File Information, request that the other client returns a RES with relevant file data, for example size. Type and identifier are the same as for GET.

4.3.15 SND

SND <type> <identifier> <start-pos> <bytes>

Types: C, H, I

State transition to DATA state. The sender will keep on sending until <bytes> bytes of binary data have been sent, and then will put itself back to NORMAL state. The parameters are essentially a mirror of the GET parameters, but bytes must always be specified (may not be -1, no streaming support).

4.3.16 NTD

NTD

Types: C

Nothing to do. This is sent by the server, to indicate that it passes control over the NORMAL state over to the other client, effectively making it the server. It is always the server that has the first say in who will transfer files, this way we don’t have to remember if we’re connecting because of a CTM or because we want to download. A client that receives NTD and has nothing to do itself should disconnect.

5 Examples

5.1 Client – Hub connection

Client	Hub
HSUP <Client-CID> +BASE <other-features>
	ISUP <Hub-CID> +BASE <other-features> IINF <Hub-CID> HU1 …
BINF <Client-CID> …
	IGPA <Hub-CID> …
HPAS <Client-CID> …
	BINF <all clients> BINF <Client-CID> …
…	…

5.2 Client – Client connection

Client	Server
CSUP <CID> +BASE <other-features>
	CSUP <CID> +BASE <other-features> CINF <CID>
CINF <CID> TO<token>
	CGET <CID> …
CSND <CID> … <data>
	CNTD <CID>
CGET <CID> …
	CSND <CID> … <data>
CNTD <CID>
	<disconnect>

6 Standard Extensions

6.1 REGX

Regular expressions in searches. Extends the SCH command with an additional operator RE that takes a regular expression in the Perl form (must be compatible with libpcre, http://www.pcre.org/). Clients that support this must add a flag RE1 in their INF, and send “REGX” in their support string to the hub. Full support for Unicode regular expression (for case insensitive regex’es for instance) is required.

6.2 ZLIB

ZLib compressed communication. There are two variants of zlib support, FULL and GET, and only one should be used on a each communications channel set up.

6.2.1 ZLIB-FULL

If, during initial SUP negotiation, both ends send “ZLIF” in their support string, it means that all subsequent message passing will be tunneled in one long zlib stream. Care must be taken to partially flush the zlib buffer when needed to ensure that the commands are in a decompressible state when they arrive at the other end.

6.2.2 ZLIB-GET

The alternative is to send “ZLIG” to indicate that ZLib is supported for binary transfers using the GET command, but not otherwise (memory constraints in the hub for example). A flag “ZL1” is added to the to the SND command to indicate that the data will come compressed, and the client receiving requests it by adding the same flag to GET (the sending client may ignore a request for a compressed transfer, but may also use it even when not requested by the receiver). The <bytes> parameter of the GET and SND commands is to be interpreted as the number of uncompressed bytes to be transferred.

6.3 UCMD

User commands. User commands are used to send hub-specific commands to the client, to provide useful shortcuts for the user. These commands contain strings which are sent back to the hub, and parameter substitutions in the strings. They are limited to sending back simple information to the hub to avoid security issues where the hub could send malicious commands to the user. Each user command has a display name, a string to be sent to the hub, and one or more categories where it may appear. Parameters are essentially dictionaries that map one string to another, and each context has its own maps. The strings passed to the hub must first be passed through a dictionary replacement that replaces all keywords in the string, and then through the equivalent of the C standard function “strftime”, with the current local time specified.

6.3.1 CMD

CMD <name>

Types: I

Name uniquely (per hub) identifies a particular user command. The name may contain ‘/’ to indicate a menu structure on the viewing hub, where each ‘/’ introduces a new level of submenus. Other than name, the command also has a number of flags that further detail what to do with it.

RM	1 = Remove command
CT	Context, the following flags added together: 1 = Hub command, client parameters only 2 = User list command, client and user parameters 4 = Search result command, client, user and file parameters 8 = File list command, client, user and file parameters
TT	Text to be sent to hub
CO	1 = Constrained, when sending this command on multiple users (for example in search results), constrain it to once per CID only
SP	1 = Insert separator instead of command name (name must still be present to uniquely identify the command)

Keywords are specified using “%[keyword]”, similar to strftime but with brackets enclosing the keyword. The same rules apply to the ‘%’ as in strftime, it is escaped by itself (but the keyword substitution function should not touch it, as strftime will). Unknown keywords must be replaced by the empty string.

The following tables specify the parameters that must be available, as a minimum.

Client parameters

myCID	CID of the client
myXX	One for each of the flags on that particular hub, for example myI4, myNI, etc.

User parameters

userCID	CID of the user
userXX	One for each of the flags the user sent , for example userI4, userNI, etc.

File parameters

fileXX

One for each of the flags received in a search result or the corresponding data taken from a file list (see RES)