Difference between revisions of "EC Protocol HOWTO"

From AMule Project FAQ
Jump to: navigation, search
(Basic Protocol Structure)
Line 89: Line 89:
 
::[ec_taglen_t] TAGLEN
 
::[ec_taglen_t] TAGLEN
 
::<[uint16] TAGCOUNT>?
 
::<[uint16] TAGCOUNT>?
:::<sub-tags>
+
:::&lt;sub-tags&gt;
 
:::<tag data>
 
:::<tag data>
  

Revision as of 11:23, 30 June 2008

Work in progress, this site is under heavy construction.

Basic Protocol Structure

Protocol definition

Short description:

EC protocol consist of two layers: a low-level transmission layer, and a high level application layer.

The transmission layer consist of two int32 values.
A uint32 flag specify the format of the message e.g. if the packet uses utf8 encoded numbers or is compressed by zlib.
The next uint32 determines the size of the application layer data.

The application layer consists of an op-code and a tag counter,followed by a tag structure.

Transmission layer

The transmission layer is completely independent of the application layer,
and holds only transport-related information.

The transmission layer actually consists of an uint32 number, referenced below as flags,
which describes flags for the current transmission session (send/receive operation).

This four-byte value is the only one in the whole protocol, that is transmitted LSB first,
and zero bytes omitted (therefore an empty transmission flags value is sent as 0x20, not 0x20 0x0 0x0 0x0).

Bit description:

bit 0: Compression flag. When set, zlib compression is applied to the application layer's data.
bit 1: Compressed numbers. When set (presumably on small packets that doesn't worth compressing by zlib), all the numbers used
in the protocol are encoded as a wide char converted to utf-8 to let some zero bytes not to be sent over the network
bit 2: Has ID. When this flag is set, an uint32 number follows the flags, which is the ID of this packet. The response to this
packet also has to have this ID. The only requirement for the ID value is that they should be unique in one session (or at
least do not repeat for a reasonably long time.)
bit 3: Reserved for later use.
bit 4: Accepts value present. A client sets this flag and sends another uint32 value (encoded as above, LSB first, zero
bytes omitted), which is a fully constructed flags value, bits set meaning that the client can accept those extensions.
No extensions can be used, until the other side sends an accept value for them. It is not defined when this value
should be send, best is on first transfer, but can be sent any time later, even changing the previously announced flags.
bit 5: Always set to 1, to distinguish from older (pre-rc8) clients.
bit 6: Always set to 0, to distinguish from older (pre-rc8) clients.
bits 7,15,23: Extension flag, means that the next byte of the flags is present.
bits 8-14,16-22,24-32: Reserved for later use.


Transmission layer example:

0x30 0x23 <appdata> - Client uses no extensions on this packet, and indicates that it can accept zlib compression and compressed numbers.

Notes:

Note 1: On the "accepts" value, the predefined flags must be set to their predefined values, because this can be used as a sort of a sanity check.
Note 2: Bits marked as "reserved" should always be set to 0.


Application layer

Data transmission is done in packets. A packet can be considered as
a special tag - with no data, no tagLen field, and with the tagCount
field always present. All numbers part of the application layer are
transmitted in network byte order, i.e. MSB first.

A packet contains the following:
[ec_opcode_t] OPCODE
[uint16] TAGCOUNT
<tags>

In detail: The opcode means what to to or what the data fields contain.
Its type is set as ec_opcode_t, which currently is an uint8.
TagCount is the number of first level tags this packet has. Then are the
tags themselves.

A tag consist of:
[ec_tagname_t] TAGNAME
[ec_tagtype_t] TAGTYPE
[ec_taglen_t] TAGLEN
<[uint16] TAGCOUNT>?
<sub-tags>
<tag data>

The ec_tagname_t is defined as an uint16, ec_taglen_t as an uint32 value
at the moment. ec_tagtype_t is an uint8.
TagName tells what it contains (see ECcodes.h for details).
TagType sends the type of this tag (see ECPacket.h for types)
TagLen contains the whole length of the tag, including the lengths of the
possible sub-tags, but without the size of the tagName, tagType and
tagLen fields. Actually the lowest bit of the tagname doesn't belong to the
tagName itself, so it has to be cleared before checking the name.

Tags may contain sub-tags to store the information, and a tagCount field
is present only for these tags. The presence of the tagCount field can
be tested by checking the lowest bit of the tagName field, when it is
set, tagCount field present.

When a tag contains sub-tags, the sub-tags are sent before the tag's own
data. So, tag data length can be calculated by substracting all sub-tags'
length from the tagLen value, and the remainder is the data length, if
non-zero.


Future Changes

Future changes of the EC protocol (probably after 2.2.0) may be:

  • no more \0 for string termination
  • last bit of flag byte indicates a following flag byte, and so on


Resources

You get definitions of OP- and Tag-Codes at this locations in the source:

  • ./src/lib/ec/[c#|cpp|java]/ECCodes.[cs|h|java]
  • ./docs/EC_Protocol.txt (outdated, but much useful information)


Examples

Notes:

  • aMule sends EC packets in two flavours (albeit it would understand other flag options as well), depending on the packet size.
    • zlib compressed application data that doesn't use utf8 compressed numbers when decompressed.
    • utf8 compressed numbers in the application data
  • The tag size doesn't take into account the size of utf8 compressed numbers in subtags. When parsing, you may want to drop the length completely and get it by the size of the subtags + size of the value field (determined by the value type flag).




This is a packet in hex values that is send to aMule for authorization:

00 00 00 22 //flag
00 00 00 36 //packet body length 54
02      //EC_OP_AUTH_REQ
04      //tag count

c8 80          //EC_TAG_CLIENT_NAME
06             //EC_TAGTYPE_STRING
0d             //value length 13
61 6d 75 6c 65 2d 72 65 6d 6f 74 65 00 //"amule-remote\0"

c8 82          //EC_TAG_CLIENT_VERSION
06             //EC_TAGTYPE_STRING
07             //value length 7
30 78 30 30 30 31 00 // "0x0001\0"

04             //EC_TAG_PROTOCOL_VERSION
03             //EC_TAGTYPE_UINT16
02             //value length 2
02 00          //value is defined by EC_CURRENT_PROTOCOL_VERSION

02              //EC_TAG_PASSWD_HASH
09             //EC_TAGTYPE_HASH16
10             //value length 16
47 bc e5 c7 4f 58 9f 48 //md5 hashed password string
67 db d5 7e 9c a9 f8 08 //password "aaa" was used

c8 80 is in fact an utf8 encoded number. It decodes to 02 00 (or 512 in decimal).
As every tag code, it is shifted one bit to left to fit in a bit that indicates the presence of subtags.
The lowest bit of 02 00 is 0; so this tag doesn't have subtags.
When we shift the value to the right one bit (or divide by 2), we get 01 00.
That's the value that can be found in ECCodes.h.




This is a simple search request that is send without utf8 compressed numbers.

00 00 00 20 //plain format, no compression
00 00 00 21 //message length: 33
 
26 00	//EC_OP_SEARCH_START
01	//tag count
	0e 03	//EC_TAG_SEARCH_TYPE
	02	//EC_TAGTYPE_UINT8
	00 00 00 17	//tag length: 23
	00 02	//subtag count

		0e 04		//EC_TAG_SEARCH_NAME
		06		//EC_TAGTYPE_STRING
		00 00 00 05	//tag length
		74 65 73 74 00 	//"test\0"
 
		0e 0a		//EC_TAG_SEARCH_FILE_TYPE
		06		//EC_TAGTYPE_STRING
		00 00 00 01	//tag length
		00		//"\0"
 
	00			//uint8 search type (local)