Writing a BitTorrent Client -

This post is a short overview of what is required for a minimal, download-only BitTorrent implementation.

Earlier this year, I wrote a BitTorrent client as an excuse to practice concurrency and networking concepts. But the resources and documentation I found while researching the protocol felt scattered, so I’m distilling my understanding here as a starting point for others.

1. Parse the `.torrent` metainfo file

The .torrent file contains information about the torrent tracker and the files to be downloaded. Data is encoded using a serialization protocol called bencoding. Parsing bencoded data is not significantly more difficult than parsing json, and there is likely a bencoding library available for your language.

2. Connect to the tracker

To connect to the torrent, an HTTP GET request is made to the tracker announce URL. The response provides a list of available peers.

3. Concurrent peer network connections

The client will connect to peers using TCP sockets. To support multiple simultaneous connections the client should be able to handle network operations asynchronously. There are two fundamental ways to do this in Python: (1) using threads, or (2) using an event loop with select() (or a library like Twisted which does so internally).

4. Peer protocol

The spec defines a number of messages that each peer must be prepared to send and receive. A minimal BitTorrent client may not need to implement all of these messages. In order to start downloading from a peer, a client needs to send a handshake, wait for a handshake response, send an ‘interested’ message, and wait for an ‘unchoke’ message. It can then start sending ‘request’ messages to request blocks. The peer will respond with ‘piece’ messages which contain the block data.

5. Torrent strategy

The client must download all blocks of all pieces and assemble them into the complete output file set. If any peers disconnect or fail to provide a block, the client must request from another peer. A more ambitious client may also attempt to further optimize its download strategy to improve download times.

1. Parse the .torrent metainfo file

2. Connect to the tracker

3. Concurrent peer network connections

4. Peer protocol

5. Torrent strategy

Further reading

1. Parse the `.torrent` metainfo file