Note: This review is adapted from the notes I took at the August SFOBUG meeting and has not been formed into a proper article. I may clean it up if I get some time later on, but I wanted to get it out there now for people to read. BitTorrent is cool!
San Francisco Bay Area resident Bram Cohen is the developer of (among other things) BitTorrent, and is one of the brains behind the very cool CodeCon, a conference for open source/free/crypto software programmers.
Bram gives the talk on BitTorrent, his peer-to-peer filesharing system. BT is designed to highly optimize bandwidth utilization by having each downloader also be an uploader to a few peers. The server component, known as the Tracker, gives each client a random list of peers from which it can retrieve chunks of the file to be downloaded. Each client downloads different chunks of the file from various peers, in parallel.
BT is integrated with the web; it “doesn’t have the standard warez console interface”. Cued by a special MIME type, your web browser launches the BT client component as a helper app, which then does the job of downloading the file. At first getting it to work nicely with Mozilla was hard, because Mozilla had a bad bug in how it worked with helper apps. But the Mozilla people fixed it and now it’s happy.
BT is implemented in Python, and works on Unix/Linux, Windows, OS X and “at least one person has it working on Mac OS 9.” It uses the wxWindows cross-platform GUI toolkit, which Bram says is sub-optimal (overlarge) but functional. BT has been in development for over two years; the protocol was finalized in October 2002. This is the year in which it really started to take off.
BT has a large userbase of people who use it to distribute content for which they actually have distribution rights (bonus!). Bram tells people not to use it for copyright infringement. Anime fans distribute translations of movies and movies that are not available in stores yet/ever, fans of “hippy bands” (e.g. Phish) distribute live recordings (such bands generally encourage field recordings), Linux distributors offer ISO filesystem images of their software via BT. And so on.
Some technical aspects of BT are discussed. Against the advice of some people, Bram refuses to turn off the Nagle algorithm (a congestion control system) for bulk TCP transfers. Why anyone would think turning it off is a good idea is confusing to me. Bram says “multicast on the Internet is never going to happen because it’s a stupid idea.”
Bram is “making pretty decent money on donations”, which is great!
Anonymity: BT has no anonymity features at all. “File distribution is one of the hardest situations” in which to implement anonymity. Proxy-based solutions would obviously defeat the primary goal, which is to maximize bandwidth efficiency. Besides, anonymity features would make BT look too “warezy”, according to Bram.
When the client is downloading a file, at the same time it is listed as a peer other clients can download file chunks from. There is a tit-for-tat system in place to prevent people from leeching — if you don’t offer chunks for download, your own download rate will plummet. A user-configurable option allows users to continue to offer file chunks even after the program has completed the download of the file.
Disk fragmentation was an early problem, which killed client performance. Writing a 1GB file in 16KB chunks is obviously going to cause extreme fragmentation. A solution was to pre-allocate the entire file, which causes the file to be full size immediately after the download begins — but it’s all zeros. The chunks are written into the file as they are downloaded.
Python itself has not proven to be a limiting factor on performance. Any slowness is algorithmic; Bram has isolated and fixed some such problems.
The most common tech support problem Bram gets is that BT tends to bluescreen some Windows machines. It turns out that BT is like a torture test for Ethernet card drivers, and some driver writers never performed such tests themselves... BT can open as many as twenty or more sockets to serve file chunks to peers, and blasts data out through all of them. This can make Windows upset.