Rodi

This page has moved. If your browser doesn't automatically redirect to its new location, click http://rodi.sourceforge.net/

emLib

Rodi (P2P)

Functional requirements

Use cases

Design

Privacy

Help the project

Try Rodi (beta)

User Manual

History

Logos

Post IP range

How to...

Message board

Rodi Introduction

Rodi or Rodia (Ρόδι or Ροδιά) means pomegranate in Greek. The Rodi program is a tiny P2P client/host (under 300K of binary code) implemented in pure Java. It's network use is similar to the bitTorrent concept. The program will serve the filesharing community with fast data delivery and serve the Open Source community by facilitating faster software deployment.

Data distribution networks today provide only search in the file names (if any) and no content search. They were originally created for delivery of binary or un-searchable content. Rodi network functional requirements include context sensitive content search. Because Rodi is a decentralized network keyword rating and consequently search results can differ from publisher to publisher. These differences make Rodi network a group of loosely related or completely unrelated search engines. Publishers belonging to the same Rodi House can use the same function when calculating keywords rate.
Existing search engines do not provide search in the previous versions of the index files like HTML, but only in the cached and supposedly recent version of the file. We argue that content of the WEB is getting more and more dynamic and updated much more frequently than in the past. Rodi's functional requirements include file version manager which will support content searchs for previous versions of the file as well as in the current one.

Security is a huge problem for the existing bitTorrent network. In most cases bitTorrent trackers accept any client and in some cases the client must go through a registration procedure which is run by a regular WEB server before the client gains access to the tracker. Part of the registration procedure is saving client IP address which are assumed to be unique. Many questions immediately arise. It is not clear how the system can work if the client is protected by a PROXY server and real IP address is invisible for any 3rd party. Also, what happens with dynamic IP addresses? How can the tracker assure that the current request is arrived from the client registered on the server and not from one with the same (faked?) IP address. How can the host make sure that request arrived from the authorized client? How can the client make sure that the host answering data request is authorized?
Traffic analyzers use some simple rules based on IP address and port number to collect the statistics or even drop the packets if ISP's decide that the traffic is illegal or parasitic. In the more advanced analyzers "deep inspection of packets, including the identification of layer-7 patterns and sequences" is supported. P2P network can use a simple encoding algorithm, for example XOR with long key. The strength of the scheme is regulated by the length of the key, frequent renewing and total number of keys. Let's assume that the length of the key is 1M characters, there are 1M different keys (hosts generate different keys for the published files). At this point a reliable analyzer is expected to store and actively use about 1T characters of keys. Let's also suggest that keys are made accessible for registered clients using different protocols, like E-Mail, FTP, HTTP, etc. Because normal high speed analyzerss are real-time embedded devices they can't reach the goal of collecting 1Tbytes of keys.
In the case of DDOS the solution is to use a network of friendly bouncers behind different ISP's using different types of equipment. This way DDOS will require more resources from the adversary than when attacking a single host and the adversary can not attack the publisher directly because the source IP of the arriving packets can not be relied upon. It comes with relatively low bandwidth costs on part of the bouncers, because Rodi streams data directly between participating nodes and only Rodi control messages routed by the bounces. Publishers are expected to spoof IP source or use dynamic and ever changing IP addresses and ports known only to the bouncers.
Publishers can get authorized signature. Downloaders are expected to learn which publishers are reliable. Downloaders recognize the publishers by nicknames. The publisher generates a pair of SSH2 DSA keys. The first key or private key the publisher stores locally and the second or public key the publisher has to post on the key server. The key server prompts the publisher to enter a unique nickname and public key. The key server then checks that the nickname is unique on this server and makes the key and fingerprint of the key accessible for the public via regular WEB interface. The key server will not log IP address of the publisher or any other information besides nickname and public key. Optionally the publisher can receive from the key server signature encrypted with the private key of the key server. The key server signature contains the nickname of the publisher, the public key of the publisher and the credentials of the server, for example the URL. The publisher then can attach a signature information element to every sent packet. Signature information elements contains following parts

A binary part with key server signature and MD5 of the packet

The nickname of the publisher

The URL of the key server (optional)

The public key of the publisher (optional)

The binary payload is encrypted using the private key of the publisher. The downloader is expected to decrypt the binary payload using the public key found on the trusted key server and then decrypt the key server signature using the public key of the key server. The downloader then makes sure that that the binary payload is indeed encrypted with publisher's private key and that the MD5 of the packet is correct.
The key server provides a XML based interface to the database containing nicknames and the public keys of the publishers. The Rodi client can load the database and periodically check for updates.
Rodi Houses can run their own key servers and provide to the publishers belonging to the house House Signature.
It can be argued that modern networks are reliable and require only a minimal set of flow control and retransmission features above the data link (UDP). There is no reason to send data over TCP if a client is not going to use the flow control capabilities of TCP. The TCP layer provides flow control, which ensures packet delivery but we contend that packet delivery is not an issue for the modern network. The real issue is delay and jitter - not packet loss. Yet another reason for avoiding TCP in file sharing applications is the limited window size in the TCP layer when applications can actually retransmit any block as the data is stored on the media supporting random access, i.e. the hard disk. TCP can not assume that being sent by the application block can be easily reproduced by the application. TCP keeps copies of every packet sent in the so-called retransmission window, attempts which are impossible even in theory for fat links with large round trip delays (RTT). Some flavor of streaming protocol is more suitable for file sharing applications.
In the simplest scenario the host can use a best effort scheme when sending packets to the client with no timeout for acknowledgement. If client fails to receive a packet a new request can be issued at any time, assuming that said packet can be found on the host. Client can optionally specify in it's request to the host optimal burst size (window size), packet size, inter burst and inter packet delay. In the best scenario client will not issue any requests to the host beside initial request. Because UDP connection is stateless no time will be spent for establishing peer to peer connections.

I will appreciate any comments of the project, especially from IT professionals, WEB managers, ISPs.

Message board

Home