|
Rodi IntroductionData distribution networks today provide only search in the file names (if any) and no content search. They were originally created for delivery of binary or un-searchable content. Rodi network functional requirements include context sensitive content search. Because Rodi is a decentralized network keyword rating and consequently search results can differ from publisher to publisher. These differences make Rodi network a group of loosely related or completely unrelated search engines. Publishers belonging to the same Rodi House can use the same function when calculating keywords rate.
Existing search engines do not provide search in the previous versions of the index files like HTML, but only in the cached and
supposedly recent version of the file. We argue that content of the WEB is getting more and more dynamic and updated much more
frequently than in the past. Rodi's functional requirements include file version manager which will support content searchs for
previous versions of the file as well as in the current one. Security is a huge problem for the existing bitTorrent network. In most cases bitTorrent trackers accept any client and in some cases the client must go through a registration procedure which is run by a regular WEB server before the client gains access to the tracker. Part of the registration procedure is saving client IP address which are assumed to be unique. Many questions immediately arise. It is not clear how the system can work if the client is protected by a PROXY server and real IP address is invisible for any 3rd party. Also, what happens with dynamic IP addresses? How can the tracker assure that the current request is arrived from the client registered on the server and not from one with the same (faked?) IP address. How can the host make sure that request arrived from the authorized client? How can the client make sure that the host answering data request is authorized? Traffic analyzers use some simple rules based on IP address and port number to collect the statistics or even drop the packets if ISP's decide that the traffic is illegal or parasitic. In the more advanced analyzers "deep inspection of packets, including the identification of layer-7 patterns and sequences" is supported. P2P network can use a simple encoding algorithm, for example XOR with long key. The strength of the scheme is regulated by the length of the key, frequent renewing and total number of keys. Let's assume that the length of the key is 1M characters, there are 1M different keys (hosts generate different keys for the published files). At this point a reliable analyzer is expected to store and actively use about 1T characters of keys. Let's also suggest that keys are made accessible for registered clients using different protocols, like E-Mail, FTP, HTTP, etc. Because normal high speed analyzerss are real-time embedded devices they can't reach the goal of collecting 1Tbytes of keys. In the case of DDOS the solution is to use a network of friendly bouncers behind different ISP's using different types of equipment. This way DDOS will require more resources from the adversary than when attacking a single host and the adversary can not attack the publisher directly because the source IP of the arriving packets can not be relied upon. It comes with relatively low bandwidth costs on part of the bouncers, because Rodi streams data directly between participating nodes and only Rodi control messages routed by the bounces. Publishers are expected to spoof IP source or use dynamic and ever changing IP addresses and ports known only to the bouncers. Publishers can get authorized signature. Downloaders are expected to learn which publishers are reliable. Downloaders recognize the publishers by nicknames. The publisher generates a pair of SSH2 DSA keys. The first key or private key the publisher stores locally and the second or public key the publisher has to post on the key server. The key server prompts the publisher to enter a unique nickname and public key. The key server then checks that the nickname is unique on this server and makes the key and fingerprint of the key accessible for the public via regular WEB interface. The key server will not log IP address of the publisher or any other information besides nickname and public key. Optionally the publisher can receive from the key server signature encrypted with the private key of the key server. The key server signature contains the nickname of the publisher, the public key of the publisher and the credentials of the server, for example the URL. The publisher then can attach a signature information element to every sent packet. Signature information elements contains following parts
The key server provides a XML based interface to the database containing nicknames and the public keys of the publishers.
The Rodi client can load the database and periodically check for updates.
It can be argued that modern networks are reliable and require only a minimal set of flow control and retransmission features above the data link (UDP). There is no reason to send data over TCP if a client is not going to use the flow control capabilities of TCP. The TCP layer provides flow control, which ensures packet delivery but we contend that packet delivery is not an issue for the modern network. The real issue is delay and jitter - not packet loss. Yet another reason for avoiding TCP in file sharing applications is the limited window size in the TCP layer when applications can actually retransmit any block as the data is stored on the media supporting random access, i.e. the hard disk. TCP can not assume that being sent by the application block can be easily reproduced by the application. TCP keeps copies of every packet sent in the so-called retransmission window, attempts which are impossible even in theory for fat links with large round trip delays (RTT). Some flavor of streaming protocol is more suitable for file sharing applications. In the simplest scenario the host can use a best effort scheme when sending
packets to the client with no timeout for acknowledgement. If client fails
to receive a packet a new request can be issued at any time, assuming
that said packet can be found on the host. Client can optionally specify
in it's request to the host optimal burst size (window size), packet size,
inter burst and inter packet delay. In the best scenario client will not
issue any requests to the host beside initial request. Because UDP connection
is stateless no time will be spent for establishing peer to peer connections. I will appreciate any comments of the project, especially from IT professionals, WEB managers, ISPs.
|