Internet.com ISP-Planet

 


Sections

 • Best of the Lists
 • Business
 • CLEC-Planet
 • Equipment
 • Executive
   Perspectives

 • Fixed Wireless
 • Investor
 • Marketing
 • Market Research
 • News
 • Notable Quotes
 • Politics
 • Profiles
 • Resources
 • Technology
 • Value-Added
   Services

 • Webhosting

Also ...
 • About Us
 • Authors

 • Letters
 • Site Map
 • Technology Jobs


 
ISP Glossary
Find an ISP Term
 
Search ISP-Planet


Search internet.com
 
internet.com

Internet News
Small Business

Advertise
Newsletters
Tech Jobs
E-mail Offers

internet.commerce
Be a Commerce Partner

ISP Equipment

Networking

Building a Better P2P Delivery System

Cachelogic unveils a brilliant system that dramatically reduces P2P traffic by making searching and file delivery more efficient. But is this English company ready for the U.S. legal system and the RIAA's explotation of it?

by Alex Goldman
ISP-Planet Associate Editor
[June 12, 2003]
Email a colleague

Cambridge, UK-based Cachelogic claims that its Cachepliance system provides the following benefits to ISPs whose networks are flooded by peer-to-peer (P2P) traffic:

  • Reduces overall bandwidth used by P2P programs by approximately 50 percent
  • Reduces upstream bandwidth used by P2P programs by up to 80 percent
  • Provides detailed data about P2P bandwidth usage

Each Cachepliance 4000 is designed to support up to 50,000 subscribers; each Cachepliance 2000 supports up to 30,000 subscribers.

Technical specifications, where disclosed, are impressive. Hard drives are hot-swappable RAID-5 array of 15,000 RPM SCSI disks (for cache storage). The smaller appliance features 700 GB of storage, and the larger has 1.45 TB. The smaller appliance has a single hot swappable power supply (with an optional second power supply) while the larger appliance has two. Each box contains 2 GB of 266 MHz ECC RAM. Processor type and speed were not disclosed, but the company does claim that its "multiple hyper-threaded CPUs . . . provide extremely high throughput."

Andrew Parker, founder and president of Cachelogic, is better known as the lead technical consultant to the developers of the Zeus webserver, which currently ranks number three in the server wars with a 2.03 percent market share, according to the Netcraft Server Survey results for March, 2003.

If anyone could provide an innovative technological approach in a nascent space like P2P bandwidth control, it is someone with a proven software track record and a product on the market whose own development, reliability, and support are matters of public record.

Astonishingly bad software
Parker has examined the architecture of distributed P2P networks like Gnutella (as opposed to the original, centralized P2P network, Napster, which was declared illegal). He claims that distributed architectures are inefficient. Although Gnutella, the example he cites, may well be the worst of the P2P programs, the flaws he identifies are astonishing and significant. It is possible that some of these flaws will be remedied in future versions of the various P2P programs, but some of the problems are inherent to any decentralized architecture.

Gnutella traffic consists of the following: Pings are requests for other nodes, Pongs are responses to those requests, Queries are search queries for specific files, and Push and Pull are two methods for delivering files (Push allows the host to send a file if the host is behind a firewall and the recipient cannot log on to download).

Two academic papers published on the architecture of Gnutella (see References at the end of this article) support Parker's contention. They say:

1) Gnutella is not truly distributed. 70 percent of Gnutella users share no files and the nearly 50 percent of all searches are served by the top 1 percent of sharing hosts.

2) Gnutella's architecture bares no relation to the topology of the network. This means that search traffic is routed by Gnutella clients in a perverse, inefficient manner. A theoretical search that would take two hops if routed directly could take six hops if rerouted in this inefficient manner.

3) The Gnutella "network" is composed of nodes that disconnect frequently. A study found that most nodes disconnect from the network within four hours, and only 4 percent remain connected to it for 24 hours or more.

4) Because nodes disconnect frequently, a significant portion of Gnutella P2P traffic is simply nodes pinging each other, both to make sure that nodes previously accessed are still connected, and to connect to nodes that have only recently connected.

As a result of these flaws, the report "Mapping the Gnutella Network" says, "Based on our measurements, the total traffic for a large (50,000 node) Gnutella network is 1 Gbit per second: 170,000 connections at 6 Kbps per second per connection, or about 330 Tbytes per month. To put this traffic volume into perspective, we note that it amounts to about 1.7 percent of the total traffic estimated over the U.S. Internet backbone in December 2000."

How is that possible? When a Gnutella client receives a search query, it broadcasts that search to all the nodes it is connected to. Users place a limit on the number of hops a query can travel, but if that number is seven, and each of the six clients the query reaches are connected to only 10 hosts (as well as the originator), the network will propagate 1 million copies of the query (this also assumes none of the 1 million are connected to each other).

Researchers are attempting to improve the protocol. A specification of "Gnutella-Pro" would only allow a node to ping only two other nodes, and the other nodes would only send a pong hit if they were ready to share a file (under the current specification, a node can be fully occupied, having run out of bandwidth or CPU processing power, and still return a pong).

Even the improved Gnutella did not spend most of its bandwidth delivering files. Gnutella spent 55 percent of its bandwidth on overhead like pings and pongs, and 35 percent of traffic on queries. Gnutella Pro eliminated the ping and pong traffic, but the problem remained that queries occupied the vast majority of all traffic.

In either case, the traffic on a node rises rapidly as the size of the Gnutella network grows.

A gated P2P community
Parker's appliances are designed to solve this problem by operating as a gateway node, separating the external P2P network from the ISP's network. The node makes all peers of P2P applications local to the ISP, rather than external. This should reduce the distance traffic has to travel. Parker notes that the intent is to force the P2P network to be structured in accordance with the physical layout of the underlying network.

Additional features of the appliances make the P2P network recognize the costs of the underlying physical network. An ISP can simply program rules to allow more P2P traffic on less expensive connections or direct traffic to networks with better peering agreements.

These features should, in fact, save a great deal of money for ISPs that have a significant number of P2P users on their networks. ISPs, especially broadband ISPs, may well prefer to limit the impact of P2P traffic on their physical network in this way, as opposed to throttling, blocking, or limiting the monthly allowance per user of P2P traffic. This is probably the only solution on the market that will not harm an ISP's relationship with its users.

While it seems clear that Cachelogic's product can go far to smooth the way to cost-efficient file sharing, there is one problem that Cachelogic is not, in our opinion, prepared do deal with: the American legal system and the RIAA's assault on fundamental Internet technologies.

Ridiculous Internet Assault Association
When we spoke to Parker, we agreed that there was demand for a project like this, but said that he might have legal problems. He knew about the DMCA. He said that caching is illegal in Canada—even though one of his competitors is based there. But when we asked if he knew about SuperDMCA laws that make all caching and routing—even NAT, DHCP, and firewall—illegal. He said, "that's ridiculous."

It may be ridiculous, but it's happening. The Electronic Freedom Foundation's list of states considering or adopting this legislation is here. Pressured by the RIAA, eight states have passed the law so far. In those states, it is illegal to use a firewall to protect yourself from hackers, and it is illegal to distribute a limited number of IP addresses among many users—and it's even illegal to publish or have information about such devices. For all practical purposes, the Internet itself—and everything published about it—is illegal in eight U.S. states.

From outside the U.S., the laws currently being debated, the activities of the RIAA, and the lawsuits currently winding their slow ways through the courts, must look ridiculous. They are ridiculous—and they probably violate fundamental rights granted by the Constitution (that's a discussion for another publication)—but their threat cannot be ignored.

References
"Mapping the Gnutella Network" by Matei Ripeanu, Adriana Iamnitchi, and Ian Foster, pub. in IEEE Computing Magazine, Jan-Feb 2002.

"Free Riding on Gnutella" by Eytan Adar and Bernardo A. Huberman

Gnutella-Pro: What bandwidth barrier? by Richard Massey, Shriram Bharath, and Ankur Jain

Chapter 2: Music, Movies, and Monpoly by George Ziemann

—End

Related articles:
  [March 12, 2003] Copyrights: More Work, More Headaches
  [Oct. 17, 2002] Red Hat's DMCA Quibble
  [July 9, 2002] EarthLink Plays the Music

 

 

 

Feedback


Advertising inquiry? Click here!

ISP-Planet's RSS feed

#