He who laughs last thinks slowest

Sunday, August 23, 2009

Looking Up Data in P2P Systems

P2P systems are are made up individual computers that are independent of each other, but it still reliable and robust as the number of participants grow. These P2P systems are attractive because they usually do not require any administrative or financial agreements, allows us of storage and computation resources of many computers over the internet, is decentralized and distributed -- making it robust, less susceptible to attacks, and fit for long computations and storage.

The authors of the paper discuss some algorithms that aim to look up data in a large P2P system in a scalable manner, without any centralized servers or hierarchy. The design points and trade-offs for each algorithm are also highlighted.

The first such algorithm discussed in length was that of Chord, which uses a successor list to ensure correct lookup even if nodes frequently join or leave the system. The data structure that Chord uses resembles a skip list. (The Chord protocol has already been lengthily discussed in a previous blog article, so I will not repeat it here)

Another way of searching is using tree-like routing wherein for each prefix, each node in the tree-based algorithm records the location of some node with that prefix. Examples of algorithms that use a tree-like data structure are: Pastry, Tapestry, and Kademlia. With Pastry, each node is given a random ID that marks its position in the identifier circle. Messages are routed with a key to the live node with a node ID closest to the key. In turn, there is a leaf set in each node with half the set having values greater than the node, and the other half having values less than the node. Queries are forwarded among the nodes and leaf sets until it reaches a point closest to the key's ID. Then, the routing table is used and the message is sent to the node that has a longer shared prefix with the sought key. The number of hops in Pastry for a query is at most log(base2^b)N.

The last algorithm discussed in detail is in routing in multiple dimensions. An example is CAN, wherein a d-dimensional Cartesian coordinate apce is used to implement the distributed hash table. The coordinate space is divided into rectangular spaces called zones. And each node is responsible for a specific zone. A key is assigned to a coordinate point, and thus to a specific node. Each node maintains a routing table containing its neigbors. When a query is made, a node forwards the query to a neighbor with coordinates closest to the destination key. The lookup cost for such a system is O(dN^(1/d)). The method for joining and leaving nodes in this system were also discussed.

In conclusion, the authors stated that the discussed algorithms in the article are still all under development. Some issues that should be looked into when using the algorithms are: the distance function, operation costs, fault tolerance and concurrent changes, proximity routing, malicious nodes, and indexing and keyword search.

Reference:

Balakrishnan, H. et al.2003.Looking Up Data in P2P Systems.Communications of the ACM

Saturday, August 22, 2009

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

The paper discusses Chord, a network protocol that looks up nodes in a distributed network. Given a key, the Chord algorithm searches for the node storing the key's value. A good quality of Chord is that it can quickly adapt to nodes that enter and leave the system. The lookup service will still work even if the nodes change frequently.
Chord uses a distributed hash function in assigning specific hash keys to nodes, and each node maintains a small routing table. In this way, the keys are balanced among the nodes and key assignments do not change much when nodes join or leave the system. Because Chord is distributed and all nodes are equal to each other, it is robust and is fit for supporting loosely-organized peer-to-peer applications. Chord is scalable, and the cost of Chord only grows in proportion to the log of the number of nodes, which makes large systems feasible. Keys that are looked up using Chord can also be structured in any way -- making the naming of keys flexible. Among its potential uses are cooperative mirroring, time-shared storage, distibuted indices, and large scale combinatorial searches.

There are many applications that can use Chord. Cooperative mirroring can allow developers to balance the load of the distributions they want published. Time-shared storage allows for a computer's data to be distributed and available to the network even if that computer is not connected to the network. When a computer joins the network, it can offer to distribute other computer's data when it is connected, so that other computers will also distribute its data when it is offline. Distributed indices allow for keyword searches in a database, as in P2P file transfer clients. Large scale combinatorial searches works by having keys as the candidate solutions, and these keys in turn point to the node that should evaluate if it is a solution or not.

The Chord algorithm is different from other peer-to-peer lookup protocols because of its simplicity, provable correctness, and provable performance. During queries, the node which initiates the lookup will search its finger table for the closest finger f preceding the key. Node n will send a request to f, asking for the closest entry preceding k and f will send back the result to n. This process will continue and results to n learning about nodes that bring it closer and closer to its target ID/ destination.

In general, the system needs to stabilize and continuously do the following in order to ensure that lookups yield correct results: (1) initialize the predecessor and fingers; (2) update the fingers and predecessors of existing nodes to reflect the change in network topology caused by the addition of n; and (3) copy all keys for which node n has become their successor to n. (http://www.inf.ed.ac.uk/teaching/courses/ip/chord-desc.html)

The authors of the paper continued by discussing in detail tests that were performed to evaluate how Chord fares in aspects of load balance, number of nodes traversed during lookup, simulateous node failures, performance of lookups when the network nodes are stabilizing, and lookup latency on an internet prototype. The test show satisfactory results, and assert that the Chord protocol is scalable.

Future work for Chord include the recommendation to create a mechanism for healing partitioned rings during stabilization. This mechanism can also help identify malicious or buggy Chord participants. Another recommendation is to change the distances of fingers from powers of 2 to powers of 1 + 1/d in order to decrease the number of messages per lookup from log N to log(base 1+d)N hops. However, this will increase the needed number of nodes to O(dlogN). Lookup latency may also be improved by server selection, wherein lookups are forwarded to nodes with the lowest delay.

References:
Stoica,I. et. al. 2001. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. MIT Laboratory for Computer Science
http://en.wikipedia.org/wiki/Chord_(peer-to-peer)
http://www.inf.ed.ac.uk/teaching/courses/ip/chord-desc.html

Thursday, July 23, 2009

Summary and Critique: Interdomain Internet Routing

The paper started by identifying questions that would need to be answered. These are: How do the relationships between ASs affect the routing information they share with each other? What are these relationships? How does this affect the technical research agenda of internet routing?

The internet is composed of ASs (Autonomous Systems) that exchange reachability / routing information. These ASs are operated by a commercial entity (such as ISPs, companies, schools), which define the routing policies that map out the rules for exporting and importing routes. The policies to import, export, and select routes depend on the relationship between ASs, if it is provider-to-customer, customer-to-provider, peer-to-peer, or sibling-to sibling.

Within each AS, different Interior Gateway Protocols (IGPs) operate. These protocols are concerned with optimizing a path metric. The paper did not discuss IGPs, but was focused on inter-AS relationships and features of BGP (Border Gateway Protocol), the current wide-area routing protocol on the internet.

There are two main classifications of AS relationships. These are: transit and peering. Simply put, a transit relationship is one where financial settlement is involved. For example, the provider provides access to all/ most of the destinations in its routing tables. Meanwhile, a peering relationship involves providing mutual access to routing tables. This is also a business deal, but there is no financial settlement.

So why can’t all relationships just be peering?

There are 2 main reasons: First, peering relationships do not generate revenue. Second, peering relationships need to be renegotiated often.

Also, import and export policies are determined based on AS relationships. Export policies are important and should be determined with care because an ISP does not want to be providing transit to packets it will not be earning from. On the other hand, the most important consideration in import policies is knowing who advertised the route. As such, routes are typically imported in the following order of priority: customer > peer > provider.

The next part of the paper discussed in detail the route selection process of BGP. In essence the write-up explained how the BGP routers are configured to implement policies which are dependent on the relationships between ASs.

So what do I think about all this?

The author was thorough enough in explaining the technical details of how interdomain internet routing worked. After reading the article, I had a better appreciation of how flexible yet complicated BGP can be. And the configurations get more complex because of ‘economic relationship’ considerations. Still, it would be interesting if somebody could figure out how to make things simpler, or the business model that would do away with the transit relationship constraints and allow for sharing of the best routes with each and every AS.

Reference:

Balakrishnan,H. Feamster,N. 2001-2005. Interdomain Internet Routing

Wednesday, July 15, 2009

Summary and Critique: On Inferring Autonomous System relationships in the Internet

The paper by Lixin Gao studies the structure of the internet, its components, and the rules which determine the routing of information across the internet. The internet is composed of Autonomous Systems (ASs) that determines it own routing policies in selecting the best route, and exporting and importing routes. Administrative domains such as ISPs, companies, and universities operate these ASs. The routing policies implemented by each AS are based on the contractual commercial agreements in effect between the administrative domains which operate it. Thus, these contractual agreements have a great influence on how the internet is structured and affect its end-to-end performance

Relationships between ASs can be classified into: peer-to-peer, provider-to-customer, customer-to-provider, or sibling-to-sibling relationship. These relationships are the basis for the import and export policies that an administrative domain can implement on its ASs. Also, there are patterns in the BGP routing entries, and these entries can potentially show what the relationship between ASs are.

Wait, what’s BGP? According to Kurose and Ross (2003), the Border Gateway Protocol (BGP) is the de facto standard interdomain routing protocol of the internet. Neighboring BGP routers exchange path information about routes to destinations. That is, the information propagates in an AS-by-AS manner.
Given these patterns and knowing that import and export policies rely on AS relationships, the author of the paper proceeded to present a heuristic that would determine the relationship between ASs.

There were algorithms presented that would determine provider-to-customer, sibling-to-sibling, and peer-to-peer relationships. When the solutions were tested and compared with AT&T internal information, the Refined algorithm (L=1) and Final algorithm (R=60) showed the best results, with 99.1% of the inference results confirmed. Tweaking of R (ratio between the degrees of two peering ASs) from the original value of ∞ to 60 gave better results. Results were also verified using the WHOIS Server, but only 54.3% of the inferred sibling relationships were confirmed. It was noted that the WHOIS database might also be out of date.
Lastly, the author identified some reasons why some inferred sibling relationships were not confirmed by AT&T or the WHOIS lookup service. These are router configuration typo, misconfiguration of small ISPs, unusual AS relationships, and inaccuracy of the heuristic (the assumption that the top provider has the highest degree may not be true).

But beyond the algorithms and academic exercise why are all these things important?
According to the author, the routing policies are manually configured and prone to error. These errors can affect other ASs and spread throughout the network and may cause outages. It is thus important to monitor routing announcements and filter out erroneous ones. The routing tables may also be checked periodically and the AS that propagated the error can be informed. Thus, these heuristics are aimed to build tools that would improve the reliability of internet routing. The relationship information can also be used by ISPs/ administrative in determining the terms of their contractual commercial agreements.

The article was an extremely long read, but very informative. The author explains the nitty-gritty of the algorithms, and provided ample introduction at the beginning to explain the basis of the calculations. I thought at first that the contractual commercial agreements between ASs was being suggested by the author as a negative constraint. However, I’ve come to realize by the end of reading that these commercial agreements are necessary for the whole thing to work – in a business/ economic point of view. There will always be providers and customers – and each has to protect their own interests. That is, I can’t think of an alternative model/ set-up. It would be good if I can learn more about that in our advanced networking class.

References:
Gao, Lixin. 2001. On Inferring System Relationships in the Internet. IEEE/ ACM Transactions on Networking, Vol 9 No 6
Kurose, J., Ross, K. 2003. Computer Networking: A Top-Down Approach Featuring the Internet

Friday, June 26, 2009

Subject: Article Review 2: End-to-End Arguments

Summary

The paper by Saltzer discusses the end-to-end argument in a distributed communication system. The end-to-end argument proposes that functions at the low levels of a system may not be cost-efficient because these functions may not be used by higher level applications. These functions may also be redundant, when higher-layer clients have to re-implement these features to suit its own requirements. Providing the functions at the lower level will then become more costly and not useful.

The authors proceed to give examples of the end-to end argument in delivery guarantees/ acknowledgements, data transmission and encryption, suppressing duplicate messages, correct arrival order of messages (FIFO delivery), and data storage systems. The main question in these examples is on where to place the functions – at the low or high level? To arrive at an answer, it is necessary to evaluate the trade-offs between performance, reliability, and costs when comparing the feasibility of placing functions on each of the identified levels.

In conclusion, the authors state that end-to-end arguments can be thought of as “Occam’s Razor”. That is, only the functions that are absolutely necessary must be included in the designed communication subsystem.

Background

History of End-to-End Arguments (http://en.wikipedia.org/wiki/End-to-end_principle#History)

The concept and research of end-to-end connectivity and network intelligence at the end-nodes reaches back to packet-switching networks in the 1970's, cf. CYCLADES. A 1981 presentation entitled End-to-end arguments in system design^[1] by Jerome H. Saltzer,David P. Reed, and David D. Clark, argued that reliable systems tend to require end-to-end processing to operate correctly, in addition to any processing in the intermediate system. They pointed out that most features in the lowest level of a communications system have costs for all higher-layer clients, even if those clients do not need the features, and are redundant if the clients have to reimplement the features on an end-to-end basis.

This leads to the model of a "dumb, minimal network" with smart terminals, a completely different model from the previous paradigm of the smart network with dumb terminals. However, the End-to-end principle was always meant to be a pragmatic engineering philosophy for network system design that merely prefers putting intelligence towards the end points. It does not forbid intelligence in the network itself if it makes more practical sense to put certain intelligence in the network rather than the end-points. David D. Clark along with Marjory S. Blumenthal in "Rethinking the design of the Internet: The end to end arguments vs. the brave new world" wrote in 2000:

From the beginning, the end to end arguments revolved around requirements that could be implemented correctly at the end-points; if implementation inside the network is the only way to accomplish the requirement, then an end to end argument isn't appropriate in the first place.

Occam’s Razor (http://en.wikipedia.org/wiki/Occam%27s_Razor)

William Ockham (c. 1285–1349) is remembered as an influential nominalist but his popular fame as a great logician rests chiefly on the maxim attributed to him and known as Occam's razor: Entia non sunt multiplicanda praeter necessitatem or "Entities should not be multiplied unnecessarily." The term razor refers to the act of shaving away unnecessary assumptions to get to the simplest explanation.

Critique

I found the paper very effective in explaining the merits of applying the end-to-end principle in system design. I agree that system designers must not be overzealous in providing functions at the low-level subsystem in the attempt to “help” the users at the higher level. A good system is one that provides only the functions that are absolutely necessary, and this will result in overall cost-effectiveness. Ultimately, which functions should be provided in each layer must be decided based on requirements analysis of the application that will use the system. Entities should not be multiplied unnecessarily.

Main Reference:

End-to End Arguments in System Design

by Saltzer, et al

Thursday, June 25, 2009

Article Review 1: DARPA

Summary

The paper discusses in detail how the DARPA internet protocols used today were developed.

The fundamental goal of the DARPA Internet Program was to develop a technique for utilizing the currently existing networks of the Department of Defense. This fundamental goal can be described in detail by seven second-level goals, which identifies in order of importance the desirable features of an effective interconnection. The most important goal in the list states that internet communication must continue despite of loss of networks or gateways. To achieve this goal, the fate-sharing approach, wherein it is acceptable to lose the state information of an entity being transmitted over the network if the entity itself is also lost, was chosen over replication. Another goal is that the internet should be able to support a variety of services. As a result, the TCP and IP protocols -- which were originally just 1 protocol, were created as two separate layers. A third goal was to incorporate and utilize a wide variety of network technologies – including military and commercial facilities. The basic assumption in achieving this flexibility is to have a datagram or packet of reasonable size with acceptable delivery.

The top 3 goals of survivability, supporting a variety of types of service, and accommodating a variety of networks, had the most effect in the existing architecture of the internet today. The other remaining goals were not met as effectively. The challenge today is to design networks using the existing internet protocols that would best fit the desired performance levels and type of service required. The author suggests exploring alternative building blocks from the datagram, and one he specified is called “flow”. The gateways will monitor the flow state to remember the nature of the flows that are passing through them. Then endpoints will enforce what type of service will be associated with the flow by sending messages. The DARPA Internet Program is currently conducting explorations of alternative building blocks .

Background

What is DARPA?

The Defense Advanced Research Projects Agency (DARPA) is an agency of the United States Department of Defense responsible for the development of new technology for use by the military. DARPA has been responsible for funding the development of many technologies which have had a major impact on the world, including computer networking, as well as NLS, which was both the first hypertext system, and an important precursor to the contemporary ubiquitous graphical user interface.

DARPA is independent from other more conventional military R&D and reports directly to senior Department of Defense management. DARPA has around 240 personnel (about 140 technical) directly managing a $3.2 billion budget. These figures are "on average" since DARPA focuses on short-term (two to four-year) projects run by small, purpose-built teams.

http://en.wikipedia.org/wiki/Defense_Advanced_Research_Projects_Agency

Critique

I honestly found the beginning of this paper boring. It was a struggle just getting past the first page. Surprisingly enough the material becomes a more interesting read when the goals were being discussed and the pros and cons for choosing one method over another were being explained. Overall, the DARPA internet project is a good example that not all project goals can be achieved (whether this is planned failure or a sad reality, is subject to argument). And many times, certain aspects have to be sacrificed in order to make way for more important objectives.

As a last note, I found it really funny on how the author subtly injected humor by saying that he does not have an idea on how to guide the procurement person (at DARPA?) who did not make sure that the performance specifications were in the procurement documents, but let slide limitations that would limit the performance goals.