Tuesday, November 25, 2008

MapReduce scheduling

MapReduce is fairly new and Hadoop implementation, as described in the paper , seems rather primitive and has a lot of room for improvement. I am not saying the creators of Hadoop did not do their job, but rather systems like these take time to evolve themselves. LATE is a big step towards improving its performance. The best part about this paper is that it is easy to understand , contrary to the usual scheduling paper where lots of math are used to describe their scheduling policy.

Enough praises.. I didn't like the parameters. I know the authors spent two pages of sensitivity analysis justifying them, and I respect their diligence. I just don't trust such static parameters can adapt to such a wide range of workloads. When things get overly complex, can we autotune it like how parallel programming community has done? It is almost impossible to predict performance because there are so many variable factors involved.

This work has much potential, especially because map reduce is merely scratching the surface of many powerful parallel distributed computation models. Many of these are only possible because of the rapid increase in network bandwidth. However, there is still significant differences in network bandwidth between different hosts in a data center. The scheduling decision here does not account for network topology and current machine workload. There are risks to using machines near the straggler as backup though, there is a higher chance for correlated failure because of network issue or machine issue (for a different thread / core on the same physical host)

All in all, this paper is great, but surely there are a lot more to be done.

Policy aware switching layer for data centers

This paper describes a way to insert middle boxes into the network in a logical but not physical fashion. This certainly relates to previous paper we have read that argues why middle boxes are disliked by the academic community because they break certain assumptions that were made during the design of the Internet.

Given a controlled environment such as Internet data center, the designer is essentially given the freedom to start from a clean slate. This allows this work to have a chance to be deployed as supposed to be a pure academic exercise, which I am really looking forward to. I suspect only real deployment will tell whether the complexity in its design is going to be an issue in daily operation or not.

In addition, I am worried that the performance overhead in the design (presenting itself as reduction in throughput as well as increase in latency) is going to render the system not cost effective economically. I am not sure how much of it comes from the fact it is Click-based and its software overhead, but it would be nice if authors could have attributed the overhead in more detail. (or perhaps our course project would help)

Tuesday, November 18, 2008

Delay Tolerant Network

The author argues the TCP has fundamental assumptions that are broken in "challenged" networks both in network characteristics and end point capabilities. Instead, the author proposes an architecture that is delay and link loss tolerant using a store and forward overlay to interconnect them.

I liked the ideas behind this paper and more importantly, the ideas in the paper have been in real deployments which is always a validation of the work. Although I am not sure if the rural areas are truly connected using this sort of technology. To have seamless transition from normal network to "challenged" network, the interface needs to be much more general. Most application written today rarely handle network failure gracefully, and to have these networks as functional as well connected networks, I believe network infrastructure is merely a first step. An application framework that makes failure handling explicit is also needed.

Sunday, November 16, 2008

An Architecture for Internet Data Transfer

This paper describes an architecture to separate data transfer mechanism and the setup of the transfer, and thus enable the use of innovative transfer methods such as multipath and peer to peer protocols.

Overall a fairly straightforward idea, essentially a level of indirection to transfer large amounts of data using an asynchronous I/O model (receiver pull) . It uses plugins to add new transfer mechanisms to the system while maintaining backwards compatibility. It also divides data into chunks so that optimizations such as caching are more effective.

I think the main advantage is that this architecture allows features such as multipath and caching to be added hidden behind a very general interface. This is very powerful, but it allows little application control over caching and multipath policies. These plugins are ultimately shared resources between different application and I don't think the current design expose any application control over this. For example, if a bulk transfer happens between US and UK millitary , they probably don't want their data cached in intermediate nodes in other countries or want to control the multiple path selection process.

Thursday, November 13, 2008

XTrace

This paper talks about tagging data and trace through different layers of protocol to provide better diagnostic tools in a distributed application. Overall a very good idea, as many applications are very difficult to analyze when it breaks down and faster diagnosis can lead to shorter downtime.

I assume this is to be used in online fashion. So it works like insurance against potential failure. There is some cost to instrument code to use xtrace, however it will likely reduce the pain when problem happens. Is this motivation strong enough for a company to adopt? How does this compare to the google approach of running multiple copies to buy them diagnosis time.

Also because this is online approach, performance overhead can also be a problem. But it is justified if it reduces the number of fail over copies a deployment needs.

Because X Trace is designed to be deployed across many ADs, if there is a bug in XTrace, it could potentially introduce correlated failure, which worries me a bit. Now things like performance bugs are multiplied and security bugs allows attackers to attack multiple ADs at once and through the entire application.

Wednesday, November 12, 2008

End to End Internet Packet Dynamics

This paper reports a large scale study of Internet performance using daemons installed on more than thirty nodes. The study goes on the conclude a list of result about various irregularities seen in Internet routing. It clears some misconceptions and shows that these irregularities do happen and sometimes frequently. Many of these irregularities also happen in site dependent fashion.

While many of these results are interesting facts and potentially lead to other further investigations of Internet irregularities. However, I feel the more valuable lesson from this particular study is the process of evaluating Internet performance at large scale. They were able to obtain a controlled experiment environment by embedding network daemons at various sites. I feel it will be more and more difficult to convince organizations to participate in today's world, especially non-academic settings.

They mentioned that they used 100k data to test bulk transfer. That seems quite low for any network today, and perhaps even low for Internet in 1995.

Thursday, November 6, 2008

i3

This paper presents an Internet overlay that delivery packets in a rendezvous fashion. I think it is a very elegant solution to the problem of mobility and multicast, because ultimately the data is what the receiver is after. Its location is only a mean for it to obtain that piece of data. This sort of reminds me of web mail but with a much tighter latency constraint. We get mobility of webmail by having a pull model instead of a push model, however it is certainly too slow if we do the samething at packet level. So i3 essentially uses webmail model as control plane and establish a fast data path for more packets.

I like the basic idea behind i3. I am not certain the implementation can be widely deployed due to the problem in scaling up. They are already seeing some performance overhead with using DHT as trigger maintenance method. Nevertheless this model of communication opens up a wide range of possiblities to improve and hopefully overtime there will be innovation in the details of implementation of such system that overhead and scaling problem can be resolved.

Middlebox no longer considered harmful

This paper presents a new architecture of routing and naming so that the design of the Internet makes incorporating middleboxes easier. The authors argue that although middleboxes violates two of the tenets, the reason is that original Internet design is not flexible enough.

The new design uses the notion of EID to serve as global unique identifier, and uses method such as delegation to hop from one EID to another EID. This is how they enable middleboxes not on the physical path. It depends on DHT for naming lookup, translating EIDs to IP addresses or other EIDs.

It wasn't particularly clear to me that the problem with NAT and middleboxes are very serious. I think as ugly as middleboxes and NATs are, they are working fairly well. Everyone is buying them and manufacturs are coming up with various configuration tools to make sure they are set up correctly. This architecture has a clean design to it, but at what cost? Having an entire routing architecture depedant on a dynamic system like DHT just sounds like a bad idea. Particularly because the DHT is on the critical path of every flow until it is cached. The author recognizes the latency of DHT look up as a problem and throws out the caching as a general solution, but I think it is more difficult than that. Also the distributed nature of DHT makes it hard to find problems should one of the node respond slowly or fail.

Tuesday, November 4, 2008

DNS Performance and Effectiveness of Caching

This paper studies the performance of DNS by taking traces at two campus networks. The result was surprising because I rarely have issues with DNS lookup. Even more surprising is the fact that a large percentage of lookups are never answered.. I suspect the situation has improved since this study was done. Perhaps we need to do another round of study of this sort.

The authors conclude that since short TTL is being employed for a number of applications such as loadbalancing, NS caching is far more effective compared to caching of the actual result. The number of referrals have significant impact on the latency of the lookup.

I like this paper and this sort of large scale studies on the Internet, because there are just so many factors that affect a typical user's Internet experience. Users often do not know where the problem is when they experience a problem. DNS certainly would not be the first place I think of when I have connectivity issues.

Monday, November 3, 2008

Development of the Domain Name System

This paper describes the Domain Name System, which is the de-facto standard in Internet naming protocol. It describes the original goals of DNS, how it evolved and the design decisions behind DNS. Compared to other descriptions of DNS I have seen before, this paper focuses much more on the broader case of resource look up rather than the narrower sense of translating domain name into ip address. Unfortunately, some of these features such as multiple type and class did not see significant expansion even until today. This is not an isolated instance where systems were designed over generic.

In general, I liked this paper because it gave me a lot of historical perspective and the shortcoming and surprises that authors did not original expect. However, the description of DNS in this paper reads too much like a functional specification. A lack of typical usage pattern can cause some confusion for someone who is not very familar with DNS (me). In this respect, I find the second paper to be more helpful.