Cosmos: Big Data and Big Challenges
The underlying plumbing for Bing includes a massively parallel storage and computation layer running over tens of thousands of servers in many data centers. Cosmos stores data triply-replicated for high availability in simple byte streams. The store layer of Cosmos holds hundreds of petabytes of data. Applications are written in a high-level set-oriented language called SCOPE which is similar to SQL. SCOPE applications are executed in a distributed fashion across the servers in a massively parallel way. Optimizations try to ship the computation close to the data. Cosmos is an ongoing service with hundreds of departments using it for the back-end calculations supporting Bing. Recent innovations have included the development of structured stream. With structured streams, Cosmos records information about the meta-data, indexing, and data affinitization within the stream as it is being generated. This meta-data is then used by SCOPE jobs which read the data to dramatically improve the overall performance of the system.
This talk will describe Cosmos, the mechanism for handling huge amounts of data and computation, the challenges faced in supporting this real live service.
Condos and Clouds
Over the last 100+ years, the way people design, build, and use buildings has evolved. It is now normal to construct a building without knowing in advance who will occupy the building. In addition, we increasingly have shared occupancy of our homes (apartments and condos), retail, and office space. To accomplish this change, the way we use the buildings has evolved. There is a new trust relationship, customs, and laws that establish the relationship between the occupants and the building managers.
Recently, our industry has been moving to implement Cloud Computing. This has been very successful in some applications and very challenging in others. This talk posits that many of the challenges we've seen in cloud computing can be understood by looking at what has happened in buildings and their occupancy. Standardization, usage patterns, legal establishment of rights and responsibilities are all nascent in the area of cloud computing. We examine a very common pattern in the implementation of "software and a service" and propose ways in which this pattern may be better supported in a multi-tenant fashion.
Download the slides for the COSMOS talk in PDF format.
Download the slides for the Condos & Clouds talk in PDF format.
About the speaker:
Pat Helland has over 30 years of experience in scalable and highly available computing. He spent most of the 1980s at Tandem as the chief architect for transaction processing plumbing for Tandem Computers' NonStop System. He worked at Hal Computers in the early 1990s where he architected a NUMA multiprocessor. In the mid-1990s, he started at Microsoft where he built Microsoft Transaction Server, Distributed Transaction Coordinator, and SQL Service Broker. He spent a couple of years at Amazon working on the product catalog and then returned to Microsoft in 2005. For the last few years, Pat has been the architect for Cosmos running inside Bing. He recently left Microsoft to move to San Francisco to be close to his grandkids.
email: phelland0215 (at) hotmail (dot) com