Fifth Scientist from Microsoft Research Wins Turing Award

Microsoft Research originally published this article on March 18, 2014: Link

Leslie Lamport first began dabbling in computers while he was still in high school. Nothing too unusual about that—until you consider that this was in the mid-1950s. Lamport was attending Leslie_Lamportthe Bronx High School of Science in New York, and he and a friend used to scrounge around, looking for discarded vacuum tubes to build a digital circuit.

The path to greatness begins with baby steps, and for Lamport, a principal researcher with Microsoft Research, that teenage curiosity has yet to be quenched. Over the ensuing decades, he has become a veritable legend in computing circles. His work in the theory of distributed computing is foundational. His 1978 paper Time, Clocks, and the Ordering of Events in a Distributed System is one of the most cited in the history of computer science. And he has contributed core principles to the field of specification and verification of concurrent systems.

On March 18, in recognition of these invaluable advances, the Association for Computing Machinery (ACM) named Lamport the 2013 winner of the A.M. Turing Award, widely regarded as the Nobel Prize of computing.

For many, such recognition couldn’t be more deserved:

  • Bill Gates: “This is well-deserved recognition for a remarkable scientist. As a leader in defining many of the key concepts of distributed computing that enable today’s mission-critical computer systems, Leslie has done great things not just for the field of computer science, but also in helping make the world a safer place. Countless people around the world benefit from his work without ever hearing his name. I like to think this award is also recognition of the amazing work of Microsoft Research, which has become a great home for scientists and engineers who want to tackle the industry’s most difficult challenges. Leslie is a fantastic example of what can happen when the world’s brightest minds are encouraged to push the boundaries of what’s possible.”
  • Butler Lampson, a technical fellow at Microsoft Research New England and the Turing winner in 1992: “Leslie’s contributions to both the theory and the practice of concurrent systems are unsurpassed for quality, scope, and importance. They are on a par with those of Dijkstra, Hoare, Milner, and Pneuli, all previous winners of the Turing Award. Although he can prove theorems with the best of them, his greatest strength is as an applied mathematician with an extraordinary sense of how to apply the tools of mathematics to problems of great practical importance.”
  • Harry Shum, Microsoft executive vice president of Technology and Research: “I really started to appreciate the incredible contribution his work has made to our industry, especially in cloud computing and distributed systems, when I worked at Bing. At Bing, we studied his paper on Paxos and applied his technology to build the distributed systems that we still use today.”
  • Nancy Lynch, head of the Theory of Distributed Systems research group for the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology: “I’m very delighted that Leslie was selected for the Turing Award this year. In fact, I think the award is very long overdue. He has made many foundational contributions to the field of distributed computing, starting in the 1970s.”
  • Ed Lazowska, Bill & Melinda Gates Chair in the Computer Science & Engineering department in the Computer Science & Engineering department at the University of Washington: “When I think about Leslie’s impact, I think about the fact that the algorithms that he designed, which in many cases were viewed as only of theoretical interest, are now fundamental to the way we build web-scale systems, the systems that all of us use every day. Without Leslie’s innovations, we wouldn’t have the computing environment that we have today.”
  • Bob Taylor, founder and manager of the Xerox Palo Alto Research Center and then founder and manager of Digital Equipment Corp.’s Systems Research Center: “The Internet is based on distributed-systems technology, which is, in turn, based on a theoretical foundation invented by Leslie. So if you enjoy using the Internet, then you owe Leslie.”

Lamport, 73, becomes the fifth scientist from Microsoft Research to have won the Turing Award, joining previous recipients Tony Hoare (1980), Lampson, Jim Gray (1998), and Chuck Thacker (2009). But despite more than 40 years of nonpareil achievements, Lamport’s self-assessment is not quite so glowing.

“I wasn’t that good at finding solutions,” he states, “but I was really good at asking questions.”

Few who know him would agree. Take Roy Levin, a Microsoft distinguished engineer and managing director of Microsoft Research Silicon Valley, where Lamport now works.

“When we started the lab here,” Levin says, “Leslie was one of the early people to join. We knew we were building a distributed-systems lab, and what better than to get the person who is the father of principled distributed computing, which Leslie most certainly is. We were delighted to get him.”

So have been Microsoft’s product groups, which have repeatedly reaped the benefits of Lamport’s expertise. His work on Paxos has been implemented in many products, in uses such as Windows Azure storage, Azure’s Rest Availability Proxy, and the Cosmos data storage and query system. He has contributed to the correctness of the Windows Server Transaction Protocol, and the modeling in the Oslo platform for model-driven applications was inspired by his work on Temporal Logic of Actions (TLA). And many at Microsoft have benefited from the LaTeX document-preparation system, created by Lamport.

But his contributions to Microsoft are not merely measured in bytes, a fact to which David Langworthy can attest.

“Leslie taught me how to think,” says Langworthy, a principal development lead for Microsoft. “Using simple math that I learned in high school, I found flaws in my programs that would have been next to impossible to debug on a live server—and found them years earlier, when we still had plenty of time to fix them.”

A Glorious Résumé

That sort of influence is not surprising. Lamport’s career is littered with glittering honors, making his Turing Award seem merely inevitable. Consider:

  • Election to the National Academy of Engineering, 1991
  • Winner of the inaugural Influential Paper Award, for the Time, Clocks paper, from the ACM Symposium on Principles of Distributed Computing, 2000.
  • The IEEE Emanuel R. Piore Award for outstanding contributions in the field of information processing in relation to computer science, 2004.
  • The Edsger W. Dijkstra Prize in Distributed Computing, 2005.
  • Three Hall of Fame Awards from the ACM Special Interest Group on Operating Systems. The awards recognize the most influential operating-systems papers published at least 10 years in the past. The award was established in 2005; Lamport won in 2007, 2012, and 2013.
  • The Test of Time Award from the IEEE Symposium on Logic in Computer Science (LICS), 2008. The award is given annually to the best of the 20-year-old LICS papers that have best met the test of time.
  • The IEEE John von Neumann Medal, 2008.
  • Election to the National Academy of Sciences, 2011.
  • The Jean-Claude Laprie Award in Dependable Computing, 2013.

In addition, Lamport, who received his Ph.D. from Brandeis University in 1972, has been bestowed with honorary doctorates from France’s Université de Rennes in 2003, Germany’s Christian-Albrechts-Universität zu Kiel in 2003, Switzerland’s École Polytechnique Fédérale de Lausanne in 2004, the Università della Svizzera italiana in 2006, and France’s Université Henri Poincaré in 2007.

Time, Clocks, and Relativity

This eventful legacy, forged in part by Lamport’s puckish sense of humor, began to take shape in 1978, when his massively influential Time, Clocks paper was published. Written when Lamport was working for Massachusetts Computer Associates, known as Compass, it resulted from his introduction to a paper called The Maintenance of Duplicate Databases, by Robert Thomas and Paul Johnson, that argued that keeping identical databases updated when a change is made to one of them requires the use of timestamps.

“I realized,” Lamport recalls, “that it did not preserve causality. Things would appear in the system as if they were done in an order that was not logically consistent with the order of the events in which the commands were being issued. And I realized this could be corrected fairly easily by changing the way these timestamps were being generated.”

His insight, stemming from his interest in physics and special relativity, was that it is problematic to identify the temporal order of two events unless there is a causal connection between them—if a message passed between them. That led to an understanding that if the timestamps of such messages could be used to generate an order of events, then all events that occurred in the system could be placed in an order. It then followed that anything that a computing system does could be described with a state machine—something that has a certain state, receives an input, then produces an output and changes its state. This concept, Lamport deduced, could be applied to complex systems, such as banking or airline reservations.

“The classic paper, of course, is his paper on clocks,” Levin says, “which introduced a new way of principled thinking about distributed computation, synchronization, and communication among asynchronous entities that was new at the time and became the basis for any attempt to reason about how parallel systems behave. It’s one of the seminal papers.”

Lamport recalls feedback of a slightly different nature.

“Shortly after the Time, Clocks paper was published,” Lamport recalls, “Jim Gray told me that he had heard two reactions to the paper: Some people thought that it was brilliant, and some people thought that it was trivial.

“I think they’re both right, but I’m disinclined to disagree with people who think it’s brilliant.”

The Bakery Algorithm

Another product of Lamport’s Compass years was the bakery algorithm, described in his A New Solution of Dijkstra’s Concurrent Programming Problem, which aimed to tackle the problem of mutual exclusion: making sure that data corruption doesn’t result from multiple threads trying to write to the same memory location or that one thread doesn’t read that location before another is finished writing to it. The name was an allusion to the ordering system commonly used in a bakery, in which customers select a numbered ticket when they enter.

“When I was at Compass, I read a paper in Communications of the ACM about a mutual-exclusion algorithm,” Lamport recalls. “It was the first time I had seen the mutual-exclusion problem, and I looked at it and said, ‘Well, that doesn’t seem very difficult.’”

So he wrote a quick algorithm and a brief paper and sent them to the editor, who promptly responded by explaining why Lamport’s algorithm wouldn’t work.

“That taught me a lesson,” he now says, “that I should not write a concurrent algorithm without having a proof of its correctness.”

It also encouraged him to go back and fully solve the problem, and he remains proud today of what he achieved with the bakery algorithm.

“That,” he says, “is the one algorithm that I have the feeling that I discovered, rather than invented.”

Byzantine Generals

In 1972, Lamport moved to the Bay Area as the vanguard of what Compass planned to be a West Coast outpost, but that office never materialized, and for five years, he was the only Compass employee based in California. Finally, he was asked to return to the East Coast. Instead, he decided to join SRI International, founded as the Stanford Research Institute.

SRI had a project to build a fault-tolerant avionics computer system for NASA. Given the nature of the system, failures had to be avoidable, and that led to a pair of papers, co-written by Lamport and SRI colleagues Marshall Pease and Robert Shostak, to address Byzantine failures.

In computing terms, “ordinary” failures might result in lost messages or halted processes, but they don’t get corrupted—and if they do, the use of redundancy enables the corrupted message to be discarded. Processes might stop, but they don’t give an incorrect answer.

Byzantine failures, though, can make mistakes or send incorrect messages.

The commonly used technique, called triple modular redundancy, uses three separate computers to “vote,” in a sort of majority rule, and even if one delivers an incorrect result, the other two still will provide the right answer. To prove that this worked, though, a proof was needed, and in trying to write a proof, the researchers encountered a problem: The “wrong” computer could send a different, incorrect value to each of the other two, and neither of those would know. That would necessitate the use of four computers to tackle one fault.

“Except if you use digital signatures,” Lamport says. “Then you can do it with three, because if the bad computer sends one incorrect value with a signature to one computer and a different incorrect value with a signature to the other, the other two computers will exchange the messages and see what’s happening, because two different values have been sent and signed.”

Lamport had heard Gray talk about another problem of the same general nature called the Chinese Generals Problem. That got Lamport to thinking about generals giving orders and generals being traitors, and he dubbed the problem and its solution with the name The Byzantine Generals Problem.

“I remember sitting at a café in Berkeley with Whit Diffie, a friend of mine, when he described the problem of constructing a digital signature,” Lamport recalls. “He said, ‘This would be really useful if people could do it.’ I said, ‘That doesn’t sound very difficult,’ and, on a napkin, I sketched to him what was the first digital-signature algorithm. It wasn’t very practical at the time, although these days, it turns out that it is feasible.”

That napkin, alas, has been lost to the sands of time.

Greek Comedy

“Leslie’s schemes for dealing with faults was a major area of investigation in distributed computing,” Levin says. “His work was fundamental there. Then it led to work on agreement protocols, a key part of the notion of getting processes to converge on a common answer. This is what has come to be known as Paxos.”

Incidently, that work was independently invented at about the same time by Barbara Liskov, Turing Award winner in 2008, and her student Brian Oki.

“Lamport’s paper [The Part-Time Parliament],” Levin adds, “explained things through the use of a mythical Greek island and its legislative body. Partly because he chose this metaphor, it was not appreciated for quite some time what was really going on there.”

Lamport has a less diplomatic assessment.

“With the success of Byzantine Generals, I thought, ‘We need a story.’ I created a story, and that was an utter disaster.

“It was also an attempt to add some humor. It’s the same problem: implementing a state machine to handle the presence of failures. The state machine, in this case, was a parliament passing a series of laws. What’s different is this was not for handling Byzantine faults, but for handling ordinary, simple failures, but in an asynchronous setting. You weren’t dealing with malicious failures. Some things were simple engineering problems, so instead of bothering to go into that, I would simply say, ‘The archeological records don’t say exactly how they did this.’ In another case, to illustrate how a particular optimization could be done, I used the story of a cheese merchant, illustrating how you could make it more efficient. In all the stories, I gave the characters in them Greek names, written in a pseudo-Greek, using transliterated names of computer scientists. The cheese merchant was a computer scientist named Gouda [for Mohamed Gouda].”

It also gave Lamport a chance to have a joke at the expense of one of his buddies at Digital Equipment Corp. (DEC), at whose Systems Research Center Lamport worked from 1985 until joining Microsoft Research in 2001.

“The ending of the paper was that there was something that could happen to cause the system to crash, just completely disintegrate. You could use the system to reconfigure itself, but if you did it carelessly, you could run into a situation where no more laws could be passed. So in the story, this is what happened, and a general named Lampson took over power and became dictator.

“It turned out that computer scientists weren’t as familiar with the Greek alphabet as mathematicians. … Butler Lampson was the only person who got it and understood that it was about how to build distributed systems.”

The paper eventually was published—a decade later, updated to account for intervening advancements. Lamport later wrote Paxos Made Simple to explain the simplicity of the algorithm—without employing Greek wordplay. Over time, the work has become known as a true advance.

“Paxos has become a primary piece of technology in modern distributed systems,” Levin says. “No one would think of building a large-scale distributed system that has to be robust and reliable without having Paxos or something very similar to it at the core.”

Abstracted Correctness

Lamport also introduced the properties of safety and liveness as the best ways to generalize about partial correctness and termination in concurrent systems. “Safety” means that nothing bad happens in a system, “liveness” that something good happens.

“I believed that all you needed was safety and liveness,” Lamport says. “About a decade later, Fred Schneider and his student Bowen Alpern formally defined these concepts and proved that my hunch was correct.

“Shortly after I introduced the concepts of safety and liveness, Amir Pnueli [Turing Award winner in 1996] showed how to use temporal logic to reason about programs. You could represent the program and its properties in the same logic. Proving correctness simply became a matter of saying that a program logically implied its specification.”

Lamport and Susan Owicki showed that temporal logic was good for reasoning about liveness.   But the idea of representing the program in temporal logic didn’t work.

“It was a wonderful idea,” Lamport says, “but it just didn’t work in practice.”

Some thought the solution was to add more complicated temporal-logic operators, but Lamport took a different direction.

“The original logic Amir Pneuli used had one temporal operator: always,” Lamport says. “Something’s always true. People added different, complicated temporal operators, such as ‘until something is true,’ ‘until something else is true.’”

Things got unwieldy, and they weren’t easy to understand.

“I had the brilliant idea of sticking to that one temporal operator but changing the basic non-temporal formulas to which that operator was applied,” he says. “Instead of using basic formulas that just talked about one state, I added formulas that talk about a pair of states, a current state and a next state. By adding those formulas that talked about two states, I was able to describe a state machine in this logic. If I could describe the state machine in the logic and I could describe the liveness properties as temporal formulas, then I could describe the entire program or the entire specification as a single formula, one that really worked. That was TLA.”

TLA underlies the TLA+ specification language, the PlusCal algorithm, and the tools associated with them. TLA and TLA+ are explained in a book written by Lamport and published in 2002: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. The basic tenet is that the best way to describe things formally is with simple mathematics, with the corollary that a specification language should contain as little as possible beyond what is needed to write simple mathematics precisely.

From DEC to Microsoft

During Lamport’s time at DEC, he had the opportunity of working with other giants of the computing industry—many of whom had worked at Xerox’s legendary Palo Alto Research Center. Among Lamport’s DEC colleagues were Levin, Thacker, Luca Cardelli, Gray, and Lampson, all of whom eventually found their way to Microsoft after DEC’s research labs withered following the sale of the company to Compaq. Lamport retains a particular fondness for his manager at DEC, Bob Taylor.

“The thing I think is so remarkable about Bob is his lack of ego,” Lamport says. “This is something that he taught to Roy Levin, that it was his job to make us happy and productive. He was working for us; we weren’t working for him, in a sense.

“In the 10 years I worked for Bob, his door was always open, and I would quite often wander in there and chat with him. There was one time during that period when he said: ‘Can you come back later? I’m busy now.’ I remember that because it was so extraordinary.”

But nothing in Lamport’s storied career prepared him for what he encountered when he joined Microsoft.

“The whole corporate attitude toward research was just so completely different at Microsoft than anyplace else,” he says. “Microsoft realizes that it’s the researchers who have to generate the research ideas.

“What the corporation can do is to provide the communication channels between the researchers and the engineers, so the engineers will know what the researchers are doing and the researchers will know what the engineers’ problems are. Microsoft has set up its program-manager system as a mechanism to do that.”

And Lamport’s illustrious career has given him the perspective to appreciate Microsoft’s research approach.

“I like working in an industrial research lab, because of the input,” he says. “[French film director] Jean Renoir wrote in his autobiography that when someone asked his father, [Impressionist painter] Pierre-Auguste, why he painted from life, taking his easel outdoors rather than painting in a studio, he said: ‘If I were to paint a tree in the studio, I would be able to paint five or six different kinds of leaves. That’s all that would come to me. But when I go out in nature, there are millions of different kinds of leaves. I need nature for that inspiration.’

“It’s the same with me. If I just work by myself and come up with problems, I’d come up with some small number of things, but if I go out into the world, where people are working on real computer systems, there are a million problems out there. When I look back on most of the things I worked on—Byzantine Generals, Paxos—they came from real-world problems.”

On Feb. 13, Lamport again found himself in the real world, delivering a tribute to close LampsonFest, an event held at Microsoft Research New England to celebrate the 70th birthday of another computing legend, Lamport’s colleague and friend, Butler Lampson.

At the outset of his talk, Lamport displayed a slide with a photo of himself, followed by one of Lampson.

“This is me,” Lamport said, “and this is Butler.

“I realized that there was some confusion about this when someone sent me email asking about my Turing Award lecture. So you should learn to distinguish us. I’m the one with the beard. Butler’s the one with the Turing Award.”

Savor the joke. We won’t be hearing that one anymore.