# Stanford Software Seminar

The Stanford Software Seminar is usually held on Mondays in various rooms in the Gates building. Talks are open to anyone.

To subscribe to the seminar mailing list, visit its mailman page or send an email to software-research-join@lists.stanford.edu from the email address you wish to subscribe. Likewise, to unsubscribe, send an email to software-research-leave@lists.stanford.edu from the subscribed email address. In either case, the subject and body of your email will be ignored.

You can also subscribe to the seminar's Google calendar.

 Upcoming Talks
 Date Monday, June 5 2017, 3-4pm Place Gates 104 Speaker Adam Chlipala, MIT Title A Case for Integrating Proof Assistants into Daily Programming Abstract Computer proof assistants are IDEs for developing rigorous mathematical proofs that are checked algorithmically. They are just beginning to be accepted as useful for "real work" in domains from systems infrastructure to applications. Still, most observers think of a mechanized proof as a kind of penance that programmers can pay to be especially virtuous. Are machine-checked proofs just an extra cost, bolted on to a standard software-development process, to increase assurance? I will argue that, instead, mechanized proofs are a key enabler of new patterns of abstraction and modularity, and these new patterns deserve to permeate the whole life cycle of development. I will draw examples from a few different efforts I'm involved with, applying the Coq proof assistant. One reach target, for the different aspects combined, is to synthesize efficient and secure Internet servers from specifications. Key challenge areas included persistent data management, parsing of network wire formats, and cryptography. I will describe how we are able to generate performance-competitive code automatically from reasonable specifications in each domain -- and how to do it in a way that generates proofs of correctness. The resulting programmer experience is something between "program synthesis" and "programming with well-designed libraries." The paradigm we're pushing is code libraries that include specification notations and procedures that know how to compile those notations efficiently, with proofs. Speaker Bio Adam Chlipala has been on the faculty in computer science at MIT since 2011. He did his undergrad at Carnegie Mellon and his PhD at Berkeley. His research focus is applying mechanized logical reasoning to the programming process (both software and hardware) to improve how we implement, compile, specify, verify, and synthesize code. Much of his work uses the Coq proof assistant, about which he has written a popular book, "Certified Programming with Dependent Types." Current projects involve verification and synthesis for processors, file systems, cryptography, databases, and Internet servers. He is designer and implementer of the Ur/Web programming language, a domain-specific functional language for building modern Web applications, which has a few commercial users.
 Date Monday, May 15 2017, 3:00pm - 4:00pm Place Gates 415 Speaker Yu Feng, UT Austin Title Component-based Synthesis of Table Consolidation and Transformation Tasks from Examples Abstract Programs that manipulate tabular data are extremely important in an era where data analytics has gained enormous popularity. Since raw data is rarely in a form that is immediately amenable to an analytics or visualization task, data scientists typically spend over 80% of their time performing tedious data preparation tasks. Such tasks include consolidating multiple data sources into a single table, reshaping data from one format into another, or adding new rows or columns to an existing table. In this talk, I will present a novel component-based synthesis algorithm that marries the power of type-directed search with lightweight SMT-based deduction and partial evaluation. Given a set of components together with their over-approximate first-order specification, our method first generates a program sketch over a subset of the components and checks its feasibility using an SMT solver. Since a program sketch typically represents many concrete programs, the use of SMT-based deduction greatly increases the scalability of the algorithm. Once a feasible program sketch is found, our algorithm completes the sketch in a bottom-up fashion, using partial evaluation to further increase the power of deduction for rejecting partially-filled program sketches. We apply the proposed synthesis methodology for automating a large class of data preparations tasks that commonly arise in data science. We have evaluated our synthesis algorithm on dozens of data wrangling and consolidation tasks obtained from on-line forums, and we show that our approach can automatically solve a large class of problems encountered by R users. Speaker Bio Yu Feng is a 5th year Ph.D. student at UT Austin advised by Isil Dillig and he was a member of the STAMP group led by Alex Aiken. Yu has interests spanning programming languages, security, and software engineering. For instance, he has published topics about Android Malware Detection (fse'14, ndss'17), Program Synthesis (popl'17, pldi'17) and Program Analysis (oopsla'15, aplas'15).
 Past Talks
 Date Thursday, April 20 2017, 2:00pm - 3:00pm Place Gates 463A Speaker Michael Pradel, Assistant Professor, TU Darmstadt Title Systematic Black-Box Analysis of Collaborative Web Applications Abstract Web applications, such as collaborative editors that allow multiple clients to concurrently interact on a shared resource, are difficult to implement correctly. Existing techniques for analyzing concurrent software do not scale to such complex systems or do not consider multiple interacting clients. This paper presents Simian, the first fully automated technique for systematically analyzing multi-client web applications. Naively exploring all possible interactions between a set of clients of such applications is practically infeasible. Simian obtains scalability for real-world applications by using a two-phase black-box approach. The application code remains unknown to the analysis and is first explored systematically using a single client to infer potential conflicts between client events triggered in a specific context. The second phase synthesizes multi-client interactions targeted at triggering misbehavior that may result from the potential conflicts, and reports an inconsistency if the clients do not converge to a consistent state. We evaluate the analysis on three widely used systems, Google Docs, Firepad, and ownCloud Documents, where it reports a variety of inconsistencies, such as incorrect formatting and misplaced text fragments. Moreover, we find that the two-phase approach runs 10x faster compared to exhaustive exploration, making systematic analysis practically applicable. (Joint work with Marina Billes and Anders Moller. Paper at PLDI'17.) Speaker Bio Michael Pradel is an assistant professor at TU Darmstadt, which he joined after a PhD at ETH Zurich and a post-doc at UC Berkeley. His research interests span software engineering and programming languages, with a focus on tools and techniques for building reliable, efficient, and secure software. In particular, he is interested in dynamic program analysis, test generation, concurrency, performance profiling, and JavaScript-based web applications.
 Date Monday, March 6 2017, 2:00pm - 3:00pm Place Gates 415 Speaker Thomas Ball, Principal Researcher and Research Manager, Microsoft Research Title Physical Computing for Everyone Abstract Thanks to Moore’s Law, embeddable microcontroller-based devices continue to get cheaper, faster, and include more integrated sensors and networking options. In 2016, the BBC and a host of technical partners, including Microsoft, delivered such a physical computing device, the micro:bit, to every 5th grader in the UK. Microsoft Research helped to make the micro:bit easy to program for novices. The non-profit Micro:bit Education Foundation (microbit.org), of which Microsoft is a founding partner, was recently created to take the micro:bit global. Over the last year, Microsoft has invested in a new web-based programming platform for physical computing, called PXT, with the micro:bit being the first target (pxt.microbit.org). In this talk, I’ll describe the design and implementation of PXT, focusing specifically on its web-based approach to physical computing. PXT supports rapid script development and testing within the confines of a modern web browser, via a novel combination of Blockly, TypeScript and hardware simulation. A browser-based compilation toolchain targets both the Thumb and AVR instruction sets and links against pre-compiled C++ code. PXT uses a bespoke C++ runtime from Lancaster University that provides a set of useful abstractions, including events, a message bus, and fibers. Speaker Bio Thomas (Tom) Ball is a principal researcher and manager at Microsoft Research. In 1999, Tom initiated the SLAM software model-checking project with Sriram Rajamani. This led to the creation of the Static Driver Verifier tool for finding defects in Windows device drivers. Tom is a 2011 ACM Fellow for “contributions to software analysis and defect detection.” As a manager, he has nurtured research areas such as automated theorem proving, program testing/verification, and empirical software engineering. His current focus is CS education and the PXT platform for physical computing.
 Date Wednesday, September 21, 4:00pm - 5:00pm Place Gates 104 Speaker Tom Henzinger, IST Austria Title The Quest for Average Response Time Abstract Responsiveness -the requirement that every request to a system be eventually handled- is one of the fundamental liveness properties of a reactive system and lies at the heart of all methods for specifying and verifying liveness. Average response time is a quantitative measure for the responsiveness requirement used commonly in performance evaluation. The static computation of average response time has proved remarkably elusive even for finite-state models of reactive systems. We present, for the first time, a robust formalism that allows the specification and computation of quantitative temporal properties including average response time. The formalism is based on nested weighted automata, which can serve as monitors for measuring the response time of a reactive system. We show that the average response time can be computed in exponential space for nondeterministic finite-state models of reactive systems and in polynomial time for probabilistic finite-state models. This work is joint with Krishnendu Chatterjee and Jan Otop.
 Date Thursday, January 21, 3:00pm - 4:00pm Place Gates 415 Speaker Julian Shun, UC Berkeley Title A Framework for Processing Large Graphs in Shared Memory Abstract In this talk, I will discuss Ligra, a shared-memory graph processing framework that has two very simple routines, one for mapping over edges and one for mapping over vertices. The routines can be applied to any subset of the vertices, which makes the framework useful for many graph traversal algorithms that operate on subsets of the vertices. Based on recent ideas used in a very fast algorithm for breadth-first search, the routines automatically adapt to the density of vertex sets. Ligra is able to express a broad class of graph algorithms including BFS, betweenness centrality, eccentricity estimation, connectivity, PageRank, and single-source shortest paths. The algorithms expressed using this framework are very simple and concise, and perform almost as well as highly optimized code. Furthermore, they get good speedups on a modern 40-core machine and are sometimes much more efficient than previously reported results using graph frameworks on machines with many more cores. I will also discuss Ligra+, an extension of Ligra that uses graph compression to reduce space usage and improve parallel performance. Speaker Bio Julian Shun (http://www.eecs.berkeley.edu/~jshun) is currently a Miller postdoc at UC Berkeley. He obtained his Ph.D. in Computer Science from Carnegie Mellon University, and his undergraduate degree in Computer Science from UC Berkeley. He is interested in developing large-scale parallel algorithms for graph processing, and parallel text algorithms and data structures. He is also interested in designing methods for writing deterministic parallel programs and benchmarking parallel programs.
 Date Tuesday, September 22, 11:00am-12:00pm Place Gates 415 Speaker Nate Foster, Cornell University Title NetKAT: Semantic Foundations for Networks Abstract Formal specification and verification of computer networks has become a reality in recent years, with the emergence of domain-specific programming languages and automated verification tools. But the design of these languages and tools has been largely ad hoc, driven more by the needs of applications and the capabilities of hardware than by any foundational principles. This talk will present NetKAT, a language for programming networks based on a well-studied mathematical foundation, Kleene Algebra with Tests (KAT). The first part of the talk will describe the design of the language, including primitives for filtering, modifying, and forwarding packets; operators for combining programs in parallel and in sequence; and a Kleene star operator for iterating programs. The next part of the talk will explore the semantic underpinnings of the language, developing an equational deductive reasoning system as well as a formal connection to finite automata. The third part of the talk will show how NetKAT can be compiled to low-level configurations for network devices. Lastly, I will discuss a recent extension of NetKAT with new features designed to support probabilistic programming. NetKAT is joint work with colleagues at Cornell, Facebook, Inhabited Type, Princeton, Samsung, UCL, and UMass Amherst. Speaker Bio Nate Foster is an Assistant Professor of Computer Science at Cornell University. The goal of his search is developing programming languages and tools for building reliable systems. He received a PhD in Computer Science from the University of Pennsylvania in 2009, an MPhil in History and Philosophy of Science from Cambridge University in 2008, and a BA in Computer Science from Williams College in 2001. His awards include a Sloan Research Fellowship, an NSF CAREER Award, a Most Influential POPL Paper Award, a Tien '72 Teaching Award, a Google Research Award, a Yahoo! Academic Career Enhancement Award, and the Morris and Dorothy Rubinoff Award.
 Date Friday, June 19, 11:00am-12:00pm Place Gates 415 Speaker Martin Vechev, ETH Zurich Title Machine Learning for Programming Abstract The increased availability of massive codebases (“Big Code”) creates an exciting opportunity for new kinds of programming tools based on probabilistic models. Enabled by these models, tomorrow’s tools will provide probabilistically likely solutions to programming tasks that are difficult or impossible to solve with traditional techniques. I will present a new approach for building such tools based on structured prediction with graphical models, and in particular, conditional random fields. These are powerful machine learning techniques popular in computer vision – by connecting these techniques to programs, our work enables new applications not previously possible. As an example, I will discuss JSNice (http://jsnice.org), a system that automatically de-minifies JavaScript programs by predicting statistically likely variable names and types. Since its release a year ago, JSNice has become a popular tool in the JavaScript community and is regularly used by thousands of developers worldwide. Speaker Bio Martin Vechev (http://www.srl.inf.ethz.ch/vechev.php) is a tenure-track assistant professor at the Department of Computer Science, ETH Zurich. Previously, he was a Research Staff Member at the IBM T.J. Watson Research Center, New York (2007-2011). He obtained his PhD from Cambridge University in 2008. His research interests are in program analysis, program synthesis, application of machine learning to programming languages, and concurrency.
 Date Thursday, June 11, 2:00pm-3:00pm Place Gates 104 Speaker Ofri Ziv Title Composing Concurrency Control Abstract Concurrency control poses significant challenges when composing computations over multiple data-structures (objects) with different concurrency-control implementations. We formalize the usually desired requirements (serializability, abort-safety, deadlock-safety, and opacity) as well as stronger versions of these properties that enable composition. We show how to compose synchronization protocols in a way which preserves these properties. Our approach generalizes well-known synchronization protocols (such as two-phase-locking and two-phase-commit) and leads to new synchronization protocols. We apply this theory to show how we can safely compose optimistic and pessimistic concurrency control and demonstrate the practical value of such a composition. For example, we show how we can execute a transaction that accesses two objects, one controlled by an STM and another by locking.
 Date Thursday, June 4, 1:00pm-2:00pm Place Gates 104 Speaker Mayur Naik, Georgia Tech Title Petablox: Declarative Program Analysis for Big Code Abstract Most software development today leverages the world's massive collection of open source software. There is significant room for program analyses to similarly leverage Big Code, the collective knowledge amassed from analyzing existing programs, to automatically infer or predict salient behaviors and vulnerabilities in new programs. We present Petablox, a framework for automatically synthesizing use-cases of arbitrary declarative program analyses for Big Code tasks such as efficiently finding good abstractions, transferring analysis results across programs, and adapting analyses to user feedback. Despite their diversity, all these tasks entail solving large instances of MaxSAT, the maximum satisfiability problem which comprises a mix of hard (inviolable, logical) constraints and soft (violable, probabilistic) constraints. We describe demand-driven, compositional, and learning-based MaxSAT optimizations in Petablox for scaling these tasks to large code bases. Speaker Bio Mayur Naik is an Assistant Professor in Computer Science at Georgia Tech since 2011. His research interests are in programming languages and software engineering, with a current emphasis on program analysis techniques and systems for improving software quality and programmer productivity on modern computing platforms such as parallel, mobile, and cloud computing. He holds a Ph.D. in Computer Science from Stanford University (2007) and was a research scientist at Intel Labs, Berkeley from 2008 to 2011.
 Date Friday, May 8, 11:00-12:00 Place Gates 415 Speaker Philip Wadler, University of Edinburgh Title A Practical Theory of Language-Integrated Query Abstract Language-integrated query is receiving renewed attention, in part because of its support through Microsoft's LINQ framework. We present a theory of language-integrated query based on quotation and normalisation of quoted terms. Our technique supports abstraction over values and predicates, composition of queries, dynamic generation of queries, and queries with nested intermediate data. Higher-order features prove useful even for constructing first-order queries. We prove that normalisation always succeeds in translating any query of flat relation type to SQL. We present experimental results confirming our technique works, even in situations where Microsoft's LINQ framework either fails to produce an SQL query or, in one case, produces an avalanche of SQL queries.
 Date Monday, March 23, 2:00-3:00 Place Gates 415 Speaker Per Stenström, Chalmers University of Technology, Sweden Title Efficient Statistical-based Cache Compression Abstract Low utilization of on-chip cache capacity limits performance and causes energy wastages because of the long latency, the limited bandwidth, and the energy consumption associated with off-chip memory accesses. Value replication ­- the same value appears in multiple memory locations ­- is an important source of low capacity utilization. While cache compression techniques in the past manage to code frequent values densely, they trade off a high compression ratio for low decompression latency, thus missing opportunities to utilize on-chip cache capacity more effectively. This talk presents, for the first time, a detailed design-space exploration of statistical-based cache compression. We show that more aggressive, statistical-based compression approaches, such as Huffman, that have been excluded in the past due to the processing overhead for compression and decompression, are prime candidates for cache and memory compression. We first find that the overhead of statistics acquisition to generate new codewords is low because value locality varies little over time and across applications so new encodings need to be generated rarely, making it possible to off-load it to software routines. We then show that the high compression ratio obtained by Huffman-based cache compression makes it possible to enjoy the performance benefits of 4X larger last-level caches at a power consumption that is about 50% lower than 4X times larger caches.
 Date Friday, February 27, 3:00-4:00 Place Gates 104 Speaker Armando Solar-Lezama, MIT Title Making synthesis practical. Are we there yet? Abstract In this talk, I will describe recent advances in our ability to synthesize programs that satisfy a specification, and some of the new applications that have been enabled by these advances. The first part of the talk will focus on the Sketch synthesis system and the algorithms that allow it to scale to challenging synthesis problems. The second part of the talk will focus on applications of synthesis to program optimization, automated tutoring and the development of the synthesizer itself. Speaker Bio Armando Solar-Lezama is an associate professor without tenure at MIT where he leads the Computer Aided Programming Group. His research interests include software synthesis and its applications, as well as high-performance computing, information flow security and probabilistic programming.
 Date Tuesday, July 22, 3:00-4:00 Place Gates 104 Speaker Martin Monperrus, University of Lille Title On the Search Space of Automatic Software Repair Abstract Automatic software repair consists of fixing software bugs automatically. A fix is a small point in a huge space of all possible program modifications. In this talk, I will talk about the topology of this search space, and how one can exploit it to speed up automatic software repair. The talk is based on content from: Mining Software Repair Models for Reasoning on the Search Space of Automated Program Fixing (Matias Martinez, Martin Monperrus), In Empirical Software Engineering, Springer, 2013. Do the Fix Ingredients Already Exist? An Empirical Inquiry into the Redundancy Assumptions of Program Repair Approaches (Matias Martinez, Westley Weimer, Martin Monperrus), In Proceedings of the International Conference on Software Engineering, 2014. A Critical Review of "Automatic Patch Generation Learned from Human-Written Patches": Essay on the Problem Statement and the Evaluation of Automatic Software Repair (Martin Monperrus), In Proceedings of the International Conference on Software Engineering, 2014. Speaker Bio The speaker is an Associate Professor at the University of Lille (France) and a member of INRIA's research group SPIRALS. In 2008-2011, he was a research associate at the Darmstadt University of Technology (Germany), and received his Ph.D. from the University of Rennes (France) in 2008.
 Date Monday, July 21, 3:00-4:00 Place Gates 104 Speaker Emina Torlak, UC Berkeley Title Programming for Everyone: From Solvers to Solver-Aided Languages and Beyond Abstract In this talk, I will present a new approach to constructing programs, which exploits advances in constraint solving to make programming easier for experts and more accessible to everyone else. The approach is based on two observations. First, much of everyday programming involves the use of domain-specific languages (DSLs) that are embedded, in the form of APIs and interpreters, into modern host languages (for example, JavaScript, Scala or Racket). Second, productivity tools based on constraint solvers (such as verification or synthesis) work best when specialized to a given domain. Rosette is a new kind of host language, designed for easy creation of DSLs that are equipped with solver-based tools. These Solver-Aided DSLs (SDSLs) use Rosette's symbolic virtual machine (SVM) to automate hard programming tasks, including verification, debugging, synthesis, and programming with angelic oracles. The SVM works by compiling SDSL programs to logical constraints understood by SMT solvers, and then translating the solver's output to counterexamples (in the case of verification), traces (in the case of angelic execution), or code snippets (in the case of synthesis and debugging). Rosette has hosted several new SDSLs, including imperative SDSLs for data-parallel and spatial programming; a functional SDSL for specifying executable semantics of secure stack machines; and a declarative SDSL for web scraping by example. Speaker Bio
 Date Thursday, May 15, 4:00-5:00 Place Gates 104 Speaker Adam Chlipala, MIT Title Bedrock: A Platform for Practical Proof-Carrying Code with Rich Policies Abstract Today more than ever, the computing ecosystem is full of examples of execution of untrusted code. From mobile applications, to JavaScript-based Web applications in browsers, to virtual-machine images run by cloud providers, users from consumers to hosting providers are running programs produced by untrusted third parties. It is essential to enforce security policies on these pieces of untrusted code, but the common techniques of today are often too weak, and they often impose high run-time costs in power and so forth, which matter especially much for resource-constrained mobile devices. One alternative paradigm is proof-carrying code (PCC), where developers distribute software with machine-checkable proofs of adherence to policies. In this talk, I will give an overview of Bedrock, a platform to make PCC more practical along a number of dimensions. Ideally we want to apply PCC with rich behavioral policies, but, without careful tool design, it can be intractable to build rigorous conformance proofs for realistic programs. Past PCC frameworks supporting rich policies have required on the order of 100 lines of manual proof per 1 line of assembly code. Bedrock, a library for the Coq proof assistant, brings this ratio closer to 1-to-1. We can do mostly automated assembly-level proofs of functional correctness for tricky program modules that mix features like heap-allocated data structures, function pointers, and concurrency. In this domain, the challenge is not just the algorithmic design of verification tools, but also the generation of machine-checkable proofs from first principles. I will focus on three broad challenges that are important to scale up the approach: "horizontal" modularity in the form of program decomposition into modules, "vertical" modularity via verified compilers, and effective proof automation for individual program modules written in particular languages. Speaker Bio Adam Chlipala is an assistant professor in computer science at MIT. He completed his PhD in CS at Berkeley in 2007 and his BS in CS at CMU in 2003. His research interests, broadly speaking, are in applied logic for software development tools. Much of his group's work uses the Coq theorem-proving software, about which he has written a popular book "Certified Programming with Dependent Types." Current projects in theorem-proving deal with verification of compilers and low-level imperative code. He also designed the Ur/Web domain-specific programming language for modern Web applications, which applies advanced type system ideas in a form practical enough to be used by several production Web applications.
 Date Friday March 7, 2:00-3:00 Place Gates 392 Speaker Mooly Sagiv, Tel Aviv University Title VeriCon: Towards Verifying Controller Programs in Software-Defined Networks Abstract Software-defined networking (SDN) is a new paradigm for operating and managing computer networks. SDN enables logically-centralized control over network devices through a software controller'' program that operates independently from the network hardware. We present VeriCon, the first system for verifying network-wide invariants of SDN controller programs. VeriCon uses first-order logic to define admissible network topologies and desired network-wide invariants (e.g., routing correctness. correct access control, and consistency of the controller's data structures). VeriCon either confirms the correctness of the controller program on ALL admissible network topologies or outputs a concrete example that violates an invariant, and so can be used for debugging controller code. We show that VeriCon, which implements classical Floyd-Hoare-Dijkstra deductive verification, is practical for a large repertoire of controller programs. In addition, as VeriCon is compositional, in the sense that it checks the correctness of each network event independently against the specified invariants, it can scale to handle complex systems. We view VeriCon as a first step en route to practical mechanisms for verifying network-wide invariants of controller code. This is a joint work with Thomas Ball and Nikolaj Bjorner (MSR), Aaron Gember (Wisc), Shachar Itzhaky (TAU), Aleksandr Karbyshev (TUM), Michael Schapira and Asaf Valdarsky (HUJI) Speaker Bio
 Date Friday Jan 17, 3-4pm Place Gates 498 Speaker Stavros Aronis, Uppsala University Title Optimal Dynamic Partial Order Reduction Abstract Stateless model checking is a powerful technique for program verification, which however suffers from an exponential growth in the number of explored executions. A successful technique for reducing this number, while still maintaining complete coverage, is Dynamic Partial Order Reduction (DPOR). We present a new DPOR algorithm, which is the first to be provably optimal in that it always explores the minimal number of executions. It is based on a novel class of sets, called source sets, which replace the role of persistent sets in previous algorithms. First, we show how to modify an existing DPOR algorithm to work with source sets, resulting in an efficient and simple to implement algorithm. Second, we extend this algorithm with a novel mechanism, called wakeup trees, that allows to achieve optimality. We have implemented both algorithms in a stateless model checking tool for Erlang programs. Experiments show that source sets significantly increase the performance and that wakeup trees incur only a small overhead in both time and space. Speaker Bio
 Date 3:00-4:00, Monday, November 18 Place Gates 415 Speaker Junfeng Yang, Columbia University Title Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading Abstract Our accelerating computational demand and the rise of multicore hardware have made parallel programs, especially shared-memory multithreaded programs, increasingly pervasive and critical. Yet, these programs remain extremely difficult to write, test, analyze, debug, and verify. Conventional wisdom has attributed these difficulties to nondeterminism (i.e., repeated executions of the same program on the same input may show different behaviors), and researchers have recently dedicated much effort to bringing determinism into multithreading. In this talk, I argue that determinism is not as useful as commonly perceived: it is neither sufficient nor necessary for reliability. We present our view on why multithreaded programs are difficult to get right, describe a promising approach we call stable multithreading to dramatically improve reliability, and summarize our last four years research on building and applying stable multithreading systems. (More details are at http://www.cs.columbia.edu/~junfeng/.) Speaker Bio
 Date 3:00-4:00, Monday, December 2 Place Gates 415 Speaker Todd Millstein, UCLA Title Toward a "Safe" Semantics for Multithreaded Programming Languages Abstract "Safe" programming languages enforce fundamental language abstractions, thereby providing strong guarantees for all program executions and obviating large classes of subtle and dangerous errors. Modern languages have embraced the compelling programmability benefits of (memory and type) safety despite the additional run-time overhead. Unfortunately, recent work to standardize multithreading semantics in mainstream programming languages is reversing this trend. While a significant improvement over prior informally-specified semantics, the current standards allow a small programming error or omission to violate program safety in ways that are difficult to understand, detect, and correct. In this talk I will argue that a safe multithreaded programming language should support the simple interleaving semantics of threads known as sequential consistency (SC). I'll then describe the results of our research over the past few years, which challenges the idea that the SC semantics is inconsistent with high performance. Perhaps surprisingly, restricting a modern compiler's optimizations to respect SC introduces minimal runtime overhead. While modern hardware relies upon important optimizations that potentially violate SC, a small extension to such hardware can preserve the SC semantics while retaining the lion's share of the benefit of these optimizations. Further, various factors will conspire to lower the cost of SC on stock hardware in the coming years. This work is joint with Dan Marino (Symantec Research Labs, formerly UCLA), Abhay Singh (U Michigan, Ann Arbor), Madan Musuvathi (Microsoft Research, Redmond), and Satish Narayanasamy (U Michigan, Ann Arbor). Speaker Bio Todd Millstein is an Associate Professor in the Computer Science Department at the University of California, Los Angeles. Todd received his Ph.D. and M.S. from the University of Washington and his A.B. from Brown University, all in Computer Science. Todd received an NSF CAREER award in 2006, an IBM Faculty Award in 2008, an ACM SIGPLAN Most Influential PLDI Paper Award in 2011, and an IEEE Micro Top Picks selection in 2012.
 Date 2:00-3:00, Monday, September 16 Place 104 Gates Speaker Mohsen Lesani, UCLA Title On Testing and Verification of Transactional Memory Algorithms Abstract A transactional memory (TM) is an object composed of a set of base objects such as registers and locks. The safety of the TM algorithm is the result of the safety of the composed base objects and the logic of the algorithm. We define a language that captures the type of the base objects and the algorithm. We define a history semantics for the language that characterizes the set of histories that a program can result. Based on the history semantics, we propose techniques for both testing and verification of TM algorithms. First, we identify two problems that lead to violation of opacity, a safety criterion for TM. We present an automatic tool that given a violating history, finds program traces that result in that history. We show that the well-known TM algorithms DSTM and McRT don't satisfy opacity. DSTM suffers from a write-skew anomaly, while McRT suffers from a write-exposure anomaly. Second, we present a program logic with novel propositions about execution order and linearization orders of base objects. We prove that our inference rules are sound i.e. if we can derive that a program satisfies a property, then every history of the program satisfies the property. Our logic is composable as it can be augmented with new inference rules to support reasoning about new object types. We have used our logic to prove that TL2 TM algorithm satisfies opacity. We are formalizing our logic and proofs in PVS. Speaker Bio Mohsen Lesani is a Phd candidate at UCLA advised by professor Palsberg. He has research experience with IBM Research, Oracle (Sun) Labs, HP Labs and EPFL. He is interested in the design, implementation, testing and verification of synchronization algorithms.
 Date 4:00-5:00, Tuesday, May 28 Place 415 Gates Speaker Vijay Ganesh Title SMT Solvers for Software Reliability and Security Abstract SMT solvers increasingly play a central role in the construction of reliable and secure software, regardless of whether such reliability/security is ensured through formal methods, program analysis or testing. This dramatic influence of SMT solvers on software engineering as a discipline is a recent phenomenon, largely attributable to impressive gains in solver efficiency and expressive power. In my talk, I will motivate the need for SMT solvers, sketch out their research story thus far, and then describe my contributions to solver research. Specifically, I will talk about two SMT solvers that I designed and implemented, namely, STP and HAMPI, currently being used in 100+ research projects. I will talk about real-world applications enabled by my solvers, and the techniques I developed that helped make them efficient. Time permitting, I will also talk about some theoretical results in the context of SMT solving. Speaker Bio Vijay Ganesh is an assistant professor at the University of Waterloo, Canada, since September 2012. Prior to that he was a Research Scientist at MIT, and completed his PhD in computer science from Stanford University.
 Date 2:00-3:00, Monday, May 20 Place Gates 104 Speaker Cristian Cadar, Imperial College London Title Safe Software Updates via Multi-version Execution Abstract Software systems are constantly evolving, with new versions and patches being released on a continuous basis. Unfortunately, software updates present a high risk, with many releases introducing new bugs and security vulnerabilities. We tackle this problem using a simple but effective multi-version based approach. Whenever a new update becomes available, instead of upgrading the software to the new version, we run the new version in parallel with the old one; by carefully coordinating their executions and selecting the behavior of the more reliable version when they diverge, we create a more secure and dependable multi-version application. We implemented this technique in Mx, a system targeting Linux applications running on multi-core processors, and show that it can be applied successfully to several real applications, such as Lighttpd and Redis.
 Date 3:00-4:00, Monday, May 6 Place Gates 392 Speaker Rupak Majumdar, Max Planck Institute Title Static Provenance Verification for Message-Passing Programs Abstract Provenance information records the source and ownership history of an object. We study the problem of static provenance tracking in concurrent programs in which several principals execute concurrent processes and exchange messages over unbounded but unordered channels. The provenance of a message, roughly, is a function of the sequence of principals that have transmitted the message in the past. The provenance verification problem is to statically decide, given a message passing program and a set of allowed provenances, whether the provenance of all messages in all possible program executions, belongs to the allowed set. We formalize the provenance verification problem abstractly in terms of well-structured provenance domains, and show a general decidability result for it. In particular, we show that if the provenance of a message is a sequence of principals who have sent the message, and a provenance query asks if the provenance lies in a regular set, the problem is decidable and EXPSPACE-complete. We describe an implementation of our technique to check provenances of messages in Firefox extensions. (Joint work with Roland Meyer and Zilong Wang)
 Speaker Bio
 Date 2:30-3:30, Thurs., Dec. 6 Place 392 Gates Speaker Christoph Kirsch, University of Salzburg Title Distributed Queues: Faster Pools and Better Queues Abstract Designing and implementing high-performance concurrent data structures whose access performance scales on multicore hardware is difficult. An emerging remedy to scalability issues is to relax the sequential semantics of the data structure and exploit the resulting potential for parallel access in relaxed implementations. However, a major obstacle in the adoption of relaxed implementations is the belief that their behavior becomes unpredictable. We therefore aim at relaxing existing implementations systematically for better scalability and performance without incurring a cost in predictability. We present distributed queues (DQ), a new family of relaxed concurrent queue implementations. DQ implement bounded or unbounded out-of-order relaxed queues with strict (i.e. linearizable) emptiness check. Our comparison of DQ against existing pool, and strict and relaxed queue implementations reveals that DQ outperform and outscale the state-of-the-art implementations. We also empirically show that the shorter execution time of queue operations of fast but relaxed implementations such as DQ (i.e. the degree of reordering through overlapping operations) may offset the effect of semantical relaxations (i.e. the degree of reordering through relaxation) making them appear as behaving as or sometimes even more FIFO than strict but slow implementations. This is joint work with A. Haas, T.A. Henzinger, M. Lippautz, H. Payer, A. Sezgin, and A. Sokolova Speaker Bio Christoph Kirsch is full professor and holds a chair at the Department of Computer Sciences of the University of Salzburg, Austria. Since 2008 he is also a visiting scholar at the Department of Civil and Environmental Engineering of the University of California, Berkeley. He received his Dr.Ing. degree from Saarland University, Saarbruecken, Germany, in 1999 while at the Max Planck Institute for Computer Science. From 1999 to 2004 he worked as Postdoctoral Researcher at the Department of Electrical Engineering and Computer Sciences of the University of California, Berkeley. His research interests are in concurrent programming and systems, virtual execution environments, and embedded software. Dr. Kirsch co-invented the Giotto and HTL languages, and leads the JAviator UAV project for which he received an IBM faculty award in 2007. He co-founded the International Conference on Embedded Software (EMSOFT), has been elected ACM SIGBED chair in 2011, and is currently associate editor of ACM TODAES.
 Date 2:00-3:00, July 16, 2012 Place Gates 260 Speaker Jesse Tov, Harvard Title Practical Programming with Substructural Types Abstract Substructural logics remove from classical logic rules for reordering, duplication, or dropping of assumptions. Because propositions in such a logic may no longer be freely copied or ignored, this suggests understanding propositions in substructural logics as representing resources rather than truth. For the programming language designer, substructural logics thus provide a framework for considering type systems that can track the changing states of logical and physical resources. While several substructural type systems have been proposed and implemented, many of these have targeted substructural types at a particular purpose, rather than offering them as a general facility. The more general substructural type systems have been theoretical in nature and too unwieldy for practical use. This talk presents the design of a general purpose language with substructural types, and discusses several language design problems that had to be solved in order to make substructural types useful in practice. Speaker Bio
 Date 2:00-3:00, July 18, 2012 Place Gates 260 Speaker Aditya Thakur, U. Wisconsin Title A Deductive Algorithm for Symbolic Abstraction with Applications to SMT Abstract This talk presents connections between logic and abstract interpretation. In particular, I will present a new algorithm for the problem of "symbolic abstraction": Given a formula \phi in a logic L and an abstract domain A, the symbolic abstraction of \phi is the best abstract value in A that over-approximates the meaning of \phi. When \phi represents a concrete transformer, algorithms for symbolic abstraction can be used to automatically synthesize the corresponding abstract transformer. Furthermore, if the symbolic abstraction of \phi is bottom, then \phi is proved unsatisfiable. The bottom line is that our algorithm is "dual-use": (i) it can be used by an abstract interpreter to compute abstract transformers, and (ii) it can be used in an SMT (Satisfiability Modulo Theories) solver to determine whether a formula is satisfiable. The key insight behind the algorithm is that Staalmarck's method for satisfiability checking of propositional-logic formulas can be explained using concepts from the field of abstract interpretation. This insight then led to the discovery of the connection between Staalmarck's method and symbolic abstraction, and the extension of Staalmarck's method to richer logics, such as quantifier-free linear real arithmetic. This is joint work with Prof. Thomas Reps. Speaker Bio
 Date 3:00-4:00, November 7, 2011 Place Gates 260 Speaker Anupam Datta, CMU Title Policy Auditing over Incomplete Logs Abstract We present the design, implementation and evaluation of an algorithm that checks audit logs for compliance with privacy and security policies. The algorithm, which we name reduce, addresses two fundamental challenges in compliance checking that arise in practice. First, in order to be applicable to realistic policies, reduce operates on policies expressed in a first-order logic that allows restricted quantification over infinite domains. We build on ideas from logic programming to identify the restricted form of quantified formulas. The resulting logic is more expressive than prior logics used for compliance-checking, including propositional temporal logics and metric first-order temporal logic, and, in contrast to these logics, can express all 84 disclosure-related clauses in the HIPAA Privacy Rule. Second, since audit logs are inherently incomplete (they may not contain sufficient information to determine whether a policy is violated or not), reduce proceeds iteratively: in each iteration, it provably checks as much of the policy as possible over the current log and outputs a residual policy that can only be checked when the log is extended with additional information. We prove correctness, termination, time and space complexity results for reduce.We implement reduce and evaluate it by checking simulated audit logs for compliance with the HIPAA Privacy Rule. Our experimental results demonstrate that the algorithm is fast enough to be used in practice. Speaker Bio Anupam Datta is an Assistant Research Professor at Carnegie Mellon University where he is affiliated with CyLab, the Electrical and Computer Engineering Department, and (by courtesy) the Computer research focuses on foundations of security and privacy. One area of focus has been on programming language methods for compositional security. His work on Protocol Composition Logic and the Logic of Secure Systems has uncovered new principles for compositional security and has been applied successfully to find attacks in and verify properties of a number of practical cryptographic protocols and secure systems. A second area of focus has been on formalizing and enforcing privacy policies. He has worked on a Logic of Privacy that formalizes concepts from contextual integrity --- a philosophical theory of privacy as a right to appropriate flows of personal information. His group has produced the first complete formalization of the HIPAA Privacy Rule using this logic and developed principled audit mechanisms for enforcing policies expressed in the logic. Dr. Datta has co-authored a book and over 30 publications in conferences and journals on these topics. He serves on the Steering Committee of the IEEE Computer Security Foundations Symposium. He has served as Program Co-Chair of the 2011 Formal Aspects of Security and Trust Workshop and the 2008 Formal and Computational Cryptography Workshop. Dr. Datta obtained MS and PhD degrees from Stanford University and a BTech from IIT Kharagpur, all in Computer Science.
 Date 3:00-4:00, Oct 4, 2011 Place 498 Gates Speaker David Basin, ETH Zurich Title Policy Monitoring in First-order Temporal Logic Abstract In security and compliance, it is often necessary to ensure that agents and systems comply to complex policies. An example from financial reporting is the requirement that every transaction t of a customer c, who has within the last 30 days been involved in a suspicious transaction t', must be reported as suspicious within 2 days. We present an approach to monitoring such policies formulated in an expressive fragment of metric first-order temporal logic. We also report on case studies in security and compliance monitoring and use these to evaluate both the suitability of this fragment for expressing complex, realistic policies and the efficiency of our monitoring algorithm. Speaker Bio David Basin is a full professor and has the chair for Information Security at the Department of Computer Science, ETH Zurich since 2003. He is also the director of the ZISC, the Zurich Information Security Center. He received his bachelors degree in mathematics from Reed College in 1984, his Ph.D. from Cornell University in 1989, and his Habilitation from the University of Saarbr|cken in 1996. His appointments include a postdoctoral research position at the University of Edinburgh (1990 - 1991), and afterwards he led a subgroup, within the programming logics research group, at the Max-Planck-Institut f|r Informatik (1992 - 1997). From 1997 - 2002 he was a full professor at the University of Freiburg where he held the chair for software engineering. His research focuses on information security, in particular methods and tools for modeling, building, and validating secure and reliable systems.
 Date 1:00-2:00, Friday June 3 Place 463a Gates Speaker Andreas Zeller, Saarland University Title Mining Precise Specifications Abstract Recent advances in software validation and verification make it possible to widely automate the check whether a specification is satisfied. This progress is hampered, though, by the persistent difficulty of writing specifications. Are we facing a "specification crisis"? By mining specifications from existing systems, we can alleviate this burden, reusing and extending the knowledge of 60 years of programming, and bridging the gap between formal methods and real-world software. But mining specifications has its challenges: We need good usage examples to learn expected behavior; we need to cope with the approximations of static and dynamic analysis; and we need specifications that are readable and relevant to users. In this talk, I present the state of the art in specification mining, its challenges, and its potential, up to a vision of seamless integration of specification and programming. Speaker Bio Andreas Zeller is a full professor for Software Engineering at Saarland University in Saarbr|cken, Germany. His research concerns the analysis of large software systems and their development process; his students are funded by companies like Google, Microsoft, or SAP. In June 2011, Zeller will be inducted as Fellow of the ACM for his contributions to automated debugging and mining software archives.