osa1

Papers I read in 2015-2016

November 15, 2016 - Tagged as: en.

This is my last week at Indiana University. It’ll be almost two years (1 year 11 months) since I came here, and it’s time for me to move on. While cleaning my messy desk covered with papers today I thought maybe I should make list of papers that I read (or at least printed) during this two-year period, so here it is.

Papers are sorted by publication dates. ??? indicates that the publication date is unknown (or may not be published at all), and those are listed last. I’ve added some very brief notes below some of the papers. Only last names of authors are listed.

On computable numbers, with an application to Entscheidungsproblem – Turing. 1936.
Computing Machinery and Intelligence – Turing. 1950.

Defines “the imitation game” aka. “Turing test”. Tries to answer some philosophical questions. I remember finding some of the arguments pretty weak.
Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I – McCarthy. 1960.

A classic. Defines “garbage collection”, “free list” and “stack” for the first time, without actually using those words.
A basis for a mathematical theory of computation – McCarthy. 1963.

Haven’t read.
The Next 700 Programming Languages – Landin. 1966.

Classic.
A Nonrecursive List Compacting Algorithm – Cheney. 1970.

A classic. Only two pages. Algorithm can easily be generalized to collect arbitrary heap objects. Known as “Cheney’s algorithm” in GC literature.
Monotone Data Flow Analysis Frameworks – Kam, Ullman. 1975.
Global Data Flow Analysis and Iterative Algorithms – Kam, Ullman. 1976.
Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints – Cousot, Cousot. 1977.

Cousot & Cousot, so avoid. Read “Principles of Program Analysis” (book) by Nielson, Nielson and Hanking.
A Real-Time Garbage Collector Based on the Lifetimes of Objects – Lieberman, Hewitt. 1983.

An early generational GC paper. Defines “scavenging” and “evacuation”, some of the terms used in GHC’s GC.
Generation Scavenging: A Non-disruptive High Performance Storage Reclamation Algorithm – Ungar. 1984.
Compiling Pattern Matching – Augustsson. 1985.

A very early paper on lazy pattern matching in LML. IIRC the idea is very simple and maybe even obvious, but this is a very early publication on the topic.
How to Replace Failure by a List of Successes - A method for exception handling, backtracking, and pattern matching in lazy functional languages – Wadler. 1985.

I don’t remember reading this.
Multilisp: A anguage for Concurrent Symbolic Computation – Halstead. 1985.

Haven’t read this.
Programming as Theory Building – Naur. 1985.

All I remember about this paper is that I disagreed very strongly with the points made. I see some notes on the paper about why analogies don’t work and are harmful.
Control Operators, the SECD-Machine, and the λ-Calculus – Felleisen, Friedman. 1986.
Hygienic Macro Expansion – Kohlbecker, Friedman, Felleisen, Duba. 1986.
No Silver Bullet - Essence and Accident in Software Engineering – Brooks. 1986.
ORBIT: An Optimizing Compiler for Scheme. Kranz, Kelsey, Rees, Hudak, Philbin, Adams. 1986.
The Concept of a Supercompiler – Turchin. 1986.

This is one of the earliest supercompilation papers published in English.

First supercompilation papers were in Russian. First supercompilation paper in English was “A supercompiler system based on the langauge Refal”, Turchin, 1979. I don’t know what happened in the 7 year period, but I think the next paper was this one.
Efficient Compilation of Pattern-Matching (The Implementation of Functional Programming Languages book chapter) – Wadler. 1987.
Re-opening Closures – Appel. 1987.

The idea is to optimize closures in runtime using captured variables. I’ve heard that Mu uses some variant of this idea.
The Concatenate Vanishes – Wadler, 1987.

An even earlier work than the deforestation one. Wadler shows how to “deforest” intermediate lists in some programs. I don’t remember how the idea worked, but it must be just a very simple case-of-case transformation.
Theorems for free! – Wadler. 1989.

Great paper, shows how parametricity leads to theorems.
Deforestation: Transforming programs to eliminate trees – Wadler. 1990.

I think this is another classic and coined the term “deforestation”. The paper makes a huge simplifying assumption and assumes that variables are used linearly in function bodies, which makes beta reduction safe in terms of work duplication. So not very useful for practical purposes.
Linear types can change the world! – Wadler. 1990.
Implementing Projection-based Strictness Analysis – Kubiak, Hughes, Launchbury. 1992.
Undecidability of Static Analysis – Landi. 1992.

IIRC this paper shows how interesting static analysis problems are undecidable, by means of reductions from known undecidable problems like the halting problem.
Unboxed objects and polymorphic typing – Leroy. 1992.
A Short Cut to Deforestation – Gill, Launchbury, Jones. 1993.

Introduces fold/build rules. Still in use today in GHC.
Metaobject protocols: Why we want them and what else they can do – Kiczales, Ashley, Rodriguez, Vahdat, Bobrow. 1993.
Occam’s Razor in Metacomputation: the Notion of a Perfect Process Tree – Glück, Klimov. 1993.

An even earlier paper on “perfect information propagation”. Uses “process trees” which are only used in very early supercompilation papers (including Turchin ones IIRC).
Syntactic Abstraction in Scheme – Dybvig, Hieb, Bruggeman. 1993.

Go IU!
What is a Purely Functional Language? – Sabry. 1993.
Partial Deduction and Driving are Equivalent – Glück, Sørensen. 1994.
Mix Ten Years Later – Jones. 1995.

I love Neil Jones’s partial evaluation work (and especially the book “Partial Evaluation and Automatic Program Generation” by Jones, Gomard, Sestoft).
The Design of a Pretty-printing Library – Hughes. 1995.
A Roadmap to Metacomputation by Supercompilation – Glück, Sørensen. 1996.

Shows how to use supercompilation for metaprogramming tasks “specialization”, “composition” and “inversion”. I don’t even remember what are the latter two though.
Building Domain-Specific Embedded Languages – Hudak. 1996.
On Perfect Supercompilation – Secher, Sørensen. 1996.

This paper shows how to do “negative information propagation” during “driving” (see Jones, 2000). “Negative” means “what forms a value cannot take”.

Now that I think about this idea again, I’m wondering in what sense “negative” information is special. Assuming the set of values of a type is finite, you can always express negative information as positive information, e.g. if the set contains {X, Y, Z} cannot be X => is one of {Y, Z}.
Realistic Compilation by Partial Evaluator – Sperber, Thiemann. 1996.

I love this paper. It inspired a project of mine. I met the first author at ICFP’16 after coincidentally sitting next to him at lunch :) .
A transformation-based optimiser for Haskell – Jones, Santos. 1997.

Describes “let-no-escape” optimizations that is still in used today in GHC.
Improvement Theory and its Application – Sands. 1997.

The idea is used in Bolingbroke’s PhD thesis, that’s why I printed this, but I haven’t read it yet.
MetaML and Multi-Stage Programming with Explicit Annotations – Taha, Sheard. 1997.

Introduction to MetaML.
Definitional Interpreters for Higher-Order Programming Languages – Reynolds. 1998.

Classic.
Growing a Language – Steele. 1998.

Watch his talk with the same title instead, it’s awesome.
Positive supercompilation for a higher-order call-by-value language – Jonsson, Nordlander. 1998.

Funny how the pages are full of notes, yet I don’t remember anything about this paper.
Type Inference with Constrained Types – Odersky, Sulzmann, Wehr. 1998.
On the Power of Homeomorphic Embedding for Online Termination – Leuschel, 1998.
A semantics for imprecise exceptions – Jones, Reid, Hoare, Marlow, Henderson. 1999.
Call-By-Push-Value: A Subsuming Paradigm (extended abstract) – Levy. 1999.

Another paper that I just couldn’t finish.
Introduction to Supercompilation – Sørensen, Glück. 1999.

This is a really good introduction. It uses “process trees” which help a lot when trying to understand the idea for the first time. I made a presentation here at IU using process trees in this paper.
Linear Scan Register Allocation – Poletto, Sarkar. 1999.

Very famous algorithm, used in many JIT compilers. I think GHC also has a variant of this (see compiler/nativeGen/RegAlloc/Linear directory).
Partial Evaluation and Automatic Program Generation – Jones, Gomard, Sestoft. 1999.

I really like this book. I tried to implement every language in this book, but it became tiresome real quick. Best introduction to partial evaluation.
Secrets of the Glasgow Haskell Compiler inliner – Jones, Marlow. 1999.
The STG runtime system (revised) – Jones, Marlow. 1999.
Compiling Embedded Languages – Elliott, Finne, de Moor. 2000.
Type-Preserving Garbage Collectors – Wang, Appel. 2001.

Haven’t read this.
The Essence of Program Transformation by Partial Evaluation and Driving – Jones. 2000.

This paper defines the concept of “driving” which IIRC is also used in some early (before Mitchell and Bolingbroke papers) supercompilation papers.
Composable and Compilable Macros - When Want it When? – Flatt. 2002.
Template Meta-programming for Haskell – Sheard, Jones. 2002.
A Functional Correspondence between Evaluators and Abstract Machines – Ager, Biernacki, Danvy, Midtgaard. 2003.
From Interpreter to Compiler and Virtual Machine: A Functional Derivation – Ager, Biernacki, Danvy, Midtgaard. 2003.
Garbage-First Garbage Collection – Detlefts, Flood, Heller, Printezis. 2004.
Self-Adjusting Computation (PhD thesis) – Acar. 2005.

I started reading this a couple of time, failed every time. I was looking for some “automated” way of getting incremental compilation out of compiler passes that don’t take incremental compilation into account.
Gaussian Elimination: a case study in efficient genericity with MetaOCaml – Carette. 2005.
Fast and Loose Reasoning is Morally Correct – Danielsson, Hughes, Hansson, Gibbons. 2006.
Haskell Is Not Not ML – Rudiak-Gould, Mycroft, Jones. 2006.

I remember trying to read this paper many times, but I couldn’t succeed. I was thinking about an intermediate language that support both lazy and strict programs equally well.
Out of the Tar Pit – Moseley, Marks. 2006.

Defines “essential complexity” and “accidental complexity”.
The Development of Chez Scheme – Dybvig. 2006.

Chez in now open source. The runtime performance of Chez programs is good, but the compiler was a huge disappointment for me. Unreadable, undocumented mess, and the passes I was expecting to see is not there (sub-0CFA).
Call-pattern Specialisation for Haskell Programs – Jones. 2007.
Faster Laziness Using Dynamic Pointer Tagging – Marlow, Yakushev, Jones. 2007.

The idea is very simple and very effective. It’s in use in GHC today, and it’s saving us a lot of jumps. One of the reasons why unpacking sum types are not as useful as one might expect.

Over the summer 2016 one of the projects I worked on was to improve this in some cases. The problem was that in some cases (when strict fields are involved) GHC is currently not tagging some pointers even though in theory it could and that’d save us some instructions in the binaries. Unfortunately the rabbit hole goes quite deep and I had to drop it to focus on unboxed sums.
System F with Type Equality Coercions – Sulzmann, Chakravarty, Jones, Donnelly. 2007.
Generational Real-Time Garbage Collection - A Three-Part Invention for Young Objects – Frampton, Bacon, Cheng, Grove. 2007.

Haven’t read yet.
Bitcoin: A Peer-to-Peer Electronic Cash System – Satoshi Nakamoto. 2008.
Computation and State Machines – Lamport. 2008.
Finally Tagless, Partially Evaluate - Tagless Staged Interpreters for Simpler Typed Languages – Carette, Kiselyov, Shan. 2009.
Idempotent Work Stealing – Michael, Vechev, Saraswat. 2009.

This idea is used in LVish.
Positive Supercompilation for a Higher Order Call-By-Value Language – Jonsson, Nordlander. 2009.

An attempt at supercompiling a call-by-value language.
Supercompiler HOSC 1.0: under the hood – Klyuchnikov, 2009.

The author’s another paper (also listed here) is my favorite introduction to supercompilation.
Tracing the Meta-Level: PyPy’s Tracing JIT Compiler – Bolz, Cuni, Fijalkowski, Rigo. 2009.

Everything done by RPython/PyPy team is great.
Types are calling conventions – Bolingbroke, Jones. 2009.

Yet another great Bolingbroke work. I worked on a variant of this during the summer of 2016.
Abstracting Abstract Machines – Horn, Might. 2010.

I was told that this is very cool work, but I haven’t read it yet.
A Play on Regular Expressions – Fischer, Huch, Wilke. 2010.
Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs – Rompf, Odersky. 2010.

Scala LMS paper.
Proving the Equivalence of Higher-Order Terms by Means of Supercompilation – Klyuchnikov, Romanenko. 2010.
Rethinking Supercompilation – Mitchell. 2010.

I remember that I really liked this paper at the time. Unlike most other papers, this paper can be implemented in a couple of hours, and it works good enough. IIRC the main novelty here was the tag-bag based termination criterion, which was later developed further by Bolingbroke in first “Supercompilation by evaluation” and then in his PhD thesis.
Scrapping your Inefficient Engine: Using Partial Evaluation to Improve Domain-Specific Language Implementation – Brady, Hammond. 2010.

Introduces Idris’s partial evaluator.
Supercompilation by Evaluation – Bolingbroke, Jones. 2010.

Best supercompilation paper at the time. The record was later broken again by the author of this paper in his PhD thesis (also listed here). The author is amazing and I was depressed for a very long time after reading his papers (another notable one is “Termination Combinators Forever” which is also listed here). I’ll never be this good as a researcher.
Ur: Statically-Typed Metaprogramming with Type-Level Record Computation – Chilpala. 2010.
C4: The Continuously Concurrent Compacting Collector – Tene, Iyengar, Wolf. 2011.

Haven’t read yet.
Flow-Sensitive Type Recovery in Linear-Log Time – Adams, Keep, Midtgaard, Might, Chauhan, Dybvig. 2011.

Sub-0CFA for type inference. Not used in current Chez implementation as far as I can see. Goal is to eliminate runtime type checks.
How To Write Shared Libraries – Drepper. 2011.
Practical aspects of evidence-based compilation in System FC – Vytiniotis, Jones. 2011.

IIRC this stuff was introduced with GHC 7.2. For some reason that I can’t recall, it’s sometimes useful to return coercions in functions. This paper is about that. I don’t remember much although the paper has some notes on it.
Termination Combinators Forever – Bolingbroke, Jones, Vytiniotis. 2011.

Shows how to build online termination checkers using some primitives and combinators. Checkers are correct by construction. The implementation is used in the author’s implementation of his PhD thesis.
Challenges for a Trace-Based Just-In-Time Compiler for Haskell – Schilling. 2012.

Read his PhD thesis instead.
Chaperones and Impersonators: Run-time Support for Reasonable Interposition – Strickland, Tobin-Hochstadt, Findler, Flatt. 2012.

I have a big “NO” marked on the paper in red.
Explicitly Heterogeneous Metaprogramming with MetaHaskell – Mainland. 2012.
Self-Optimizing AST Interpreters – Würthinger, Wöß, Stadler, Duboscq, Simon, Wimmer. 2012.
The HERMIT in the Machine - A Plugin for the Interactive Transformation of GHC Core Language Programs – Farmer, Gill, Komp, Sculthorpe. 2012.
Call-by-need supercompilation (PhD thesis) – Bolingbroke. 2013.
Forge: Generating a High Performance DSL Implementation from a Declarative Specification – Sujeeth, Gibbons, Brown, Lee, Rompf, Odersky, Olukotun. 2013.

After having played with LMS and Delite a little bit, I have nothing good to say about this work. Things may be improved since then though.
Hybrid Partial Evaluation – Shali, Cook. 2013.

Cook has an introductory blog post / web page about partial evaluation which is pretty good.
Optimizing Data Structures in High-Level Programs – Rompf, Sujeeth, Amin, Brown, Jovanovic, Lee, Jonnalagedda, Olukotun, Odersky. 2013.

A Scala LMS paper. All I remember about these papers is that I didn’t like them.
One VM to Rule Them All – Würthinger, Wimmer, Wöß, Stadler, Duboscq, Humer, Richards, Simon, Wolczko. 2013.

Has 9 authors.
Simple and Efficient Construction of Static Single Assignment Form – Braun, Buchwalk, Hack, Leißa, Malloc, Zwinkau. 2013.

I think the algorithm was already known in the compilers community, but this paper publishes it first and benchmarks it.
Supercompiling Erlang (MSc thesis) – Weinholt. 2013.
Terra: A Multi-Stage Language for High-Performance Computing – DeVito, Hegarty, Aiken, Hanrahan, Vitek. 2013.

Terra is awesome. I talked a little bit about it in my blog posts 1, 2.
Trace-based Just-in-time Compilation for Lazy Functional Programming Languages (PhD thesis) – Schilling, 2013.

Two good theses published in 2013, Bolingbroke and this one. Too bad neither of them were pursued further.

IIRC the JIT introduction in this thesis is very easy to read, doesn’t assume a lot of background.
Combinators for Impure yet Hygienic Code Generation – Kameyama, Kiselyov, Shan. 2014.
Compiling a Reflective Language using MetaOCaml – Asai. 2014.

I love everything related with MetaOCaml.
Freeze After Writing - Quasi-Deterministic Parallel Programming with LVars – Kuper, Turon, Krishnaswami, Newton. 2014.
Macros that Work Together - Compile-Time Bindings, Partial Expression, and Definition Contexts – Flatt, Culpepper, Darais, Findler. 2014.
Modular, Higher-Order Cardinality Analysis in Theory and Practice – Sergey, Vytiniotis, Jones. 2014.

This is still in use in GHC today. Cardinality analysis is done as a part of demand analysis (lattice is extended to handle cardinalities).
Optimizing SYB Is Easy! – Adams, Farmer, Magalhaes. 2014.
Safe Zero-cost Coercions for Haskell (extended edition) – Breitner, Eisenberg, Jones, Weirich. 2014.

Another paper full of notes yet I don’t remember anything.
Supercompilation: Ideas and Methods – Klyuchnikov, Krustev. 2014.

Published in “The Monad.Reader” issue 23, this is without a doubt the best introduction to supercompilation. It’s very well written, provides great bibliography for further reading, and comes with a simple implementation. Great stuff, although the language is too simple to be useful.
Taming the Parallel Effect Zoo – Kuper, Todd, Tobin-Hochstadt, Newton. 2014.
The Kansas University Rewrite Engine - A Haskell-Embedded Strategic Programming Language with Custom Closed Universes – Sculthorpe, Frisby, Gill. 2014.

This is about KURE, rewrite engine that powers HERMIT. I remember that at the time I decided to implement my own rewrite combinators than fighting HERMIT API to do whatever I wanted to do.
Go 1.5 concurrent garbage collector pacing – Celements. 2015.
Static Program Analysis (lecture notes) – Møller, Schwartzbach. 2015.

https://cs.au.dk/~amoeller/spa/
Static Single Assignment Book (in-progress book) – “Lots of authors”. 2015 (ongoing)

I’m still reading this.
Shallow Embedding of DSLs via Online Partial Evaluation – Leißa, Boesche, Hack, Membarth, Slusallek. 2015.
The Design of Terra: Harnessing the Best Features of High-Level and Low-Level Languages – DeVito, Hanrahan. 2015.
Improving Implicit Parallelism – Trilla, Runciman. 2015.
K-Java: A Complete Semantics of Java – Bogdanas, Rosu. 2015.
Zero-Overhead Metaprogramming – Marr, Seaton, Ducasse. 2015.

Related with JITs, partial evaluation, and metaobject protocols. I don’t remember reading this.
GADTs Meet Their Match: Pattern-Matching Warnings That Account for GADTs, Guards, and Laziness – Karachalias, Schrivers, Vytiniotis, Jones. 2016.
Staging Generic Programming – Yallop. 2016.
State Machines All The Way Down - An Architecture for Dependently Typed Applications – Brady. 2016.

Common Subexpression Elimination in a Lazy Functional Language – Chitil. ???

Shows how CSE can lead to more memory residency and is not always beneficial.
Constructed Product Result Analysis for Haskell – Baker-Finch, Glynn, Jones. ???

This is still a part of GHC, done during demand analysis. We recently extended this to work on sum types.
Demand analysis (draft, not published) – Jones, Sestoft, Hughes

I hate this paper because it wasted so much of my time. GHC wiki links to that, even though this is not relevant to the implementation. Read Sergey et al. instead.
Lecture Notes: Control Flow Analysis for Functional Languages (lecture notes) – Aldrich. ???
Supercompiling with Staging – Inoue. ???

Shows how to use MetaOCaml’s staging operators to do supercompilation-like transformations.
Space and Time Efficient Supercompilation (PhD thesis) – Jonsson. ???.
The Design and Implementation of BER MetaOCaml – Kiselyov. ???
The Next Stage of Staging – Inoue, Kiselyov, Kameyama. ???
Two Techniques for Compiling Lazy Pattern Matching – Maranget. ???
Warnings for pattern matching – Maranget. ???

(Show comments)