osa1

The issue with work sharing (common subexpression elimination)

August 13, 2015 - Tagged as: en, supercompilation, haskell.

I’d expect more work sharing to be always more beneficial. But apparently this is not the case, as pointed out in (Chitil, 1997)¹.

Here’s an example from the paper: (slightly changed)

sum [1 .. 1000] + sum [-1000 .. -1] + prod [1 .. 1000]

We can evaluate this expression to WHNF using heap space enough for a single list(to be more specific, we only need a single cons cell at any time). After evaluating a subexpression, we can deallocate and allocate for the next list etc.

However, if we eliminate common subexpressions, and generate this code:

let v = [1 .. 1000]
 in sum v + sum [-1000 .. -1] + prod v

Now v has to live until the let body is evaluated to a value. We win in allocation/deallocation side, but we lose in residency side. In paper’s words: “Whereas the transformation always decreases total heap usage, it may considerably influence heap residency.”

In general, we can’t do this transformation, without risking increased residency:

\[ e’[e,e] \leadsto \texttt{let}\; x = e\; \texttt{in}\; e’[x,x] \]

As a solution, the paper suggests this:

We always do CSE if the subexpressions’ WHNF == NF(i.e. if it’s a “safe type” in paper’s terms). According to the paper, “a partially evaluated expression is certain to require only a small, fixed amount of space if it’s not a function, whose environment may refer to arbitrary large data structures, and its WHNF is already its normal form”.
We always do CSE when a named expression is syntactically dominating another equal expression:

\[ \texttt{let}\; x = e\; \texttt{in}\; e’[e] \leadsto \texttt{let}\; x = e\; \texttt{in}\; e’[x] \]

Note that (1) is not always true, assume an expression with type ForeignPtr a where a is a huge FFI object. This has WHNF == NF property, but it may increase residency significantly. Maybe GHC didn’t have FFI at the time the paper is written.

Also, I’m wondering how is CSE is handled in current GHC.

In supercompilation, we want to avoid evaluating same expressions in a loop forever, so we keep some kind of “history”, and when we come across a term that we evaluated before, we fold the process tree and avoid evaluating same term again.

(fib 1000, fib 1000)

Unless we make sure to split it in a way that branches of the process tree are unaware of each other, we may end up eliminating common subexpressions. However, since there are lots of cases where we may want CSE, a splitter that always prevents it is not always desirable. We should instead allow CSE in a controlled way.

Olaf Chitil, “Common Subexpression Elimination in a Lazy Functional Language”, section 3.5.↩︎