July 25, 2016 - Tagged as: en, haskell.
In this post we’ll take a look at internals of GHC’s mutable variables, and how they’re used by IORef
and STRef
. The code is copied from GHC, with minor changes for clarity.
λ> :m + Data.IORef
λ> :info IORef
newtype IORef a
= GHC.IORef.IORef (GHC.STRef.STRef GHC.Prim.RealWorld a)
-- Defined in ‘GHC.IORef’
instance Eq (IORef a) -- Defined in ‘GHC.IORef’
GHC.IORef
is defined in libraries/base/GHC/IORef.hs
:
-- | A mutable variable in the 'IO' monad
newtype IORef a = IORef (STRef RealWorld a)
We’ll look at 3 operations: read, write, and atomic modify.
-- | Read the value of an 'IORef'
readIORef :: IORef a -> IO a
IORef var) = stToIO (readSTRef var)
readIORef (
-- | Write a new value into an 'IORef'
writeIORef :: IORef a -> a -> IO ()
IORef var) v = stToIO (writeSTRef var v)
writeIORef (
atomicModifyIORef :: IORef a -> (a -> (a,b)) -> IO b
IORef (STRef r#)) f = IO $ \s -> atomicModifyMutVar# r# f s atomicModifyIORef (
STRef
is defined in libraries/base/GHC/STRef.hs
:
-- | A value of type `STRef s a` is a mutable variable in state thread `s`,
-- containing a value of type `a`
data STRef s a = STRef (MutVar# s a)
-- | Read the value of an 'STRef'
readSTRef :: STRef s a -> ST s a
STRef var#) = ST $ \s1# -> readMutVar# var# s1#
readSTRef (
-- | Write a new value into an 'STRef'
writeSTRef :: STRef s a -> a -> ST s ()
STRef var#) val = ST $ \s1# ->
writeSTRef (case writeMutVar# var# val s1# of
# -> (# s2#, () #) s2
Note that there’s no atomicModifySTRef
, because that only makes sense in IO context. So atomicModifyIORef
directly calls the primop.
In summary:
MutVar#
, wrapped with STRef
. When we unpack an IORef
in data constructor fields, internally we store a MutVar#
.writeMutVar#
readMutVar#
atomicModifyMutVar#
Primop definition:
primop ReadMutVarOp "readMutVar#" GenPrimOp
MutVar# s a -> State# s -> (# State# s, a #)
{Read contents of {\tt MutVar\#}. Result is not yet evaluated.}
with
has_side_effects = True
can_fail = True
Code generation is handled by StgCmmPrim.emitPrimOp
:
emitPrimOp :: DynFlags
-> [LocalReg] -- where to put the results
-> PrimOp -- the op
-> [CmmExpr] -- arguments
-> FCode ()
...
ReadMutVarOp [mutv]
emitPrimOp dflags [res] = emitAssign (CmmLocal res) (cmmLoadIndexW dflags mutv (fixedHdrSizeW dflags) (gcWord dflags))
This is just relative addressing, base is the MutVar#
we’re reading, and we skip the closure header to read the contents.
Primop definition:
primop WriteMutVarOp "writeMutVar#" GenPrimOp
MutVar# s a -> a -> State# s -> State# s
{Write contents of {\tt MutVar\#}.}
with
has_side_effects = True
code_size = { primOpCodeSizeForeignCall }
-- for the write barrier
can_fail = True
Code generation is again implemented in emitPrimOp
:
WriteMutVarOp [mutv,var]
emitPrimOp dflags [] = do emitStore (cmmOffsetW dflags mutv (fixedHdrSizeW dflags)) var
{-no results-}]
emitCCall [CmmLit (CmmLabel mkDirty_MUT_VAR_Label))
(CmmReg (CmmGlobal BaseReg), AddrHint), (mutv,AddrHint)] [(
This is more involved than readMutVar#
. First, we write the variable to the MutVar#
in the first line (emitStore
). Then, we call a function, specified by the variable mkDirty_MUT_VAR_Label
, passing two arguments: a global called BaseReg
, and the MutVar#
. mkDirty_MUT_VAR_Label
just holds the name of this function: (defined in rts/sm/Storage.c
)
/*
This is the write barrier for MUT_VARs, a.k.a. IORefs. A
MUT_VAR_CLEAN object is not on the mutable list; a MUT_VAR_DIRTY
is. When written to, a MUT_VAR_CLEAN turns into a MUT_VAR_DIRTY
and is put on the mutable list.
*/
void dirty_MUT_VAR(StgRegTable *reg, StgClosure *p)
{
Capability *cap = regTableToCapability(reg);if (p->header.info == &stg_MUT_VAR_CLEAN_info) {
p->header.info = &stg_MUT_VAR_DIRTY_info;
recordClosureMutated(cap,p);
} }
Remember that for the first argument we passed something called BaseReg
, and for the second argument we passed the MutVar#
.
This function gets a Capability
from the register table, and if the MutVar#
is “clean”, it marks it as “dirty” and records in the capability that it’s now mutated.
Capability
lacks documentation, but it’s not too important, so we just skip that and look at recordClsoureMutated
.
void recordClosureMutated(Capability *cap, StgClosure *p)
{
bdescr *bd = Bdescr((StgPtr)p);if (bd->gen_no != 0) recordMutableCap(p,cap,bd->gen_no);
}
p
is our MutVar#
here. bdescr
stands for “block descriptor”. GHC RTS allocates memory in blocks, and every block belongs to a generation. First generation is special in that if a MutVar#
is in the first generation, it can’t point to a younger generation, as the first generation is already the youngest generation. This is from includes/rts/storage/GC.h
:
- generation 0 is the allocation area. It is given a fixed set of blocks
during initialisation, and these blocks normally stay in G0S0. In parallel
execution, each Capability has its own nursery.
This code basically checks if the MutVar#
belongs to first generation (generation 0). If that’s not the case, we record the MutVar#
in the generation’s “mut list”:
void recordMutableCap(const StgClosure *p, Capability *cap, uint32_t gen)
{
bdescr* bd = cap->mut_lists[gen];if (bd->free >= bd->start + BLOCK_SIZE_W) {
bdescr *new_bd = allocBlockOnNode_lock(cap->node);
new_bd->link = bd;
bd = new_bd;
cap->mut_lists[gen] = bd;
}
*bd->free++ = (StgWord)p; }
Garbage collector then checks that list when collecting younger generations, to avoid collecting young objects kept alive by older generations (i.e. pointers from older generations to younger generations, see scavenge_capability_mut_lists
in rts/sm/Scav.c
).
We saw in dirty_MUT_VAR
that the MutVar#
is marked as “dirty” when it’s mutated. When is it marked as “clean” again?
When a MutVar#
is copied during GC, the object pointed by it is also copied to the same generation, and then the MutVar#
becomes clean again, because it no longer points to a younger generation. This is the related code:
static void
scavenge_block(bdescr *bd)
{
...case MUT_VAR_DIRTY:
gct->eager_promotion = rtsFalse;
evacuate(&((StgMutVar *)p)->var);
gct->eager_promotion = saved_eager_promotion;if (gct->failed_to_evac) {
((StgClosure *)q)->header.info = &stg_MUT_VAR_DIRTY_info;else {
}
((StgClosure *)q)->header.info = &stg_MUT_VAR_CLEAN_info;
}
p += sizeofW(StgMutVar);break;
... }
Primop definition:
primop AtomicModifyMutVarOp "atomicModifyMutVar#" GenPrimOp
MutVar# s a -> (a -> b) -> State# s -> (# State# s, c #)
with
out_of_line = True
has_side_effects = True
can_fail = True
out_of_line = True
basically tells code generator that this primop is implemented as a function. Code generator then just generates a function call:
cgOpApp :: StgOp -- The op
-> [StgArg] -- Arguments
-> Type -- Result type (always an unboxed tuple)
-> FCode ReturnKind
...
StgPrimOp primop) args res_ty = do
cgOpApp (<- getDynFlags
dflags <- getNonVoidArgAmodes args
cmm_args case shouldInlinePrimOp dflags primop cmm_args of
Nothing -> do -- out-of-line
let fun = CmmLit (CmmLabel (mkRtsPrimOpLabel primop))
NativeNodeCall, NativeReturn) fun cmm_args
emitCall (...
The primop is implemented in Cmm, in rts/PrimOps.cmm
. The code is a mess, but here’s the important part:
stg_atomicModifyMutVarzh ( gcptr mv, gcptr f )
{
...
retry:
x = StgMutVar_var(mv);1) = x;
StgThunk_payload(z,#ifdef THREADED_RTS
(h) = ccall cas(mv + SIZEOF_StgHeader + OFFSET_StgMutVar_var, x, y);if (h != x) { goto retry; }
#else
StgMutVar_var(mv) = y;#endif
if (GET_INFO(mv) == stg_MUT_VAR_CLEAN_info) {
"ptr", mv "ptr");
ccall dirty_MUT_VAR(BaseReg
}
return (r);
}
It’s basically a compare-and-swap loop, and in the end it marks the MutVar#
as “dirty”, using the same dirty_MUT_VAR
function used by the code generated for writeMutVar#
.
MutVar#
structAs the last thing, we look at the definition of MutVar#
: (in includes/rts/storage/Closures.h
)
typedef struct {
StgHeader header;
StgClosure *var;
} StgMutVar;
Nothing interesting here. See this Wiki page for GHC’s heap object layout. In our case, payload contains a single closure.
This concludes our MutVar#
(which is used under the hood for IORef
and STRef
) tour. I guess lessons here are:
readIORef
is fast, but writeIORef
is one function call in the best case. In the worst case, it does an expensive allocation (this is not just a heap pointer bump). If you have a tight loop with some state variables, prefer parameter passing instead.
Unpacking an IORef
in a data constructor field does not really make the constructor mutable. Instead, it inlines the MutVar#
, which has a mutable pointer field.
If you think about it a little bit, you may realize that optimizing (2) is actually quite tricky. Imagine having something like this:
data D = D { f1 :: {-# UNPACK #-} !(IORef Int)
f2 :: Int
,
}
bumpf1 :: D -> IO ()
D f1 _) = modifyIORef f1 (+ 1)
bumpf1 (
bumpf2 :: D -> D
D f1 f2) = D f1 (f2 + 1) bumpf2 (
You’d expect this to print True
:
do ref <- newIORef 0
let d1 = D ref 0
= bumpD2 d1
d2
bumpf1 d1<- readIORef (f1 d1)
rv1 <- readIORef (f1 d2)
rv2 print (rv1 == rv2)
If D
becomes mutable after the UNPACK
, this code doesn’t work anymore, because we lose sharing after the functional update in line bumpD2 d1
.
See also this discussion for how other compilers improve this.