osa1 github gitlab twitter cv rss

Debugging #15038

April 28, 2018 - Tagged as: en, haskell, ghc.

I recently spent some time debugging GHC bug #15038 and to help with context switching between debugging and other tasks I took notes during debugging. After successfully debugging it and reading my notes from the beginning I thought it may be an interesting read, so I’m publishing it. It may not make much sense to anyone other than me, and it’s very lightly edited, so keep your expectations low :)


Crashes at this point:

>>> bt
#11 0x00000000008d6650 in
    barf (s=0x94fe90 "evacuate(static): strange closure type %d") at rts/RtsMessages.c:47
#12 0x00000000008e7e83 in
    evacuate (p=0x981640) at rts/sm/Evac.c:574
#13 0x0000000000911069 in
    scavenge_static () at rts/sm/Scav.c:1831
#14 0x0000000000911638 in
    scavenge_loop () at rts/sm/Scav.c:2185
#15 0x00000000008e38fb in
    scavenge_until_all_done () at rts/sm/GC.c:1092
#16 0x00000000008e2558 in
    GarbageCollect (collect_gen=1, do_heap_census=false, gc_type=0,
                    cap=0xad40c0 <MainCapability>, idle_cap=0x0) at rts/sm/GC.c:418
#17 0x00000000008d523b in
    scheduleDoGC (pcap=0x7ffeced3a4a0, task=0x2620cc0, force_major=false) at rts/Schedule.c:1799
#18 0x00000000008d4787 in
    schedule (initialCapability=0xad40c0 <MainCapability>, task=0x2620cc0) at rts/Schedule.c:545
#19 0x00000000008d5be1 in
    scheduleWaitThread (tso=0x4200105388, ret=0x0, pcap=0x7ffeced3a590) at rts/Schedule.c:2533
#20 0x00000000008d7dd7 in
    rts_evalLazyIO (cap=0x7ffeced3a590, p=0x981730, ret=0x0) at rts/RtsAPI.c:530
#21 0x00000000008d84ca in
    hs_main (argc=1, argv=0x7ffeced3a788, main_closure=0x981730, rts_config=...) at rts/RtsMain.c:72
#22 0x000000000040d7da in
    main ()

While evacuating object at location

p = (StgClosure **) 0x981640

Info table of the object:

>>> print *get_itbl(q)
$5 = {
  layout = {
    payload = {
      ptrs = 0,
      nptrs = 134217728
    },
    bitmap = 576460752303423488,
    large_bitmap_offset = 0,
    __pad_large_bitmap_offset = 0,
    selector_offset = 576460752303423488
  },
  type = 16585,
  srt_bitmap = 419430400,
  code = 0x981619 "\026\230"
}

Untagged object:

q = (StgClosure *) 0x981618

Tagged:

*p = (StgClosure *) 0x981619

Where does this object appear? Tagged:

>>> find 0x4200000000, 0x14200000000, (void*)0x981619
(nothing)

Untagged:

>>> find 0x4200000000, 0x14200000000, (void*)0x981618
(nothing)

Not in the heap!

>>> info symbol 0x981618
rctv_closure in section .data of /home/omer/.local/share/rr/Main-31/mmap_hardlink_3_Main

According to STG output references should be

lvl4_rctv referenced by
lvl5_rctw referenced by
lvl6_rctx referenced by
$wbyteParserBadOnce

Looking at each closure:

rctv :: ([Char], Packed.Bytes.Parser.Parser Word)

>>> print (StgClosure)rctv_closure (0x981618)
{
  header = {
    info = 0x8c2960 <ghczmprim_GHCziTuple_Z2T_con_info>
  },
  payload = [0x0, 0x0] -- not evaluated yet?
}
>>> print *get_itbl(&rctv_closure)
{
  layout = {
    payload = {
      ptrs = 2,
      nptrs = 0
    },
    bitmap = 2,
    large_bitmap_offset = 2,
    __pad_large_bitmap_offset = 2,
    selector_offset = 2
  },
  type = 4, -- CONSTR_2_0
  srt_bitmap = 0,
  code = 0x8c2960 <ghczmprim_GHCziTuple_Z2T_con_info> "H\377\303\377e"
}


rctw :: [([Char], Packed.Bytes.Parser.Parser Word)]

>>> print (StgClosure)rctw_closure (0x981638)
{
  header = {
    info = 0x8c3b80 <ghczmprim_GHCziTypes_ZC_con_info>
  },
  payload = [0x0, 0x0] -- not evaluated yet?
}
>>> print *get_itbl(&rctw_closure)
{
  layout = {
    payload = {
      ptrs = 2,
      nptrs = 0
    },
    bitmap = 2,
    large_bitmap_offset = 2,
    __pad_large_bitmap_offset = 2,
    selector_offset = 2
  },
  type = 4, -- CONSTR_2_0
  srt_bitmap = 1,
  code = 0x8c3b80 <ghczmprim_GHCziTypes_ZC_con_info> "H\203\303\002\377e"
}


rctx :: Packed.Bytes.Parser.Parser Word

>>> print (StgClosure)rctx_closure (0x981658)
{
  header = {
    info = 0x40ca10 <rctx_info>
        -- Packed.Bytes.Parser.Parser GHC.Types.Word
  },
  payload = []
}
>>> print *get_itbl(&rctx_closure)
{
  layout = {
    payload = {
      ptrs = 0,
      nptrs = 0
    },
    bitmap = 0,
    large_bitmap_offset = 0,
    __pad_large_bitmap_offset = 0,
    selector_offset = 0
  },
  type = 21, -- THUNK_STATIC
  srt_bitmap = 3,
  code = 0x40ca10 <rctx_info> "H\215E\350L9\370r@H\203\354\bL\211\350H\211\336H\211\307\061\300\350S\314M"
}

$wbyteParserBadOnce :: Int -> Int#

>>> print (StgClosure)Parser_zdwbyteParserBadOnce_closure (0x981680)
{
  header = {
    info = 0x40d0b0 <Parser_zdwbyteParserBadOnce_info>
  },
  payload = 0x8
}
>>> print *get_itbl(&Parser_zdwbyteParserBadOnce_closure)
{
  layout = {
    payload = {
      ptrs = 0,
      nptrs = 0
    },
    bitmap = 0,
    large_bitmap_offset = 0,
    __pad_large_bitmap_offset = 0,
    selector_offset = 0
  },
  type = 14, -- FUN_STATIC
  srt_bitmap = 113,
  code = 0x40d0b0 <Parser_zdwbyteParserBadOnce_info> "H\215E\320L9\370\017\202\342\001"
}

Back to the backtrace. Adding watchpoints makes it run too slow, so first figure out when this happens:

>>> break GarbageCollect
Breakpoint 2 at 0x8e2144: file rts/sm/GC.c, line 226.
>>> ignore 2 10000000
Will ignore next 10000000 crossings of breakpoint 2.
>>> c
>>> info breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x00000000008e2144 in GarbageCollect at rts/sm/GC.c:226
        breakpoint already hit 6 times
        ignore next 9999994 hits

So 6th time we run GC we see this error. Let’s see if 0x981640 is a valid closure by the beginning of 6th GC:

>>> call LOOKS_LIKE_CLOSURE_PTR(0x981640)
$5 = true

However at this point it already has wrong type:

>>> print *get_itbl((StgClosure*)0x981640)
$4 = {
  layout = {
    payload = {
      ptrs = 0,
      nptrs = 134217728
    },
    bitmap = 576460752303423488,
    large_bitmap_offset = 0,
    __pad_large_bitmap_offset = 0,
    selector_offset = 576460752303423488
  },
  type = 16585,
  srt_bitmap = 419430400,
  code = 0x981619 "\026\230"
}

In fact, this object has this type since the beginning.

>>> info symbol 0x981640
rctw_closure + 8 in section .data of /home/omer/.local/share/rr/Main-31/mmap_hardlink_3_Main

This object is in payload[0] of rctw?

lvl5_rctw
  :: [(GHC.Base.String, Packed.Bytes.Parser.Parser GHC.Types.Word)]
[GblId, Str=m2, Unf=OtherCon []] =
    CCS_DONT_CARE :! [lvl4_rctv GHC.Types.[]];

So it should be rctv. Something is wrong with first payload of rctw. We should focus on evacuation/scavenging of rctw.

>>> break evacuate_static_object if q == 0x981638
>>> c
** hits breakpoint **
>>> call printClosure(q)
ghc-prim:GHC.Types.:(0x981619, 0xace111)
>>> info symbol 0x981619
rctv_closure + 1 in section .data of /home/omer/.local/share/rr/Main-31/mmap_hardlink_3_Main

So indeed we have rctv_closure in first payload (but tagged).

>>> call printClosure(0x981619)
ghc-prim:GHC.Tuple.(,)(0x9815f0, 0x981616)
>>> call printClosure(0xace111)
ghc-prim:GHC.Types.[](0x8c3450#)

It looks OK. Let’s see if it breaks after this GC.

>>> call printClosure(0x981638)
ghc-prim:GHC.Types.:(0x981619, 0xace111)

Nope. Continue until first payload of 0x981638 (rctw) changes.

It turns out the payload doesn’t change. So rctv itself changes!

On 4th GarbageCollect() this fails:

>>> call printClosure(0x981619)
*** printClosure: unknown type 16585 ****
Main: internal error: printClosure 16585
    (GHC version 8.5.20180425 for x86_64_unknown_linux)
    Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

So break at 3rd:

>>> break GarbageCollect
Breakpoint 1 at 0x8e2144: file rts/sm/GC.c, line 226.
>>> ignore 1 2
Will ignore next 2 crossings of breakpoint 1.
>>> c
>>> call printClosure(0x981619)
ghc-prim:GHC.Tuple.(,)(0x9815f0, 0x981616)

(Remember that 0x981619 is tagged 0x981618 which is rctv)

Now watch the closure:

>>> watch (StgClosure)rctv_closure
Watchpoint 2: (StgClosure)rctv_closure
>>> disable breakpoint 1
>>> c

This takes forever, gdb starts leaking memory, fills 16G and swaps.

Let’s take a look at static objects again. rctv’s info table pointer somehow gets overwritten. What could cause this? Perhaps an object before rctv has an overlapping field.

>>> info symbol 0x981617
rctu_closure + 7 in section .data of /home/omer/.local/share/rr/Main-31/mmap_hardlink_3_Main

It seems like we have rctu before rctv.

lvl3_rctu
  :: forall s.
     Packed.Bytes.Parser.Maybe# (Packed.Bytes.Parser.Leftovers# s)
     -> GHC.Prim.State# s
     -> (# GHC.Prim.State# s,
           Packed.Bytes.Parser.Result# s GHC.Types.Word #)

>>> print (StgClosure)rctu_closure
{
  header = {
    info = 0x40c908 <rctu_info>
  },
  payload = 0x8
}
>>> print *get_itbl(&rctu_closure)
{
  layout = {
    payload = {
      ptrs = 0,
      nptrs = 0
    },
    bitmap = 0,
    large_bitmap_offset = 0,
    __pad_large_bitmap_offset = 0,
    selector_offset = 0
  },
  type = 14, -- FUN_STATIC
  srt_bitmap = 1,
  code = 0x40c908 <rctu_info> "H\215E\370L9\370\017\202\277"
}

So locations of rctu and rctv

rctu: 0x981610
rctv: 0x981618

There’s only 8 bytes in between. This doesn’t look right. Because rctu is a static object with no payload, so it needs at least two words: info table pointer and static link field. Static link field is missing in the layout, so info table pointer of rctv is used as static link.

At this point I remember different types of objects have link fields in different places so let’s check:

#define FUN_STATIC_LINK(p)   (&(p)->payload[0])

Because I can’t add watchpoint to this location so I do this:

>>> break evacuate_static_object if q == 0x981610

Just to count which GC we’re at

>>> break GarbageCollect
>>> ignore 2 10000

First hit

>>> print (StgClosure)rctv_closure
$1 = {
  header = {
    info = 0x8c2960 <ghczmprim_GHCziTuple_Z2T_con_info>
  },
  payload = 0x8
}
>>> info breakpoints
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00000000008e7a73 in evacuate_static_object at rts/sm/Evac.c:340
        stop only if q == 0x981610 (target evals)
        breakpoint already hit 1 time
2       breakpoint     keep y   0x00000000008e2144 in GarbageCollect at rts/sm/GC.c:226
        breakpoint already hit 3 times
        ignore next 9997 hits

Let’s see if this breaks it:

>>> fin
>>> print (StgClosure)rctv_closure
$2 = {
  header = {
    info = 0x981561
  },
  payload = 0x8
}

Yep! So this function call breaks rctv:

>>> bt
#0  evacuate_static_object (link_field=0x981618, q=0x981610 (rctu)) at rts/sm/Evac.c:340
#1  0x00000000008e7e20 in
    evacuate (p=0x981628) at rts/sm/Evac.c:546
#2  0x0000000000911069 in
    scavenge_static () at rts/sm/Scav.c:1831
#3  0x0000000000911638 in
    scavenge_loop () at rts/sm/Scav.c:2185
#4  0x00000000008e38fb in
    scavenge_until_all_done () at rts/sm/GC.c:1092
#5  0x00000000008e2558 in
    GarbageCollect (collect_gen=1, do_heap_census=false, gc_type=0,
                    cap=0xad40c0 <MainCapability>, idle_cap=0x0) at rts/sm/GC.c:418
#6  0x00000000008d523b in
    scheduleDoGC (pcap=0x7ffeced3a4a0, task=0x2620cc0, force_major=false) at rts/Schedule.c:1799
#7  0x00000000008d4787 in
    schedule (initialCapability=0xad40c0 <MainCapability>, task=0x2620cc0) at rts/Schedule.c:545
#8  0x00000000008d5be1 in
    scheduleWaitThread (tso=0x4200105388, ret=0x0, pcap=0x7ffeced3a590) at rts/Schedule.c:2533
#9  0x00000000008d7dd7 in
    rts_evalLazyIO (cap=0x7ffeced3a590, p=0x981730, ret=0x0) at rts/RtsAPI.c:530
#10 0x00000000008d84ca in
    hs_main (argc=1, argv=0x7ffeced3a788, main_closure=0x981730, rts_config=...) at rts/RtsMain.c:72
#11 0x000000000040d7da in
    main ()

I’m convinced that layout of rctu is wrong. I’ll add some debug prints to GHC to see why the layout is generated this way, but before that I’m saving all the binaries and dump files of this session.

The bug should be in cgTopRhsClosure, the Cmm code generator for top-level closures.

Good thing GHC is deterministic enough these days so I get the same symbols when I recompile (later I realized that top-level symbols are deterministic enough but local ids usually change).

gen_code
  closure_label: lvl3_rctu_closure
  descr: <Parser.lvl3_rctu>
  info_tbl: label: lvl3_rctu_info
            rep:HeapRep static {
                  Fun {arity: 6 fun_type: ArgGen [True, False, False, True, True]} }
  fv_details: []

This looks fine. Let’s check mkStaticClosureFields:

mkStaticClosureFields
  caf_refs: NoCafRefs
  info_lbl: lvl3_rctu_info
  is_caf: False
  padding: []
  static_link_field: []
  saved_info_field: []
  static_link_value: 3

So this object doesn’t get a static link field. This is because the condition

staticClosureNeedsLink (mayHaveCafRefs caf_refs) info_tbl

doesn’t hold.

staticClosureNeedsLink :: Bool -> CmmInfoTable -> Bool
-- A static closure needs a link field to aid the GC when traversing
-- the static closure graph.  But it only needs such a field if either
--        a) it has an SRT
--        b) it's a constructor with one or more pointer fields
-- In case (b), the constructor's fields themselves play the role
-- of the SRT.
staticClosureNeedsLink has_srt CmmInfoTable{ cit_rep = smrep }
  | isConRep smrep         = not (isStaticNoCafCon smrep)
  | otherwise              = has_srt -- needsSRT (cit_srt info_tbl)

So has_srt is False for this object and idCafInfo of this closure says it’s not caffy.

Just to make sure this is the problem, I modify the code to treat all top-level closures as caffy. Indeed that fixes the bug.

CAFness of bindings are decided in hasCafRefs in TidyPgm. Two conditions for CAFness are

First one obviously doesn’t hold becasue rctu is a function. I don’t know why the second condition doesn’t hold yet.

TidyPgm operates on Core so let’s look at Core of rctu:

-- RHS size: {terms: 80, types: 1,207, coercions: 0, joins: 0/0}
lvl3_rctu
  :: forall s.
     Packed.Bytes.Parser.Maybe# (Packed.Bytes.Parser.Leftovers# s)
     -> GHC.Prim.State# s
     -> (# GHC.Prim.State# s, Packed.Bytes.Parser.Result# s Word #)
[GblId, Arity=2, Caf=NoCafRefs, Str=<S,1*U><S,U>, Unf=OtherCon []]
lvl3_rctu
  = \ (@ s_a6Ca)
      (leftovers0_a691
         :: Packed.Bytes.Parser.Maybe#
              (Packed.Bytes.Parser.Leftovers# s_a6Ca))
      (s0_a692 :: GHC.Prim.State# s_a6Ca) ->
      case leftovers0_a691 of {
        (#_|#) ds_d6Km ->
          (# s0_a692,
             (# GHC.Prim.(#_|#)
                  @ ('GHC.Types.TupleRep '[])
                  @ ('GHC.Types.TupleRep
                       '['GHC.Types.TupleRep
                           '['GHC.Types.UnliftedRep, 'GHC.Types.IntRep, 'GHC.Types.IntRep],
                         'GHC.Types.LiftedRep])
                  @ (# #)
                  @ (Packed.Bytes.Parser.Leftovers# s_a6Ca)
                  GHC.Prim.(##),
                GHC.Prim.(#_|#)
                  @ ('GHC.Types.TupleRep '[])
                  @ 'GHC.Types.LiftedRep
                  @ (# #)
                  @ Word
                  GHC.Prim.(##) #) #);
        (#|_#) ds_d6Kn ->
          case ds_d6Kn of { (# bytes0_scqD, stream0_scqE #) ->
          case bytes0_scqD of { (# arr0_scqH, off0_scqI, len0_scqJ #) ->
          case GHC.Prim.># len0_scqJ 0# of {
            __DEFAULT ->
              case Packed.Bytes.Parser.nextNonEmpty @ s_a6Ca stream0_scqE s0_a692
              of
              { (# ipv_s6NZ, ipv1_s6O0 #) ->
              case ipv1_s6O0 of {
                (#_|#) ds3_d6JW ->
                  (# ipv_s6NZ,
                     (# GHC.Prim.(#_|#)
                          @ ('GHC.Types.TupleRep '[])
                          @ ('GHC.Types.TupleRep
                               '['GHC.Types.TupleRep
                                   '['GHC.Types.UnliftedRep, 'GHC.Types.IntRep, 'GHC.Types.IntRep],
                                 'GHC.Types.LiftedRep])
                          @ (# #)
                          @ (Packed.Bytes.Parser.Leftovers# s_a6Ca)
                          GHC.Prim.(##),
                        GHC.Prim.(#_|#)
                          @ ('GHC.Types.TupleRep '[])
                          @ 'GHC.Types.LiftedRep
                          @ (# #)
                          @ Word
                          GHC.Prim.(##) #) #);
                (#|_#) ds3_d6JX ->
                  case ds3_d6JX of { (# bytes1_scqM, stream1_scqN #) ->
                  case bytes1_scqM of { (# arr1_scqQ, off1_scqR, ds4_scqS #) ->
                  (# ipv_s6NZ,
                     (# GHC.Prim.(#|_#)
                          @ ('GHC.Types.TupleRep '[])
                          @ ('GHC.Types.TupleRep
                               '['GHC.Types.TupleRep
                                   '['GHC.Types.UnliftedRep, 'GHC.Types.IntRep, 'GHC.Types.IntRep],
                                 'GHC.Types.LiftedRep])
                          @ (# #)
                          @ (Packed.Bytes.Parser.Leftovers# s_a6Ca)
                          (# (# arr1_scqQ, GHC.Prim.+# off1_scqR 1#,
                                GHC.Prim.-# ds4_scqS 1# #),
                             stream1_scqN #),
                        GHC.Prim.(#|_#)
                          @ ('GHC.Types.TupleRep '[])
                          @ 'GHC.Types.LiftedRep
                          @ (# #)
                          @ Word
                          a1_rcbR #) #)
                  }
                  }
              }
              };
            1# ->
              (# s0_a692,
                 (# GHC.Prim.(#|_#)
                      @ ('GHC.Types.TupleRep '[])
                      @ ('GHC.Types.TupleRep
                           '['GHC.Types.TupleRep
                               '['GHC.Types.UnliftedRep, 'GHC.Types.IntRep, 'GHC.Types.IntRep],
                             'GHC.Types.LiftedRep])
                      @ (# #)
                      @ (Packed.Bytes.Parser.Leftovers# s_a6Ca)
                      (# (# arr0_scqH, GHC.Prim.+# off0_scqI 1#,
                            GHC.Prim.-# len0_scqJ 1# #),
                         stream0_scqE #),
                    GHC.Prim.(#|_#)
                      @ ('GHC.Types.TupleRep '[])
                      @ 'GHC.Types.LiftedRep
                      @ (# #)
                      @ Word
                      a1_rcbR #) #)
          }
          }
          }
      }

This is huge and I’m not going to check this by hand. However at this point I realize that the condition for evacuating a FUN_STATIC is

case FUN_STATIC:
    if (info->srt_bitmap != 0) {
        evacuate_static_object(FUN_STATIC_LINK((StgClosure *)q), q);
    }
    return;

So this is not marked as CAF, but its SRT bitmap is not empty. Perhaps the bug is in the SRT bitmap.

I also try this in gdb:

>>> set get_itbl(&rctu_closure)->srt_bitmap = 0

This also fixes it. Now let’s figure out why SRT bitmap of rctu (which is supposed to be a non-CAF) is 1.

I look again at the definition of rctu and it seems like a1_rcbR is a free variable.

a1_rcbR :: GHC.Types.Word
[GblId, Caf=NoCafRefs, Str=m, Unf=OtherCon []] =
    CCS_DONT_CARE GHC.Types.W#! [5##];

However because this is not a CAF it shouldn’t make rctu CAF.

I stare at the code aimlessly for a while until I realize something interesting. This is the STG for rctu after unarisation:

lvl3_rctu
  :: forall s.
     Packed.Bytes.Parser.Maybe# (Packed.Bytes.Parser.Leftovers# s)
     -> GHC.Prim.State# s
     -> (# GHC.Prim.State# s,
           Packed.Bytes.Parser.Result# s GHC.Types.Word #)
[GblId,
 Arity=2,
 Caf=NoCafRefs,
 Str=<S,1*U><S,U>,
 Unf=OtherCon []] =
    [] \r [us_gcwc us_gcwd us_gcwe us_gcwf us_gcwg void_0E]
        case us_gcwc of tag_gcwh {
          __DEFAULT ->
              (#,,,,,,#) [1#
                          Control.Exception.Base.absentError
                          Control.Exception.Base.absentError
                          0##
                          0##
                          1#
                          Control.Exception.Base.absentError];
          2# ->
              case ># [us_gcwg 0#] of {
                __DEFAULT ->
                    case Packed.Bytes.Parser.nextNonEmpty us_gcwe GHC.Prim.void# of {
                      (#,,,,#) us_gcwi us_gcwj us_gcwk us_gcwl us_gcwm ->
                          case us_gcwi of tag_gcwn {
                            __DEFAULT ->
                                (#,,,,,,#) [1#
                                            Control.Exception.Base.absentError
                                            Control.Exception.Base.absentError
                                            0##
                                            0##
                                            1#
                                            Control.Exception.Base.absentError];
                            2# ->
                                case -# [us_gcwm 1#] of sat_scuk {
                                  __DEFAULT ->
                                      case +# [us_gcwl 1#] of sat_scuj {
                                        __DEFAULT ->
                                            (#,,,,,,#) [2#
                                                        us_gcwj
                                                        us_gcwk
                                                        sat_scuj
                                                        sat_scuk
                                                        2#
                                                        a1_rcbR];
                                      };
                                };
                          };
                    };
                1# ->
                    case -# [us_gcwg 1#] of sat_scur {
                      __DEFAULT ->
                          case +# [us_gcwf 1#] of sat_scuq {
                            __DEFAULT ->
                                (#,,,,,,#) [2# us_gcwd us_gcwe sat_scuq sat_scur 2# a1_rcbR];
                          };
                    };
              };
        };

This has two free variables: a1_rcbR and Control.Exception.Base.absentError. a1_rcbR is clearly not a CAF, but maybe absentError is? Because absentError references are generated in unarise, which happens much later than TidyPgm (right before generating Cmm), TidyPgm can’t take those into account.

Id for absentError is generated using vanillaIdInfo which sets the CAF-ness field as MayHaveCafRefs. So clearly absentError is a CAFFY.

So in summary, we introduce references to CAFs in UnariseStg which happens later than TidyPgm. The code generator is somehow handles this correctly by generating a SRT for this function and setting the info table bitmap correctly. However object layout generator (in cgTopRhsClosure) only uses the information generated in TidyPgm, so we get this mismatch of SRT information of this function.

Some ways to fix this:

The whole thing took 4:47 hours.