October 16, 2017 - Tagged as: en, haskell.
Here’s a problem:
Implement a scheduler that runs these tasks in parallel, utilizing available resources as much as possible.
The code I’ll show here piggybacks on GHC RTS for scheduling. But for that we first have to implement our resources and tasks in a way that exposes necessary information to GHC’s scheduler. The idea is simple and fun to implement, but I can’t recommended using it in production :-) Scheduling is a hard problem, with many variations, and I’ve only recently started reading about it. This solution is a fun one than anything else.
The idea is simple; we implement resources as MVar
s and tasks as threads. Threads (tasks) take the MVar
s before performing the operation. Because threads are scheduled by GHC RTS, GHC handles scheduling of our tasks. Because of fairness properties of MVar
s, our threads are scheduled “fairly”, e.g. all tasks eventually finish even when we have infinitely many tasks.
A resource is an abstract object with a lock and unique identifier:
data Resource = Resource
_resourceName :: T.Text
{ _resourceId :: Unique
, _resourceLock :: MVar ()
,
}
instance Show Resource where
show = T.unpack . _resourceName
_resourceName
is just a string to be used for tracing program execution.
A Unique
is an integer that can be used in at most one resource:
newtype Unique = Unique Int
deriving (Eq, Ord)
Using Unique
we can define a total order for Resource
:
instance Eq Resource where
==) = (==) `on` _resourceId
(
instance Ord Resource where
compare = comparing _resourceId
A task that requires exclusive access to a subset of all resources can be implemented using withResources
:
withResources :: (MonadLogger m, MonadBaseControl IO m) => S.Set Resource -> m () -> m ()
= acquire_locks (S.toList locks)
withResources locks a where
= case ls of
acquire_locks ls ->
[]
a: ls' -> do
l "taking lock " <> (_resourceName l))
logDebug ($ \() ->
withMVar (_resourceLock l) acquire_locks ls'
Note that when all tasks are implemented using this function a deadlock won’t occur: resources are ordered, and S.toList
generates a sorted list, which in turn causes acquire_locks
to take locks in order, effectively implementing Dijkstra’s resource hierarchy solution to the dining philosophers problem.
Here are three task generators:
newtype Task = Task
runTask :: forall m . (MonadLogger m, MonadBaseControl IO m) => m () }
{
mkFastTask :: Int -> S.Set Resource -> Task
=
mkFastTask i res Task $ withResources res $ do
"Performing " <> T.pack (show i))
logDebug (500 :: Milliseconds)
threadDelay ("Fast task done (" <> T.pack (show i) <> ")")
logDebug (
mkSlowTask :: Int -> S.Set Resource -> Task
=
mkSlowTask i res Task $ withResources res $ do
"Performing " <> T.pack (show i))
logDebug (3 :: Seconds)
threadDelay ("Slow task done (" <> T.pack (show i) <> ")")
logDebug (
mkCrashingTask :: Int -> S.Set Resource -> Task
=
mkCrashingTask i res Task $ withResources res $ do
"Performing " <> T.pack (show i))
logDebug (error "task failed"
Integer arguments are just for tracing task execution in program output. mkFastTask
generates a task that takes 500 milliseconds to run. mkSlowTask
generates a task that takes 3 seconds. mkCrashingTask
makes a task that throws an exception, demonstrating that we release resources properly on exceptions.
Finally, the scheduler just spawns tasks using forkIO
or async
:
schedule :: (MonadLogger m, MonadBaseControl IO m, Forall (Pure m)) => [Task] -> m ()
= do
schedule tasks <- forM tasks $ \(Task task) ->
thrs `catch` (\(e :: SomeException) -> logDebug "Task failed"))
async (task forM_ thrs wait
Here’s an example run
taking lock resource5
Performing 0
taking lock resource0
Performing 1
taking lock resource2
taking lock resource6
taking lock resource7
Performing 2
Task failed
taking lock resource6
Performing 3
taking lock resource8
Performing 4
taking lock resource1
taking lock resource2
Performing 5
Task failed
taking lock resource2
taking lock resource3
taking lock resource8
taking lock resource0
taking lock resource3
taking lock resource4
Performing 9
Fast task done (3)
Fast task done (9)
Fast task done (0)
Slow task done (1)
taking lock resource4
taking lock resource8
Slow task done (4)
Performing 6
Fast task done (6)
taking lock resource7
Performing 8
Task failed
Performing 7
Slow task done (7)
The whole code that randomly generates resources and tasks and then runs them is here. It uses quite a lot of dependencies because it was extracted from a larger program, and I’m too lazy to make it smaller and simpler. I provided a stack.yaml
so hopefully it’s still not too hard to run.