October 9, 2024 - Tagged as: en, plt, haskell.
OOP is certainly not my favorite paradigm, but I think mainstream statically-typed OOP does a few things right that are very important for programming with many people, over long periods of time.
In this post I want to explain what I think is the most important one of these things that the mainstream statically-typed OOP languages do well.
I will then compare the OOP code with Haskell, to try to make the point that OOP is not as bad in everything as some functional programmers seem to think.
In this post I use the word “OOP” to mean programming in statically-typed language with:
B
implements the public interface of type A
, values of type B
can be passed as A
.Examples of OO languages according to this definition: C++, Java, C#, Dart.
This set of features allows a simple and convenient way of developing composable libraries, and extending the libraries with new functionality in a backwards compatible way.
It’s probably best explained with an example. Suppose we have a simple logger library:
class Logger {
// Private constructor: initializes state, returns an instance of `Logger`.
Logger._();
// Public factory: can return `Logger` or any of the subtypes.
factory Logger() => Logger._();
void log(String message, Severity severity) { /* ... */ }
}
enum Severity {
Info,
Error,
Fatal,
}
and another library that does some database stuff:
class DatabaseHandle {
/* ... */
}
and an application that uses both:
class MyApp {
final Logger _logger;
final DatabaseHandle _dbHandle;
MyApp()
: _logger = Logger(),
_dbHandle = DatabaseHandle(...);
}
As is usually the case, things that make network connections, change shared state etc. need to be mocked, faked, or stubbed to be able to test applications. We may also want to extend the libraries with new functionality. With the features that we have, we don’t have to see this coming and prepare the types based on this.
In the first iteration we might just add a concrete class that is just the copy of the current class, and make the current class abstract:
// The class is now abstract.
abstract class Logger {
// Public factory now returns an instance of a concrete subtype.
factory Logger() => _SimpleLogger();
Logger._();
// `log` is now abstract.
void log(String message, Severity severity);
}
class _SimpleLogger extends Logger {
factory _SimpleLogger() => _SimpleLogger._();
_SimpleLogger._() : super._() {/* ... */}
@override
void log(String message, Severity severity) {/* ... */}
}
This change is backwards compatible, requires no changes in user code.
Now we might add more implementations, e.g. for ignoring log messages:
abstract class Logger {
factory Logger() => _SimpleLogger();
// New.
factory Logger.ignoring() => _IgnoringLogger();
Logger._();
void log(String message, Severity severity);
}
class _IgnoringLogger extends Logger {
factory _IgnoringLogger() => _IgnoringLogger._();
_IgnoringLogger._() : super._() {}
@override
void log(String message, Severity severity) {}
}
Similarly we can add a logger that logs to a file, to a DB, etc.
We can do the same for the database handle class, but for mocking, faking, or stubbing, in tests.
To be able to use these new subtypes in our app, we implement a factory, or add a constructor to allow passing a logger and a db handle:
class MyApp {
final Logger _logger;
final DatabaseHandle _dbHandle;
MyApp()
: _logger = Logger(),
_dbHandle = DatabaseHandle();
MyApp.withLoggerAndDb(this._logger, this._dbHandle);
}
Note that we did not have to change any types, or add type parameters. Any methods of MyApp
that use the _logger
and _dbHandle
fields do not have to know about the changes.
Now suppose one of the DatabaseHandle
implementations also start using the logger library:
abstract class DatabaseHandle {
factory DatabaseHandle.withLogger(Logger logger) =>
_LoggingDatabaseHandle._(logger);
factory DatabaseHandle() => _LoggingDatabaseHandle._(Logger.ignoring());
DatabaseHandle._();
/* ... */
}
class _LoggingDatabaseHandle extends DatabaseHandle {
final Logger _logger;
_LoggingDatabaseHandle._(this._logger) : super._();
/* ... */
}
In our app, we might test by disabling logging in the db library, but start logging db operations in production:
class MyApp {
// New
MyApp.testingSetup()
: _logger = Logger(),
_dbHandle = DatabaseHandle.withLogger(Logger.ignoring());
// Updated to start using the logging feature of the DB library.
MyApp()
: _logger = Logger(),
_dbHandle = DatabaseHandle.withLogger(Logger.toFile(...));
/* ... */
}
As an example that adds more state to the types, we can add a logger implementation that only logs messages above certain severity:
class _LogAboveSeverity extends _SimpleLogger {
// Only logs messages with this severity or more severe.
final Severity _severity;
_LogAboveSeverity(this._severity) : super._();
@override
void log(String message, Severity severity) { /* ... */ }
}
We can add another factory to the Logger
abstract class that returns this type, or we can even implement this in another library:
// Implemented in another library, not in `Logger`'s library.
class LogAboveSeverity implements Logger {
// Only logs messages with this severity or more severe.
final Severity _severity;
final Logger _logger;
LogAboveSeverity(this._severity) : _logger = Logger();
LogAboveSeverity.withLogger(this._severity, this._logger);
@override
void log(String message, Severity severity) { /* ... */ }
}
As a final example to demonstrate adding more operations (rather than more state), we can have a logger that logs to a file, with a flush
operation:
class FileLogger implements Logger {
final File _file;
FileLogger(this._file);
@override
void log(String message, Severity severity) {/* ... */}
void flush() {/* ... */}
}
In summary:
Crucially, we didn’t have to change any types while doing these changes, and the new code is still as type safe as before.
The logger and database libraries evolved in a completely backwards compatible way.
Since none of the types used in our application changed, MyApp
methods didn’t have to change at all.
When we decide to take advantage of the new functionality, we updated only how we construct the logger and db handle instances in our app. Rest of the app didn’t change.
Now let’s consider how something like this could be done in Haskell.
Immediately at the start, we have a few choices on how to represent it.
Option 1: An ADT, with callback fields to be able to add different types of loggers later:
data Logger = MkLogger
{ _log :: Message -> Severity -> IO ()
}
simpleLogger :: IO Logger
data Severity = Info | Error | Fatal
deriving (Eq, Ord)
log :: Logger -> String -> Severity -> IO ()
In this representation, extra state like the minimum severity level in our _LogAboveSeverity
is not added to the type, but captured by the closures:
logAboveSeverity :: Severity -> IO Logger
logAboveSeverity minSeverity = MkLogger
{ _log = \message severity -> if severity >= minSeverity then ... else pure ()
}
If we need to update some of the state shared by the closures, the state needs to be stored in some kind of reference type like IORef
.
Similar to the OOP code, the FileLogger
needs to be a separate type:
data FileLogger = MkFileLogger
{ _logger :: Logger -- callbacks capture the file descriptor/buffer and write to it
, _flush :: IO () -- similarly captures the file descriptor/buffer, flushes it
}
logFileLogger :: FileLogger -> String -> Severity -> IO ()
logFileLogger = log . _logger
However, unlike our OOP example, existing code that uses the Logger
type and log
function cannot work with this new type. There needs to be some refactoring, and how the user code will need to be refactored depends on how we want to expose this new type to the users.
Option 2: A typeclass that we can implement for our concrete logger types:
class Logger a where
log :: a -> String -> Severity -> IO ()
data SimpleLogger = MkSimpleLogger { ... }
simpleLogger :: IO SimpleLogger
simpleLogger = ...
instance Logger SimpleLogger where
log = ...
To allow backwards-compatible changes in the logger library, we need to hide the concrete logger class:
module Logger
( Logger
, simpleLogger -- I can export this without exporting its return type
) where
...
With this module, we have to either add a type parameter to the functions and other types that use Logger
, or use existentials.
Adding a type parameter is not a backwards compatible change, and in general it can cause snowball effect of propagating the type parameter to the direct users, and then their users, and so on, creating a massive change and difficult to use types.
The problem with existentials is that they are limited in how you can use them, and are somewhat strange in some areas. In our application we can do this:
data MyApp = forall a . Logger a => MkMyApp
{ _logger :: a
}
But we can’t have a local variable with this existential type:
createMyApp :: IO MyApp
createMyApp = do
-- I can't add a type annotation to myLogger without the concrete type
myLogger <- simpleLogger -- simpleLogger :: IO SimpleLogger
return MkMyApp { _logger = myLogger }
I also cannot have an existential type in a function argument:
-- The type signature is accepted by the compiler, but the value cannot be used.
doStuffWithLogging :: (forall a . Logger a => a) -> IO ()
doStuffWithLogging logger = log logger "test" Info -- some obscure type error
Instead we have to “pack” the logger value with its typeclass dictionary in a new type:
data LoggerBox = forall a . Logger a => LoggerBox a
doStuffWithLogging :: LoggerBox -> IO ()
doStuffWithLogging (LoggerBox logger) = log logger "test" Info
Other problems and limitations of this approach:
forall a . Logger a => ... a ...
instead of just Logger
.FileLogger
, but
Logger
value to FileLogger
, without knowing the concrete type of the FileLogger
.The effect monad approach is a variation of option (2) without existentials. Instead of
class Logger a where
log :: a -> String -> Severity -> IO ()
We add the ability to log in a monad type parameter:
class MonadLogger m where
log :: String -> Severity -> m ()
Then provide a “monad transformer” for each of the logger implementations:
newtype SimpleLoggerT m a = SimpleLoggerT { runSimpleLoggerT :: m a }
instance MonadIO m => MonadLogger (SimpleLoggerT m) where
log msg sev = SimpleLoggerT { runSimpleLoggerT = liftIO (logStdout msg sev) }
newtype FileLoggerT m a = FileLoggerT { runFileLoggerT :: Handle -> m a }
instance MonadIO m => MonadLogger (FileLoggerT m) where
log msg sev = FileLoggerT { runFileLoggerT = \handle -> liftIO (logFile handle msg sev) }
The database library does the same, and the app combines these together:
newtype MyAppMonad a = ...
instance MonadLogger MyAppMonad where ...
instance MonadDb MyAppMonad where ...
Because we have one type parameter that encapsulates all side effects (instead of one for logging, one for database operations), this avoids the issues with snowballed type parameters in the use sites.
The database library can also add a logger dependency without breaking the user code.
I think this is the best we can get in Haskell, and it’s quite similar to our OOP solution in terms of code changes needed to be done in the user code.
However for this to work the entire ecosystem of libraries need to do things this way. If the database library decides to use the ADT approach, we will need an “adapter”, e.g. a monad typeclass for the DB operations, with a concrete monad transformer type to call the DB library functions.
This is also the main problem with the composable effects libraries.
(There are also issues with how this kind of code performs in runtime, but that’s probably a topic for another blog post.)
Haskellers have been developing various ways of modelling side effects (such as DB operations, logging) as “effects” and various ways of composing them.
A simple and widespread way of doing this is via the effect monads, as we’ve seen in the previous section.
However these systems have a few drawbacks, compared to our OOP solution:
Different effect libraries generally don’t work together. For example, mtl and eff functions won’t work together without some kind of adapter turning one into the other.
Even if the entire Haskell ecosystem decides to use one particular effect system, things like using two different handlers for different parts of the program, such as the example of using different logger in the db library and the main app, requires type juggling. In some effect libraries this is not even possible.
Finally, note that the OOP code shown in this post are very basic and straightforward code that even a beginner in OOP can write. Any new person who joins the project, or any one time contributor who just wants to fix a bug and move on, will be able to work on either one of the libraries or the application code. It’s difficult to say the same with the composable effects libraries in Haskell.
Mainstream statically-typed OOP allows straightforward backwards compatible evolution of types, while keeping them easy to compose. I consider this to be one of the killer features of mainstream statically-typed OOP, and I believe it is an essential feature for programming with many people, over long periods of time.
Just like OOP, Haskell has design patterns, such as the effect monad pattern we’ve shown above. Some of these design patterns solve the problem nicely, but they need an entire ecosystem to follow the same pattern to be useful.
I think it would be beneficial for the functional programming community to stop dismissing OOP’s successes in the industry as an accident of history and try to understand what OOP does well.
Thanks to Chris Penner and Matthías Páll Gissurarson for reviewing a draft of this blog post.