(TL;DR -- I don't often use single-case discriminated unions. Jump down to the flowchart to see what I commonly use instead.)
In F#, single-case discriminated unions crop up fairly often. However, in many cases, there are other alternatives one might consider. In this post I want to sketch some of these alternatives, providing an account of the possibilities and -- more importantly -- the trade-offs involved. In order to keep this post to a reasonable length, I will consider three very common scenarios for which one might use single-case discriminated unions. Specifically, they are:
- As Marker Values
- As Tagged Primitives
- As Value Objects
!!! DISCLAIMER !!!
The advice in this post is meant to be a set of guidelines. A starting point. A way of consciously evaluating decisions (instead of just blindly copy-pasting some code). Further, after nearly 15 years of doing F# professionally, I've developed a lot of 'instinctual decision making'. This blog is very much an attempt at unpacking some of that, and codifying it in a way others might leverage. In short, it represents how I approach things. This might not necessarily “work” for someone else. And it certainly won't cover all possible scenarios one might encounter. Still, I hope you find it useful.
Just take everything herein with a grain of salt, OK? 😉
Marker Values
For the purposes of this discussion, let's define a “marker value” as: a type
only ever manifested in a single instance. Python's None
is one example.
The standard library in Pony also uses marker values to great effect. Though
unlike Python, where None
is a built-in construct, Pony leverages its flexible
“primitive” construct. In F#, you might see a single-case discriminated
union used for this purpose. In fact, one might argue any non-data-carrying
variant of any discriminated union is an example of a marker value (but I
don't want to wade too far into the weeds here).
type AmbientAuth = AmbientAuth
In any case, the main things one wants out of a marker value are:
- It should not carry data
- It should not support null-ness
- It should be an actual type
- It should have easy ergonomics (at the call site)
It should have only one instanceAll occurrences should be semantically equivalent
So, what other alternatives are there in F#?
Plot Twist!!! SCDUs are the perfect type for this in F# (but do consider
decorating them with the StructAttribute
). Really?! Well, yes... and no.
There is one caveat. If you intend to have your F# “marker value” be consumed
by other .NET languages (e.g. C#, VB), you may want to consider a struct instead.
type AmbientAuth = struct (* MARKER *) end
Why something different for other languages? Because F# discriminated unions, regardless of the number of variants, generate some extra code (as part of the machinery encoding them onto the CLR). F# neatly hides this detail from you. However, it's plainly visible in, for example, C# consumers. So, using a struct produces a more language-neutral solution. But it will mean slightly more cumbersome invocation for F# consumers.
let auth = AmbientAuth
(* instantiation: ↑↑↑ Union vs Struct ↓↓↓ *)
let auth = AmbientAuth()
(*
!!! LOOK OUT !!!
let auth = AmbientAuth ⮜ partially applied constructor -- NOT a value!
forgetting the parens ⮝⮝ is a subtle source of confusion
*)
So, to recap Marker Values present the following options and trade-offs:
Single-case Union | CLR Struct |
---|---|
Pro: Hits a syntactic 'sweet spot' | Pro: Language neutral |
Con: Funky when used outside F# | Con: Instantiation cruft in F# |
Tagged Primitives
Perhaps the most common usage of single-case unions -- or at least the one that
leaps foremost into people's minds -- is that of a “Tagged Primitive”. In other
words, a simple wrapper of some other basic type, usually meant to provide more
context. Again, we can find similar constructs in other languages. Most notably,
a newtype
in Haskell. Some fairly common examples of such a thing in F#
might be:
type ClientId = ClientId of Guid // ⮜ this should probably be a Value Object!
type Pixels = Pixels of uint64
// NOTE other operators omitted for brevity
static member ( + ) (Pixels l, Pixels r) = Pixels(l + r)
In fact, if you squint a bit, those even (sort of) look like a newtype
.
Alas, F# is not Haskell. Let me repeat that one more time for the folks
in the cheap seats:
F# IS NOT HASKELL.
It really isn't. And it certainly doesn't have newtype
. The above examples
behave rather differently (than newtype
) in terms of what the compiler emits.
Requests to add such a feature have, to date, languished in discussion. So,
instead, we have the previous example, or a few other alternatives. But first,
let's have some criteria for why we'd want (need?) a newtype
-like construct.
- It should provide contextual information about the role of some code.
- The 'new' type should be type-checked distinctly from the 'wrapped' type.
- There are no behavioral restrictions imposed on the underlying type.
This seem like very useful goals. However, item 2 means we can not use type abbreviations. And item 3 requires a significant amount of 'boilerplate' code to do correctly (as we want to have a lot of behavior pass through to the underlying primitive). Also, let's pause to appreciate: if we want the opposite of item 3, then we are almost certainly talking about a “Value Object” (discussed later in this post). So what's left (besides single-case unions)? Depending on the type being wrapped, and the target usage, there are a few alternatives. Let's consider each in turn.
Units of Measure
Units of Measure meet all our given criteria. And syntactically they are very light-weight, as well. Further, they enforce the fundamental rules of mathematics at compile time (e.g. quantities of dissimilar units can be multiplied together -- but not combined via addition).
[<Measure>]
type pixels
let offset = 240UL<pixels>
However, they carry some heavy restrictions. They are limited to only numeric
primitives (int
, float
, et cetera). They are erased at compile time (so
no reflective meta-programming support -- and no visibility in other .NET
languages). Additionally, since many operations in .NET are not “units aware”,
it's not uncommon to have to explicitly temporarily discard the units for
certain operations (only to re-apply them later). This unwrap-compute-rewrap
dance has come to upset many an F# developer.
Generic Tags
It turns out, with just a little bit of hackery, we can actually get something like Units of Measure -- but for non-numeric types. We call this a Generic Tag. I won't go into the specific mechanics of it, though. There are a few different ways to achieve it. And all of them are definitely advanced (not to mention a bit... circumspect). However, there's a library which hides the details for many (most?) common non-numeric primitives. So I will should you an example.
open System
#r "nuget: FSharp.UMX"
open FSharp.UMX
[<Measure>]
type ClientId
let current : Guid<ClientId> = UMX.tag (Guid.NewGuid())
As with everything, there are some more trade-offs here. In fact, other than
lifting the restriction to numerics, generic tags have all the same caveats as
units of measure. Further, their are at least some “units aware” functions
(abs
, min
, et cetera) in the core F# library. However, those are (again) limited to numeric types.
For generic tags, you may well find yourself stripping off the tag frequently-enough that its usage becomes questionable.
It's worth exploring -- but “individual milage may vary”.
Records
Since both Units of Measure and Generic Tags are erased at compile time, there really is no way to expose them to other .NET Languages. And, as mentioned in the section on Marker Values, even a Single-Case Union will surface some extra cruft when consumed from, e.g., Visual Basic. So, in cases where a (CLR) language-neutral implementation is required. We fallback to records, the bread-and-butter of F# data types.
type ClientId = { Value : Guid } // ⮜ this should probably be a Value Object!
type Pixels = { Value : uint64 }
// NOTE other operators omitted for brevity
static member ( + ) (Pixels l, Pixels r) = Pixels(l + r)
The drawback here is, clearly, the same as with single-case unions: more boilerplate, as we'd like all our behavior to pass through to the underlying primitive -- but need to code it ourselves.
Before moving on, the trade-offs for various Tagged Primitives can be summarized as follows:
Single-case Union | Units of Measure | Generic Tag | Record |
---|---|---|---|
Pro: Feel like a Haskeller 🙄 | Pro: Low-syntax, No-overhead | Pro: Low-syntax, No-overhead | Pro: Language neutral |
Con: Requires boilerplate | Con: Numeric types only | Con: Dodgy use of type system | Con: Requires boilerplate |
Con: Funky when used outside F# | Con: Erased at compile-time | Con: Erased at compile-time | |
Con: Standard library isn't “tag aware” |
Value Objects
Earlier, I mentioned how a “primitive with customized behavior” is its own kind
of construct (and the closely related notion: maybe your InvoiceId
s
shouldn't be divisible by any number?). But this is a very well-explored
area of software development. It's called a “Value Object”. It's an essential
part of Domain-Driven Design. And was first defined by Eric Evans in his
seminal work on the subject. But let me re-iterate, here, the main aspects
of a value object:
- It lacks identity
- It has no lifecycle
- It is self-consistent
Unpacking these a bit...
- “lacks identity” means there's no well-know identifier for this thing (e.g.
row id = 12345
). In practical terms, it means a thing with structural equality semantics. These are plentiful in F# (yay!). - “no lifecycle” means it's not data which evolves over time. Again, concretely, this means it's immutable. Cool. Lots of immutable constructs in F#.
- “self-consistent” means if you've got an instance of it, that instance can be reliably assumed to contain valid (for a given domain) data. Oh, hmm... So this one doesn't come for free. In F#, we use carefully controlled construction functions to realize self-consistent values.
Fortunately, there are several ways to encode a value object in F#. The most common are as a discriminated union or as a record. Rarely, it will be beneficial to model the internal state of the value object as one of a set of mutually exclusive alternatives. In that (very infrequent) case, there's clearly only one tool for the job -- a multi-case union. However, in this post, we're focused on single-case unions. And they can be used to address the most common scenario of a value object simply wrapping one (or a few) field(s).
type Paint =
private | Paint' of volume : float * pigment : Color
member me.Volume =
let (Paint' (volume=value)) = me in value
member me.Pigment =
let (Paint' (pigment=value)) = me in value
// NOTE other members omitted
static member Of(volume : float, color : string) : Result<Paint, PaintError> =
// NOTE validation omitted
However, it's equally valid to use a record for this. In fact, the code is nearly identical. The only differences being how you define and access the underlying data.
type Paint =
private { volume : float; pigment : Color }
member me.Volume = me.volume
member me.Pigment = me.pigment
// NOTE other members omitted
static member Of(volume : float, color : string) : Result<Paint, PaintError> =
// NOTE validation omitted
In either case, the 'recipe' for a Value Object is roughly as follows:
- Decide between a discriminated union or a record
- Mark constructor and field(s)
private
- Add public construction (either by static methods or companion module)
- Add any public read-only properties or operators (as needed)
- Add any extra operations (either as methods or in a companion module)
But why pick a union? When should you prefer a record? Honestly, this comes down to personal preference and sense of style. Me? Personally, I prefer the record. Performing a decomposition just to access the internal state feels... gratuitous. But to each his own. I certainly wouldn't balk at code that chose to go with the union approach.
For completeness sake, the following table lists the trade-offs between unions and records for encoding a Value Object (though it's a very silly difference):
Single-case Union | Record |
---|---|
Con: Awkward access to underlying data | Pro: Direct access underlying data |
Conclusion
Hopefully, by now, I've at least got you thinking about the various alternative approaches one might use when trying to 'level up' from simple primitives in F#. To further help, and for easier reference, I've also created the following flow chart (“How Paul Pick's His Primitives”, so to speak).
Have fun and happy coding!