Paul Blasucci's Weblog

weblog index

A Mixed-Paradigm Recipe for Exposing Native Code

Published: Tuesday, 15 December 2015, at 17:10 UTC +01:00

(Note: this post assumes some familiarity with either .NET or Mono... it's also going to help if you've worked with C#, VB, or F# before.)

F# is frequently called a "functional first" programming language. Don Syme, creator of the language, has explained it thus:

Functional-first programming uses functional programming as the initial paradigm for most purposes, but employs other techniques such as objects and state as necessary.

However, the simplicity of this statement belies the tremendous power and flexibility of the language. This is seldom more apparent than when trying to wrap unmanaged libraries in F# code. In fact, we may combine two different approaches -- one common to OO languages and the other popularized by pure functional programming -- into a sort of recipe for wrapping native functionality in F#. Specifically, we'll bring together deterministic resource management[1][2] with the notion of abstract data types[3][4]. As a case study for exploring this, we'll look at the fszmq project.

What is fszmq?

fszmq is an MPLv2-licensed CLR (e.g. Mono, .NET) binding to the ZeroMQ distributed computing library. ZeroMQ, meanwhile, provides a complete library of building blocks for developing high-performance, message-passing systems.

fszmq is primarily concerned with Sockets which pass stateless Messages to one another. These messages are comprised of 1 or more frames of 0 or more bytes. fszmq makes no demands on the actual representation of message data. Sockets exchange messages in well-defined patterns which provide proven semantics on which to build distributed systems. Additionally, sockets provide (inaccessible to application code) inbound and outbound in-memory message queues. This makes centralization optional rather than mandatory. Sockets also provide a uniform abstraction over various transport protocols, the most popular of which are In-Process (i.e. threads), IPC, TPC, and PGM. Finally, a Context groups together a collection of sockets into a logically distinct "node". There is typically one context instance per OS-level process.

A simple example of a server, which receives updates from a client, and then replies with an acknowledgement might look as follows:

// create, configure Context, Socket instances
use context = new Context ()
use server  = router context
Socket.bind server "tcp://eth0:5555"

while not hook.IsCancellationRequested do
  let msg    = Socket.recvAll server
  let sender = Array.get msg 0
  // actual work would go here
  [| sender; 0x00uy |] |> Socket.sendAll server

For more information on getting started with fszmq and ZeroMQ please visit:

And now, back to the main feature...

F# code is subject to garbage collection, just like any other CLR language. This poses particular issues when working with unmanaged resources, which -- by definition -- are outside the purview of a garbage collector. However, we can take two explicit steps to help manage this. First, we define a type whose (ideally non-public) constructor initializes a handle to unmanaged memory:

type Socket internal (context,socketType) =
  let mutable disposed  = false // used for clean-up
  let mutable handle    = C.zmq_socket (context,socketType)
  //NOTE: in fszmq, unmanaged function calls are prefixed with 'C.'
  do if handle = 0n then ZMQ.error ()

Then, we both override object finalization (inherited from System.Object) and we implement the IDisposable interface, which allows us to control when clean-up happens:

  override __.Finalize () =
    if not disposed then
      disposed <- true // ensure clean-up only happens once
      let okay = C.zmq_close handle
      handle <- 0n
      assert (okay = 0)

  interface IDisposable with

    member self.Dispose () =
      self.Finalize ()
      GC.SuppressFinalize self // ensure clean-up only happens once

With our creation and destruction in place, we've made a (useless, but quite safe) managed type, which serves as an opaque proxy to the unmanaged code with which we'd like to work. However, as we've defined no public properties or methods, there's no way to interact with instances of this type.

And now abstract data types enter into the scene.

Ignoring the bits which pertain to unmanaged memory, our opaque proxy sounds an awful lot like this passage about abstract data types:

[An ADT] is defined as an opaque type along with a corresponding set of operations... [we have] functions that work on the type, but we are not allowed to see "inside" the type itself.

This would exactly describe our situation... if only we had some functions which could manipulate our proxy. Let's make some!

For the sake of navigability, we group the functions into a module with the same name as the type they manipulate. And the implementations themselves mostly invoke unmanaged functions passing the internal state of our opaque proxy.

module Socket =

  let trySend (socket:Socket) (frame:byte[]) flags =
    match C.zmq_send(socket.Handle,frame,unativeint frame.Length,flags) with
    | Message.Okay -> true
    | Message.Busy -> false
    | Message.Fail -> ZMQ.error()

  let send socket frame =
    Message.waitForOkay (fun () -> trySend socket frame ZMQ.WAIT)

  let sendMore socket frame : Socket =
    Message.waitForOkay (fun () -> trySend socket frame (ZMQ.WAIT ||| ZMQ.SNDMORE))
    socket

  //NOTE: additional functions elided, though they follow the same pattern

And that's primarily all there is to this little "recipe". We can see from the following simple example how our opaque proxy instances are a sort of token which provides scope as it is passed through various functions calls.

// create our opaque Socket instance
use client = dealer context
//NOTE: the 'use' keyword ensures '.Dispose()' is called automatically

// configure opaque proxy
Socket.connect client "tcp://eth0:5555"

// ... elsewhere ...
// send a message
date.Stamp () |> Socket.send client

// recv (and log) a message
client
|> Socket.tryPollInput 500<ms> // timeout
|> Option.iter logAcknowledgement

Now, we could stop here. However, this clean and useful F# code will feel a bit clumsy when used from C#. Specifically, in C# one tends to invoke methods on objects. Also, the tendency is for PascalCase when naming public methods. Fortunately -- as an added bonus -- we can accommodate C# with only minor decoration to our earlier code. We'll first add an ExtensionAttribute to our module. This tells various parts of the CLR to find extension methods inside this module.

[<Extension>]
module Socket =

And then we add two attributes to each public function. The ExtensionAttribute allows our function to appear as a method on the opaque proxy (when used from C#). Meanwhile, the CompiledNameAttribute ensures that C# developers will be presented with the naming pattern they expect. Calling the code from F# remains unaltered.

  [<Extension;CompiledName("SendMore")>]
  let sendMore socket frame : Socket =
    Message.waitForOkay (fun () -> trySend socket frame (ZMQ.WAIT ||| ZMQ.SNDMORE))
    socket

  //NOTE: additional functions elided, though they follow the same pattern

Now C# developers will find it quite straight-forward to use the code... and we've maintained all the benefits of both deterministic resource management and abstract data types.

// create our opaque Socket instance
//NOTE: the 'using' keyword ensures '.Dispose()' is called automatically
using(var client = context.Dealer())
{
  // configure opaque proxy
  client.Connect("tcp://eth0:5555");

  // ... elsewhere ...
  // send a message
  client.Send(date.Stamp());

  // recv (and log) a message
  var msg = new byte[0];
  if(client.TryGetInput(500,out msg)) logger.Log(msg);
}

By combining useful techniques from a few different "styles" of programming, and exploiting the rich, multi-paradigm capabilities of F#, we are able to provide simple, robust wrappers over native code.

TL;DR...

A Mixed-Paradigm Recipe for Exposing Native Code

Make a managed type with no public members, which proxies an unmanaged object
- Initialize native objects in the type's constructor
- Clean-up native objects in the type's finalizer
- Expose the finalizer via the IDisposable interface
Use the abstract data type pattern to provide an API
- Define functions which have "privileged" access to the native objects inside the opaque type from step #1
- Group said functions into a module named after the opaque type from step #1

Bonus: make the ADT friendly for C# consumers

Use ExtensionAttribute to make ADT functions seem like method calls
Use CompiledNameAttribute to respect established naming conventions

(This post is part of the 2015 F# Advent.)