Recently I blogged about how we’ve been using ocaml-ctypes to generate type-safe bindings to some C libraries that we want to use within our OCaml software stack.
Over the last few months we’ve added bindings for a few things including:
libpci
: for for accessing PCI devices;flock(2)
: for applying or removing an advisory lock on an open file; andlibsanlock
: for shared locks on distributed storage.
The last of these did, however, cause a small headache. Like many C libraries, libsanlock makes use of flexible array members.
What are flexible array members?
The term flexible member may counjour up any number of images but here we’re talking about a feature introduced in the C99 standard.
As of C99, it is possible to define a struct whose final member is an array without specifying its size. These can then be used to store an arbitrary amount of data as part of this struct when the required amound may only be determined at runtime. It is also possible to achieve this with the GCC compiler by declaring an array of zero length.
There is an example of such a struct in Sanlock library:
struct sanlk_resource {
char lockspace_name[SANLK_NAME_LEN]; /* terminating \0 not required */
char name[SANLK_NAME_LEN]; /* terminating \0 not required */
/* ... */
uint32_t num_disks;
/* followed by num_disks sanlk_disk structs */
struct sanlk_disk disks[0];
};
As is often the case, there is an accompanying member that specifies the size of the array which you’ll need when handling one of these structs.
The sizeof
operator on this struct is also required to obtain the offset to
this field and when allocating memory for a struct of this kind. For example,
to allocate a sanlk_resource
with n
disks:
struct sanlk_resource *res = malloc(sizeof *res + n * sizeof (struct sanlk_disk));
Bending Ctypes around flexible array members
When wrapping C structs in OCaml types using ocaml-ctypes, you define a new
structure
parameterised by a new OCaml type. This creates an “unsealed”
structure which enables us to handle values of this types (perhaps as returned
by a C library) but we cannot create values of this type until it is “sealed”.
We first need to declare all of the fields in the struct so that the size and
offsets of the struct and its members can be handled safely by the OCaml code.
The Ctypes library provides the primitive and POSIX types that you’ll need to build up your wrapper for a C struct.
As a concrete example, consider the following C struct:
struct series {
int length;
char contents [];
};
We would create an OCaml representation for this as follows:
module Line = struct
open Ctypes
type internal
let internal : internal structure typ = structure "line"
let length = field internal "length" int
let contents = field internal "contents" (array 0 char)
let () = seal internal
end
The problem comes when wrapping the flexible array member. The constructor for
the Ctypes array typ
is paramertised by the size to allow for bounds-checked
modification and access. This means that you can only create an OCaml type
representation of arrays that you know the size of in advance.
What value should we choose? Well this is going to inform how much memory is allocated for the OCaml values of this type and if we receive convert a C value with more elements in the flexible array member than we have declared then they are truncated.
Ctypes helpfully provides an allocate
function which takes a value of type
typ
and returns an uninitialized value of the right size and shape. It also
provides us with functions to dereference pointers (perhaps returned from the C
library) to these values. Unfortunately neither of these will just do the
right thing any more. Working out out how large these values should be at
compile-time is like asking how long is a piece of string?
The only sensible workaround is to try and copy the design pattern used in C in OCaml. We have declared it to be of zero size and we will have to handle the allocation and array bounds checking by hand.
Handling values returned from C
We can no longer rely on the native Ctypes support to convert this type for us since the array would always have length zero (despite being backed by a larger array allocated by the C). Now we must cast our array to and from a pointer to have it appear like an array of the correct size:
module Line = struct
...
type t = {
length : int;
contents : char list;
}
let of_internal_ptr (p : internal ptr) : t =
let arr_len = getf !@p length in
let contents_list =
let arr_start = getf !@p contents |> CArray.start in
CArray.from_ptr arr_start arr_len |> CArray.to_list in
{ length = arr_length; contents = contents_list; }
end
Creating values to pass to C
Normally we would simply call allocate
to create a pointer to a fresh value
of our given type and then set each of the fields before passing this value to
a C function. However, we need to now make sure we create a value of the
correct size.
A slight quirk is that you cannot use the allocate
function to allocate
arbitrary memory but Ctypes does provide an allocate_n
variant which allows
the use of an “abstract” typ
:
val allocate_n : ?finalise:('a ptr -> unit) -> 'a typ -> count:int -> 'a ptr
(** allocate_n t ~count:n allocates a fresh array with element type t and
length n, and returns its address. The argument ?finalise, if present, will be
called just before the memory is freed. The array will be automatically freed
after no references to the pointer remain within the calling OCaml program. The
memory is allocated with libc's calloc and is guaranteed to be zero-filled. *)
So if we have a value of our OCaml type we can construct a pointer which we can
use with our C bindings by coercing from one of these abstract values via the
void
type and then using the unchecked setters for the array field:
module Line = struct
...
let to_internal_ptr t =
let size = (sizeof internal + t.length * sizeof char) in
let internal =
allocate_n (abstract ~name:"" ~size ~alignment:1) 1
|> to_voidp |> from_voidp internal |> (!@) in
setf internal length t.length;
List.iteri (CArray.unsafe_set contents) t.contents;
addr internal
end
Using a view to make conversion implicit
Finally we can also create a view which creates a new Ctypes typ
from
another with some implicit conversion rules. This allows us to use this new
value when binding C functions.
Suppose we had a C function that operated on a struct line
, for example:
int length(struct line *l);
Then we could add a binding for this function by declaring a view using our
to_internal_ptr
and from_internal_ptr
functions. This allows us to call
this function with values of our OCaml type, Line.t
:
module Line = struct
...
let t = view ~read:of_internal_ptr ~write:to_internal_ptr (ptr internal)
end
let length = foreign "length" (Line.t @-> returning int)
Conclusion
I’ve spent a bit of time using Ctypes now and, even though I hadn’t been exposed to too much of the pain before it, I’m not sure I’d want to be without it. You do have to jump through some hoops but I don’t resent them because I feel they add value. The extra effort of the required boiler-plate is worth it for the confidence it brings. Sure, if you stray from straight-forward C patterns to more idiomatic use (like this example) then you have to stray a bit from the ideal model in Ctypes too but hey, the rules were made to be broken! Also, Ctypes is very actively maintained and the small usuability niggle addressed in this post has been noted.
Leave a Reply