One of the annoying things about OCaml arrays is that every cell has to be initialized when the array is first created.
I’m trying to implement an array in OCaml that distinguishes size
(the current number of elements in the array) from capacity
(the maximum size that the array can grow to).
At any given point in the array’s lifetime, the “inhabited cells” must form a prefix of the array itself. In other words, you can’t do something like:
1 2 UNINITIALIZED 3 4 UNINITIALIZED
The array also clears cell immediately once they’ve been popped so the application doesn’t leak memory / hold references to stuff that “used to be in the array” but isn’t anymore.
The array uses the last element as the size. The reason for this is three-fold: 1) creating an array only requires one call to Array.make
2) you don’t need to do any arithmetic on the indices to access the elements in the underlying array and 3) it only requires chasing one pointer, not two, to get to the elements.
This code uses Obj.magic
, but I try to confine it to places where it can be easily audited.
I’m not using Sys.opaque_identity
from the Sys Module anywhere because I’m not actually using uninitialized memory anywhere … I’m just using Obj.magic
to subvert the type system and let me store integers in strategic places. I am curious what work is needed to flambda
-proof this code though.
Anywhere, here’s the interface file uninit.mli
:
type 'a t val make : int -> 'a t val size : 'a t -> int val cap : 'a t -> int val get_exn : 'a t -> int -> 'a val get_opt : 'a t -> int -> 'a option val set_exn : 'a t -> int -> 'a -> unit val append_exn : 'a t -> 'a -> unit val pop_exn : 'a t -> 'a
And the source file uninit.ml
:
type 'a t = 'a array (* set every value to the integer 0, regardless of the * type of the array. * note that the last zero is special, it is the current size * of the array. *) let make : int -> 'a t = fun cap -> Array.make (cap + 1) (Obj.magic 0) let cap : 'a t -> int = fun arr -> (-1) + Array.length arr (* get the index of the size. This is the same as the capacity *) let size_idx : 'a t -> int = cap let size : 'a t -> int = fun arr -> (Obj.magic arr.(size_idx arr)) (* NOTE: incr_size is not checked and not safe *) let incr_size : 'a t -> unit = fun arr -> let new_size : int = 1 + Obj.magic arr.(size_idx arr) in arr.(size_idx arr) <- (Obj.magic new_size) (* NOTE: decr_size is not checked and not safe *) let decr_size : 'a t -> unit = fun arr -> let new_size : int = -1 + Obj.magic arr.(size_idx arr) in arr.(size_idx arr) <- (Obj.magic new_size) let get_exn : 'a t -> int -> 'a = fun arr idx -> if idx < size arr then arr.(idx) else raise (Invalid_argument "out of bounds") let get_opt : 'a t -> int -> 'a option = fun arr idx -> if idx < size arr then Some arr.(idx) else None let set_exn : 'a t -> int -> 'a -> unit = fun arr idx v -> if idx < size arr then arr.(idx) <- v else raise (Invalid_argument "out of bounds") let append_exn : 'a t -> 'a -> unit = fun arr v -> let s = size arr in let c = cap arr in if s = c then failwith "cannot extend array! size = capacity" else (* put the new item one after the last item, increment * the size by 1 *) (arr.(s) <- v; incr_size arr) let pop_exn : 'a t -> 'a = fun arr -> let s = size arr in if size arr = 0 then failwith "cannot pop from empty array" else let item = arr.(-1 + s) in (* shrink the array, re-zero the discarded element *) (decr_size arr; arr.(-1 + s) <- (Obj.magic 0); item)