ICFP 2021
Sun 22 - Sat 28 August 2021
Sat 28 Aug 2021 02:20 - 02:40 at OCaml - Session 5 Chair(s): Ashish Agarwal

Safe Protocol Updates via Propositional Logic

Introduction

If values of a given type are stored on disk, or are sent between different executables, then changing that type or its serialization can result in versioning issues.

Often such issues are resolved by either making the deserializer more permissive or the serializer more generous and then reasoning ``manually'' about the situation.

For example, if a record type is stored on disk, one might reason that making the deserializer more permissive by adding a new field with a default value is safe, as old serializations will still be readable.

However, as type transformations become more complex, it can quickly become too difficult to make and reason about such changes. There are a few sources of difficulty:

  1. Libraries for generating serializations from a type may not offer flexible ways for making permissive deserializers and generous serializers.
  2. Given a pair of a (possibly permissive) deserializer and a (possibly generous) serializer, determining whether or not they’re compatible can be tricky.
  3. Since non-trivial systems may have multiple executables with multiple channels of communication between them, there must be some method for determining which serializer/deserializer pairs must be checked
  4. Once you know which serializer/deserializer pairs should be checked, you must also take into account the order of deployment of the various executables.

The Trace_types library attempts to resolve these problems; I’ll outline how below.

What Trace_types does

Generating flexible serializations

At the heart of Trace_types is a combinator library for building up deserializers and serializers. For example, here’s part of the interface of Trace_types.Deserializer:

module Deserializer : sig
   type 'a t
  
  val atom : (module Binable_and_sexpable with type t = 'a) -> 'a t
  val return : 'a -> 'a t
  val map : 'a t -> f:('a -> 'b) -> 'b t
  val both : 'a t -> 'b t -> ('a * 'b) t
  val either : 'a t -> 'b t -> [`First of 'a | `Second of 'b] t
  val nest : 'a t -> key:string -> 'a t
end	

Here, atom takes a preexisting serialization for a type that Trace_types will treat as a black box, and the rest of the functions build up a 'a Deserializer from that.

Trace_types has a ppx that generates values of type t Deserializer.t and t Serializer.TG from a typet`’s definition. For example, if you write:

type t = 
  | Foo of int
  | Bar of string
[@@deriving trace_types]	

then the ppx will generate a deserializer that’s essentially:

let deserializer = 
  let open Deserializer in
  either
    (nest (atom (module Int))    ~key:"Foo")
    (nest (atom (module String)) ~key:"Bar")
  |> map ~f:(function
    | `First x -> Foo x
    | `Second x -> Bar x)	

(For records the ppx will use both instead of either.)

By building up deserializers in this way, Trace_types can support things you would expect it to, e.g., arbitrarily nested types (having types using Trace_types be fields or constructor arguments to other types using Trace_types, etc.) and parameterized types.

Trace_types’s ppx also allows you to add custom serializers and deserializers to pieces of a type definition, and the combinators turn out to be expressive enough that you can write the equivalent of things like [@sexp.option] directly; unlike with the Sexp library, there’s no need for additional ppx support.

Determining compatibility of serializer/deserializer pairs

A crucial aspect of Trace_types’s serializers and deserializers is that you can extract from them both runtime functions to actually perform serialization and deserialization as well as an artifact that is both itself serializable and is amenable to static analysis. This artifact is what allows \trty\ to statically determine whether or not a given serializer/deserializer pair is compatible.

At its heart, this is done by turning each serializer $s$ or deserializer $d$ into boolean propositions $P_s$ and $P_d$ and then using Z3’s SAT solver to check that the implication $P_s\rightarrow P_d$ is a tautology. Variant types turn into disjunctions while record types turn into conjunctions. For example, the variant type above turns into $\textrm{Foo}\vee\textrm{Bar}$.

While that example is simple, as types nest and different customizations get added, the generated boolean propositions can become complicated.

Finding the right serializer/deserializer pairs to check

Although I’ll go over it less in the talk, there must also be machinery to determine the right pairs of serializers and deserializers to check.

Trace_types does this by organizing serializers and deserializers along channels, which are just strings:

module Channel : sig
  type t
  
  val to_string : t -> string
  val of_string : string -> t 
end

and then requiring you to register what ``channel'' a given serializer or deserializer is for with a central registry in order to get the associated runtime function.

The meaning of a channel is that Trace_types will check compatibility of serializers and deserializers registered to the same channel. Because a channel is just a stringable synchronization point, it can be whatever you like: the file at this location...'' orthe binary rpc with the following name…'', etc.

Finally, because the deployment order of executables matters for the correctness of type upgrades, Trace_types actually validates a given executable against a collection of already-deployed executables. It does this both at compile-time and at deploy-time.

Uptake and Future Directions

What kind of uptake has Trace_types had at Jane Street? A year after its release, it is used in 5 production projects. That’s ok, but Jane Street has hundreds of ``production projects'', so Trace_types is definitely still a niche library.

For the main project I’m involved in, Trace_types is used pervasively and has enabled workflows that were not possible before. A dev on another project that uses it have called it a ``big win'' for the increase in flexibility that it gives.

On the other hand, other devs have looked at Trace_types and decided it wasn’t right for their projects.

In this talk, I’ll talk about some of the downsides that can prevent new users from picking Trace_types up (and thus also future directions for improvement).

Some downsides are prosaic (e.g., it would be nice for Trace_types to work seamlessly with more bits of Jane Street’s infrastructure) while others are more fundamental (e.g., how can the complexity of Trace_types be reduced for new users?).

Sat 28 Aug

Displayed time zone: Seoul change

01:30 - 03:00
Session 5OCaml at OCaml
Chair(s): Ashish Agarwal
01:30
30m
Talk
From 2n+1 to n
OCaml
Nandor Licker University of Cambridge, Timothy M. Jones University of Cambridge, UK
File Attached
02:00
20m
Talk
Property-Based Testing for OCaml through Coq
OCaml
Paaras Bhandari University of Maryland, College Park, Leonidas Lampropoulos University of Maryland, College Park
02:20
20m
Talk
Safe Protocol Updates via Propositional Logic
OCaml
Michael O'Connor Jane Street