5. The CAISAR modelling language¶

5.1. Origin: WhyML¶

CAISAR heavily relies on Why3 and uses the WhyML language as a basis for its own interpretation language. A reference of WhyML is available on the original Why3 manual.

However, since Why3 aims to verify a whole range of programs, it cannot specialize on a particular program structure. For further background, the paper [F2013] from one of Why3’s original author details the rationale and tradeoff involved in the WhyML design and implementation. To quote the author, “[the Why3 team] has come up with logic of compromise”.

As CAISAR focuses on artificial intelligence systems, it can make some additional assumptions on the nature of the inputs, program and users:

users will have a basic background on linear algebra, and expect CAISAR to reflect this
inputs will mostly be multidimensional vectors (machine learning community coined the term “tensor” as well) of floating point values, strings, ints or chars
the program control flow will mostly be composed of a lot of real or floating-point arithmetic operations: there is no loop with runtime invariant, nor conditional

With those constraints in mind, CAISAR provides several extensions to WhyML, that we detail here. They can be used directly in any WhyML file provided to CAISAR.

Some of those extensions will be “interpreted”. During the translation from “pure” WhyML terms to actual inputs to provers, symbols will be replaced with other symbols, or directly computed by CAISAR.

5.2. Built-ins¶

The built-ins are available under stdlib/caisar/. To access the symbols they define, the corresponding theory needs to be imported in the scope of the current one. For instance, to import the symbols defined by the theory Vector, prepend use caisar.types.Vector at the top of the file.

5.2.1. Vector¶

5.2.1.1. Types¶

type vector 'a
type index = int

5.2.1.2. Functions¶

(** Returns the i-th element of v. **)
function ([]) (v: vector 'a) (i: index) : 'a

function length (v: vector 'a) : int

(** Returns a vector with the i-th element equals to v1[i] - v2[i]. **)
function (-) (v1: vector 'a) (v2: vector 'a) : vector 'a

function mapi (v: vector 'a) (f: index -> 'a -> 'b) : vector 'b
function map (v: vector 'a) (f: 'a -> 'b) : vector 'b
function map2 (v1: vector 'a) (v2: vector 'b) (f: 'a -> 'b -> 'c) :
  vector 'c
function foreach (v: vector 'a) (f: 'a -> 'b) : vector 'b =
  map v f
function foreach2
  (v1: vector 'a) (v2: vector 'b) (f: 'a -> 'b -> 'c) : vector 'c =
    map2 v1 v2 f

5.2.1.3. Predicates¶

predicate forall_ (v: vector 'a) (f: 'a -> bool) =
  forall i: index. valid_index v i -> f v[i]
predicate forall2
(v1: vector 'a) (v2: vector 'b) (f: 'a -> 'b -> bool) =
  length(v1) = length(v2) ->
    forall i: index. valid_index v1 i -> f v1[i] v2[i]

5.2.2. Model¶

5.2.2.1. Types¶

(** Type of a machine learning model. CAISAR is able to interpret a symbol of
type model to get its control flow graph. **)
type model

5.2.2.2. Functions¶

(** Returns a symbol for the machine learning model with filename m. **)
function read_model (m: string) : model

(** Returns a vector that represents the application of a model
on an input vector. **)
function (@@) (m: model) (v: vector 'a) : vector 'a

5.2.3. Dataset¶

5.2.3.1. Types¶

(** Dataset type. dataset 'a 'b is a dataset with
  inputs of type 'a vector and output of type 'b vector. **)
type a
type b
type dataset = vector (a, b)

5.2.3.2. Functions¶

(** Returns a symbol for the dataset with filename f. **)
function read_dataset (f: string) : dataset
function foreach (d: dataset) (f: a -> b -> 'c) : vector 'c =
    Vector.foreach d (fun e -> let a, b = e in f a b)

5.2.3.3. Predicates¶

predicate forall_ (d: dataset) (f: a -> b -> bool) =
    Vector.forall_ d (fun e -> let a, b = e in f a b)

5.2.4. The Neural Intermediate Representation (NIR)¶

TODO: write a section on the NIR.

NIR is CAISAR’s internal representation of ML models and properties. NIR structures are similar to ONNX ones (or Neural Networks);

[F2013]

Jean-Christophe Filiâtre, One Logic to Use Them All, CADE 24 - the 24th International Conference on Automated Deduction