5. The CAISAR modelling language

5.1. Origin: WhyML

CAISAR heavily relies on Why3 and uses the WhyML language as a basis for its own interpretation language. A reference of WhyML is available on the original Why3 manual.

However, since Why3 aims to verify a whole range of programs, it cannot specialize on a particular program structure. For further background, the paper [F2013] from one of Why3’s original author details the rationale and tradeoff involved in the WhyML design and implementation. To quote the author, “[the Why3 team] has come up with logic of compromise”.

As CAISAR focuses on artificial intelligence systems, it can make some additional assumptions on the nature of the inputs, program and users:

  • users will have a basic background on linear algebra, and expect CAISAR to reflect this

  • inputs will mostly be multidimensional vectors (machine learning community coined the term “tensor” as well) of floating point values, strings, ints or chars

  • the program control flow will mostly be composed of a lot of real or floating-point arithmetic operations: there is no loop with runtime invariant, nor conditional

With those constraints in mind, CAISAR provides several extensions to WhyML, that we detail here. They can be used directly in any WhyML file provided to CAISAR.

Some of those extensions will be “interpreted”. During the translation from “pure” WhyML terms to actual inputs to provers, symbols will be replaced with other symbols, or directly computed by CAISAR.

5.2. Built-ins

The built-ins are available under stdlib/interpretation.mlw. To access the symbols they define, the corresponding theory needs to be imported in the scope of the current theory. For instance, to import the symbols defined by the theory Vector, prepend use caisar.Vector at the top of the file.

5.2.1. Vector Types

type vector 'a
type index = int Functions

(** Returns the i-th element of v. **)
function ([]) (v: vector 'a) (i: index) : 'a

function length (v: vector 'a) : int

(** Returns a vector with the i-th element equals to v1[i] - v2[i]. **)
function (-) (v1: vector 'a) (v2: vector 'a) : vector 'a

function mapi (v: vector 'a) (f: index -> 'a -> 'b) : vector 'b
function map (v: vector 'a) (f: 'a -> 'b) : vector 'b
function map2 (v1: vector 'a) (v2: vector 'b) (f: 'a -> 'b -> 'c) :
  vector 'c
function foreach (v: vector 'a) (f: 'a -> 'b) : vector 'b =
  map v f
function foreach2
  (v1: vector 'a) (v2: vector 'b) (f: 'a -> 'b -> 'c) : vector 'c =
    map2 v1 v2 f Predicates

predicate forall_ (v: vector 'a) (f: 'a -> bool) =
  forall i: index. valid_index v i -> f v[i]
predicate forall2
(v1: vector 'a) (v2: vector 'b) (f: 'a -> 'b -> bool) =
  length(v1) = length(v2) ->
    forall i: index. valid_index v1 i -> f v1[i] v2[i]

5.2.2. NeuralNetwork Types

(** Type describing a neural network.
CAISAR is able to interpret a symbol of type nn
to get its control flow graph. **)
type nn
type format = ONNX | NNet Functions

(** Returns a symbol of type nn from the neural network located at n. **)
function read_neural_network (n: string) (f: format) : nn

(** Returns a vector that represents the application of neural network nn
on input vector v.**)
function (@@) (n: nn) (v: vector 'a) : vector 'a

5.2.3. Dataset Types

(** Dataset type. dataset 'a 'b is a dataset with
  inputs of type 'a vector and output of type 'b vector. **)
type dataset 'a 'b = vector ('a, 'b)
type format = CSV Functions

(** Returns a symbol of type nn from the neural network located at n. **)
function read_dataset (f: string) (k: format) : dataset 'a 'b
function foreach
  (d: dataset 'a 'b) (f: 'a -> 'b -> 'c) : vector 'c =
    Vector.foreach d (fun e -> let a, b = e in f a b) Predicates

predicate forall_
  (d: dataset 'a 'b) (f: 'a -> 'b -> bool) =
    Vector.forall_ d (fun e -> let a, b = e in f a b)

Jean-Christophe Filiâtre, One Logic to Use Them All, CADE 24 - the 24th International Conference on Automated Deduction