HRAM0 | Model and Toolchain for Research in Memory Safety

Classification of Programs

Last Updated:

Families of Programs

1. An input space is a set $\mathcal{X} \subseteq \mathbb{Z}^\ast$.
2. An output space $\mathcal{Y}$ is a tuple $(p_\mathcal{Y}, Y, f_\mathcal{Y})$ where $p_\mathcal{Y} : \mathcal{X} \times \mathcal{P}(\mathbb{Z} \times \mathbb{Z}) \to \{0, 1\}$ is a characteristic predicate such that the output space concretely corresponds to the subset $\{M \in \mathcal{P}(\mathbb{Z} \times \mathbb{Z}) : \exists x \in \mathcal{X} \; p_\mathcal{Y}(x, M) = 1\} \subseteq \mathbb{Z} \times \mathbb{Z})$. The set $Y$ is a non-empty set of canonical representations containing a null element $\mathfrak{0}$, and $f_\mathcal{Y} : \mathcal{P}(\mathbb{Z} \times \mathbb{Z}) \to Y$ is a filter function satisfying $p_\mathcal{Y}(x, M) = 0 \iff f(M) = \mathfrak{0} \; \; \forall x \in \mathcal{X}, M \in \mathcal{P}(\mathbb{Z} \times \mathbb{Z})$.
3. A family of programs $\mathcal{F}$ is a subset of admissible programs $\mathcal{F} \subseteq \mathbb{P}$ equipped with a pair $(\mathcal{X}, \mathcal{Y})$, an input space and output space respectively, such that for all programs $P \in \mathcal{F}$, it holds that for all inputs $x \in \mathcal{X}$ in the input space, $P$ is safely terminating i.e. $\mathsf{exec}(P, x) \notin \{\infty, \bot\}$ and $P$ produces the expected output for input $x$ i.e. $p_\mathcal{Y}(x, \mathsf{exec}(P, x)) = 1$.

Equivalence between two programs $P_1, P_2 \in \mathcal{F}$, denoted $P_1 \equiv P_2$ is understood to mean $P_1 \equiv_{f_\mathcal{Y}} P_2$ where the input space in the definition of $\equiv_{f_\mathcal{Y}}$ is restricted to $\mathcal{X}$.

4. An indexed family of programs $\mathcal{F} = \{\mathcal{F}^{(\kappa)} := (\mathcal{X}^{(\kappa)}, \mathcal{Y}^{(\kappa)})\}_{\kappa \in \mathbb{N}}$ is a collection of subfamilies such that the problem size in each subfamily is $\kappa$ i.e. it holds that $\mathcal{X}^{(\kappa)} \subseteq \mathbb{Z}^{n(\kappa)}$ where $n(\kappa)$ is an integer function of $\kappa$ and captures the fact that the actual input size may be larger than the problem size but nevertheless linked to it.

Safely-Terminating Programs

Consider the set $\mathbb{T}$ of terminating programs; that is we define $\mathbb{T} := \{P \in \mathbb{P} : \mathsf{exec}(P, \mathbf{x}) \neq \infty \; \forall \mathbf{x} \in \mathbb{Z}^\ast\}$. We can partition $\mathbb{T}$ into two classes; the class of safe programs $\mathbb{S} \subset \mathbb{T}$ defined as $\mathbb{S} := \{P \in \mathbb{T} : \mathsf{exec}(P, \mathbf{x}) \neq \bot \; \forall \mathbf{x} \in \mathbb{Z}^\ast\}$; and the class of unsafe programs $\bar{\mathbb{S}} := \mathbb{T} \setminus \mathbb{S}$. Note that it is sufficient for a program to be deemed unsafe, and hence a member of $\bar{\mathbb{S}}$, should it enter the error state for only a single input. An immediate consequence of a strengthening transformation $T : \mathbb{P} \to \mathbb{P}$ is that $T(P) \in \mathbb{S} \; \forall P \in \mathbb{T}$.

Naturally, only subsets of $\mathbb{T}$ that are defined with respect to finite input spaces are recognizable (recursively enumerable). For example, for any positive integers $\kappa$ and $m$, the subset $\mathbb{T}_m^{(\kappa)} := \{P \in \mathbb{P} : \mathsf{exec}(P, \mathbf{x}) \neq \infty \; \forall \mathbf{x} \in \mathbb{Z}_m^n\} \subset \mathbb{T}$ is recognizable. Natural analogues of $\mathbb{T}_m^{(\kappa)}$ can be defined for the class of safe programs ($\mathbb{S}_m^{(\kappa)}$) and the class of unsafe programs ($\bar{\mathbb{S}}_m^{(\kappa)}$).

Example Family of Programs: Sorting

To help the reader, it is instructive to use a running example that we can make reference to throughout these pages. Furthermore, such a running example affords the opportunity to make easy comparison between different approaches. The example we use is familiar: sorting. In particular, we describe the family of sorting programs; first, from a general perspective, and then with consideration of a special case i.e. a concrete program that implements a specific sorting algorithm.

Consider $\mathcal{F}_{\mathsf{sort}} := \{\mathcal{F}_{\mathsf{sort}}^{(\kappa)}\}_{\kappa \in \mathbb{N}}$ where $\mathcal{F}_{\mathsf{sort}}^{(\kappa)} := (\mathcal{X}_{\mathsf{sort}}^{(\kappa)} := \mathbb{Z}^{n(\kappa)}, \mathcal{Y}_{\mathsf{sort}}^{(\kappa)} := (p_{\mathsf{sort}}, Y_{\mathsf{sort}}, f_{\mathsf{sort}}))$. The input space for each input size $\kappa$ is the set of integer arrays of length $\kappa$. Therefore, we have $n(\kappa) = \kappa$. The expected output is also an array of length $\kappa$. We can further restrict the output arrays to be sorted, but we will not do this at the moment since we wish to consider more than merely correct programs. We define $Y_\mathsf{sort} := \mathbb{Z}^{n(\kappa)}$, the set of integer arrays of length $\kappa$. The expected form of the output is to take the first $\kappa$ integers of memory, starting from address 0, as the output array. Therefore, the predicate $p_{\mathsf{sort}}(\mathbf{x}, M) := \bigwedge_{i = 0}^{\kappa - 1} M(i)$ checks that all $\kappa$ memory locations between 0 and $\kappa - 1$ (inclusive) are defined. Furthermore the filter function $f_{\mathsf{sort}}$ is defined thus: $$f_{\mathsf{sort}}(M) = \begin{cases} (M(0), \ldots, M(\kappa - 1)) & \text{ if } \exists \mathbf{x} \in \mathcal{X}_{\mathsf{sort}}^{(\kappa)} \; p_{\mathsf{sort}}(\mathbf{x}, M) = 1 \\ \mathfrak{0} & \text{ otherwise} \end{cases}$$ A subfamily of $\mathcal{F}_{\mathsf{sort}}$ is $\mathcal{F}_{\mathsf{sort}}^{\mathsf{legal}}$ which is defined as the collection of programs that output a permutation of the input array. It suffices to define ${p}_{\mathsf{sort}}^{\mathsf{legal}}(\mathbf{x}, M) := p_{\mathsf{sort}}(\mathbf{x}, M) \; \land \; \mathsf{perm}(\mathbf{x}, [M(0), \ldots, M(\kappa - 1)])$ where $\mathsf{perm}$ is a predicate that asserts whether the sequence of length $\kappa$ in the second argument is a permutation of the sequence of length $\kappa$ in the first argument. Note that we are implicitly fixing some $\kappa$ in our treatment here to simplify notation. We can of course subdivide further and consider the subfamily of $\mathcal{F}_{\mathsf{sort}}^{\mathsf{correct}} \subset \mathcal{F}_{\mathsf{sort}}^{\mathsf{legal}}$ corresponding to the collection of correct programs i.e. those that output sorted arrays. In this case, we define ${p}_{\mathsf{sort}}^{\mathsf{correct}}(\mathbf{x}, M) := {p}_{\mathsf{sort}}^{\mathsf{legal}}(\mathbf{x}, M) \; \land \; M(0) \leq M(1) \leq \cdots \leq M(\kappa - 1)$.

Now we present a member of this program family, namely selection sort. An implementation of this sorting algorithm is as follows. Note that this code is in the form of assembly used by our assembler.

BEGIN CODE

###############################################################################
# This program does some setup (initializes r2 := -1 for example) and calls
# the selsort function (below) to sort the input array, then halts.
###############################################################################
main:
    put -1, r2 # setup r2 (this is one of our common conventions)
    put &x, r4 # copy address of input to r4 in order to sort input
    call selsort
    hlt

###############################################################################
# Selection Sort (function dirties r5-r13 (i.e.not saved/restored from stack))
#        
# arguments: r4 - address of array of integers to sort of length n (reg n)
# sorts array pointed to by r4 in-place
###############################################################################
selsort:
        put 0, r5 # r5 is outer index
        add r2, n, r12 # r12 = n - 1
        sub r12, r5, r9
        brn r9, outerloop
        brn r2, end_outerloop

outerloop:
        add r5, r4, r6
        lod r6, r7 # r7 is min value
        put 0, r8
        add r8, r7, r13
        add r8, r5, r8
        sub r2, r5, r9 # r9 = r5 + 1 = inner index

        sub n, r9, r10
        brn r10, innerloop
        brn r2, end_innerloop

innerloop:
        add r9, r4, r10
        lod r10, r10
        sub r7, r10, r11
        brn r11, newmin
        brn r2, nextiter
newmin:
        put 0, r11
        add r11, r9, r8
        add r11, r10, r7
nextiter:
        sub r2, r9, r9
        sub n, r9, r10
        brn r10, innerloop

end_innerloop:
        add r8, r4, r10
        sto r13, r10
        sto r7, r6

        sub r2, r5, r5
        sub r12, r5, r9
        brn r9, outerloop

end_outerloop:
        ret
END CODE

Example Family of Programs: Square Matrix Multiplication

Consider $\mathcal{F}_{\mathsf{matmul}} := \{\mathcal{F}_{\mathsf{matmul}}^{(\kappa)}\}_{\kappa \in \mathbb{N}}$ where $\mathcal{F}_{\mathsf{matmul}}^{(\kappa)} := (\mathcal{X}_{\mathsf{matmul}}^{(\kappa)} := \mathbb{Z}^{n(\kappa)}, \mathcal{Y}_{\mathsf{matmul}}^{(\kappa)} := (p_{\mathsf{matmul}}, Y_{\mathsf{matmul}}, f_{\mathsf{matmul}}))$. The input space for each input size $\kappa$ is a pair of integer matrices whose rows and columns are of dimension $\kappa$. Therefore, we have $n(\kappa) = 2.\kappa^2$. This particular example illustrates that the size of the input $n = n(\kappa)$ may be larger than the "problem size" $\kappa$ albeit the former grows in accordance with the latter; in this case, in direct proportion with its square.

The choice of mapping from $\mathbb{Z}^{\kappa \times \kappa} \times \mathbb{Z}^{\kappa \times \kappa}$ to $\mathbb{Z}^{2\kappa^2}$ is arbitrary. Here we choose row-major order. An element of the input space is a multiplicand matrix in row-major order followed by the multiplier matrix in row-major order. An element of the set of cannonical representations $Y_\mathsf{matmul}$ is a $\kappa \times \kappa$ integer matrix. The predicate $p_{\mathsf{matmul}}$ is defined as $p_{\mathsf{matmul}}(\mathbf{x}, M) := \bigwedge_{i = 0}^{\kappa^2 - 1} M(i)$ checks that all $\kappa^2$ memory locations between 0 and $\kappa^2 - 1$ (inclusive) are defined. Furthermore the filter function $f_{\mathsf{matmul}}$ is defined thus: $$f_{\mathsf{matmul}}(M) = \begin{cases} \begin{bmatrix} M(0) & M(1) & \cdots & M(\kappa - 1)) \\ \vdots & \vdots & \vdots & \vdots \\ M((\kappa - 1)\kappa) & M((\kappa - 1)\kappa + 1) & \cdots & M((\kappa - 1)\kappa + \kappa - 1) \end{bmatrix} & \text{ if } \exists \mathbf{x} \in \mathcal{X}_{\mathsf{matmul}}^{(\kappa)} \; p_{\mathsf{matmul}}(\mathbf{x}, M) = 1 \\ \mathfrak{0} & \text{ otherwise} \end{cases}$$

We define $\mathcal{F}_{\mathsf{matmul}}^{\mathsf{legal}} := \mathcal{F}_{\mathsf{matmul}}$ which is defined as the collection of programs that output a square matrix iff the input is a pair of square matrices.

We can of course subdivide further and consider the subfamily of $\mathcal{F}_{\mathsf{matmul}}^{\mathsf{correct}} \subset \mathcal{F}_{\mathsf{matmul}}^{\mathsf{legal}}$ corresponding to the collection of correct programs i.e. those programs that output a matrix that is the product of the pair of input matrices. In this case, we define ${p}_{\mathsf{matmul}}^{\mathsf{correct}}(\mathbf{x}, M) := {p}_{\mathsf{matmul}}^{\mathsf{legal}}(\mathbf{x}, M) \; \land \; {f}_{\mathsf{matmul}}(M) = \mathbf{A}\mathbf{B}$ where $\mathbf{x} = (\mathbf{A}, \mathbf{B})$.