Ordering

2022-07-13

Say we are given

ambmxmym\begin{aligned} a &\le m\\ b &\le m\\ x &\ge m\\ y &\ge m \end{aligned}

and want to know, is xax \ge a? How about xyx \ge y?

A human can of course look at these equations and say "yes" and "not provable from the given information", but our goal is to develop an algorithm that can do this for arbitrary sets of inequalities, and arbitrary queries. Our compiler will need this for some checks.

So, how can we do this? For starters, manipulating equations tends to be easiest if we place them into matrices. Secondarily, equalities are much easier to work with than inequalities. Thus, we introduce slack variables s0s_{*}\ge0 to turn each of these into an equality. Now, we have

[101001000011000100001100010001010001][abmxys0s1s2s3]=[0000]\begin{aligned} \begin{bmatrix} -1 & 0 & 1 & 0 & 0 & -1 & 0 & 0 & 0\\ 0 & -1 & 1 & 0 & 0 & 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 1 & 0 & 0 & 0 & -1 & 0\\ 0 & 0 & -1 & 0 & 1 & 0 & 0 & 0 & -1\\ \end{bmatrix} \begin{bmatrix} a\\b\\m\\x\\y\\s_0\\s_1\\s_2\\s_3 \end{bmatrix} &= \begin{bmatrix}0\\0\\0\\0 \end{bmatrix} \end{aligned}

Now, say, we're given an arbitrary query with respect to our 5 variables (and possible constants). E.g., we may want to know whether axa \le x or b+1yb + 1 \le y are known true.

Let us use axa \le x, or equivalently, xa0x - a \ge 0 as an example. We could repsent this as the column vector [10010]T\begin{bmatrix}-1& 0&0&1&0\end{bmatrix}^T.

So, to be able to answer queries with respect to xx and aa, the first thing we'd need to be able to do isolate them in an expression. That means, we'll need to take some linear combination of the constraints to get the vector q=[10010]T\textbf{q}=\begin{bmatrix}-1& 0&0&1&0\end{bmatrix}^T.

But first, we should put our constraints into Hermite Normal Form and then drop all the zeroed constraints, or employ some other similar technique, in order to make sure that all remaining constraints are linearly independent. Guaranteeing they're linearly independent may simplify work later on. In the example above, we already have full row rank (4), so we'll skip this step in this blog post.

Now, to isolate our query, we have the problem

[100001001111001000011000010000100001]x=[qz]\begin{aligned} \begin{bmatrix} -1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 1 & 1 & -1 & -1\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\\ -1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0\\ 0 & 0 & -1 & 0\\ 0 & 0 & 0 & -1\\ \end{bmatrix} \textbf{x}&= \begin{bmatrix}\textbf{q}\\\textbf{z}\end{bmatrix} \end{aligned}

So we're trying to find a linear combination x\textbf{x} of constraints to isolate our query q\textbf{q}, but then what shall z\textbf{z} be?

z\textbf{z} corresponds to our four slack variables.

Now, to answer queries of the \ge variety, we want all slack variables to be 0\le0. That is if we have ws=dw - s_{*} = d, then wdw\ge d.

We can focus on only \ge, as the other queries of interest can be reframed as \ge, e.g. by adjusting constant offsets or flipping signs.

So, our current idea for proving (or failing to prove) a query is to find the linear combination of existing constraints that produces our query, and then check if all slack variables are negative.

To assist us in this quest, we add augment colums and set z=0\textbf{z}=\textbf{0}:

[100000000100000011110000001000000001000010001000010001000010001000010001]x=[q0000]\begin{aligned} \begin{bmatrix} -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & -1 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 1 & -1 & -1 & 0 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\\ -1 & 0 & 0 & 0 & 1 & 0 & 0 & 0\\ 0 & -1 & 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & -1 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & -1 & 0 & 0 & 0 & 1\\ \end{bmatrix} \textbf{x}&= \begin{bmatrix}\textbf{q}\\0\\0\\0\\0\end{bmatrix} \end{aligned}

Now, the linear combination of columns x\textbf{x} must produce a value of 00 for each slack variable. We can divide this into both the part contributing to our query, and our augment. Letting VV be the total number of variables, and EE the total number of equations, using 0-based indexing, and referring to the above expression more succinctly as Ax=q\textbf{Ax}=\textbf{q}:

0=se=c=0E1Ae+V,cxc+c=0E1Ae+V,c+Exc=xe+c=0E1Ae+V,cxcxe=c=0E1Ae+V,cxc\begin{aligned} 0 &= s_{e}\\ &= \sum_{c=0}^{E-1} A_{e+V,c} x_c + \sum_{c=0}^{E-1} A_{e+V,c+E} x_c\\ &= x_e + \sum_{c=0}^{E-1} A_{e+V,c} x_c\\ x_e &= -\sum_{c=0}^{E-1} A_{e+V,c} x_c\\ \end{aligned}

Note that c=0E1Ae+V,cxc\sum_{c=0}^{E-1} A_{e+V,c} x_c is precisely the quantity we require to be 0\le 0 for our query to be true; it is the value of the slack variable in the linear combination of constraints that produces our query. So, to check if this is satisfied, all we must do is check that xe0x_e\ge0.

Okay, but how do we solve for xx? If A\textbf{A} has full column rank, this is trivial: we either have a solution, or no solutions. If we have no solutions, we obviously fail. If we have one solution, it is trivial to check whether all xe0,e=E,,2E1x_e\ge0, e = E,\ldots,2E-1.

But what if A\textbf{A} does not have full column rank? Again, we have the problem Ax=q\textbf{Ax}=\textbf{q}. Our first step will be to ensure that it has full row rank, which we can do by multiplying by some matrix U\textbf{U} if necessary, e.g., calculated from Hermite Normal Form, discarding zeroed rows. Letting UA=H\textbf{UA}=\textbf{H}, and Uq=b\textbf{Uq}=\textbf{b}, we now have Hx=b\textbf{Hx}=\textbf{b}.

Letting MM and NN be the number of rows and columns of H\textbf{H}, respectively. Note that MM is also the rank of H\textbf{H}, as we have guaranteed that it has full row rank. Because the column rank is deficient, we know N>MN > M.

We can construct a non-singular (but not necessarilly unimodulary) N×NN\times N integer matrix V\textbf{V}, where the first MM columns diagonalize H\textbf{H}, while the remaining NMN-M columns are the nullspace of H\textbf{H}. Calling these two blocks V1\textbf{V}_1 and V2\textbf{V}_2, we have V=[V1V2]\textbf{V} = \begin{bmatrix}\textbf{V}_1&\textbf{V}_2\end{bmatrix}.

With this, we have

b=HVV1x=H[V1V2]V1x=[D0N×(NM)]V1x\begin{aligned} \textbf{b} &= \textbf{HVV}^{-1}\textbf{x}\\ &= \textbf{H} \begin{bmatrix}\textbf{V}_1&\textbf{V}_2\end{bmatrix} \textbf{V}^{-1}\textbf{x}\\ &= \begin{bmatrix}\textbf{D}&\textbf{0}_{N\times \left(N-M\right)}\end{bmatrix} \textbf{V}^{-1}\textbf{x}\\ \end{aligned}

Where D\textbf{D} is diagonal.

Now, let

y1=D1bV1x=[y1y2]b=[D0N×(NM)][D1by2]=b+0=b\begin{aligned} \textbf{y}_1 &= \textbf{D}^{-1}\textbf{b}\\ \textbf{V}^{-1}\textbf{x} &= \begin{bmatrix}\textbf{y}_1\\\textbf{y}_2\end{bmatrix}\\ \textbf{b} &= \begin{bmatrix}\textbf{D}&\textbf{0}_{N\times \left(N-M\right)}\end{bmatrix} \begin{bmatrix}\textbf{D}^{-1}\textbf{b}\\\textbf{y}_2\end{bmatrix}\\ &= \textbf{b} + \textbf{0}\\ &= \textbf{b} \end{aligned}

Therefore, letting

V1x=[D1by2] \textbf{V}^{-1}\textbf{x} = \begin{bmatrix}\textbf{D}^{-1}\textbf{b}\\\textbf{y}_2\end{bmatrix}\\

is a valid solution for all values of y2\textbf{y}_2. Now, solving for xx, as it is xx that we need to prove our queries, we have

V1x=[D1by2]x=[V1V2][D1by2]x=V1D1b+V2y2\begin{aligned} \textbf{V}^{-1}\textbf{x} &= \begin{bmatrix}\textbf{D}^{-1}\textbf{b}\\\textbf{y}_2\end{bmatrix}\\ \textbf{x} &= \begin{bmatrix}\textbf{V}_1&\textbf{V}_2\end{bmatrix} \begin{bmatrix}\textbf{D}^{-1}\textbf{b}\\\textbf{y}_2\end{bmatrix}\\ \textbf{x} &= \textbf{V}_1\textbf{D}^{-1}\textbf{b} + \textbf{V}_2\textbf{y}_2\\ \end{aligned}

As V2\textbf{V}_2 is the nullspace of H\textbf{H}, y2\textbf{y}_2 can moves through the nullspace of H\textbf{H}.

Now, recognizing that we need the last EE entries of x\textbf{x} to be 0\ge 0 but don't care about the first VV elements, we left-multiply by

J=[0E×EIE×E] \textbf{J} = \begin{bmatrix}\textbf{0}_{E\times E}&\textbf{I}_{E\times E}\end{bmatrix}

to discard the elements of x\textbf{x} we aren't interested in. For simplicity, let

x=Jxc=JV1D1bW=JV2x=cWy2\begin{aligned} \textbf{x}^* &= \textbf{Jx}\\ \textbf{c} &= \textbf{JV}_1\textbf{D}^{-1}\textbf{b}\\ \textbf{W} &= -\textbf{JV}_2\\ \textbf{x}^* &= \textbf{c} - \textbf{Wy}_2 \end{aligned}

Now, as what we need is x0\textbf{x}^*\ge 0

0cWy2Wy2c\begin{aligned} \textbf{0} &\le \textbf{c} - \textbf{Wy}_2\\ \textbf{Wy}_2 &\le \textbf{c}\\ \end{aligned}

So, this reduces the problem to one of simply checking if Wy2c\textbf{Wy}_2 \le c is satisfiable, where W\textbf{W} is known from the initial constraints, and c\textbf{c} is computed from the query (and most of the computation can be done once on initialization; for a particular query, all we need is a matrix-vector multiply). Checking feasibility is the first step to solving any linear program, but we can write specialized software for polyhedra that handles the unbounded nature of y2\textbf{y}_2 without needing to duplicate the elements of W\textbf{W} in a tableau.