Say we are given
abxy≤m≤m≥m≥m and want to know, is x≥a? How about x≥y?
A human can of course look at these equations and say "yes" and "not provable from the given information", but our goal is to develop an algorithm that can do this for arbitrary sets of inequalities, and arbitrary queries. Our compiler will need this for some checks.
So, how can we do this? For starters, manipulating equations tends to be easiest if we place them into matrices. Secondarily, equalities are much easier to work with than inequalities. Thus, we introduce slack variables s∗≥0 to turn each of these into an equality. Now, we have
⎣⎢⎢⎢⎡−10000−10011−1−100100001−10000−10000−10000−1⎦⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡abmxys0s1s2s3⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎡0000⎦⎥⎥⎥⎤ Now, say, we're given an arbitrary query with respect to our 5 variables (and possible constants). E.g., we may want to know whether a≤x or b+1≤y are known true
.
Let us use a≤x, or equivalently, x−a≥0 as an example. We could repsent this as the column vector [−10010]T.
So, to be able to answer queries with respect to x and a, the first thing we'd need to be able to do isolate them in an expression. That means, we'll need to take some linear combination of the constraints to get the vector q=[−10010]T.
But first, we should put our constraints into Hermite Normal Form and then drop all the zeroed constraints, or employ some other similar technique, in order to make sure that all remaining constraints are linearly independent. Guaranteeing they're linearly independent may simplify work later on. In the example above, we already have full row rank (4), so we'll skip this step in this blog post.
Now, to isolate our query, we have the problem
⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡−10100−10000−11000−10000−11000−1000−101000−1⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤x=[qz] So we're trying to find a linear combination x of constraints to isolate our query q, but then what shall z be?
z corresponds to our four slack variables.
Now, to answer queries of the ≥ variety, we want all slack variables to be ≤0. That is if we have w−s∗=d, then w≥d.
We can focus on only ≥, as the other queries of interest can be reframed as ≥, e.g. by adjusting constant offsets or flipping signs.
So, our current idea for proving (or failing to prove) a query is to find the linear combination of existing constraints that produces our query, and then check if all slack variables are negative.
To assist us in this quest, we add augment colums and set z=0:
⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡−10100−10000−11000−10000−11000−1000−101000−1000001000000000100000000010000000001⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤x=⎣⎢⎢⎢⎢⎢⎡q0000⎦⎥⎥⎥⎥⎥⎤ Now, the linear combination of columns x must produce a value of 0 for each slack variable. We can divide this into both the part contributing to our query, and our augment. Letting V be the total number of variables, and E the total number of equations, using 0-based indexing, and referring to the above expression more succinctly as Ax=q:
0xe=se=c=0∑E−1Ae+V,cxc+c=0∑E−1Ae+V,c+Exc=xe+c=0∑E−1Ae+V,cxc=−c=0∑E−1Ae+V,cxc Note that ∑c=0E−1Ae+V,cxc is precisely the quantity we require to be ≤0 for our query to be true; it is the value of the slack variable in the linear combination of constraints that produces our query. So, to check if this is satisfied, all we must do is check that xe≥0.
Okay, but how do we solve for x? If A has full column rank, this is trivial: we either have a solution, or no solutions. If we have no solutions, we obviously fail. If we have one solution, it is trivial to check whether all xe≥0,e=E,…,2E−1.
But what if A does not have full column rank? Again, we have the problem Ax=q. Our first step will be to ensure that it has full row rank, which we can do by multiplying by some matrix U if necessary, e.g., calculated from Hermite Normal Form, discarding zeroed rows. Letting UA=H, and Uq=b, we now have Hx=b.
Letting M and N be the number of rows and columns of H, respectively. Note that M is also the rank of H, as we have guaranteed that it has full row rank. Because the column rank is deficient, we know N>M.
We can construct a non-singular (but not necessarilly unimodulary) N×N integer matrix V, where the first M columns diagonalize H, while the remaining N−M columns are the nullspace of H. Calling these two blocks V1 and V2, we have V=[V1V2].
With this, we have
b=HVV−1x=H[V1V2]V−1x=[D0N×(N−M)]V−1x Where D is diagonal.
Now, let
y1V−1xb=D−1b=[y1y2]=[D0N×(N−M)][D−1by2]=b+0=b Therefore, letting
V−1x=[D−1by2] is a valid solution for all values of y2. Now, solving for x, as it is x that we need to prove our queries, we have
V−1xxx=[D−1by2]=[V1V2][D−1by2]=V1D−1b+V2y2 As V2 is the nullspace of H, y2 can moves through the nullspace of H.
Now, recognizing that we need the last E entries of x to be ≥0 but don't care about the first V elements, we left-multiply by
J=[0E×EIE×E] to discard the elements of x we aren't interested in. For simplicity, let
x∗cWx∗=Jx=JV1D−1b=−JV2=c−Wy2 Now, as what we need is x∗≥0
0Wy2≤c−Wy2≤c So, this reduces the problem to one of simply checking if Wy2≤c is satisfiable, where W is known from the initial constraints, and c is computed from the query (and most of the computation can be done once on initialization; for a particular query, all we need is a matrix-vector multiply). Checking feasibility is the first step to solving any linear program, but we can write specialized software for polyhedra that handles the unbounded nature of y2 without needing to duplicate the elements of W in a tableau.