Matrix Rounding

The Basic Problem

We consider variations of the following matrix rounding problem. Given a matrix $ X\in{\mathbb{R}}^{m\times n}$ , compute an integer matrix $ Y\in{\mathbb{Z}}^{m\times n}$ such that the error in all rows and columns as well as in the whole matrix is small, i.e.,

$\displaystyle \forall j \in [1..n]$ $\displaystyle :$ $\displaystyle \bigg\vert\sum_{i = 1}^m (x_{ij} - y_{ij})\bigg\vert < e_1,$  
$\displaystyle \forall i \in [1..m]$ $\displaystyle :$ $\displaystyle \bigg\vert\sum_{j = 1}^n (x_{ij} - y_{ij})\bigg\vert < e_2,$  
    $\displaystyle \bigg\vert\sum_{i = 1}^m\sum_{j = 1}^n (x_{ij} - y_{ij})\bigg\vert < e_3,$  
$\displaystyle \forall \ensuremath{i\in\ensuremath{[1..m]}}, \ensuremath{j\in\ensuremath{[1..n]}}$ $\displaystyle :$ $\displaystyle \vert x_{ij} - y_{ij}\vert < e_4,$  


for some small positive constants $ e_1,\ldots,e_4$ . In statistics, this problem with errors $ e_1=e_2=e_3=e_4=1$ is known as Controlled Rounding and was first studied by Bacharach [1] and later by Causey, Cox and Ernst [5]. Here, one wants to round the entries of a table of frequency counts to a given base such that the totals of the table are not changed. This allows to increase the readability of tables (by rounding to multiples of $ 10$ ), but the main use is in confidentiality protection [12], where one wants to perturb the table such that no knowledge about individuals can be gained from small counts. In discrete mathematics, Baranyai [2] used such roundings for colouring and partitioning complete uniform hypergraphs.

Extension I: Linear-Time One-Pass Quasi-Rounding

Another application of matrix rounding is the flexible transfer line scheduling problem [11,4]. Here one has a single machine that can produce $ m$ different products, but only one at a time. It is known how many products $ d_i, \ensuremath{i\in\ensuremath{[1..m]}}$ of each type should be produced after $ n=\sum_{i=1}^m d_i$ steps. The goal is to design a production schedule such that the given number of goods is produced after $ n$ steps. The crucial point is that this schedule should be balanced, i.e., after $ \ensuremath{j\in\ensuremath{[1..n]}}$ steps, approximately $ (\tfrac{d_i}{n})j$ units of product $ i$ should be produced.

We can encode the $ m$ demands as a column vector of $ m$ values $ (\tfrac{d_1}{n},\ldots,\tfrac{d_m}{n})^T$ . If $ X$ is the $ m \times n$ -matrix given by repeating the above vector $ n$ times, a good schedule is given by an integral matrix $ Y$ having small rounding error in all columns and in all initial row intervals.

In [7] we give a linear-time one-pass algorithm to compute a quasi-rounding $ Y\in{\mathbb{Z}}^{m\times n}$ for a given matrix $ X$ with integral column sums. This quasi-rounding fulfils

$\displaystyle \forall b \in [1..n], i \in [1..m]$ $\displaystyle :$ $\displaystyle \bigg\vert\sum_{j = 1}^b (x_{ij} - y_{ij})\bigg\vert < 1,$  
$\displaystyle \forall j \in [1..n]$ $\displaystyle :$ $\displaystyle \bigg\vert\sum_{i = 1}^b (x_{ij} - y_{ij})\bigg\vert < 1,$  
$\displaystyle \forall \ensuremath{i\in\ensuremath{[1..m]}}, \ensuremath{j\in\ensuremath{[1..n]}}$ $\displaystyle :$ $\displaystyle \vert x_{ij} - y_{ij}\vert < 2.$  


Using the triangle inequality, one immediately gets that the error in a contiguous part of a row is less than two. This may also be desirable for controlled rounding, if the column labels of a table are ordered. In this case we can calculate the value for a range of entries while controlling the error.

Extension II: Rounding Row and Column Intervals

For controlled rounding, a quasi-rounding as above does normally not suffice. Also, if both the rows and the columns of a table are ordered, one would like to control the error in initial column intervals, too. In other words, given a matrix $ X$ , we seek for a matrix $ Y\in{\mathbb{Z}}^{m\times n}$ fulfilling

$\displaystyle \forall b \in [1..n], \; i \in [1..m]$ $\displaystyle :$ $\displaystyle \bigg\vert\sum_{j = 1}^b (x_{ij} - y_{ij})\bigg\vert < 1,$  
$\displaystyle \forall b \in [1..m], \; j \in [1..n]$ $\displaystyle :$ $\displaystyle \bigg\vert\sum_{i = 1}^b (x_{ij} - y_{ij})\bigg\vert < 1,$  
    $\displaystyle \bigg\vert\sum_{i = 1}^m\sum_{j = 1}^n (x_{ij} - y_{ij})\bigg\vert < 1,$  
$\displaystyle \forall \ensuremath{i\in\ensuremath{[1..m]}}, \ensuremath{j\in\ensuremath{[1..n]}}$ $\displaystyle :$ $\displaystyle \vert x_{ij} - y_{ij}\vert < 1.$  


In [8] we give a $ O(m n \log m n)$ time algorithm to compute such roundings. This algorithm is based on the observation that half-integral matrices (i.e., entries $ 0,\tfrac{1}{2}$ ) can easily be rounded. Adapting a method introduced by Beck and Spencer [3], it is possible to round matrices of binary numbers recursively by computing roundings of half-integral matrices.

Extension III: Randomised Rounding

A rounding computed by a randomised algorithm is called unbiased [6,9] or randomised [10] rounding, if each entry is rounded up with probability equal to its fractional value. Such roundings have the property that the sum over a set of entries in $ Y$ has expected value equal to the sum over the corresponding entries in $ X$ , hence the term ``unbiased''.

The algorithm from [8] can easily be modified to compute unbiased roundings by modifying the half-integral rounding algorithm accordingly. This algorithm will then compute an unbiased controlled rounding of a matrix containing $ \ell$ -bit numbers in time $ O(m n \ell)$ .

Bibliography


1
M. Bacharach.
Matrix rounding problems.
Management Science (Series A), 12:732-742, 1966.
2
Zs. Baranyai.
On the factorization of the complete uniform hypergraph.
In Infinite and finite sets (Colloq., Keszthely, 1973; dedicated to P. Erdös on his 60th birthday), Vol. I, pages 91-108. Colloq. Math. Soc. János Bolyai, Vol. 10. North-Holland, Amsterdam, 1975.
3
J. Beck and J. Spencer.
Well distributed 2-colorings of integers relative to long arithmetic progressions.
Acta Arithm., 43:287-298, 1984.
4
N. Brauner and Y. Crama.
The maximum deviation just-in-time scheduling problem.
Discrete Appl. Math., 134:25-50, 2004.
5
B. D. Causey, L. H. Cox, and L. R. Ernst.
Applications of transportation theory to statistical problems.
Journal of the American Statistical Association, 80(392):903-909, Dec 1985.
6
L. H. Cox.
A constructive procedure for unbiased controlled rounding.
Journal of the American Statistical Association, 82:520-524, 1987.
7
B. Doerr, T. Friedrich, C. Klein, and R. Osbild.
Rounding of sequences and matrices, with applications.
In Third Workshop on Approximation and Online Algorithms (WAOA), volume 3879 of Lecture Notes in Computer Science, pages 96-109. Springer, 2006.
8
B. Doerr, T. Friedrich, C. Klein, and R. Osbild.
Unbiased matrix rounding.
In 10th Scandinavian Workshop on Algorithm Theory (SWAT), volume 4059 of Lecture Notes in Computer Science, pages 102-112. Springer, 2006.
9
I. P. Fellegi.
Controlled random rounding.
Survey Methodology, 1:123-133, 1975.
10
P. Raghavan.
Probabilistic construction of deterministic algorithms: Approximating packing integer programs.
J. Comput. Syst. Sci., 37:130-143, 1988.
11
G. Steiner and S. Yeomans.
Level schedules for mixed-model, just-in-time processes.
Management Science, 39:728-735, 1993.
12
L. Willenborg and T. de Waal.
Elements of Statistical Disclosure Control, volume 155 of Lecture Notes in Statistics.
Springer, 2001.