Learning: Mathematics and Science

Mathematics: Basic matrix operations.


This tutorial is an introduction to matrices, that in mathematics are defined as rectangular arrays (or tables) of numbers, symbols, or expressions, arranged in rows and columns, which are used to represent a mathematical object or a property of such an object. The tutorial shows, how to perform basic arithmetic, row and some other operations with matrices. It does not describe square matrix specific operations, like calculation of the determinant, the eigenvalue or the adjoint matrix. Previewed to publish some day another matrix tutorial, that deals with these computations. If you are interested in an application, that does arithmetic and row operations, you may want to download my Basic matrix operations PC application (for matrices up to 4 rows/columns), freely available in the Free Pascal GUI Applications section of this website.

A matrix is a rectangular array of numbers (or other mathematical objects) for which operations such as addition and multiplication are defined. The numbers, symbols, or expressions in the matrix are called its entries or elements. A matrix whose entries are real numbers is called a real matrix. The horizontal lines of entries in a matrix are called rows, the vertical lines of entries are called columns. The matrix size is defined by the number of rows and columns it contains. A matrix with m rows and n columns is called an m×n matrix (or m-by-n matrix); m and n are called the dimensions of the matrix. Matrices are written one row of elements below the other, the whole being enclosed in parentheses or box brackets. Example of a 2×3 integer matrix A:

A =   -4  0  9  
    3 -8  1  

Matrices with a single row are called row vectors, matrices with a single column are called column vectors. A matrix with the same number of rows and columns is called a square matrix. A matrix with an infinite number of rows or columns (or both) is called an infinite matrix.

Matrices are usually symbolized using upper-case letters (A in the example above), while the corresponding lower-case letters, with two subscript indices (e.g. a11, a12, ...) represent the entries. The entry in the i-th row and j-th column of a matrix A is sometimes referred to as the (i,j) entry, or (i,j)th entry of the matrix, and is most commonly denoted as aij, or ai,j. An alternative notation references the matrix elements similarly as it is done in most computer programming languages: A[i,j]. Example, the (1,3) entry of the example matrix A above is 9, what you can write as a13 = 9. Note, that index counting always starts with 1 and not with 0, as is usual in programming.

Identity matrices.

The identity matrix of size n, denoted In is the n×n square matrix with ones on the main diagonal and zeros elsewhere. As example, the 3×3 identity matrix:

   1 0 0  
I3  0 1 0  
   0 0 1  

Properties:

Arithmetic operations.

Scalar multiplication.

This operation may be done with any matrix. The product kA of the number k and the m×n matrix A is computed by multiplying every entry of A by k.

  (kA)i,j = k · Ai,j

where 1 ≤ i ≤ m and 1 ≤ j ≤ n.

Example:

-1 ·  8  6  1   =   -8 -6 -1  
   2 -3 -4       -2  3  4  

Sample calculation (with C = kA): c12 = k ∙ a12 = (-1) ∙ 6 = -6.

Matrix addition.

Matrix addition is only defined for two matrices of same size. The sum A + B of two m-by-n matrices A and B is calculated entrywise:

  (A + B)i,j = Ai,j + Bi,j

where 1 ≤ i ≤ m and 1 ≤ j ≤ n.

Example:

   5   0  -2   +   -1   8   2   =     4   8   0  
  -7  -8   6       -5  10   9       -12   2  15  

Sample calculation (with C = A + B): c21 = a21 + b21 = (-7) + (-5) = -12.

Like addition of numbers, matrix addition is commutative: A + B = B + A.

Matrix subtraction.

Subtracting the matrix A from the matrix B is the same than adding the matrix A to the matrix -B. -B is nothing else than the product of the scalar -1 and the matrix B, i.e. all entries of B simply change the sign and you can do an entrywise addition with the opposite of these values. What is nothing else than doing an entrywise subtraction of the A and B elements.

  (A - B)i,j = Ai,j - Bi,j

where 1 ≤ i ≤ m and 1 ≤ j ≤ n.

Example:

   5   0  -2   +   -1   8   2   =    6  -8  -4  
  -7  -8   6       -5  10   9       -2 -18  -3  

Sample calculation (with C = A - B): c21 = a21 - b21 = -7 - (-5) = -7 + 5 = -2.

Matrix multiplication.

Multiplication of two matrices is only defined if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, the matrix product AB is the m-by-p matrix whose entries are given by the dot product of the corresponding row of A and the corresponding column of B.

  (AB)i,j = ∑ai,rbr,j = ai,1b1,j + ai,2b2,j + ... + ai,nbn,j

where 1 ≤ i ≤ m and 1 ≤ j ≤ p and the sum is taken for r = 1 .. n.

Example:

  3 9 -7      8  6   =   54 64  
  4 7  1        1  2       36 34  
    -3 -4  

Sample calculation (with C = AB):

c21 = a21b11 + a22b21 + a23b31 = 4∙8 + 7∙1 + 1∙(-3) = 32 + 7 - 3 = 36.

Properties:

Note that for the matrices A and B of the previous example, the product C = BA is a 3×3 matrix! And calculating the (2,1) entry of C would give:

c21 = a21b11 + a22b21 + a23b31 = 1∙3 + 2∙4 = 3 + 8 = 11.

Inverse of a matrix.

An n-by-n square matrix A is called invertible or non-singular if there exists a matrix B such that AB = BA = In, where In is the n×n identity matrix. If B exists, it is unique and is called the inverse matrix of A, denoted A−1. A singular matrix is a matrix, that is not invertible (the inverse does not exist); this is the case if the determinant of A is 0: A is invertible and AA-1 = A-1A = In, if A is an n×n square matrix with |A| ≠ 0.

The inverse of a square matrix A may be calculated by dividing the adjugate matrix (or adjoint) of A, denoted adj(A), by its determinant |A|.

  A-1 = adj(A) / |A|

with |A| ≠ 0.

As I said at the beginning of this tutorial, this text does not describe the calculation of adjoints or determinants. If you have taken some math classes, you probably know, what a determinant is; they are commonly used with Cramer's rule to solve linear equation systems. If you are interested in determinants, you may want to have a look at my 2-by-2 and 3-by-3 linear equation systems tutorial, where you find several examples, how to proceed to calculate a 2×2 or 3×3 determinant. The adjoint of a matrix or adjugate matrix adj(A) is the transpose of the cofactor matrix of A. The cofactor matrix is formed with the co-factors of the elements of the matrix A. The co-factor of an element of the matrix is equal to the product of the minor of the element and -1 raised to the power of the positional value of the element. For details, concerning the cofactor matrix, the adjoint and matrix inversion, you may want to visit the cofactor matrix page at CueMath, that contains detailed information and calculation examples for 3×3 matrices.

Another method (than using the adjoint) to calculate the inverse of a matrix, is to use linear row reduction. And finally, there are scientific calculators, that support operations with matrices.

Note: For a 2×2 matrix, calculations are rather simple and you can remember the following formulas:

 
A =   a b     adj(A) =    d -b     |A| = ad - bc
   c d        -c  a  

Example:

  -2  1   -1 =   -2/3  1/3  
  -1  2         -1/3  2/3  

Sample calculation (with C = A-1):

c22 = a11 / |A| = (-2) / [(-2)∙2 - 1∙(-1)] = (-2) / (-3) = 2/3.

Matrix division.

Dividing a matrix A by a matrix B is the same than multiplying the matrix A by the matrix B-1, where B-1 is the inverse, as described in the paragraph before. You can find some division examples at the matrix division page of the AtoZMath website.

  A / B = A ∙ B-1

with B being invertible.

Power of a matrix.

The nth power of a matrix, where n is a strictly positive integer (n > 0) may be calculated by multiplying the matrix n times by itself. For the square and cube of a matrix, we get: A2 = A ∙ A resp. A3 = A ∙ A ∙ A. This operation is defined for all matrices.

  An = A ∙ A ∙ ... ∙ A (n times)

We have seen that for an invertible n×n matrix AA-1 = A-1A = In. Considering the exponent properties of numbers, that stay true for matrices, we can rewrite this equality: A1A-1 = A1+(-1) = A0 = In. The definition of the 0th power of a matrix is extended to matrices that are not invertible and for all n×n square matrices, we have A0 = In.

One question arising here is if this holds true if A is the null matrix. In fact, for numbers, 00 is undefined. I’m not sure. It seems that calculations, that involve the 0th power of the null matrix, are possible and correct, so, maybe, the definition may be used even for the null matrix.

The exponent rules may also be used to calculate the power of a matrix with negative exponent:

  A-n = (A-1)n

Row operations.

Row operations consist in changing all elements of one single row by a same operation on the elements. They are used in several ways, including solving linear equations and finding matrix inverses.

Row switching.

Row switching is interchanging two rows of a matrix. Switching row i and row j of an m-by-n matrix is interchanging all n elements of row i and row j:

  Ri ↔ Rj

where 1 ≤ i ≤ m and 1 ≤ j ≤ m.

Example (C corresponding to A after row switching has been done): Switching rows 1 and 3 of a 3×2 matrix.

    5 -7        -2  6  
A =    0 -8     C =    0 -8  
   -2  6         5 -7  

Row multiplication.

Row multiplication is multiplying all entries of a row by a non-zero constant. Row multiplication is also called scaling of a row. Multiplying row i of an m-by-n matrix by a constant k (with row i' corresponding to row i, after the row multiplication has been done) may be written as:

  Ri' = k ∙ Ri

where 1 ≤ i ≤ m and i' = i; with k ≠ 0.

Example (C corresponding to A after row multiplication has been done): Multiplying row 2 of a 3×2 matrix by -2.

    5 -7         5 -7  
A =    0 -8     C =    0 16  
   -2  6        -2  6  

Row addition.

Row addition is replacing all entries of a row by the sum of that row and a multiple of another row. Adding row i to k times row j of an m-by-n matrix (with row i' corresponding to row i, after the row addition has been done) may be written as:

  Ri' = Ri + k ∙ Rj

where 1 ≤ i ≤ m and 1 ≤ j ≤ m and i' = i; with i ≠ j and k ≠ 0.

Example (C corresponding to A after row multiplication has been done): Adding row 1 and 2 times row 3 of a 3×2 matrix:

    5 -7         1  5  
A =    0 -8     C =    0 -8  
   -2  6        -2  6  

Row subtraction.

If k < 0, adding k multiplied by the value of the row elements is the same than subtracting -k multiplied by this value. Thus, row subtraction may be defined as:

  Ri' = Ri - k ∙ Rj

where 1 ≤ i ≤ m and 1 ≤ j ≤ m and i' = i; with i ≠ j and k ≠ 0.

Other operations.

Matrix transposition.

The transpose of an m-by-n matrix A is the n-by-m matrix AT (also denoted Atr or tA) formed by turning rows into columns and vice versa.

  (AT)i,j = Aj,i

where 1 ≤ i ≤ m and 1 ≤ j ≤ n.

Example:

   5  0 -2   T =    5 -7  
  -7 -8  6          0 -8  
    -2  6  

Sample calculation (with C = AT): a13 = -2 => c31 = -2.

Properties:

Submatrices.

A submatrix of a matrix is obtained by deleting any collection of rows and/or columns. The minors and cofactors of a matrix may be calculated by computing the determinant of certain submatrices.

A principal submatrix is a square submatrix obtained by removing certain rows and columns. The definition varies from author to author. According to some authors, a principal submatrix is a submatrix in which the set of row indices that remain is the same as the set of column indices that remain. Other authors define a principal submatrix as one in which the first k rows and columns, for some number k, are the ones that remain. This type of submatrix has also been called a leading principal submatrix.

[The tutorial text is mostly based on Matrix (mathematics) and other Wikipedia articles]


If you find this tutorial helpful, please, support me and this website by signing my guestbook.