Public study notes

Linear Algebra and Data Structures Study Notes

Expression trees, proper binary trees, diagonal matrices, determinants, singular value decomposition, Haar transform, and Walsh transform notes.

Overview

These notes collect the main ideas behind a few common data structures and linear algebra topics. The emphasis is on how to reason through each result, not just memorize formulas.

Data structures

Expression trees and proper binary trees are useful for representing syntax, recursion, and structural proofs.

Linear algebra

Diagonal matrices, determinants, SVD, and Haar transforms all connect matrix operations to geometric or signal-processing interpretations.

Expression Trees

What an expression tree represents

An arithmetic expression tree stores operands as leaves and operators as internal nodes. With binary operators such as +, -, *, and /, each internal node has exactly two children.

2 + a + 4 * (c + 1) - b / 2

Using conventional precedence, this parses as:

((2 + a) + (4 * (c + 1))) - (b / 2)

The root is -. Its left subtree represents 2 + a + 4 * (c + 1), and its right subtree represents b / 2.

Printing with in-order traversal

In-order traversal prints the left subtree, then the operator, then the right subtree. Parentheses are still needed when omitting them would change the expression's meaning.

4 * (c + 1) cannot be printed as 4 * c + 1, because multiplication has higher precedence than addition.

When parentheses are required

Use these precedence values:

precedence('+') = precedence('-') = 1
precedence('*') = precedence('/') = 2

A child expression needs parentheses when:

  • the child precedence is lower than the parent precedence
  • precedence is the same, and the child is the right child of - or /
a - (b + c) != a - b + c
a / (b * c) != a / b * c

Pseudocode

Algorithm printExpression(T)
Input  expression tree T
Output arithmetic expression represented by T

return printSubtree(root(T), null, false)
Algorithm printSubtree(v, parentOp, isRightChild)
if isExternal(v) then
    return element(v)

op <- element(v)
leftExpr  <- printSubtree(left(v), op, false)
rightExpr <- printSubtree(right(v), op, true)
expr <- leftExpr + " " + op + " " + rightExpr

if needParentheses(op, parentOp, isRightChild) then
    return "(" + expr + ")"
else
    return expr
Algorithm needParentheses(childOp, parentOp, isRightChild)
if parentOp = null then
    return false

if precedence(childOp) < precedence(parentOp) then
    return true

if precedence(childOp) > precedence(parentOp) then
    return false

if isRightChild and (parentOp = '-' or parentOp = '/') then
    return true

return false

The traversal visits each node once, so the time complexity is O(n). The recursion stack uses O(h) space, where h is the tree height.

Proper Binary Trees

In a proper binary tree, every internal node has exactly two children. Let:

n = number of nodes
e = number of external nodes
i = number of internal nodes
h = height

Useful identities and inequalities

e = i + 1
n = 2e - 1
h <= i
h <= (n - 1) / 2
e <= 2^h
h >= log2(e)
h >= log2(n + 1) - 1

Proof idea

  • Every node except the root has one incoming edge, so there are n - 1 edges.
  • Every internal node has two children, so there are 2i edges.
  • Therefore n - 1 = 2i.
  • Since n = e + i, it follows that e = i + 1.
  • A perfect binary tree maximizes the number of external nodes for a fixed height, so e <= 2^h.

Matrix Basics

Diagonal matrices

For a diagonal matrix A = diag(a1, a2, ..., an), operations act on each diagonal entry independently.

Inverse

If every a_i != 0, then:

A^(-1) = diag(1/a1, 1/a2, ..., 1/an)

Inverse square root

If every a_i > 0, then:

A^(-1/2) = diag(1/sqrt(a1), ..., 1/sqrt(an))
A = diag(4, 9, 16)
A^(-1/2) = diag(1/2, 1/3, 1/4)

3x3 determinants

For the matrix:

| a  b  c |
| d  e  f |
| g  h  i |

Expansion along the first row gives:

det = a(ei - fh) - b(di - fg) + c(dh - eg)

The first-row sign pattern is + - +.

Singular Value Decomposition

Meaning

Singular Value Decomposition writes a matrix as:

A = U Σ V^T

One useful interpretation is:

V^T: rotate or change input coordinates
Σ:   scale each special direction
U:   rotate or change output coordinates

The core relationship is:

A v_i = σ_i u_i

Here v_i is an input direction, u_i is an output direction, and σ_i is the singular value.

Where the formula comes from

A v_i = UΣV^T v_i
V^T v_i = e_i
A v_i = UΣe_i = U(σ_i e_i) = σ_i u_i

Why A^T A and AA^T appear

A^T A v_i = σ_i^2 v_i
AA^T u_i = σ_i^2 u_i

So A^T A gives the right singular vectors v_i and σ_i^2, while AA^T gives the left singular vectors u_i and σ_i^2.

σ_i = sqrt(eigenvalue)
u_i = A v_i / σ_i
v_i = A^T u_i / σ_i

Normalized vectors do not imply σ_i = 1. Normalized means ||u_i|| = 1 and ||v_i|| = 1; the singular value still records how much A stretches the direction.

When AA^T = A^T A

If AA^T = A^T A, the left and right singular vectors come from the same eigenspace, which can make hand calculation easier. But repeated singular values can make the basis non-unique, so the correct pairing must still satisfy A v_i = σ_i u_i.

Haar Transform

What W represents

For N = 4, the Haar transform matrix W_4 is built from Haar basis functions sampled at four positions. Each column is one Haar basis vector.

W_4 = 1/2 * | 1   1   sqrt(2)    0      |
            | 1   1  -sqrt(2)    0      |
            | 1  -1      0      sqrt(2) |
            | 1  -1      0     -sqrt(2) |
  • Column 1: overall average
  • Column 2: first half versus second half
  • Column 3: difference inside the first half
  • Column 4: difference inside the second half

The outside factor 1/2 normalizes the columns to unit length.

Relationship between phi_k(x), k, and W

phi_k(x) defines the k-th Haar basis function. The matrix W is the discrete matrix produced by evaluating those basis functions at sample points.

W's k-th column = values of phi_k(x) at the N sample points

The indexing rule:

k = 2^p + q

means p is the scale or level, and q is the position of the interval at that level.

Why N is usually a power of 2

The standard Haar transform repeatedly splits the data in half, so it is naturally defined for N = 2^m, such as 2, 4, 8, and 16. Under this standard rule, there is no canonical 3 x 3 Haar matrix. For three samples, a common practical approach is to pad to four samples and use W_4.

The W_8 pattern

W_8 follows the same rule as W_4, but it adds one more level of subdivision. The columns move from coarse to fine detail: overall average, first four versus last four, internal differences inside each half, and finally pairwise differences.

W_8 = 1/sqrt(8) *

| 1   1    sqrt(2)     0       2    0    0    0  |
| 1   1    sqrt(2)     0      -2    0    0    0  |
| 1   1   -sqrt(2)     0       0    2    0    0  |
| 1   1   -sqrt(2)     0       0   -2    0    0  |
| 1  -1       0     sqrt(2)    0    0    2    0  |
| 1  -1       0     sqrt(2)    0    0   -2    0  |
| 1  -1       0    -sqrt(2)    0    0    0    2  |
| 1  -1       0    -sqrt(2)    0    0    0   -2  |
  • Column 1: overall average
  • Column 2: first 4 samples versus last 4 samples
  • Columns 3-4: differences inside each group of 4
  • Columns 5-8: adjacent pair differences

Walsh Transform

The Walsh transform is usually represented by a normalized Hadamard matrix. Unlike the Haar matrix, it contains only +1 and -1 entries. Its basis patterns are global sign patterns rather than local interval differences.

W_4 in course ordering

Walsh matrices can use different row orderings. In one common course ordering, the 4 x 4 Walsh matrix is:

W_4 = 1/2 *

| 1  -1  -1   1 |
| 1  -1   1  -1 |
| 1   1   1   1 |
| 1   1  -1  -1 |

W_8 in the same ordering

Extending that same row-ordering pattern to eight samples gives:

W_8 = 1/sqrt(8) *

| 1  -1  -1   1  -1   1   1  -1 |
| 1  -1  -1   1   1  -1  -1   1 |
| 1  -1   1  -1  -1   1  -1   1 |
| 1  -1   1  -1   1  -1   1  -1 |
| 1   1   1   1   1   1   1   1 |
| 1   1   1   1  -1  -1  -1  -1 |
| 1   1  -1  -1   1   1  -1  -1 |
| 1   1  -1  -1  -1  -1   1   1 |

The normalization factor depends on the convention. If the course writes the unnormalized matrix, omit 1/2 for W_4 and 1/sqrt(8) for W_8. If the transform is orthonormal, include them.

Haar versus Walsh

  • Haar: local, coarse-to-fine differences; many zero entries.
  • Walsh: global sign patterns; only +1 and -1.

Quick Formula Sheet

Expression tree print O(n) time, O(h) recursion stack.
Proper binary tree e = i + 1
n = 2e - 1
Diagonal inverse diag(a_i)^(-1) = diag(1/a_i)
3x3 determinant a(ei-fh)-b(di-fg)+c(dh-eg)
SVD A = UΣV^T
A v_i = σ_i u_i
Haar size Standard Haar matrices use N = 2^m.
Walsh transform Normalized Hadamard matrix with +1 and -1.