Overview
These notes collect the main ideas behind a few common data structures and linear algebra topics. The emphasis is on how to reason through each result, not just memorize formulas.
Data structures
Expression trees and proper binary trees are useful for representing syntax, recursion, and structural proofs.
Linear algebra
Diagonal matrices, determinants, SVD, and Haar transforms all connect matrix operations to geometric or signal-processing interpretations.
Expression Trees
What an expression tree represents
An arithmetic expression tree stores operands as leaves and
operators as internal nodes. With binary operators such as
+, -, *, and /,
each internal node has exactly two children.
2 + a + 4 * (c + 1) - b / 2
Using conventional precedence, this parses as:
((2 + a) + (4 * (c + 1))) - (b / 2)
The root is -. Its left subtree represents
2 + a + 4 * (c + 1), and its right subtree represents
b / 2.
Printing with in-order traversal
In-order traversal prints the left subtree, then the operator, then the right subtree. Parentheses are still needed when omitting them would change the expression's meaning.
4 * (c + 1) cannot be printed as
4 * c + 1, because multiplication has higher
precedence than addition.
When parentheses are required
Use these precedence values:
precedence('+') = precedence('-') = 1
precedence('*') = precedence('/') = 2
A child expression needs parentheses when:
- the child precedence is lower than the parent precedence
-
precedence is the same, and the child is the right child of
-or/
a - (b + c) != a - b + c
a / (b * c) != a / b * c
Pseudocode
Algorithm printExpression(T)
Input expression tree T
Output arithmetic expression represented by T
return printSubtree(root(T), null, false)
Algorithm printSubtree(v, parentOp, isRightChild)
if isExternal(v) then
return element(v)
op <- element(v)
leftExpr <- printSubtree(left(v), op, false)
rightExpr <- printSubtree(right(v), op, true)
expr <- leftExpr + " " + op + " " + rightExpr
if needParentheses(op, parentOp, isRightChild) then
return "(" + expr + ")"
else
return expr
Algorithm needParentheses(childOp, parentOp, isRightChild)
if parentOp = null then
return false
if precedence(childOp) < precedence(parentOp) then
return true
if precedence(childOp) > precedence(parentOp) then
return false
if isRightChild and (parentOp = '-' or parentOp = '/') then
return true
return false
The traversal visits each node once, so the time complexity is
O(n). The recursion stack uses O(h) space,
where h is the tree height.
Proper Binary Trees
In a proper binary tree, every internal node has exactly two children. Let:
n = number of nodes
e = number of external nodes
i = number of internal nodes
h = height
Useful identities and inequalities
e = i + 1
n = 2e - 1
h <= i
h <= (n - 1) / 2
e <= 2^h
h >= log2(e)
h >= log2(n + 1) - 1
Proof idea
- Every node except the root has one incoming edge, so there are
n - 1edges. - Every internal node has two children, so there are
2iedges. - Therefore
n - 1 = 2i. - Since
n = e + i, it follows thate = i + 1. - A perfect binary tree maximizes the number of external nodes for a fixed height, so
e <= 2^h.
Matrix Basics
Diagonal matrices
For a diagonal matrix A = diag(a1, a2, ..., an),
operations act on each diagonal entry independently.
Inverse
If every a_i != 0, then:
A^(-1) = diag(1/a1, 1/a2, ..., 1/an)
Inverse square root
If every a_i > 0, then:
A^(-1/2) = diag(1/sqrt(a1), ..., 1/sqrt(an))
A = diag(4, 9, 16)
A^(-1/2) = diag(1/2, 1/3, 1/4)
3x3 determinants
For the matrix:
| a b c |
| d e f |
| g h i |
Expansion along the first row gives:
det = a(ei - fh) - b(di - fg) + c(dh - eg)
The first-row sign pattern is + - +.
Singular Value Decomposition
Meaning
Singular Value Decomposition writes a matrix as:
A = U Σ V^T
One useful interpretation is:
V^T: rotate or change input coordinates
Σ: scale each special direction
U: rotate or change output coordinates
The core relationship is:
A v_i = σ_i u_i
Here v_i is an input direction, u_i is an
output direction, and σ_i is the singular value.
Where the formula comes from
A v_i = UΣV^T v_i
V^T v_i = e_i
A v_i = UΣe_i = U(σ_i e_i) = σ_i u_i
Why A^T A and AA^T appear
A^T A v_i = σ_i^2 v_i
AA^T u_i = σ_i^2 u_i
So A^T A gives the right singular vectors
v_i and σ_i^2, while AA^T
gives the left singular vectors u_i and
σ_i^2.
σ_i = sqrt(eigenvalue)
u_i = A v_i / σ_i
v_i = A^T u_i / σ_i
Normalized vectors do not imply σ_i = 1.
Normalized means ||u_i|| = 1 and
||v_i|| = 1; the singular value still records how
much A stretches the direction.
When AA^T = A^T A
If AA^T = A^T A, the left and right singular vectors
come from the same eigenspace, which can make hand calculation
easier. But repeated singular values can make the basis non-unique,
so the correct pairing must still satisfy
A v_i = σ_i u_i.
Haar Transform
What W represents
For N = 4, the Haar transform matrix
W_4 is built from Haar basis functions sampled at four
positions. Each column is one Haar basis vector.
W_4 = 1/2 * | 1 1 sqrt(2) 0 |
| 1 1 -sqrt(2) 0 |
| 1 -1 0 sqrt(2) |
| 1 -1 0 -sqrt(2) |
- Column 1: overall average
- Column 2: first half versus second half
- Column 3: difference inside the first half
- Column 4: difference inside the second half
The outside factor 1/2 normalizes the columns to unit length.
Relationship between phi_k(x), k, and W
phi_k(x) defines the k-th Haar basis
function. The matrix W is the discrete matrix produced
by evaluating those basis functions at sample points.
W's k-th column = values of phi_k(x) at the N sample points
The indexing rule:
k = 2^p + q
means p is the scale or level, and q is
the position of the interval at that level.
Why N is usually a power of 2
The standard Haar transform repeatedly splits the data in half, so
it is naturally defined for N = 2^m, such as
2, 4, 8, and
16. Under this standard rule, there is no canonical
3 x 3 Haar matrix. For three samples, a common practical
approach is to pad to four samples and use W_4.
The W_8 pattern
W_8 follows the same rule as W_4, but it
adds one more level of subdivision. The columns move from coarse to
fine detail: overall average, first four versus last four, internal
differences inside each half, and finally pairwise differences.
W_8 = 1/sqrt(8) *
| 1 1 sqrt(2) 0 2 0 0 0 |
| 1 1 sqrt(2) 0 -2 0 0 0 |
| 1 1 -sqrt(2) 0 0 2 0 0 |
| 1 1 -sqrt(2) 0 0 -2 0 0 |
| 1 -1 0 sqrt(2) 0 0 2 0 |
| 1 -1 0 sqrt(2) 0 0 -2 0 |
| 1 -1 0 -sqrt(2) 0 0 0 2 |
| 1 -1 0 -sqrt(2) 0 0 0 -2 |
- Column 1: overall average
- Column 2: first 4 samples versus last 4 samples
- Columns 3-4: differences inside each group of 4
- Columns 5-8: adjacent pair differences
Walsh Transform
The Walsh transform is usually represented by a normalized Hadamard
matrix. Unlike the Haar matrix, it contains only +1 and
-1 entries. Its basis patterns are global sign patterns
rather than local interval differences.
W_4 in course ordering
Walsh matrices can use different row orderings. In one common
course ordering, the 4 x 4 Walsh matrix is:
W_4 = 1/2 *
| 1 -1 -1 1 |
| 1 -1 1 -1 |
| 1 1 1 1 |
| 1 1 -1 -1 |
W_8 in the same ordering
Extending that same row-ordering pattern to eight samples gives:
W_8 = 1/sqrt(8) *
| 1 -1 -1 1 -1 1 1 -1 |
| 1 -1 -1 1 1 -1 -1 1 |
| 1 -1 1 -1 -1 1 -1 1 |
| 1 -1 1 -1 1 -1 1 -1 |
| 1 1 1 1 1 1 1 1 |
| 1 1 1 1 -1 -1 -1 -1 |
| 1 1 -1 -1 1 1 -1 -1 |
| 1 1 -1 -1 -1 -1 1 1 |
The normalization factor depends on the convention. If the course
writes the unnormalized matrix, omit 1/2 for
W_4 and 1/sqrt(8) for
W_8. If the transform is orthonormal, include them.
Haar versus Walsh
- Haar: local, coarse-to-fine differences; many zero entries.
- Walsh: global sign patterns; only
+1and-1.
Quick Formula Sheet
O(n) time, O(h) recursion stack.
e = i + 1n = 2e - 1
diag(a_i)^(-1) = diag(1/a_i)
a(ei-fh)-b(di-fg)+c(dh-eg)
A = UΣV^TA v_i = σ_i u_i
N = 2^m.
+1 and -1.