(matrix_calculus)=

# Matrix-Vector Calculus

Solverz provides automatic symbolic differentiation for mixed matrix-vector expressions.
When an equation contains matrix-vector products (`Mat_Mul`), Solverz automatically computes
analytical Jacobian blocks using tensor calculus with Einstein index notation.

This module is based on the approach described in:

> Laue, Mitterreiter, Giesen. *A Simple and Efficient Tensor Calculus.*
> Proceedings of the AAAI Conference on Artificial Intelligence, 2020.

See also: [matrixcalculus.org](https://www.matrixcalculus.org/)

## Supported Operations

| Operation | Symbolic Form | Example | Derivative |
|-----------|---------------|---------|------------|
| Matrix-vector multiply | `Mat_Mul(A, x)` | $Ax$ | $A$ |
| Element-wise multiply | `x * y` | $x \odot y$ | $\operatorname{diag}(y)$ |
| Addition | `x + y` | $x + y$ | $I$ |
| Absolute value | `Abs(x)` | $\|x\|$ | $\operatorname{diag}(\operatorname{sign}(x))$ |
| Exponential | `exp(x)` | $e^x$ | $\operatorname{diag}(e^x)$ |
| Sine / Cosine | `sin(x)` / `cos(x)` | $\sin(x)$ | $\operatorname{diag}(\cos(x))$ |
| Logarithm | `ln(x)` | $\ln(x)$ | $\operatorname{diag}(x^{-1})$ |
| Power | `x**n` | $x^n$ | $\operatorname{diag}(n x^{n-1})$ |
| Transpose | `transpose(A)` | $A^T$ | $A^T$ |
| Diagonal | `Diag(x)` | $\operatorname{diag}(x)$ | $\operatorname{diag}(\cdot)$ |

```{note}
`Mat_Mul` is the recommended interface for matrix-vector products. The legacy `MatVecMul` function
(which decomposes sparse matrices into CSC components for Numba) is deprecated. `Mat_Mul` uses
`scipy.sparse` directly, which is both faster and supports full matrix calculus.
```

All element-wise functions (exp, sin, cos, ln, Abs) can be composed with matrix operations:

```python
from Solverz import Var, Param, Eqn, Model
from Solverz.sym_algebra.functions import Mat_Mul, exp, sin

m = Model()
m.x = Var('x', [1, 2, 3])
m.A = Param('A', [[1, 0, 0], [0, 2, 0], [0, 0, 3]], dim=2)
m.f = Eqn('f', exp(Mat_Mul(m.A, m.x)))  # exp(A@x)
```

Solverz will automatically compute $\frac{\partial}{\partial x} e^{Ax} = \operatorname{diag}(e^{Ax}) \cdot A$.

## How It Works

### 1. Einstein Index Notation

Every tensor is labelled with indices:

- **Scalar**: no index (e.g., $\alpha$)
- **Vector** $x$: single index $x_i$
- **Matrix** $A$: two indices $A_{ij}$

Repeated indices in a product imply summation (contraction). For example,
$A_{ij} x_j$ means $\sum_j A_{ij} x_j = (Ax)_i$.

### 2. Computation DAG

The expression is parsed into a directed acyclic graph (DAG):

- **Leaf nodes**: variables ($x$, $y$) and parameters ($A$, $B$, $G$)
- **Internal nodes**: operations ($+$, $\times$, $@$, $f(\cdot)$)
- **Root node**: the output expression

For example, $e \odot (Ge - Bf)$ becomes:

```
        *[i]          ← element-wise multiply
       /    \
     e[i]   -[i]      ← subtraction
           /    \
       @[i]    @[i]    ← matrix-vector products
       / \     / \
     G[ij] e[j] B[ij] f[j]
```

### 3. Forward-Mode Automatic Differentiation

Derivatives propagate from **leaves to root** (pushforward):

1. **Leaf variables**: $\frac{\partial x_j}{\partial x_j} = 1$ (identity), $\frac{\partial x_j}{\partial y_k} = 0$

2. **Multiplication** (product rule in index notation):

```{math}
\tilde{C}_{s_3 s_4} = B_{s_2} \cdot \tilde{A}_{s_1 s_4} + A_{s_1} \cdot \tilde{B}_{s_2 s_4}
```

   where $s_1, s_2, s_3$ are the operand/result indices and $s_4$ is the derivative direction.

3. **Element-wise unary function** $f$:

```{math}
\frac{\partial f(A)}{\partial x} = f'(A) \odot \frac{\partial A}{\partial x}
```

4. **Addition**: $\frac{\partial (A + B)}{\partial x} = \frac{\partial A}{\partial x} + \frac{\partial B}{\partial x}$

### 4. TMul2Mul: Index Resolution

After differentiation, tensor index expressions are resolved back to matrix operations.
The key rules are:

| Index Pattern | Matrix Operation | Meaning |
|---------------|-----------------|---------|
| $(i, i, i)$ | $x \odot y$ | element-wise |
| $(i, j, ij)$ | $x \cdot y^T$ | outer product |
| $(ij, j, i)$ | $A y$ | matrix-vector product |
| $(ij, jk, ik)$ | $AB$ | matrix multiplication |
| $(i, ij, ij)$ | $\operatorname{diag}(x) A$ | diagonal scaling |
| $(ij, jj, ij)$ | $A \operatorname{diag}(y)$ | right diagonal scaling |
| $(i, ii, ii)$ | $\operatorname{diag}(x \odot y)$ | diagonal from element-wise |

## Application Examples

### Example 1: Power Flow Equations

The active power balance equation in power systems:

```{math}
P = e \odot (Ge - Bf) + f \odot (Be + Gf)
```

```python
from Solverz.sym_algebra.symbols import iVar, Para
from Solverz.sym_algebra.functions import Mat_Mul
from Solverz.sym_algebra.matrix_calculus import TensorExpr

e = iVar('e')
f = iVar('f')
G = Para('G', dim=2)
B = Para('B', dim=2)

P = e * (Mat_Mul(G, e) - Mat_Mul(B, f)) + f * (Mat_Mul(B, e) + Mat_Mul(G, f))
TP = TensorExpr(P)

print(TP.diff(e))  # diag(-B@f + G@e) + diag(e)@G + diag(f)@B
print(TP.diff(f))  # diag(B@e + G@f) + diag(e)@(-B) + diag(f)@G
```

### Example 2: Nonlinear Matrix Equations

```python
from Solverz.sym_algebra.functions import exp

A = Para('A', dim=2)
x = iVar('x')

expr = exp(Mat_Mul(A, x))
TE = TensorExpr(expr)
print(TE.diff(x))  # diag(exp(A@x))@A
```

### Example 3: Heat Network Equations

```python
mL = Para('mL', dim=2)
K = Para('K')
m = iVar('m')
from Solverz.sym_algebra.functions import Abs

expr = Mat_Mul(mL, K * Abs(m) * m)
TE = TensorExpr(expr)
print(TE.diff(m))  # mL@(diag(K*Abs(m)) + diag(K*m*Sign(m)))
```

## API Reference

```{eval-rst}
.. autofunction:: Solverz.sym_algebra.matrix_calculus.MixedEquationDiff
.. autoclass:: Solverz.sym_algebra.matrix_calculus.TensorExpr
   :members: diff, visualize
```