Symbolic math in R with caracas

R
math
caracas
Author

Paolo Bosetti

Published

2025-Jan-08

Playing with caracas

I want to disclose some nice feature of caracas, a computer algebra system for R, which is actually a NumPy wrapper, although it provides an arguably nicer and more elegant interface.

A playful caracal

Linear regression

Let me consider the case of a linear regression with two predictors and two levels each:

\[ \hat y_{ij} = b_1 + b_2 x_{1,i} + b_3x_{2,j} + b_4x_{1,i}x_{2,j} \]

With linear algebra, we can write:

\[ \begin{align} X b &= y \\ X^TXb &= X^T y \\ (X^TX)^{-1}X^TX b &= (X^TX)^{-1}X^T y \end{align} \]

i.e. the vector \(\hat b\) that minimizes \(||y - Xb||^2\) is:

\[ \hat b = (X^TX)^{-1}X^T y \] The combination of predictor values, assuming that we are using coded units (i.e. each variable ranges from -1 to 1), can be obtained by expand.grid and model.matrix:

(x_mat <- expand.grid(A = c(-1, 1), B = c(-1, 1)) %>%
  model.matrix( ~ A * B, data = .))
  (Intercept)  A  B A:B
1           1 -1 -1   1
2           1  1 -1  -1
3           1 -1  1  -1
4           1  1  1   1
attr(,"assign")
[1] 0 1 2 3

In symbols, using caracas:

y  <- matrix_sym(2, 2, "y") %>% c()
X <- x_mat %>% as_sym()
b <- vector_sym(ncol(X), "b")
mu <- X %*% b

\[ y=\left[\begin{matrix}y_{11}\\y_{21}\\y_{12}\\y_{22}\end{matrix}\right];~~~X=\left[\begin{matrix}1 & -1 & -1 & 1\\1 & 1 & -1 & -1\\1 & -1 & 1 & -1\\1 & 1 & 1 & 1\end{matrix}\right];~~~b=\left[\begin{matrix}b_{1}\\b_{2}\\b_{3}\\b_{4}\end{matrix}\right];~~~\mu = X b =\left[\begin{matrix}b_{1} - b_{2} - b_{3} + b_{4}\\b_{1} + b_{2} - b_{3} - b_{4}\\b_{1} - b_{2} + b_{3} - b_{4}\\b_{1} + b_{2} + b_{3} + b_{4}\end{matrix}\right] \]

Note that the above equations are automatically generated by calling, e.g., tex(X) to get the \(\LaTeX\) expression of the symbolic variable X.

To find \(\hat b\) we solve the equation \((X^TX) b = X^T y\):

b_hat <- solve(t(X) %*% X, t(X) %*% y) %>% simplify()

to get:

\[ \hat b = \left[\begin{matrix}b_{1}\\b_{2}\\b_{3}\\b_{4}\end{matrix}\right] = \left[\begin{matrix}\frac{y_{11}}{4} + \frac{y_{12}}{4} + \frac{y_{21}}{4} + \frac{y_{22}}{4}\\- \frac{y_{11}}{4} - \frac{y_{12}}{4} + \frac{y_{21}}{4} + \frac{y_{22}}{4}\\- \frac{y_{11}}{4} + \frac{y_{12}}{4} - \frac{y_{21}}{4} + \frac{y_{22}}{4}\\\frac{y_{11}}{4} - \frac{y_{12}}{4} - \frac{y_{21}}{4} + \frac{y_{22}}{4}\end{matrix}\right] \]

One possible way to evaluate b_hat at a given set of values \(y=\{1, 2, 3, 7\}\) is to convert it to a function and call it with the arguments 1, 2, 3, and 7:

b <- b_hat %>% as_func()
b(1, 2, 3, 7)
     [,1]
[1,] 3.25
[2,] 1.75
[3,] 1.25
[4,] 0.75

This is just a stub, To be continued…