Lasso Regression #306

yfnaji · 2025-10-20T22:10:56Z

Implementation of Lasso Regression

This pull request introduces an implementation of the Lasso regression model as part of the RustQuant_ml crate.

Lasso regression extends linear regression by adding an L1 regularisation term, which encourages sparsity in the coefficient vector, resulting in some coefficients being driven to zero. This in turn enables feature selection and reduces overfitting.

This implementation is designed to closely align with Scikit-Learn’s linear_model.Lasso model. The Lasso implementation from Scikit-Learn using the same data as the unit tests in this PR is available here.

Coordinate Descent Algorithm

The Loss function for Lasso regression is given by:

$$ L := \frac{1}{2n}\left\lVert\textbf{y} - X\beta\right\rVert_2^2 + \lambda\left|\beta\right| $$

We iterate through each column individually, isolating the algorithm to column $j$.

Define the partial residual for column $j$ as:

$$ r_j = \textbf{y} - X\beta + X_j\beta_j $$

where $X_j$ is the $j^{\text{th}}$ column of $X$, and $\beta_j$ is the coefficient corresponding to the $j^{\text{th}}$ feature.

This effectively computes the residuals without the $j^{\text{th}}$ column, allowing us to assess the contribution of feature $j$ to the model.

The objective function becomes:

$$ \frac{1}{2n}\left\lVert r_j - X_j\beta_j\right\rVert_2^2 + \lambda\left|\beta_j\right| $$

which can be expanded as:

$$ \frac{1}{2n}\left(r_j - X_j\beta_j\right)^T\left(r_j - X_j\beta_j\right) + \lambda\left|\beta_j\right| $$

$$ \Rightarrow \frac{1}{2n}\left(r_j^Tr_j - 2\beta_jX_j^Tr_j + \beta_j^2X_j^TX_j\right) + \lambda\left|\beta_j\right| $$

Noting that $\beta_j \in \mathbb{R}$ (and thus $\beta_j^T = \beta_j$), we define:

$$ \rho_j := \frac{1}{n}X_j^Tr_j \quad \text{and} \quad z_j := \frac{1}{n}X_j^TX_j $$

We can now minimise with respect to $\beta_j$:

$$ \min_{\beta_j} \left[\frac{1}{2}z_j\beta_j^2 - \rho_j\beta_j\right] + \lambda\left|\beta_j\right| $$

Differentiating with respect to $\beta_j$ and substituting the minimum value, we obtain:

$$ \left.\frac{\partial L}{\partial \beta_j}\right\vert_{\beta_j=\hat{\beta}_j} = z_j\hat{\beta}_j - \rho_j + \lambda , \text{sgn}\left(\hat{\beta}_j\right) = 0 $$

where

$$ \text{sgn}(x) = \begin{cases} 1, & x > 0\\ 0, & x = 0\\ -1, & x < 0 \end{cases} $$

We now consider three cases:

Case 1: $\hat{\beta}_j > 0$

$$ z_j\hat{\beta}_j - \rho_j + \lambda = 0 $$

$$ \Rightarrow \hat{\beta}_j = \frac{\rho_j - \lambda}{z_j}, \quad \text{valid when } \rho_j > \lambda $$

Case 2: $\hat{\beta}_j < 0$

$$ z_j\hat{\beta}_j - \rho_j - \lambda = 0 $$

$$ \Rightarrow \hat{\beta}_j = \frac{\rho_j + \lambda}{z_j}, \quad \text{valid when } \rho_j < -\lambda $$

Case 3: $\hat{\beta}_j = 0$

Since $\text{sgn}(x)$ is not differentiable at $x = 0$, we use its subgradient:

$$ \text{sgn}(0) \in [-1, 1] $$

Thus, for $\hat{\beta}_j = 0$ to hold, we require:

$$ 0 = -\rho_j + \lambda x \quad \text{where} \quad x \in [-1, 1] $$

$$ \Rightarrow \rho_j \in [-\lambda, \lambda] $$

Since we iterate through each column while holding other coefficients fixed, the update for each coefficient must be repeated until convergence or until a maximum number of iterations is reached.

Define the change in the coefficient for feature $h$ at iteration $k$:

$$ \delta^{(k+1)} := \hat{\beta}^{(k+1)}_h - \hat{\beta}^{(k)}_h $$

For the $i^{\text{th}}$ observation, define the residual:

$$ \varepsilon_i := y_i - \sum_{j=1}^{n} X_{ij}\hat{\beta}_j $$

The residuals are updated after each coefficient adjustment as:

$$ \varepsilon^{(k+1)}_i = \varepsilon^{(k)}_i - \delta^{(k+1)} \cdot X_{ij} $$

This ensures that at each iteration, the residuals reflect the most recent coefficient updates:

$$ y_i - \sum_{j \neq h} X_{ij}\hat{\beta}_j - \beta^{(k+1)} X_{ih} $$

as required.

crates/RustQuant_ml/src/lasso.rs

+
+                let delta: f64 = new_coefficient_j - coefficients[j];
+                if delta.abs() > 0.0 {
+                    residuals -= &feature_vals_col_j * delta;


crates/RustQuant_ml/src/lasso.rs

+
+                let delta: f64 = new_coefficient_j - coefficients[j];
+                if delta.abs() > 0.0 {
+                    residuals -= &feature_vals_col_j * delta;


crates/RustQuant_ml/src/lasso.rs

+                    0.743_965_706_491_596_7,
+                    -0.304_713_846_510_641_43,
+                    1.355_162_653_724_116_22,
+                ][i],


yfnaji added 4 commits October 12, 2025 21:40

Lasso: Create module + struct

8a1a534

Lasso: Implementations for constructor, fitting and create output struct

c07bb50

Lasso: Create unit tests

a5974b4

Lasso: Remove redundant variable

48d9a0d

github-advanced-security bot found potential problems Oct 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Lasso Regression #306

Lasso Regression #306

yfnaji commented Oct 20, 2025 •

edited

Loading

Uh oh!

Check warning

Check warning

Check warning

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Lasso Regression #306

Are you sure you want to change the base?

Lasso Regression #306

Conversation

yfnaji commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation of Lasso Regression

Coordinate Descent Algorithm

Case 1: $\hat{\beta}_j &gt; 0$

Case 2: $\hat{\beta}_j &lt; 0$

Case 3: $\hat{\beta}_j = 0$

Uh oh!

Check warning

Check warning

Check warning

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yfnaji commented Oct 20, 2025 •

edited

Loading

Case 1: $\hat{\beta}_j > 0$

Case 2: $\hat{\beta}_j < 0$