Skip to content

SGD batching improvements, LBFGS and a Gaussian Process#435

Open
Craigacp wants to merge 29 commits intooracle:mainfrom
Craigacp:sgd-batching
Open

SGD batching improvements, LBFGS and a Gaussian Process#435
Craigacp wants to merge 29 commits intooracle:mainfrom
Craigacp:sgd-batching

Conversation

@Craigacp
Copy link
Copy Markdown
Member

Description

This PR started with a series of changes to improve training speed on batches in SGD training. This added several improvements to the linear algebra package and refactored the SGD objective functions to operate on single examples or batches. These changes are prerequisites for the Gaussian process and LinearTrainer/LBGFS implementations which directly operate on the full dataset as a single batch. The GaussianProcessTrainer is a exact implementation which can be used for small regression problems using a supplied kernel function to compute training data point similarity. The LinearTrainer is an implementation of linear/logistic regression which uses a second order gradient descent method (LBFGS) to find the global minima of the objective (unlike SGD which may not find the global minima without fine tuning of the learning rate), and it also has built in L2 regularization to find a small parameter vector.

Future work is to add support for optimizing the kernel hyperparameters in the Gaussian process, and adding link functions to the GP so they can be used for classification as well, but this PR is plenty big enough as it is.

Motivation

Improves the speed of minibatch SGD training where the batch size is greater than 1 (on dense data), adds an alternative linear model which finds the optima via second order methods, and adds a flexible kernel based regression suitable for small datasets.

Paper reference

  • LBGFS following Nocedal & Wright 2006, "Numerical Optimization (2nd Edition)
  • Gaussian Process regression following Rasmussen & Williams 2006, "Gaussian Processes for Machine Learning".

Craigacp added 28 commits March 18, 2026 20:27
…ugh into the trainers and parameters. This commit calls a lot of methods which need to be created in the la package.
…gradient methods on SGDObjective return a record not a pair, and adding a few methods to DenseMatrix and DenseVector in support of L-BFGS.
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant