CompleteProblem: implement L-BFGS optimization
Previously, the optimization of the loss with respect to the model was performed using gradient descent. This commit adds a L-BFGS optimization method to the CompleteProblem class for efficiency. The L-BFGS method on the complete, not linearized problem also places itself as a reference benchmark for other optimization methods.
The function in question is CompleteProblem::optimizeLBFGS(). The outer quasi-Newton loop is described as follows:
- The standard L-BFGS application of the inverse Hessian on the gradient, where the right-hand side product is evaluated first followed by the core product approximated the truncated terms, followed by the left-hand side product.
- A simple line-search in the optimal direction is used to to find the ideal update length.
- The model is updated.