Cleaning Sample Covariance Matrices using Cross-Validation / Potters / Paris
Cleaning Sample Covariance Matrices using Cross-Validation: Empirical, Numerical and Theoretical Study
Internship proposed by Marc Potters, Capital Fund Management
Background: The best estimator of the true covariance matrix given a sample covariance matrix (in the absence of prior information about true eigenvectors) has been solved by Ledoit and Péché. They showed that sample eigenvalues should be replaced by a function of themselves: this function is called the non-linear shrinkage function. In the high dimensional limit (size of the matrix going to infinity) this function can be computed exactly from the sample matrix. Computing this function at finite N is delicate as one need to evaluate a complex function at the branch-cut where it is the most singular. Another approach is to compute the function by cross-validation, measuring eigenvectors of one data set and computing their variance on an independent set. The power of the cross-validation method is that it is model independent, not only does it not require much knowledge of random matrix theory, it is also robust to unknown additive noise and/or time-correlation is the data. Leave-one-out cross-validation should work well in principle, but in practice it contains some high frequency noise.
Project: The internship is about understanding leave-one-out cross-validation to compute the non-linear shrinking function for covariance matrix estimation. The intern will implement an algorithm and compare it with cross-validation on larger blocks and a discrete implementation of Ledoit-Péché formula. There are two implementations issues: first, find a way to reduce/eliminate the high frequency noise, second, find a computationally efficient way to diagonalize many matrices that are just rank-1 updates of the same matrix. The problem of the high frequency noise will also be tackle analytically. Numerical and analytical advances are to be tested against simulated data (where the theory is known) and on real financial data. The financial problem is to manage a book stocks coming from option hedging. The goal is to find an algorithm that finds the trades that minimizes transaction costs while keeping the risk of the book below a certain threshold. The issue is to find a risk-model (covariance matrix) that is robust out-of-sample.
Reference: The background material will be covered in my class in the M2 System Complex (winter 2019). See also Bun, Bouchaud and Potters Physics Reports 666 (2017).
Location and dates: The internship will take place in the offices of CFM: 23 rue de l’université Paris 7e. Dates and duration are flexible to be compatible with the requirement of your master’s program.
Possibility of thesis after the internship
Comments are closed