Research
2025
- Job Market PaperBridging Dense and Sparse Models in High-Dimensional Quantile RegressionYaping Wang2025Slides coming soon
This paper introduces a high-dimensional quantile regression that bridges the dense and sparse modeling perspectives by allowing conditional quantiles to depend densely on latent factors capturing pervasive comovements and sparsely on idiosyncratic components reflecting heterogeneous, localized shocks. The resulting framework combines the interpretability and variable selection advantages of sparse models with the stability and dimension reduction of factor models. Theoretically, we establish convergence rates for the proposed estimator under weak temporal dependence and allow for both strong and weak factors. Simulation studies demonstrate favorable finite-sample performance and highlight a trade-off under weak factors, where the need to retain idiosyncratic components increases as the precision of their estimation deteriorates. In an empirical application to forecasting housing starts using a large macro-financial panel, the estimator achieves lower check loss than sparse quantile regression and factor only specifications, with the largest gains in the lower tail.
- Working PaperPerformance of Empirical Risk Minimization For Principal Component RegressionYaping Wang, Christian Brownlees, and Guðmundur Stefán Guðmundsson2025R&R at Econometric Theory
This paper establishes bounds on the predictive performance of empirical risk minimization for principal component regression. Our analysis is nonparametric, in the sense that the relation between the prediction target and the predictors is not specified. In particular, we do not rely on the assumption that the prediction target is generated by a factor model. In our analysis we consider the cases in which the largest eigenvalues of the covariance matrix of the predictors grow linearly in the number of predictors (strong signal regime) or sublinearly (weak signal regime). The main result of this paper shows that empirical risk minimization for principal component regression is consistent for prediction and, under appropriate conditions, it achieves near-optimal performance in both the strong and weak signal regimes.
- Working PaperCross-Validating the Number of Factors for PredictionYaping Wang2025
This paper studies how to determine the number of factors k by cross-validation in high-dimensional predictive models. We consider a nonparametric setting in which the relationship between the target variable and high-dimensional predictors is unspecified, and treat the number of factors as a tuning parameter for prediction: factors are estimated fold-wise from predictors only, k is selected to minimize the validation loss in predicting the target, and factors are re-estimated on the full sample once is k chosen. We show that fold-wise cross-validation achieves near-oracle out-of-sample predictive performance under both strong and weak factors regimes. Extensions to weakly dependent data are derived using blocked cross-validation, providing valid performance guarantees for factor-augmented predictions with time series. Our empirical application shows that cross-validated factor selection yields smaller out-of-sample prediction errors than information-criterion-based choices, particularly when the factors are weak or their directions are misaligned with the target.