Hi! I am currently a Ph.D. student at Wayne University, working with Prof. Boris Mordukhovich from the Mathematics department and Dr. Pham Duy Khanh at Ho Chi Minh City University of Education. My recent research interests lie in the intersection of optimization theory and machine learning, including but not limited to:
Variational analysis
Mathematical optimization
Machine learning and statistical learning theory
Feel free to drop me an email at tranbadat@wayne.edu
The paper investigates the fundamental convergence properties of Sharpness-Aware Minimization (SAM), a recently proposed gradient-based optimization method (Foret et al., 2021) that significantly improves the generalization of deep neural networks. The convergence properties including the stationarity of accumulation points, the convergence of the sequence of gradients to the origin, the sequence of function values to the optimal value, and the sequence of iterates to the optimal solution are established for the method. The universality of the provided convergence analysis based on inexact gradient descent frameworks (Khanh et al., 2023b) allows its extensions to the normalized versions of SAM such as VaSSO (Li & Giannakis, 2023), RSAM (Liu et al., 2022), and to the unnormalized versions of SAM such as USAM (Andriushchenko & Flammarion, 2022). Numerical experiments are conducted on classification tasks using deep learning models to confirm the practical aspects of our analysis.
Journal
Globally convergent coderivative-based generalized Newton methods in nonsmooth optimization
P. D. Khanh, B. S. Mordukhovich, V. T Phat, and D. B. Tran
This paper proposes and justifies two globally convergent Newton-type methods to solve unconstrained and constrained problems of nonsmooth optimization by using tools of variational analysis and generalized differentiation. Both methods are coderivative-based and employ generalized Hessians (coderivatives of subgradient mappings) associated with objective functions, which are either of class C1,1, or are represented in the form of convex composite optimization, where one of the terms may be extended-real-valued. The proposed globally convergent algorithms are of two types. The first one extends the damped Newton method and requires positive-definiteness of the generalized Hessians for its well-posedness and efficient performance, while the other algorithm is of the regularized Newton type being well-defined when the generalized Hessians are merely positive-semidefinite. The obtained convergence rates for both methods are at least linear, but become superlinear under the semismooth∗ property of subgradient mappings. Problems of convex composite optimization are investigated with and without the strong convexity assumption on smooth parts of objective functions by implementing the machinery of forward-backward envelopes. Numerical experiments are conducted for Lasso problems and for box constrained quadratic programs with providing performance comparisons of the new algorithms and some other first-order and second-order methods that are highly recognized in nonsmooth optimization.
Journal
Inexact reduced gradient methods in nonconvex optimization
P. D. Khanh, B. S. Mordukhovich, and D. B. Tran
Journal of Optimization Theory and Applications, 2023
This paper proposes and develops new linesearch methods with inexact gradient information for finding stationary points of nonconvex continuously differentiable functions on finite-dimensional spaces. Some abstract convergence results for a broad class of linesearch methods are stablished. A general scheme for inexact reduced gradient (IRG) methods is proposed, where the errors in the gradient approximation automatically adapt with the magnitudes of the exact gradients. The sequences of iterations are shown to obtain stationary accumulation points when different stepsize selections are employed. Convergence results with constructive convergence rates for the developed IRG methods are established under the Kurdyka- Lojasiewicz property. The obtained results for the IRG methods are confirmed by encouraging numerical experiments, which demonstrate advantages of automatically controlled errors in IRG methods over other frequently used error selections.