Benign overfitting for two-layer relu networks Y Kou, Z Chen, Y Chen, Q Gu arXiv preprint arXiv:2303.04145, 2023 | 11 | 2023 |
Benign Overfitting in Two-layer ReLU Convolutional Neural Networks Y Kou, Z Chen, Y Chen, Q Gu International Conference on Machine Learning (ICML). 2023, 2023 | 6 | 2023 |
Why Does Sharpness-Aware Minimization Generalize Better Than SGD? Z Chen, J Zhang, Y Kou, X Chen, CJ Hsieh, Q Gu Advances in Neural Information Processing Systems 36, 2024 | 5 | 2024 |
How Does Semi-supervised Learning with Pseudo-labelers Work? A Case Study Y Kou, Z Chen, Y Cao, Q Gu The Eleventh International Conference on Learning Representations, 2023 | 5 | 2023 |
Certified adversarial robustness under the bounded support set Y Kou, Q Zheng, Y Wang International Conference on Machine Learning, 11559-11597, 2022 | 3 | 2022 |
Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data Y Kou, Z Chen, Q Gu Advances in Neural Information Processing Systems 36, 2024 | 2 | 2024 |
Fast Sampling via De-randomization for Discrete Diffusion Models Z Chen, H Yuan, Y Li, Y Kou, J Zhang, Q Gu arXiv preprint arXiv:2312.09193, 2023 | 2 | 2023 |
Matching the Statistical Query Lower Bound for k-sparse Parity Problems with Stochastic Gradient Descent Y Kou, Z Chen, Q Gu, SM Kakade arXiv preprint arXiv:2404.12376, 2024 | | 2024 |
Guided Discrete Diffusion for Electronic Health Record Generation Z Chen, J Han, Y Li, Y Kou, E Halperin, RE Tillman, Q Gu arXiv preprint arXiv:2404.12314, 2024 | | 2024 |
On the Power of Multitask Representation Learning with Gradient Descent Q Li, Z Chen, Y Deng, Y Kou, Y Cao, Q Gu | | 2023 |