Follow
Ching-Hsiang Chu
Ching-Hsiang Chu
Research Scientist, Meta/Facebook
Verified email at fb.com - Homepage
Title
Cited by
Cited by
Year
Scalable distributed dnn training using tensorflow and cuda-aware mpi: Characterization, designs, and performance evaluation
AA Awan, J Bédorf, CH Chu, H Subramoni, DK Panda
2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2019
452019
Optimized broadcast for deep learning workloads on dense-GPU InfiniBand clusters: MPI or NCCL?
AA Awan, CH Chu, H Subramoni, DK Panda
Proceedings of the 25th European MPI Users' Group Meeting, 1-9, 2018
422018
Nv-group: link-efficient reduction for distributed deep learning on modern dense gpu systems
CH Chu, P Kousha, AA Awan, KS Khorassani, H Subramoni, DK Panda
Proceedings of the 34th ACM International Conference on Supercomputing, 1-12, 2020
302020
The MVAPICH project: Transforming research into high-performance MPI library for HPC community
DK Panda, H Subramoni, CH Chu, M Bayatpour
Journal of Computational Science 52, 101208, 2021
262021
OC-DNN: Exploiting advanced unified memory capabilities in CUDA 9 and volta GPUs for out-of-core DNN training
AA Awan, CH Chu, H Subramoni, X Lu, DK Panda
2018 IEEE 25th International Conference on High Performance Computing (HiPC …, 2018
252018
Improving SCTP performance by jitter-based congestion control over wired-wireless networks
JM Chen, CH Chu, EHK Wu, MF Tsai, JR Wang
EURASIP Journal on Wireless Communications and Networking 2011, 1-13, 2011
252011
CUDA kernel based collective reduction operations on large-scale GPU clusters
CH Chu, K Hamidouche, A Venkatesh, AA Awan, DK Panda
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016
242016
Efficient and scalable multi-source streaming broadcast on GPU clusters for deep learning
CH Chu, X Lu, AA Awan, H Subramoni, J Hashmi, B Elton, DK Panda
2017 46th International Conference on Parallel Processing (ICPP), 161-170, 2017
202017
High-performance, distributed training of large-scale deep learning recommendation models
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
192021
Performance evaluation of MPI libraries on GPU-enabled OpenPOWER architectures: Early experiences
KS Khorassani, CH Chu, H Subramoni, DK Panda
International Conference on High Performance Computing, 361-378, 2019
182019
Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters
K Hamidouche, A Venkatesh, AA Awan, H Subramoni, CH Chu, ...
2015 IEEE International Conference on Cluster Computing (CLUSTER), 78-87, 2015
182015
IVC: Imperceptible video communication
R Carvalho, CH Chu, LJ Chen
Proc. of HotMobile (poster), 2014
182014
Distributed topology control for energy-efficient and reliable wireless communications
MT Sun, CH Chu, EHK Wu, CS Hsiao, AAK Jeng
IEEE Systems Journal 12 (3), 2152-2161, 2017
172017
A collision-aware backoff mechanism for IEEE 802.11 WLANs
YC Chan, MC Liao, CH Chu
2009 IEEE International Conference on Intelligent Computing and Intelligent …, 2009
172009
Designing a profiling and visualization tool for scalable and in-depth analysis of high-performance GPU clusters
P Kousha, B Ramesh, KK Suresh, CH Chu, A Jain, N Sarkauskas, ...
2019 IEEE 26th International Conference on High Performance Computing, Data …, 2019
162019
Communication profiling and characterization of deep-learning workloads on clusters with high-performance interconnects
AA Awan, A Jain, CH Chu, H Subramoni, DK Panda
IEEE Micro 40 (1), 35-43, 2019
162019
Characterizing cuda unified memory (um)-aware mpi designs on modern gpu architectures
KV Manian, AA Ammar, A Ruhela, CH Chu, H Subramoni, DK Panda
Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, 43-52, 2019
162019
Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2021. High-performance, Distributed Training of …
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
142021
M. khorashadi, P
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
Bhattacharya, P. Lapukhov, M. Naumov, L. Qiao, M. Smelyanskiy, B. Jia, and V …, 2021
122021
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
102022
The system can't perform the operation now. Try again later.
Articles 1–20