Ching-Hsiang Chu

Cited by

	All	Since 2019
Citations	822	704
h-index	18	16
i10-index	28	22

200

100

150

2010201120122013201420152016201720182019202020212022202320243 5 10 12 4 9 12 19 41 59 105 139 160 196 45

Public access

View all

18 articles

5 articles

available

not available

Based on funding mandates

Co-authors

Dhabaleswar K. PandaProfessor of Computer Science, The Ohio State UniversityVerified email at cse.ohio-state.edu
Hari SubramoniThe Ohio State UniversityVerified email at cse.ohio-state.edu
Ammar Ahmad AwanMicrosoftVerified email at osu.edu
Khaled HamidoucheAMD ResearchVerified email at amd.com
Kawthar Shafie KhorassaniThe Ohio State UniversityVerified email at osu.edu
Akshay VenkateshNVIDIA; Ohio State UniversityVerified email at nvidia.com
Eric Hsiao-kuang WuNational Central UniversityVerified email at csie.ncu.edu.tw
Xiaoyi LuAssistant Professor, University of California, MercedVerified email at ucmerced.edu
Jahanzeb HashmiSenior Architect, NVIDIAVerified email at nvidia.com
Pouya KoushaResearch Assistant, The Ohio State UniversityVerified email at osu.edu
(Altamont) Bracy Hamilton EltonPenguin ComputingVerified email at bracyelton.com
Mohammadreza Bayatpour (Mamzi)NVIDIA, The Ohio State UniversityVerified email at nvidia.com
Arpan JainThe Ohio State UniversityVerified email at osu.edu
Karthik Vadambacheri ManianSt. Jude Children's Research HospitalVerified email at stjude.org
Min-Te SunProfessor of Computer Science and Information Engineering, National Central UniversityVerified email at csie.ncu.edu.tw
Srinivas Sridharan, PhdDistinguished Engineer, NVIDIAVerified email at nvidia.com
Qinghua ZhouThe Ohio State UniversityVerified email at osu.edu
Mustafa OzdalMetaVerified email at meta.com
Dheevatsa MudigereDistinguished Engineer, NVIDIAVerified email at nvidia.com
Liang LuoUniversity of WashingtonVerified email at cs.washington.edu

Ching-Hsiang Chu

Research Scientist, Meta/Facebook

Verified email at meta.com - Homepage

High-performance Computing GPU GPU Communication Wireless Networks


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022	74	2022
The MVAPICH project: Transforming research into high-performance MPI library for HPC community DK Panda, H Subramoni, CH Chu, M Bayatpour Journal of Computational Science 52, 101208, 2021	61	2021
Scalable distributed dnn training using tensorflow and cuda-aware mpi: Characterization, designs, and performance evaluation AA Awan, J Bédorf, CH Chu, H Subramoni, DK Panda 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2019	56	2019
Optimized broadcast for deep learning workloads on dense-GPU InfiniBand clusters: MPI or NCCL? AA Awan, CH Chu, H Subramoni, DK Panda Proceedings of the 25th European MPI Users' Group Meeting, 1-9, 2018	51	2018
Nv-group: link-efficient reduction for distributed deep learning on modern dense gpu systems CH Chu, P Kousha, AA Awan, KS Khorassani, H Subramoni, DK Panda Proceedings of the 34th ACM International Conference on Supercomputing, 1-12, 2020	39	2020
M. khorashadi, P D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Bhattacharya, P. Lapukhov, M. Naumov, L. Qiao, M. Smelyanskiy, B. Jia, and V …, 2021	38	2021
Oc-dnn: Exploiting advanced unified memory capabilities in cuda 9 and volta gpus for out-of-core dnn training AA Awan, CH Chu, H Subramoni, X Lu, DK Panda 2018 IEEE 25th International Conference on High Performance Computing (HiPC …, 2018	36	2018
High-performance, distributed training of large-scale deep learning recommendation models D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ... arXiv preprint arXiv:2104.05158, 2021	31	2021
Improving SCTP performance by jitter-based congestion control over wired-wireless networks JM Chen, CH Chu, EHK Wu, MF Tsai, JR Wang EURASIP Journal on Wireless Communications and Networking 2011, 1-13, 2011	27	2011
CUDA kernel based collective reduction operations on large-scale GPU clusters CH Chu, K Hamidouche, A Venkatesh, AA Awan, DK Panda 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2016	26	2016
Performance evaluation of MPI libraries on GPU-enabled OpenPOWER architectures: Early experiences KS Khorassani, CH Chu, H Subramoni, DK Panda High Performance Computing: ISC High Performance 2019 International …, 2019	25	2019
Designing high-performance mpi libraries with on-the-fly compression for modern gpu clusters Q Zhou, C Chu, NS Kumar, P Kousha, SM Ghazimirsaeed, H Subramoni, ... 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2021	24	2021
Efficient and scalable multi-source streaming broadcast on GPU clusters for deep learning CH Chu, X Lu, AA Awan, H Subramoni, J Hashmi, B Elton, DK Panda 2017 46th International Conference on Parallel Processing (ICPP), 161-170, 2017	24	2017
Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters K Hamidouche, A Venkatesh, AA Awan, H Subramoni, CH Chu, ... 2015 IEEE International Conference on Cluster Computing, 78-87, 2015	24	2015
Communication profiling and characterization of deep-learning workloads on clusters with high-performance interconnects AA Awan, A Jain, CH Chu, H Subramoni, DK Panda IEEE Micro 40 (1), 35-43, 2019	21	2019
Characterizing cuda unified memory (um)-aware mpi designs on modern gpu architectures KV Manian, AA Ammar, A Ruhela, CH Chu, H Subramoni, DK Panda Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, 43-52, 2019	20	2019
Designing a profiling and visualization tool for scalable and in-depth analysis of high-performance GPU clusters P Kousha, B Ramesh, KK Suresh, CH Chu, A Jain, N Sarkauskas, ... 2019 IEEE 26th International Conference on High Performance Computing, Data …, 2019	19	2019
IVC: Imperceptible video communication R Carvalho, CH Chu, LJ Chen Proc. of HotMobile (poster), 2014	18	2014
Distributed topology control for energy-efficient and reliable wireless communications MT Sun, CH Chu, EHK Wu, CS Hsiao, AAK Jeng IEEE Systems Journal 12 (3), 2152-2161, 2017	17	2017
Optimized large-message broadcast for deep learning workloads: MPI, MPI+ NCCL, or NCCL2? AA Awan, KV Manian, CH Chu, H Subramoni, DK Panda parallel computing 85, 141-152, 2019	16	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors