


default search action
Youhe Jiang
This is just a disambiguation page, and is not intended to be the bibliography of an actual person. Any publication listed on this page has not been assigned to an actual author yet. If you know the true author of one of the publications listed below, you are welcome to contact us.
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
[c6]Youhe Jiang, Ran Yan, Binhang Yuan:
HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment. ICLR 2025
[c5]Youhe Jiang, Fangcheng Fu, Xiaozhe Yao, Guoliang He, Xupeng Miao, Ana Klimovic, Bin Cui, Binhang Yuan, Eiko Yoneki:
Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs. ICML 2025
[c4]Youhe Jiang, Fangcheng Fu, Xiaozhe Yao, Taiyi Wang, Bin Cui, Ana Klimovic, Eiko Yoneki:
ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments. MLSys 2025
[i17]Youhe Jiang, Fangcheng Fu, Xiaozhe Yao, Guoliang He, Xupeng Miao, Ana Klimovic, Bin Cui, Binhang Yuan, Eiko Yoneki:
Demystifying Cost-Efficiency in LLM Serving over Heterogeneous GPUs. CoRR abs/2502.00722 (2025)
[i16]Youhe Jiang, Ran Yan, Binhang Yuan:
HexGen-2: Disaggregated Generative Inference of LLMs in Heterogeneous Environment. CoRR abs/2502.07903 (2025)
[i15]Youhe Jiang, Fangcheng Fu, Xiaozhe Yao, Taiyi Wang, Bin Cui, Ana Klimovic, Eiko Yoneki:
ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments. CoRR abs/2502.09334 (2025)
[i14]You Peng, Youhe Jiang, Chen Wang, Binhang Yuan:
HEXGEN-TEXT2SQL: Optimizing LLM Inference Request Scheduling for Agentic Text-to-SQL Workflow. CoRR abs/2505.05286 (2025)
[i13]Yuhang Wang, Youhe Jiang, Bin Cui, Fangcheng Fu:
Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately. CoRR abs/2505.13326 (2025)
[i12]Youhe Jiang, Fangcheng Fu, Wanru Zhao, Stephan Rabanser, Nicholas D. Lane, Binhang Yuan:
Cascadia: A Cascade Serving System for Large Language Models. CoRR abs/2506.04203 (2025)
[i11]Li Zhang, Youhe Jiang, Guoliang He, Xin Chen, Han Lv, Qian Yao, Fangcheng Fu, Kai Chen:
Efficient Mixed-Precision Large Language Model Inference with TurboMind. CoRR abs/2508.15601 (2025)
[i10]Ran Yan, Youhe Jiang, Binhang Yuan:
Flash Sparse Attention: An Alternative Efficient Implementation of Native Sparse Attention Kernel. CoRR abs/2508.18224 (2025)
[i9]Guoliang He, Youhe Jiang, Wencong Xiao, Kaihua Jiang, Shuguang Wang, Jun Wang, Zixian Du, Zhuo Jiang, Xinlei Zhang, Binhang Yuan, Eiko Yoneki:
Efficient Pre-Training of LLMs via Topology-Aware Communication Alignment on More Than 9600 GPUs. CoRR abs/2509.15940 (2025)
[i8]Chris Tong, Youhe Jiang, Gufeng Chen, Tianyi Zhao, Sibian Lu, Wenjie Qu, Eric Yang, Lynn Ai, Binhang Yuan:
Parallax: Efficient LLM Inference Service over Decentralized Environment. CoRR abs/2509.26182 (2025)
[i7]Ran Yan, Youhe Jiang, Tianyuan Wu, Jiaxuan Gao, Zhiyu Mei, Wei Fu, Haohui Mai, Wei Wang, Yi Wu, Binhang Yuan:
AReaL-Hex: Accommodating Asynchronous RL Training over Heterogeneous GPUs. CoRR abs/2511.00796 (2025)- 2024
[j3]Yujie Wang
, Youhe Jiang
, Xupeng Miao
, Fangcheng Fu
, Shenhan Zhu
, Xiaonan Nie
, Yaofeng Tu
, Bin Cui
:
Improving Automatic Parallel Training via Balanced Memory Workload Optimization. IEEE Trans. Knowl. Data Eng. 36(8): 3906-3920 (2024)
[c3]Youhe Jiang, Ran Yan, Xiaozhe Yao, Yang Zhou, Beidi Chen, Binhang Yuan:
HexGen: Generative Inference of Large Language Model over Heterogeneous Environment. ICML 2024: 21946-21961
[c2]Xiaoyu You
, Youhe Jiang
, Jianwei Xu
, Mi Zhang
, Min Yang
:
GNNFingers: A Fingerprinting Framework for Verifying Ownerships of Graph Neural Networks. WWW 2024: 652-663
[i6]Ran Yan, Youhe Jiang, Wangcheng Tao, Xiaonan Nie, Bin Cui, Binhang Yuan:
FlashFlex: Accommodating Large Language Model Training over Heterogeneous Environment. CoRR abs/2409.01143 (2024)
[i5]Dian Xiong, Li Chen, Youhe Jiang, Dan Li, Shuai Wang, Songtao Wang:
Revisiting the Time Cost Model of AllReduce. CoRR abs/2409.04202 (2024)- 2023
[c1]Youhe Jiang, Fangcheng Fu, Xupeng Miao, Xiaonan Nie, Bin Cui:
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning. IJCAI 2023: 2142-2150
[i4]Yujie Wang, Youhe Jiang, Xupeng Miao, Fangcheng Fu, Xiaonan Nie, Bin Cui:
Improving Automatic Parallel Training via Balanced Memory Workload Optimization. CoRR abs/2307.02031 (2023)
[i3]Youhe Jiang, Ran Yan, Xiaozhe Yao, Beidi Chen, Binhang Yuan:
HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment. CoRR abs/2311.11514 (2023)- 2022
[j2]Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi, Xiaonan Nie, Hailin Zhang
, Bin Cui:
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism. Proc. VLDB Endow. 16(3): 470-479 (2022)
[i2]Youhe Jiang, Xupeng Miao, Xiaonan Nie, Bin Cui:
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learning. CoRR abs/2209.13258 (2022)
[i1]Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi, Xiaonan Nie, Hailin Zhang, Bin Cui:
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism. CoRR abs/2211.13878 (2022)- 2020
[j1]Youhe Jiang
, Huaxi Gu
, Yunfeng Lu
, Xiaoshan Yu:
2D-HRA: Two-Dimensional Hierarchical Ring-Based All-Reduce Algorithm in Large-Scale Distributed Machine Learning. IEEE Access 8: 183488-183494 (2020)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2026-02-10 22:58 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







