{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T07:15:06Z","timestamp":1773299706593,"version":"3.50.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Graph processing underpins a vast array of data-centric applications, serving as a crucial component in fields such as social network analysis, recommendation systems, bio-informatics, and search engines. As graph data grows in scale and complexity, high-performance graph processing is increasingly essential. Many graph processing tasks depend on efficient data structures to manage the sparsity typical of real-world graphs, where most vertices have limited connectivity. This sparsity poses challenges for memory and computational efficiency in large-scale graph processing, and conventional sparse formats like Compressed Sparse Row (CSR) often struggle with memory and computation inefficiencies when handling massive graphs. To address these challenges, we introduce GraphCSR, a degree-equalized CSR format specifically tailored to enhance the spatio-temporal efficiency of distributed graph processing across various tasks. GraphCSR aggregates low-degree vertices into synthetic high-degree ones and applies group-wise compression to reduce storage overhead by recording only the starting index for each aggregated group. This reduces memory usage and supports batch-memory access to improve performance. Our extensive evaluations in various graph processing algorithms and datasets demonstrate that GraphCSR not only reduces the memory footprint required for large-scale graphs, but also improves performance across multiple types of graph processing tasks, outperforming popular sparse storage formats. Furthermore, when deployed on a production-scale supercomputer with 79,024 nodes, GraphCSR achieved a graph processing throughput that exceeded the top-ranked system on the Graph500 benchmark.<\/jats:p>","DOI":"10.14778\/3749646.3749691","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T17:55:06Z","timestamp":1757008506000},"page":"4255-4268","update-policy":"https:\/\/linproxy.fan.workers.dev:443\/https\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["GraphCSR: A Degree-Equalized CSR Format for Large-Scale Graph Processing"],"prefix":"10.14778","volume":"18","author":[{"given":"Xinbiao","family":"Gan","sequence":"first","affiliation":[{"name":"University of Defense Technology, Changsha, China"}]},{"given":"Tiejun","family":"Li","sequence":"additional","affiliation":[{"name":"University of Defense Technology, Changsha, China"}]},{"given":"Chunye","family":"Gong","sequence":"additional","affiliation":[{"name":"National Supercomputer Center in Tianjin, China"}]},{"given":"Dongsheng","family":"Li","sequence":"additional","affiliation":[{"name":"University of Defense Technology, Changsha, China"}]},{"given":"Dezun","family":"Dong","sequence":"additional","affiliation":[{"name":"University of Defense Technology, Changsha, China"}]},{"given":"Jie","family":"Liu","sequence":"additional","affiliation":[{"name":"University of Defense Technology, Changsha, China"}]},{"given":"Kai","family":"Lu","sequence":"additional","affiliation":[{"name":"University of Defense Technology, Changsha, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2021. https:\/\/linproxy.fan.workers.dev:443\/https\/graph500.org\/. (2021)."},{"key":"e_1_2_1_2_1","unstructured":"2022. https:\/\/linproxy.fan.workers.dev:443\/https\/law.di.unimi.it\/webdata\/twitter-2010\/. (2022)."},{"key":"e_1_2_1_3_1","unstructured":"2022. https:\/\/linproxy.fan.workers.dev:443\/https\/lemurproject.org\/clueweb12\/. (2022)."},{"key":"e_1_2_1_4_1","volume-title":"Parallel Shortest Path Graph Computations of United States Road Network Data on Apache Spark. Social Informatics and Telecommunications Engineering","author":"Arfat Yasir","year":"2018","unstructured":"Yasir Arfat, Rashid Mehmood, and Aiiad Albeshri. 2018. Parallel Shortest Path Graph Computations of United States Road Network Data on Apache Spark. Social Informatics and Telecommunications Engineering (2018), 323\u2013336."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3159652.3162007"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1583991.1584053"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2008.4536313"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503221.3508403"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2014.52"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3190508.3190545"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4800"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-019-04121-z"},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"F. Chierichetti R. Kumar S. Lattanzi M. Mitzenmacher A. Panconesi and P. Raghavan. 2009. On compressing social networks. SIGKDD (2009) 219\u2013228.","DOI":"10.1145\/1557019.1557049"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3075457"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2012.04.007"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192404"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.14778\/3476311.3476369"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3190657"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/3689341"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3696410.3714833"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3673038.3673129"},{"key":"e_1_2_1_22_1","volume-title":"TianheStar: Orchestrating SSSP Applications on Tianhe Supercomputer. In 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 534\u2013542","author":"Gan Xinbiao","year":"2024","unstructured":"Xinbiao Gan, Qian Tang, Feng Xiong, Shijie Li, Bo Yang, and Tiejun Li. 2024. TianheStar: Orchestrating SSSP Applications on Tianhe Supercomputer. In 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, 534\u2013542."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3627535.3638498"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2021.3100785"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE53745.2022.00199"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.68"},{"key":"e_1_2_1_27_1","volume-title":"The Graph 500 List. https:\/\/linproxy.fan.workers.dev:443\/https\/graph500.org\/ Last accessed","year":"2022","unstructured":"https:\/\/linproxy.fan.workers.dev:443\/http\/graph500.org\/. 2021. The Graph 500 List. https:\/\/linproxy.fan.workers.dev:443\/https\/graph500.org\/ Last accessed 03 March 2022."},{"key":"e_1_2_1_28_1","series-title":"Journal of Physics: Conference Series","volume-title":"Implementation of multiple precision sparse matrix-vector multiplication on CUDA using ELLPACK format","author":"Isupov Konstantin","year":"2013","unstructured":"Konstantin Isupov, Ivan Babeshko, and Alexander Krutikov. 2021. Implementation of multiple precision sparse matrix-vector multiplication on CUDA using ELLPACK format. In Journal of Physics: Conference Series, Vol. 1828. IOP Publishing, 012013."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2014.7004270"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3352460.3358286"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2465351.2465369"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772751"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00022"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-019-02835-4"},{"key":"e_1_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Zhe Li Chengkun Wu Yishui Li Runduo Liu Kai Lu Ruibo Wang Jie Liu Chunye Gong Canqun Yang Xin Wang et al. 2023. Free energy perturbation-based large-scale virtual screening for effective drug discovery against COVID-19. The international journal of high performance computing applications 37 1 (2023) 45\u201357.","DOI":"10.1177\/10943420221117797"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/35021BIGCOMP.2015.7072830"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2017.53"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2018.00059"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751209"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASAP.2015.7245713"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5161108"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2807591.2807626"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER49012.2020.00053"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2016.7840705"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41019-016-0024-y"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983703"},{"key":"e_1_2_1_47_1","volume-title":"Nebula Graph: An open source distributed graph database. arXiv preprint arXiv:2206.07278","author":"Wu Min","year":"2022","unstructured":"Min Wu, Xinglu Yi, Hui Yu, Yu Liu, and Yujue Wang. 2022. Nebula Graph: An open source distributed graph database. arXiv preprint arXiv:2206.07278 (2022)."},{"key":"e_1_2_1_48_1","first-page":"458","article-title":"Bidirectional-Bitmap Based CSR for Reducing Large-Scale Graph Space","volume":"58","author":"Xinbiao Gan","year":"2021","unstructured":"Gan Xinbiao, Tan Wen, and Liu Jie. 2021. Bidirectional-Bitmap Based CSR for Reducing Large-Scale Graph Space. Journal of Computer Research and Development 58, 3 (2021), 458.","journal-title":"Journal of Computer Research and Development"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCSim.2015.7237065"},{"key":"e_1_2_1_50_1","unstructured":"R. Zafarani and H. Liu. 2009. Social Computing Data Repository at ASU [https:\/\/linproxy.fan.workers.dev:443\/http\/socialcomputing.asu.edu]. Informatics and Decision Systems Engineering (2009)."},{"key":"e_1_2_1_51_1","volume-title":"Gemini: A Computation-Centric Distributed Graph Processing System. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016","author":"Zhu Xiaowei","year":"2016","unstructured":"Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A Computation-Centric Distributed Graph Processing System. In 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2\u20134, 2016, Kimberly Keeton and Timothy Roscoe (Eds.). USENIX Association, 301\u2013316. https:\/\/linproxy.fan.workers.dev:443\/https\/www.usenix.org\/conference\/osdi16\/technical-sessions\/presentation\/zhu"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687727"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.14778\/2002974.2002976"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/linproxy.fan.workers.dev:443\/https\/dl.acm.org\/doi\/pdf\/10.14778\/3749646.3749691","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T03:29:50Z","timestamp":1757042990000},"score":1,"resource":{"primary":{"URL":"https:\/\/linproxy.fan.workers.dev:443\/https\/dl.acm.org\/doi\/10.14778\/3749646.3749691"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":53,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.14778\/3749646.3749691"],"URL":"https:\/\/linproxy.fan.workers.dev:443\/https\/doi.org\/10.14778\/3749646.3749691","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}