{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T22:10:50Z","timestamp":1760652650307,"version":"build-2065373602"},"reference-count":26,"publisher":"Wiley","issue":"7","license":[{"start":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:00:00Z","timestamp":1760140800000},"content-version":"vor","delay-in-days":10,"URL":"https:\/\/linproxy.fan.workers.dev:443\/http\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2022YFB3303400"],"award-info":[{"award-number":["2022YFB3303400"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Computer Graphics Forum"],"published-print":{"date-parts":[[2025,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Generating 3D objects with complex topologies from monocular images remains a challenge in computer graphics, due to the difficulty of modeling varying 3D shapes with disentangled, steerable geometry and visual attributes. While NeRF\u2010based methods suffer from slow volumetric rendering and limited structural controllability. Recent advances in 3D Gaussian Splatting provide a more efficient alternative and its generative modeling with separate control over structure and appearance remains underexplored. In this paper, we propose <jats:bold>G\u2010SplatGAN<\/jats:bold>, a novel 3D\u2010aware generation framework that combines the rendering efficiency of 3D Gaussian Splatting with disentangled latent modeling. Starting from a shared Gaussian template, our method uses dual modulation branches to modulate geometry and appearance from independent latent codes, enabling precise shape manipulation and controllable generation. We adopt a progressive adversarial training scheme with multi\u2010scale and patch\u2010based discriminators to capture both global structure and local detail. Our model requires no 3D supervision and is trained on monocular images with known camera poses, reducing data reliance while supporting real image inversion through a geometry\u2010aware encoder. Experiments show that G\u2010SplatGAN achieves superior performance in rendering speed, controllability and image fidelity, offering a compelling solution for controllable 3D generation using Gaussian representations.<\/jats:p>","DOI":"10.1111\/cgf.70256","type":"journal-article","created":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:34:49Z","timestamp":1760186089000},"update-policy":"https:\/\/linproxy.fan.workers.dev:443\/https\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["G\u2010SplatGAN: Disentangled 3D Gaussian Generation for Complex Shapes via Multi\u2010Scale Patch Discriminators"],"prefix":"10.1111","volume":"44","author":[{"ORCID":"https:\/\/linproxy.fan.workers.dev:443\/https\/orcid.org\/0009-0003-2979-7367","authenticated-orcid":false,"given":"Jiaqi","family":"Li","sequence":"first","affiliation":[{"name":"University of Science and Technology of China  Hefei China"}]},{"ORCID":"https:\/\/linproxy.fan.workers.dev:443\/https\/orcid.org\/0009-0004-2100-7962","authenticated-orcid":false,"given":"Haochuan","family":"Dang","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China  Hefei China"}]},{"ORCID":"https:\/\/linproxy.fan.workers.dev:443\/https\/orcid.org\/0009-0005-3272-0445","authenticated-orcid":false,"given":"Zhi","family":"Zhou","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China  Hefei China"}]},{"ORCID":"https:\/\/linproxy.fan.workers.dev:443\/https\/orcid.org\/0009-0005-7778-9614","authenticated-orcid":false,"given":"Junke","family":"Zhu","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China  Hefei China"}]},{"ORCID":"https:\/\/linproxy.fan.workers.dev:443\/https\/orcid.org\/0000-0003-1475-8894","authenticated-orcid":false,"given":"Zhangjin","family":"Huang","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China  Hefei China"}]}],"member":"311","published-online":{"date-parts":[[2025,10,11]]},"reference":[{"volume-title":"6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings","year":"2018","author":"Binkowski Mikolaj","key":"e_1_2_9_2_2"},{"key":"e_1_2_9_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3596711.3596730"},{"key":"e_1_2_9_4_2","unstructured":"Angel XChang ThomasFunkhouser Leonidas JGuibas PatHanrahan ZimoHuang ZhenLi SilvioSavarese ManolisSavva ShuranSong HaoSu JianxiongXiao LiYi andFisherYu. Shapenet: An information-rich 3d model repository.arXiv preprint arXiv:1512.03012 2015. 7 8"},{"key":"e_1_2_9_5_2","doi-asserted-by":"crossref","unstructured":"EricChan MarcoMonteiro PetrKellnhofer JiajunWu andGordonWetzstein. pi-gan: Periodic implicit generative adversarial networks for 3d-aware image synthesis. InProc. CVPR 2021. 2 4 5 8","DOI":"10.1109\/CVPR46437.2021.00574"},{"key":"e_1_2_9_6_2","doi-asserted-by":"crossref","unstructured":"ZilongChen FengWang YikaiWang andHuapingLiu.Text-to-3d using gaussian splatting 2024. 2 11","DOI":"10.1109\/CVPR52733.2024.02022"},{"key":"e_1_2_9_7_2","doi-asserted-by":"crossref","unstructured":"XuChen YufengZheng Michael J.Black OtmarHilliges andAndreasGeiger. Snarf: Differentiable forward skinning for animating non-rigid neural implicit shapes. InProceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) pages11594\u201311604 October2021. 3","DOI":"10.1109\/ICCV48922.2021.01139"},{"key":"e_1_2_9_8_2","doi-asserted-by":"crossref","unstructured":"ThibaultGroueix MatthewFisher Vladimir G.Kim BryanRussell andMathieuAubry. AtlasNet: A Papier-M\u00e2ch\u00e9 Approach to Learning 3D Surface Generation. InProceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2018. 2 3","DOI":"10.1109\/CVPR.2018.00030"},{"key":"e_1_2_9_9_2","first-page":"6629","volume-title":"Proceedings of the 31st International Conference on Neural Information Processing Systems","author":"Heusel Martin","year":"2017"},{"key":"e_1_2_9_10_2","unstructured":"LiangxiaoHu HongwenZhang YuxiangZhang BoyaoZhou BoningLiu ShengpingZhang andLiqiangNie. Gaussianavatar: Towards realistic human avatar modeling from a single video via animatable 3d gaussians. InIEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024. 2 6"},{"key":"e_1_2_9_11_2","unstructured":"PhillipIsola Jun-YanZhu TinghuiZhou andAlexei AEfros. Image-to-image translation with conditional adversarial networks.CVPR 2017. 6 10"},{"key":"e_1_2_9_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3592433"},{"key":"e_1_2_9_13_2","article-title":"Overlock: An overview-first-look-closely-next convnet with context-mixing dynamic kernels","volume":"2502","author":"Lou Meng","year":"2025","journal-title":"CoRR"},{"key":"e_1_2_9_14_2","doi-asserted-by":"crossref","unstructured":"LarsMescheder MichaelOechsle MichaelNiemeyer SebastianNowozin andAndreasGeiger. Occupancy networks: Learning 3d reconstruction in function space. In2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pages4455\u20134465 2019. 3","DOI":"10.1109\/CVPR.2019.00459"},{"key":"e_1_2_9_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503250"},{"key":"e_1_2_9_16_2","doi-asserted-by":"crossref","unstructured":"MichaelNiemeyerandAndreasGeiger. Giraffe: Representing scenes as compositional generative neural feature fields. InProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2021. 2","DOI":"10.1109\/CVPR46437.2021.01129"},{"key":"e_1_2_9_17_2","doi-asserted-by":"crossref","unstructured":"AlbertPumarola EnricCorona GerardPons-Moll andFrancescMoreno-Noguer. D-nerf: Neural radiance fields for dynamic scenes. In2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) pages10313\u201310322 2021. 3","DOI":"10.1109\/CVPR46437.2021.01018"},{"key":"e_1_2_9_18_2","unstructured":"Jeong JoonPark PeterFlorence JulianStraub RichardNew-combe andStevenLovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. InProceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) June2019. 3"},{"key":"e_1_2_9_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3272127.3275066"},{"key":"e_1_2_9_20_2","unstructured":"KatjaSchwarz YiyiLiao MichaelNiemeyer andAndreasGeiger. Graf: Generative radiance fields for 3d-aware image synthesis. InAdvances in Neural Information Processing Systems (NeurIPS) 2020. 2"},{"key":"e_1_2_9_21_2","first-page":"7462","article-title":"Implicit neural representations with periodic activation functions","volume":"33","author":"Sitzmann Vincent","year":"2020","journal-title":"Advances in neural information processing systems,"},{"key":"e_1_2_9_22_2","doi-asserted-by":"crossref","unstructured":"ZiyuWang YuDeng JiaolongYang JingyiYu andXinTong. Generative Deformable Radiance Fields for Disentangled Image Synthesis of Topology-Varying Objects.Computer Graphics Forum 2022. 2 3 7 8 9","DOI":"10.1111\/cgf.14689"},{"volume-title":"Advances in Neural Information Processing Systems","year":"2016","author":"Wu Jiajun","key":"e_1_2_9_23_2"},{"key":"e_1_2_9_24_2","unstructured":"TaoranYi JieminFang JunjieWang GuanjunWu LingxiXie XiaopengZhang WenyuLiu QiTian andXinggangWang. Gaussiandreamer: Fast generation from text to 3d gaussians by bridging 2d and 3d diffusion models. InCVPR 2024. 2 6 11"},{"key":"e_1_2_9_25_2","doi-asserted-by":"crossref","unstructured":"RichardZhang PhillipIsola AlexeiA.Efros Eli Shechtman and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June2018. 3","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_2_9_26_2","unstructured":"KaiZhang GernotRiegler NoahSnavely andVladlenKoltun.Nerf++: Analyzing and improving neural radiance fields 2020. 2"},{"key":"e_1_2_9_27_2","unstructured":"Jun-YanZhu ZhoutongZhang ChengkaiZhang JiajunWu AntonioTorralba JoshuaB.Tenenbaum and William T. Freeman. Visual object networks: Image generation with disentangled 3D representations. InAdvances in Neural Information Processing Systems (NeurIPS) 2018. 2"}],"container-title":["Computer Graphics Forum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/linproxy.fan.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/pdf\/10.1111\/cgf.70256","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T21:36:12Z","timestamp":1760650572000},"score":1,"resource":{"primary":{"URL":"https:\/\/linproxy.fan.workers.dev:443\/https\/onlinelibrary.wiley.com\/doi\/10.1111\/cgf.70256"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10]]},"references-count":26,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["10.1111\/cgf.70256"],"URL":"https:\/\/linproxy.fan.workers.dev:443\/https\/doi.org\/10.1111\/cgf.70256","archive":["Portico"],"relation":{},"ISSN":["0167-7055","1467-8659"],"issn-type":[{"type":"print","value":"0167-7055"},{"type":"electronic","value":"1467-8659"}],"subject":[],"published":{"date-parts":[[2025,10]]},"assertion":[{"value":"2025-10-11","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e70256"}}