HanDiffuser: Text-to-Image Generation with Realistic Hand Appearances

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)(2024)

引用 0|浏览17
暂无评分
摘要
Text-to-image generative models can generate high-quality humans, but realismis lost when generating hands. Common artifacts include irregular hand poses,shapes, incorrect numbers of fingers, and physically implausible fingerorientations. To generate images with realistic hands, we propose a noveldiffusion-based architecture called HanDiffuser that achieves realism byinjecting hand embeddings in the generative process. HanDiffuser consists oftwo components: a Text-to-Hand-Params diffusion model to generate SMPL-Body andMANO-Hand parameters from input text prompts, and a Text-GuidedHand-Params-to-Image diffusion model to synthesize images by conditioning onthe prompts and hand parameters generated by the previous component. Weincorporate multiple aspects of hand representation, including 3D shapes andjoint-level finger positions, orientations and articulations, for robustlearning and reliable performance during inference. We conduct extensivequantitative and qualitative experiments and perform user studies todemonstrate the efficacy of our method in generating images with high-qualityhands.
更多
查看译文
关键词
Hands,Diffusion,Humans
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要