Haodong Duan
38 Papers
2 Citations
Haodong Duan is an academic researcher. The author has contributed to research in topics: Computer science & Pattern recognition (psychology). The author has an hindex of 1, co-authored 1 publications.
Chat about Author
Papers
MMBench: Is Your Multi-modal Model an All-around Player?
Yuan Liu,Haodong Duan,Yuanhan Zhang,Bo Li,Songyang Zhang,Wangbo Zhao,Yike Yuan,Jiaqi Wang,Conghui He,Ziwei Liu,Kai Chen,Dahua Lin +11 more
- 12 Jul 2023
TL;DR: MMBench as discussed by the authors is a multi-modality benchmark for large vision-language models, which is designed to evaluate the ability of large-scale vision language models with a large number of evaluation questions and abilities.
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang,Xiaoyi Wang,Yuhang Cao,Chao Xu,Linke Ouyang,Zhiyuan Zhao,Shuangrui Ding,Songyang Zhang,Haodong Duan,H. Yan,Xin Zhang,Wei Li,Jingwen Li,Kai Chen,Conghui He,Xingcheng Zhang,Yu Qiao,Da Lin,Jiaqi Wang +18 more
TL;DR: This work proposes InternLM-XComposer, a vision-language large model that enables advanced image-text comprehension and composition that achieves competitive text-image composition scores compared to public solutions, including GPT4-V and GPT3.5.
PYSKL: Towards Good Practices for Skeleton Action Recognition
Haodong Duan,Jiaqi Wang,Kai Chen,Dahua Lin +3 more
- 19 May 2022
TL;DR: PYSKL implements six different algorithms under a unified framework with both the latest and original good practices to ease the comparison of efficacy and efficiency and provides an original GCN-based skeleton action recognition model named ST-GCN++, which achieves competitive recognition performance without any complicated attention schemes.
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong,Pan Zhang,Yuhang Zang,Yuhang Cao,Bin Wang,Linke Ouyang,Xilin Wei,Songyang Zhang,Haodong Duan,Maosong Cao,Wenwei Zhang,Yining Li,Hang Yan,Yang Gao,Xinyue Zhang,Wei Li,Jingwen Li,Kai Chen,Conghui He,Xingcheng Zhang,Yu Qiao,Dahua Lin,Jiaqi Wang +22 more
TL;DR: Experimental results demonstrate the superiority of InternLM-XComposer2 based on InternLM2-7B in producing high-quality long-text multi-modal content and its exceptional vision-language understanding performance across various benchmarks, where it not only significantly outperforms existing multimodal models but also matches or even surpasses GPT-4V and Gemini Pro in certain assessments.
111
InternLM2 Technical Report
Zheng Cai,Maosong Cao,Haojiong Chen,Kai Chen,Keyu Chen,Xin Chen,Xun Chen,Zehui Chen,Zhi Chen,Pei Chu,Xiao-wen Dong,Haodong Duan,Qi Fan,Zhaoye Fei,Yang Gao,Jiaye Ge,Chenya Gu,Yuzhe Gu,Tao Gui,Aijia Guo,Qipeng Guo,Conghui He,Yingfan Hu,Ting Huang,Tao Jiang,Penglong Jiao,Zhen Jin,Zhikai Lei,Jiaxing Li,Jingwen Li,Linyang Li,Shuaibin Li,Wei Li,Yining Li,Hongwei Liu,Jiangning Liu,Jiawei Hong,Kaiwen Liu,Kui-Jie Liu,Xiaoran Liu,Chen Lv,Haijun Lv,Kai Lv,Li Ma,Runyuan Ma,Zerun Ma,Wenchang Ning,Linke Ouyang,Jiantao Qiu,Yuan Qu,Fukai Shang,Yunfan Shao,Demin Song,Zifan Song,Zhihao Sui,Peng-hao Sun,Yu Sun,Huanze Tang,Bin Wang,Guoteng Wang,Jiaqi Wang,Jiayu Wang,Rui Wang,Yudong Wang,Ziyi Wang,Xing Wei,Qizhen Weng,Fan Wu,Yingtong Xiong,Chao Xu,Rui Ze Xu,Hang Yan,Yirong Yan,Xiaogui Yang,Haochen Ye,Huaiyuan Ying,Jia Yu,Jing Yu,Yuhang Zang,Chuyu Zhang,Li Zhang,Pan Zhang,Peng Zhang,Ruijie Zhang,Shuo Zhang,Songyang Zhang,Wenjian Zhang,Wenwei Zhang,Xingcheng Zhang,Xinyue Zhang,Hui Zhao,Qian Zhao,Xiaomeng Zhao,Fen-Fang Zhou,Zaida Zhou,Jingming Zhuo,Yi-Ling Zou,Xipeng Qiu,Yu Qiao,Dahua Lin +99 more
TL;DR: InternLM2 is an open-source LLM that outperforms its predecessors in comprehensive evaluations across various tasks, including long-context modeling and open-ended subjective evaluations. It utilizes innovative pre-training and optimization techniques to capture long-term dependencies and achieve remarkable performance on the ``Needle-in-a-Haystack" test.
73