Licheng Yu

My name is Licheng Yu (虞立成). I am now a Research Scientist at Facebook AI. I completed my PhD in Computer Science from University of North Carolina at Chapel Hill in 2019 May. My advisor is Tamara L. Berg. I also work closely with Mohit Bansal during my PhD study. My research interest lies in computer vision and natural language processing.

I completed my Master's degrees from both Georgia Tech and Shanghai Jiaotong University in 2014. I received my Bachelor's degree from Shanghai Jiao Tong University.

Email: licheng [at]
Office: 201 S. Columbia St., Rm-257, UNC-Chapel Hill, NC 27599-3175
More info: [Resume], [Google Scholar], [LinkedIn], [GitHub].


Work Experience

2020.03—future    :

Research Scientist





Research Assistant


Research Intern


Research Intern


Research Intern


Research Assistant

Projects & Publications

HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li*, Yen-Chun Chen*, Yu Cheng, Zhe Gan, Licheng Yu, Jingjing Liu
(*First 2 authors contribute equally.)
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
ECCV 2020
Jize Cao, Zhe Gan, Yu Cheng, Licheng Yu, Yen-Chun Chen, Jingjing Liu
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
ECCV 2020
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
UNITER: Learning UNiversal Image-Text Representations
ECCV 2020
Yen-Chun Chen*, Linjie Li*, Licheng Yu*, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, Jingjing Liu
(*First 3 authors contribute equally.)
Achieving SOTA on 13 Vision+Language Datasets/Tasks, and
Rank 1 on VCR Leaderboard
Rank 1 on NLVR2 Leaderboard
TVQA+: Spatio-Temporal Grounding for Video Question Answering
ACL 2020
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
BachGAN: High-Resolution Image Synthesis from Salient Object Layout
CVPR 2020
Yandong Li, Yu Cheng, Zhe Gan, Licheng Yu, Liqiang Wang, Jingjing Liu
[Paper] [Code]
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
CVPR 2020
Jingzhou Liu, Wenhu Chen, Yu Cheng, Zhe Gan, Licheng Yu, Yiming Yang, Jingjing Liu
Multi-Target Embodied Question Answering
CVPR 2019
Licheng Yu, Xinlei Chen, Georgia Gkioxari, Mohit Bansal, Tamara L. Berg, Dhruv Batra
[Paper] [Video] [Code]
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
NAACL 2019
Hao Tan, Licheng Yu, Mohit Bansal
[Paper] [Code]
TVQA: Localized Compositional Video Question Answering
EMNLP 2018
Jie Lei, Licheng Yu, Mohit Bansal, Tamara L. Berg
[Paper] [Project] [Explore] (Oral)
MAttNet: Modular Attention Network for Referring Expression Comprehension
CVPR 2018
Licheng Yu, Zhe Lin, Xiaohui Shen, Jimei Yang, Xin Lu, Mohit Bansal, Tamara L. Berg
From Image to Language and Back Again
Journal of Natural Language Engineering (JNLE), 2018
Anya Belz, Tamara L. Berg, Licheng Yu
Physics-Inspired Garment Recovery from a Single-View Image
ACM Transactions on Graphics, 2018
Shan Yang, Tanya Ambert, Zherong Pan, Ke Wang, Licheng Yu, Tamara L. Berg, Ming C. Lin
A Unified Framework for Manifold Landmarking
IEEE Transactions on Signal Processing, 2018
Hongteng Xu, Licheng Yu, Mark Davenport, Hongyuan Zha
Hierarchically-Attentive RNN for Album Summarization and Storytelling
EMNLP 2017
Licheng Yu, Mohit Bansal, Tamara L. Berg
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
CVPR 2017
Licheng Yu, Hao Tan, Mohit Bansal, Tamara L. Berg
[Paper] [Code] [Project] [Talk] (Spotlight 8%)
Modeling Context in Referring Expressions
ECCV 2016
Licheng Yu, Patrick Poirson, Shan Yang, Alexander C. Berg, Tamara L. Berg
[Paper] [Dataset] [Talk] (Spotlight 4.7%)
Visual Madlibs: Fill-in-the-blank Image Description and Question Answering
ICCV 2015
Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg
Dictionary Learning with Mutually Reinforcing Group-Graph Structures
AAAI 2015
Licheng Yu*, Hongteng Xu*, Hongyuan Zha, Yi Xu
(* denotes equal contribution)
Vector Sparse Representation of Color Image Using Quaternion Matrix Analysis
IEEE Transactions on Image Processing, TIP 2015
Yi Xu, Licheng Yu, Hongteng Xu, Truong Nguyen, Hao Zhang
Quaternion-based Sparse Representation of Color Image
IEEE International Conference on Multimedia and Expo, ICME 2013
Licheng Yu, Yi Xu, Hongteng Xu, Hao Zhang
Single Image Super-resolution via Phase Congruency Analysis
IEEE Visual Communications and Image Processing, VCIP 2013
Licheng Yu, Yi Xu, Bo Zhang
[Paper] (Oral)
Self-Example Based Super-resolution with Fractal-based Gradient Enhancement
IEEE International Conference on Multimedia and Expo, ICME workshop 2013
Licheng Yu, Yi Xu, Hongteng Xu
Robust Single Image Super-resolution based on Gradient Enhancement
APSIPA Annual Summit and Conference, APSIPA 2012
Licheng Yu, Yi Xu, Hongteng Xu, Xiaokang Yang


Self-supervised Learning for Vision-and-Language
Recent Advances in Vision-and-Language Research
CVPR 2020 Tutorial
Licheng Yu, Linjie Li, Yen-Chun Chen
Revisiting Grid Features for VQA
Duy-Kien Nguyen, Huaizu Jiang, Vedanuj Goswami, Licheng Yu, Xinlei Chen
Winner of VQA 2020 Challenge
Gobang Android App (AI mode + 2-player mode)
Licheng Yu
Skill Measurement via Egocentric Vision in Wetlab
Licheng Yu, Yin Li, James Rehg

PhD Thesis: "Question Answering, Grounding, and Generation for Vision and Language" [PDF][Talk]