Jie Lei

Jie Lei

Old Town, San Diego, May 2019 (Courtesy of Qin)

I am a final year PhD student at UNC Chapel Hill, working on vision and language. My advisors are Tamara L. Berg and Mohit Bansal.

I earned my bachelor's degree in computer science from Yingcai Honors College, University of Electronic Science and Technology of China (UESTC) in 2017. I worked at Nanyang Technological University with Sinno Jialin Pan, and University of Manitoba with Yang Wang.

google scholar github twitter cv

Email: jielei [at] cs.unc.edu
Office: SN 260, 201 S. Columbia St. Chapel Hill, NC 27599-3175


News

Publications & Preprints

moment-detr
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei, Tamara L. Berg, Mohit Bansal
vimpac
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan*, Jie Lei*, Thomas Wolf, Mohit Bansal
arXiv 2021 [PDF] [Code] Star
value
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li*, Jie Lei*, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang, William Yang Wang, Tamara L. Berg, Mohit Bansal, Jingjing Liu, Lijuan Wang, Zicheng Liu
NeurIPS 2021 - Datasets and Benchmarks Track [PDF] [Code] [Leaderboard & Challenge]
value
Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models
Linjie Li, Jie Lei, Zhe Gan, Jingjing Liu
ICCV 2021 Oral [PDF] [Dataset]
mTVR
mTVR: Multilingual Moment Retrieval in Videos
Jie Lei, Tamara L. Berg, Mohit Bansal
ACL 2021 [PDF] [Code]
VL-T5
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
ICML 2021 [PDF] [Code] Star
decembert
Improved Pre-Training from Noisy Instructional Videos via Dense Captions and Entropy Minimization
Zineng Tang*, Jie Lei*, Mohit Bansal
NAACL 2021 [PDF] [Code] Star
VL-T5
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei*, Linjie Li*, Luowei Zhou, Zhe Gan, Tamara L. Berg, Mohit Bansal, Jingjing Liu
CVPR 2021 Best Student Paper Honorable Mention Oral [PDF] [Code] Star
VLEP
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
EMNLP 2020 [PDF] [VLEP Dataset]
TVR
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
MART
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal
TVQA PLUS
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal
TVQA
TVQA: Localized, Compositional Video Question Answering
Jie Lei, Licheng Yu, Mohit Bansal, Tamara L. Berg
EMNLP 2018 Oral [PDF] [Slides] [Dataset] [Code] Star
image classification
Weakly Supervised Image Classification with Coarse and Fine Labels
Jie Lei, Zhenyu Guo and Yang Wang
CRV 2017 [PDF] [Code] Star

Projects

AnimeGAN: Create Anime Face using Generative Adversarial Networks
Jie Lei
A simple GAN model that could automatically generate anime girl faces.

Miscs