I am an Applied Scientist at Amazon AWS AI working on natural language processing and machine learning.
My research is driven by the goal of bringing the world’s knowledge to the user’s assistance, which manifests itself in two main directions
- How to effectively organize and use knowledge. This involves tasks like question answering (where I have co-lead the development of some benchmarks for complex reasoning: HotpotQA and BeerQA), information extraction, syntactic analysis for many languages (check out Stanza, my go-to procrastination project), etc.
- How to effectively communicate knowledge. This mainly concerns interactive NLP systems such as conversational systems, where I am interested in theory-of-mind reasoning under information asymmetry, offline-to-online transfer, multi-modal interactions, etc.
In these tasks, I am also excited to explore data-efficient models and training techniques, model explainability, and self-supervised learning techniques that enable us to address these problems.
Before joining Amazon, I worked for JD.com AI Research as a research scientist. I obtained my Ph.D. in Computer Science at Stanford University advised by Prof. Chris Manning, where I was a member of the NLP group. I also obtained two Master’s at Stanford (CS & Statistics), and my Bachelor’s at Tsinghua University.
- arXivSpanDrop: Simple and Effective Counterfactual Learning for Long SequencesarXiv preprint arXiv:2022.02169, 2022.
- ACLImproving Time Sensitivity for Question Answering over Temporal Knowledge GraphsIn Association of Computational Linguistics (ACL), 2022.
- EMNLPAnswering Open-Domain Questions of Varying Reasoning Steps from TextIn Empirical Methods for Natural Language Processing (EMNLP), 2021.
- NAACLGraph Ensemble Learning over Multiple Dependency Trees for Aspect-level Sentiment ClassificationIn 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
- ACL (Demo)Stanza: A Python Natural Language Processing Toolkit for Many Human LanguagesIn Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020.
- FindingsStay Hungry, Stay Focused: Generating Informative and Specific Questions in Information-Seeking ConversationsIn Findings of the Association for Computational Linguistics: EMNLP 2020, 2020.
- EMNLPHotpotQA: A Dataset for Diverse, Explainable Multi-hop Question AnsweringIn Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.