Peng Qi


(pinyin: /qí péng/; ipa: /tɕʰǐ pʰə̌ŋ/)

I am an Applied Scientist at Amazon AWS AI working on natural language processing and machine learning.

My research is driven by the goal of bringing the world’s knowledge to the user’s assistance, which manifests itself in two main directions

  • How to effectively organize and use knowledge. This involves tasks like question answering (where I have co-lead the development of some benchmarks for complex reasoning: HotpotQA and BeerQA), information extraction, syntactic analysis for many languages (check out Stanza, my go-to procrastination project), etc.
  • How to effectively communicate knowledge. This mainly concerns interactive NLP systems such as conversational systems, where I am interested in theory-of-mind reasoning under information asymmetry (e.g., how to ask good questions and how to provide good answers beyond the literal answer), offline-to-online transfer, multi-modal interactions, etc.

In these tasks, I am also excited to explore data-efficient models and training techniques, model explainability, and self-supervised learning techniques that enable us to address these problems.

Before joining Amazon, I worked for AI Research as a research scientist. I obtained my Ph.D. in Computer Science at Stanford University advised by Prof. Chris Manning, where I was a member of the NLP group. I also obtained two Master’s at Stanford (CS & Statistics), and my Bachelor’s at Tsinghua University.

[CV (slightly outdated)]

selected publications

(*=equal contribution)

  1. EMNLP Findings
    Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks
    Kaiser Sun, Peng Qi, Yuhao Zhang, Lan Liu, William Yang Wang, and Zhiheng Huang
    In Findings of the Association for Computational Linguistics: EMNLP , 2023.
  2. ACL Findings
    RobustQA: Benchmarking the Robustness of Domain Adaptation for Open-Domain Question Answering
    Rujun Han, Peng Qi, Yuhao Zhang, Lan Liu, Juliette Burger, William Yang Wang, Zhiheng Huang, Bing Xiang, and Dan Roth
    In Findings of the Association for Computational Linguistics: ACL 2023, 2023.
  3. ACL Findings
    PragmatiCQA: A Dataset for Pragmatic Question Answering in Conversations
    Peng Qi*, Nina Du*, Christopher D. Manning, and Jing Huang
    In Findings of the Association for Computational Linguistics: ACL 2023, 2023.
  4. arXiv
    SpanDrop: Simple and Effective Counterfactual Learning for Long Sequences
    Peng Qi*, Guangtao Wang*, and Jing Huang
    arXiv preprint arXiv:2022.02169, 2022.
  5. EMNLP
    Answering Open-Domain Questions of Varying Reasoning Steps from Text
    Peng Qi*, Haejun Lee*, Oghenetegiri "TG" Sido*, and Christopher D. Manning
    In Empirical Methods for Natural Language Processing (EMNLP), 2021.
  6. ACL (Demo)
    Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
    Peng Qi*, Yuhao Zhang*, Yuhui Zhang, Jason Bolton, and Christopher D. Manning
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020.
  7. EMNLP
    HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
    Zhilin Yang*, Peng Qi*, Saizheng Zhang*, Yoshua Bengio, William W. Cohen, Ruslan Salakhutdinov, and Christopher D. Manning
    In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.