David Wan

I’m a second year PHD student at the University of North Carolina at Chapel Hill, where I am advised by Mohit Bansal. Before this, I graduated with a bachelor’s and masters’ degree from Columbia University, advised by Kathleen McKeown. My research interest is in natural language processing.

Publications

  • Evaluating and Improving Factuality in Multimodal Abstractive Summarization
    David Wan and Mohit Bansal
    Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
    [paper] [code]

  • Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization
    Shiyue Zhang*, David Wan* and Mohit Bansal
    preprint 2022
    [paper][code]

  • Constrained Regeneration for Cross-Lingual Query-Focused Extractive Summarization
    Elsbeth Turcan, David Wan, Faisal Ladhak, Petra Galuscakova, Sukanta Sen, Svetlana Tchistiakova, Weijia Xu, Marine Carpuat, Kenneth Heafield, Douglas Oard and Kathleen McKeown
    Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022)
    [paper] [bib]

  • FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
    David Wan and Mohit Bansal
    Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022)
    [paper] [bib] [code]

  • Segmenting Subtitles for Correcting ASR Segmentation Errors
    David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuscakova, Elena Zotkina, Zhengping Jiang, Peter Bell and Kathleen McKeown
    Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL 2021)
    [paper] [bib]

  • Incorporating Terminology Constraints in Automatic Post-Editing
    David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat and Kathleen McKeown
    Proceedings of the Fifth Conference on Machine Translation (WMT 2020)
    [paper] [bib] [code]

  • Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines
    David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell and Kathy McKeown
    Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS 2020)
    [paper] [bib]