Publications

For an up-to-date list of my publications, please visit my Google Scholar Page.


Refereed Publications

GenerationPrograms: Fine-grained Attribution with Executable Programs
David Wan, Eran Hirsch, Elias Stengel-Eskin, Ido Dagan, and Mohit Bansal
Second Conference on Language Models (COLM 2025)
[Paper] [Code]

CLaMR: Multimodal Late-Interaction Retrieval
David Wan, Han Wang, Elias Stengel-Eskin, Jaemin Cho, and Mohit Bansal
arXiv Preprint
[Paper] [Code]

MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration
David Wan, Justin Chih-Yao Chen, Elias Stengel-Eskin, and Mohit Bansal
Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025)
[Paper] [Code]

QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization
Shiyue Zhang*, David Wan*, Arie Cattan, Ayal Klein, Ido Dagan, and Mohit Bansal
Second Conference on Language Models (COLM 2025)
[Paper] [Code]

On Positional Bias of Faithfulness for Long-form Summarization
David Wan, Jesse Vig, Mohit Bansal, and Shafiq Joty
Proceedings of the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL 2025)
[Paper] [Code]

Localizing Factual Inconsistencies in Attributable Text Generation
Arie Cattan, Paul Roit, Shiyue Zhang, David Wan, Roee Aharoni, Idan Szpektor, Mohit Bansal, and Ido Dagan
arXiv Preprint
[Paper] [Code]

ACUEval: Fine-grained Hallucination Evaluation and Correction for Abstractive Summarization
David Wan, Koustuv Sinha, Srinivasan Iyer, Asli Celikyilmaz, Mohit Bansal, and Ramakanth Pasunuru
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 Findings)
[Paper] [Code]

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training
David Wan, Jaemin Cho, Elias Stengel-Eskin, and Mohit Bansal
Proceedings of the European Conference on Computer Vision (ECCV 2024)
[Paper] [Code] [Website]

HistAlign: Improving Context Dependency in Language Generation by Aligning with History
David Wan, Shiyue Zhang, and Mohit Bansal
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023)
[Paper] [Code]

Extractive is not Faithful: An Investigation of Broad Unfaithfulness Problems in Extractive Summarization
Shiyue Zhang*, David Wan*, and Mohit Bansal
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)
[Paper] [Code]

Faithfulness-Aware Decoding Strategies for Abstractive Summarization
David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, and Mohit Bansal
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)
[Paper] [Code]

Evaluating and Improving Factuality in Multimodal Abstractive Summarization
David Wan and Mohit Bansal
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022)
[Paper] [Code]

Constrained Regeneration for Cross-Lingual Query-Focused Extractive Summarization
Elsbeth Turcan, David Wan, Faisal Ladhak, Petra Galuscakova, Sukanta Sen, Svetlana Tchistiakova, Weijia Xu, Marine Carpuat, Kenneth Heafield, Douglas Oard, and Kathleen McKeown
Proceedings of the 29th International Conference on Computational Linguistics (COLING 2022)
[Paper]

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization
David Wan and Mohit Bansal
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022)
[Paper] [Code]

Segmenting Subtitles for Correcting ASR Segmentation Errors
David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuscakova, Elena Zotkina, Zhengping Jiang, Peter Bell, and Kathleen McKeown
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL 2021)
[Paper]

Incorporating Terminology Constraints in Automatic Post-Editing
David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, and Kathleen McKeown
Proceedings of the Fifth Conference on Machine Translation (WMT 2020)
[Paper] [Code]

Subtitles to Segmentation: Improving Low-Resource Speech-to-Text Translation Pipelines
David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, and Kathy McKeown
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS 2020)
[Paper]