Forensic Linguistics
Every individual has certain ways of language use that are unique to them (very much like fingerprints). Such a unique use of grammar and other language features is known as a person's idoiolect, which can be used for accurate identification of a document's author, which is an important aspect of forensic linguistics. In this project, we investigate authorship attribution of various documents, including highly formal technical writings (Feng, Banerjee, and Choi; 2012b), and even collaborative multi-author documents (Zuo, Zhao, and Banerjee; 2019).
This is one of Banerjee's areas of interest within NLP. The project, however, is not his current focus. It sees sporadic progress when there are students interested in pursuing this topic.Research Group
Ritwik Banerjee, Research Assistant Professor of Computer Science, Stony Brook University
Chaoyuan Zuo, Ph.D. ↦ Faculty at the School of Journalism & Communication, Nankai University (China)
Yu Zhao, M.S.
Song Feng, Sr. Applied Scientist, Amazon Web Services
Yejin Choi, Professor of Computer Science, University of Washington
Publications
[Feng, Banerjee, and Choi; 2012b]- Song Feng, Ritwik Banerjee, and Yejin Choi. Characterizing Stylistic Elements in Syntactic Structure. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1522 - 1533. Association for Computational Linguistics, 2012. [ PDF ]
- Chaoyuan Zuo, Yu Zhao, and Ritwik Banerjee. Style Change Detection with Feed-forward Neural Networks. In Working Notes of CLEF 2019 – Conference and Labs of the Evaluation Forum, CLEF 2018 – Vol. 2380. Central Europe Workshop Proceedings (CEUR-WS.org), 2019. [ PDF ]