I am a Ph.D. candidate advised by Benjamin Van Durme; I research how the understanding of gender in natural language processing (NLP) impacts society, and I also work on human annotation of natural language text and help out with NLP software infrastructure. I read in NLP ethics, gender politics, and the construction of identity in language and social media, and in general I’m interested in human-computer interaction and humanistic studies of technology. Previously, my graduate research focused on topic modeling. I have a master’s degree in computer science from Johns Hopkins University and a bachelor’s in mathematics from Harvey Mudd College, and between college and grad school I did research and software engineering at Pacific Northwest National Laboratory. My pronouns are she/her (or they/them).
Selected publications I’ve contributed to:
On Measuring Social Biases in Sentence Encoders
Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 2019
Streaming Word Embeddings with the Space-Saving Algorithm
Chandler May, Kevin Duh, Benjamin Van Durme, and Ashwin Lall
Social Bias in Elicited Natural Language Inferences
Rachel Rudinger, Chandler May, and Benjamin Van Durme
Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, 2017
An Analysis of Lemmatization on Topic Models of Morphologically Rich Language
Chandler May, Ryan Cotterell, and Benjamin Van Durme
Topic Identification and Discovery on Text and Speech
Chandler May, Francis Ferraro, Alan McCree, Jonathan Wintrode, Daniel Garcia-Romero, and Benjamin Van Durme
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015
Particle Filter Rejuvenation and Latent Dirichlet Allocation
Chandler May, Alex Clemmer, and Benjamin Van Durme
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014
Selected software I’ve contributed to (that isn’t directly associated with a research project):
The word2vec word representation learning C code, commented.
A Python interface to the Concrete communication protocol for annotated text.
A protocol for cross-language, cross-platform services. Used by Concrete.
A Python (Django) web interface for crowd work. Largely compatible with Mechanical Turk formats.