William Merrill
I am a Ph.D. student at the CDS at NYU, where I am advised by Tal Linzen. I also work closely with Ashish Sabharwal at AI2, where I used to be a PYI. My Ph.D. is supported by an NSF graduate research fellowship, Two Sigma Ph.D. Fellowship, and by AI2.
My research uses formal methods to better understand the capabilities and limitations of language models. I've worked on characterizing the computational power of transformers for representing linguistic structure and solving reasoning problems. I've also worked on understanding the aspects of semantics that can be learned from co-occurrence patterns in a large text corpus. Overall, I am interested in building out theoretical foundations for the alchemy of large language models. Why have LMs been successful? What are their limitations? How can we more systematically build and deploy them?
Contact: willm[æt]nyu.edu
or here for anonymous feedback
I'm on the academic job market! Feel free to reach out if you have a relevant faculty or postdoc opening.
Outside of research, I like exploring New York City by foot, train, and boat. I like cooking new things and trying hole-in-the-wall restaurants. I also play basketball, ping pong, and Age of Empires II.
Latest posts
Apr 15, 2022 | Project: Improved Adversarial Robustness via Abstract Interpretation |
---|---|
Apr 16, 2020 | A Formal Hierarchy of RNN Architectures |
Sep 6, 2019 | Theory of Saturated Neural Networks |
Publications
2024
- EMNLP
- COLM
-
-
-
-
-
- ICLR
2023
- DLT
- ME-FoMoA Tale of Two Circuits: Grokking as Competition of Sparse and Dense SubnetworksIn ICLR Workshop on Mathematical and Empirical Understanding of Foundation Models, May 2023
- TACL
- NeurIPS
- TACL
2022
- CoNLL
-
- TACL
- ACL
2021
- EMNLP
-
- TACLProvable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?TACL, Sep 2021
- EMNLPEffects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient DescentIn EMNLP, Nov 2021
2020
- ACL
- COVID19
- arXiv
2019
- DeLeFoLSequential Neural Networks as AutomataIn ACL Workshop on Deep Learning and Formal Languages, Aug 2019
- BlackboxNLPFinding Hierarchical Structure in Neural Stacks Using Unsupervised ParsingIn ACL Workshop BlackboxNLP, Aug 2019
- LChangeDetecting Syntactic Change Using a Neural Part-of-Speech TaggerIn ACL Workshop on Computational Approaches to Historical Language Change, Aug 2019
2018
- BlackboxNLP
- NAACL
- TULConA Semantics of Subordinate Clauses Using Delayed EvaluationToronto Undergraduate Linguistics Conference, Nov 2018