William Merrill

Will is an incoming Assistant Professor at the Toyota Technical Institute at Chicago and currently a Young Investigator at the Allen Institute for AI. He received his PhD from New York University working with Tal Linzen and Ashish Sabharwal, supported by an NSF graduate research fellowship and Two Sigma PhD fellowship. A major focus of Will’s research has been developing theory on the computational power and limitations of transformers, with an eye towards guiding the analysis and design of new architectures and inference methods. More generally, he is interested in theoretical computer science, computational linguistics, and the science of deep learning.
Contact: willm[æt]{nyu.edu,allenai.org,ttic.edu}
or here for anonymous feedback
Potential PhD students: I will be recruiting PhD students to start in 2026. If you would like to work with me, please apply to TTIC and mention my name in your application! See my application FAQs
Latest posts
Oct 1, 2025 | My Dissertation is Now Online! |
---|---|
Apr 15, 2022 | Project: Improved Adversarial Robustness via Abstract Interpretation |
Apr 16, 2020 | A Formal Hierarchy of RNN Architectures |
Publications
2025
- arXiv
- arXivCritical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model TrainingJun 2025
- arXiv
- COLMRWKV-7 "Goose" with Expressive Dynamic State EvolutionIn COLM, Oct 2025
- COLM
-
- ACLBetween Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic BiasesIn ACL, Jul 2025\textbfOutstanding Paper
2024
- EMNLP
- COLM
- ICML
- ACL
- ACL
- TACL
- ACL
- ICML
- ICLR
2023
- DLT
- ME-FoMoA Tale of Two Circuits: Grokking as Competition of Sparse and Dense SubnetworksIn ICLR Workshop on Mathematical and Empirical Understanding of Foundation Models, May 2023
- TACL
- NeurIPS
- TACL
2022
- CoNLL
- arXiv
- TACL
- ACL
2021
- EMNLP
-
- TACLProvable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?TACL, Sep 2021
- EMNLPEffects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient DescentIn EMNLP, Nov 2021
2020
- ACL
- COVID19
- arXiv
2019
- DeLeFoLSequential Neural Networks as AutomataIn ACL Workshop on Deep Learning and Formal Languages, Aug 2019
- BlackboxNLPFinding Hierarchical Structure in Neural Stacks Using Unsupervised ParsingIn ACL Workshop BlackboxNLP, Aug 2019
- LChangeDetecting Syntactic Change Using a Neural Part-of-Speech TaggerIn ACL Workshop on Computational Approaches to Historical Language Change, Aug 2019
2018
- BlackboxNLP
- NAACL
- TULConA Semantics of Subordinate Clauses Using Delayed EvaluationIn Toronto Undergraduate Linguistics Conference, Mar 2018