William Merrill
I am a Ph.D. student at the CDS at NYU, where I am advised by Tal Linzen and supported by an NSF graduate research fellowship and by AI2.
My research develops theory to better understand what language models can do, as well as what they can't. I've worked on characterizing the computational power of transformers for representing linguistic structure and solving reasoning problems. I've also analyzed the aspects of semantics that can be learned from co-occurrence patterns as a way to understand the potential of self-supervised learning.
Contact: willm[æt]nyu.edu
or here for anonymous feedback
Outside of research, I like exploring New York City by foot, train, and boat. I like cooking new things and trying hole-in-the-wall restaurants. I also play basketball, ping pong, and Age of Empires II.
Latest posts
Apr 15, 2022 | Project: Improved Adversarial Robustness via Abstract Interpretation |
---|---|
Apr 16, 2020 | A Formal Hierarchy of RNN Architectures |
Sep 6, 2019 | Theory of Saturated Neural Networks |
Publications
2023
- DLTFormal Languages and the NLP Black BoxIn Developments in Language Theory, Jun 2023
- ME-FoMoA Tale of Two Circuits: Grokking as Competition of Sparse and Dense SubnetworksIn ICLR Workshop on Mathematical and Empirical Understanding of Foundation Models, Jun 2023
- TACLTransparency Helps Reveal When Language Models Learn MeaningTACL, Jun 2023
- How Language Model Hallucinations Can SnowballJun 2023
- NeurIPSA Logic for Expressing Log-Precision TransformersIn NeurIPS, Dec 2023
- TACLThe Parallelism Tradeoff: Limitations of Log-Precision TransformersTACL, Jun 2023
2022
- CoNLLEntailment Semantics Can Be Extracted from an Ideal Language ModelIn CoNLL, Dec 2022
- Extracting Finite Automata from RNNs Using State MergingJan 2022
- TACLSaturated Transformers are Constant-Depth Threshold CircuitsTACL, Aug 2022
- ACLReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionIn ACL, May 2022
2021
- EMNLPCompetency Problems: On Finding and Removing Artifacts in Language DataIn EMNLP, Nov 2021
- TACLProvable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?TACL, Sep 2021
- EMNLPEffects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient DescentIn EMNLP, Nov 2021
2020
- ACLA Formal Hierarchy of RNN ArchitecturesIn ACL, Jul 2020
- COVID19CORD-19: The COVID-19 Open Research DatasetIn ACL Workshop on NLP for COVID-19, Jul 2020
- On the Linguistic Capacity of Real-Time Counter AutomataSep 2020
2019
- DeLeFoLSequential Neural Networks as AutomataIn ACL Workshop on Deep Learning and Formal Languages, Aug 2019
- BlackboxNLPFinding Hierarchical Structure in Neural Stacks Using Unsupervised ParsingIn ACL Workshop BlackboxNLP, Aug 2019
- LChangeDetecting Syntactic Change Using a Neural Part-of-Speech TaggerIn ACL Workshop on Computational Approaches to Historical Language Change, Aug 2019
2018
- BlackboxNLPContext-Free transductions with neural stacksIn EMNLP Workshop BlackboxNLP, Nov 2018
- NAACLEnd-to-End Graph-Based TAG Parsing with Neural NetworksIn NAACL, Nov 2018
- TULConA semantics of subordinate clauses using delayed evaluationToronto Undergraduate Linguistics Conference, Nov 2018