I'm a PhD student in the Department of Linguistics at New York University. Some things I'm currently thinking about:
- How do (large) language models generalize under distribution shift?
- How is language model training similar to or different from human language acquisition?
- How can we interpret language and language models through the lens of algebra?
Some things I've thought about recently:
- Do transformer models need to be very deep to achieve good performance on compositional tasks?
- Are there problems which transformers linear state-space models cannot solve?
- How can the Yiddish verbal prefix צע- be analyzed as an ingressive marker?
- Can you train a language model's tokenizer at the same time as you train the language model?
You can find a list of publications on my website.
Feel free to email me ([email protected]
) to ask questions about research, or reach out X @jowenpetty
.
If you're at NYU and want to talk in-person, email me at my NYU email.