Skip to content
View jopetty's full-sized avatar

Highlights

  • Pro

Organizations

@clay-lab

Block or report jopetty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jopetty/README.md

Hi there 👋

I'm a PhD student in the Department of Linguistics at New York University. Some things I'm currently thinking about:

  • How do (large) language models generalize under distribution shift?
  • How is language model training similar to or different from human language acquisition?
  • How can we interpret language and language models through the lens of algebra?

Some things I've thought about recently:

  • Do transformer models need to be very deep to achieve good performance on compositional tasks?
  • Are there problems which transformers linear state-space models cannot solve?
  • How can the Yiddish verbal prefix צע- be analyzed as an ingressive marker?
  • Can you train a language model's tokenizer at the same time as you train the language model?

You can find a list of publications on my website. Feel free to email me ([email protected]) to ask questions about research, or reach out X @jowenpetty. If you're at NYU and want to talk in-person, email me at my NYU email.

Pinned Loading

  1. ml-template ml-template Public template

    A template for developing a portable ML experimentation framework

    Python

  2. dotfiles dotfiles Public

    Ruby

  3. lingtex lingtex Public

    A collection of LaTeX files for linguists

    TeX

  4. growing-tokens growing-tokens Public

    Train a tokenizer while training a language model

    Python

  5. coursework-latex coursework-latex Public

    A LaTeX3 meta-class for typesetting lecture notes, problem sets, and other academic documents.

    TeX 12 7

  6. CPSC-223 CPSC-223 Public

    Homework assignments, lecture notes, and papers for CPSC 223b (Data Structures and Programming Techniques) at Yale University.

    C 3