|
I'm currently interested in AI Safety from a variety of perspectives - evaluation of language models for hidden objectives/biases; pluralistic alignment of language models; emergent misalignment.
Here is some more information about me.
-
What kind of research am I interested in?: I am an NLP researcher who applies ML to solve sociotechnical issues using human-centered approaches. Few of my major previous projects are:
- Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments - Link - ACL 2022 workshop paper
- On the Challenges of Building Hate Speech Datasets - Link - preprint
- LGBQTweet - A community-sourced dataset for detecting hate against sexual and gender identity minorities
- Studying the evolution of attitudes toward gender identity minorities over time across partisan leanings in popular political podcasts - Code
-
What am I currently working on?:
- Evaluating the cultural competence of LLMs - Project for LING 575 (Societal Impacts of Language Technology)
- Stanford CS 336: Language Modeling from Scratch
-
Regarding the ideas I am interested in exploring, I do have a few broad overarching themes in my mind (non-exhaustive list):
- I am interested in assessing the harms of emergent misalignment from a human-centered lens, its effect on user-facing applications such as content moderation, and ways to personalize LMs with different social norms and values
-
Why am I a great hire?: I am a generalist software engineer with 4+ years of experience across American Express, PayPal, and Standard Chartered. Here are some notable highlights and projects I have under my belt:
- At Amex, I created a language translation feature (HuggingFace, Sanic, Flask) using the Marian MT framework for translating chats between English and Spanish
- At Standard Chartered I led a team of 4 in developing an API management framework (TypeScript, React, Flask) and creating configurable APIs (1.3K+ APIs) to fetch, add, and update data from Oracle SQL and Dremio data lakes
|
|