Vitthal Bhandari

I am a graduate student in Computational Linguistics at the University of Washington UW. Before coming to Seattle, I spent more than 4 years in the banking industry as a generalist software engineer working at American Express, Standard Chartered Bank, and PayPal. I completed my Bachelor's in Computer Science and Engineering from BITS Pilani where I also did a minor in Data Science and worked with Prof. Poonam Goyal and Prof. Sundaresan Raman.

Email  /  CV  /  Scholar  /  Twitter  /  Github  /  LinkedIn  /  Blog

profile photo

Research

My interests lie in efficient language modeling, adaptive agentic memory, and software engineering.

  1. Currently, I am working on efficient speech modeling for 21 extremely low-resource endangered languages - see Code
  2. I implemented Language Modeling from Scratch - see Blog & Code on tokenization
  3. I am an SWE with over four years of experience with Python and TypeScript, working at American Express, Standard Chartered, and PayPal
  4. At Amex, I created a language translation feature (HuggingFace, Sanic) using Marian MT for translating chats between English and Spanish
  5. At Standard Chartered, I led a team of 4 in developing an API management framework (TypeScript, React, Flask) and creating configurable APIs (1.3K+ APIs)
  6. Previous research experience includes leveraging pretrained models for detecting homophobia and transphobia in YouTube comments - ACL workshop paper
  7. A Qualitative study on the challenges of building hate speech datasets - Preprint
  8. A project on studying the evolution of attitudes toward Trans people over time across partisan leanings in popular political podcasts - Code
prl On the Challenges of Building Datasets for Hate Speech Detection
Vitthal Bhandari
Preprint

This paper presents a comprehensive framework that standardizes the dataset creation pipeline across seven critical checkpoints by identifying systemic challenges in hate speech dataset creation.

arXiv
blind-date Leveraging Pretrained Language Models for Detecting Homophobia and Transphobia in Social Media Comments
Vitthal Bhandari and Poonam Goyal
ACL 2022 Workshop on Language Technology for Equality, Diversity and Inclusion

I contributed to a shared task focused on identifying homophobic and transphobic content in YouTube comments by implementing basic classifiers using multilingual pre-trained language models to analyze English, Tamil, and code-mixed datasets.

Paper | Code
clean-usnob Reviewing the collaborative role of Image processing in retinal imaging
Rehana Khan, Vitthal Bhandari, Sundaresan Raman, Abhishek Vyas, Akshay Raman, Maitreyee Roy and Rajiv Raman
Teleophthalmology and Digital Health: A Practical Guide to Applications, Springer Nature

Paper

Coursework

LING 575: Speech Technology for Endangered Languages
LING 572: Advanced Statistical Methods for Natural Language Processing
LING 571: Deep Processing Techniques for Natural Language Processing
LING 570: Shallow Processing Techniques for Natural Language Processing
LING 575: Societal Impacts of Language Technology
Stanford CS 336: Language Modeling from Scratch
Harvard CS 2881: AI Safety
Stanford CS 234: Reinforcement Learning

Credits of this template go to source code.