Agam Goyal
agamg2 [at] illinois [dot] edu

Affiliated Groups
I am a first-year Computer Science Ph.D. student at the University of Illinois Urbana-Champaign, co-advised by Prof. Hari Sundaram and Prof. Eshwar Chandrasekharan. I also collaborate closely with Prof. Koustuv Saha. Previously, I was an undergraduate student at the University of Wisconsin-Madison majoring in Computer Science, Mathematics, and Data Science, and advised by Prof. Hanbaek Lyu and Prof. Junjie Hu.
My research focuses on aligning GenAI technologies with diverse human values and exploring questions around their safe integration in social contexts. I also study the application of NLP techniques more broadly in computational understanding, modeling, and governance of online communities. For more details on my research interests, see this page. My work is graciously supported in part by the Cohere For AI Research Grant Program and the OpenAI Researcher Access Program.
Currently, I am working as a Ph.D. Research Intern at Adobe Research, mentored by Dr. Apoorv Saxena and Dr. Koyel Mukherjee. I am working on developing continual learning techniques for LLMs with applications to agentic RAG settings.
If you are an undergraduate student interested in gaining research experience, please reach out to me at agamg2@illinois.edu. A strong background in ML/NLP and experience with PyTorch is highly recommended.
News
Jun 20, 2025 | Our work Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders has been accepted to the Actionable Interpretability Workshop @ ICML! |
---|---|
Apr 25, 2025 | Gave a talk on Detoxification of LLMs using SAEs at the AImpact Center @ UIUC. [Slides] |
Jan 22, 2025 | Our work on Small Language Models for Content Moderation has been accepted to NAACL 2025 (Main) as an Oral talk! |
Oct 17, 2024 | Two new preprints on Uncovering the Internet’s Hidden Values and Small Language Models for Content Moderation are now on arXiv! |
Jun 20, 2024 | Check out our new pre-print on aligning LLM-agents using belief networks Beyond Demographics: Aligning Role-playing LLM-based Agents Using Human Belief Networks on arXiv. |