My work aims at understanding the inner workings of language models to improve evaluation, reliability, and efficiency. I am currently a PhD student at the Institute for Machine Learning at ETH Zurich and have experience in industry (Google, Bloomberg, IBM, Microsoft Research, German Foreign Office) and research (UCL, University of Oxford, Tsinghua University, TU Berlin).
Research Fields
Machine Learning, Natural Language Processing, Computational Social Science
Ongoing Projects
Inference-time Steering
Mechanistic Interpretability of Language Models
Latent (Ordinal) Preference Ranking
My work aims at understanding the inner workings of language models to improve evaluation, reliability, and efficiency. I am currently a PhD student at the Institute for Machine Learning at ETH Zurich and have experience in industry (Google, Bloomberg, IBM, Microsoft Research, German Foreign Office) and research (UCL, University of Oxford, Tsinghua University, TU Berlin).
Research Fields
Machine Learning, Natural Language Processing, Computational Social Science
Ongoing Projects
Inference-time Steering
Mechanistic Interpretability of Language Models
Latent (Ordinal) Preference Ranking