About Me
I am a PhD student in Electronics and Computer Engineering at the University of Cagliari. My research focuses on adversarial machine learning and the security of large language models, with particular attention to jailbreak robustness, mechanistic interpretability, and practical attack and defense evaluation. I am interested in understanding how and why models fail, and in designing methods that improve their reliability in real-world settings.
News
ELLIS Institute Tübingen
Starting next month I will be joining the ELLIS Institute Tübingen, working on AI Safety and Alignment under the supervision of Maksym Andriushchenko.
Slides available: “From Evasion to Jailbreak”
The slides are available for the tutorial “From Evasion to Jailbreak: Adversarial Machine Learning in the age of LLMs”, held with Fabio Brau at TAIC - ITASEC2026. View the talk page.
AAAI 2026 in Singapore 🇸🇬
Together with Giorgio Piras, we are presenting “SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models”.
January 22, 12 pm, Hall 2 — Poster #62. If you are around, come by for a chat on LLMs’ refusal.
NeurIPS 2025 in San Diego 🇺🇸
Together with Fabio Brau, we are presenting “TransferBench: Benchmarking Ensemble-based Black-box Transfer Attacks” at NeurIPS 2025 and EurIPS 2025.
December 4, 2025, 11:00 AM-2:00 PM PST, Poster Stand #3914.
