Alessandro Potenza
Computer Engineering student at Politecnico di Milano, building high-performance AI systems, GPU kernels, and open-source tools.
Featured Projects
agentrial
The pytest for AI agents. Run your agent 100 times, get confidence intervals instead of anecdotes. Published open-source framework with Wilson confidence intervals, step-level failure attribution via Fisher exact test, and real cost tracking across 45+ models.
Flash-Reasoning
Tree-Aware KV-Cache Attention for Reasoning LLMs. Custom Fused GQA Triton kernels exploit physical prefix sharing to exceed HBM bandwidth limits, enabling efficient inference for System 2 Reasoning models like DeepSeek-R1.
Flash-SAE
High-Performance Triton Kernels for Sparse Autoencoders. 13.6x speedup and 97% memory reduction via sparse kernel fusion. Drop-in PyTorch replacement with full autograd support for Mechanistic Interpretability research.
Verify-CBL
Neuro-Symbolic Formal Verification Engine combining Z3 SMT solver with LLM-powered code translation. Mathematically proves behavioral equivalence between legacy and modernized code, detecting 'penny drift' that testing misses.
All Projects
agentrial
The pytest for AI agents. Run your agent 100 times, get confidence intervals instead of anecdotes. Published open-source framework with Wilson confidence intervals, step-level failure attribution via Fisher exact test, and real cost tracking across 45+ models.
Flash-Reasoning
Tree-Aware KV-Cache Attention for Reasoning LLMs. Custom Fused GQA Triton kernels exploit physical prefix sharing to exceed HBM bandwidth limits, enabling efficient inference for System 2 Reasoning models like DeepSeek-R1.
Flash-SAE
High-Performance Triton Kernels for Sparse Autoencoders. 13.6x speedup and 97% memory reduction via sparse kernel fusion. Drop-in PyTorch replacement with full autograd support for Mechanistic Interpretability research.
Verify-CBL
Neuro-Symbolic Formal Verification Engine combining Z3 SMT solver with LLM-powered code translation. Mathematically proves behavioral equivalence between legacy and modernized code, detecting 'penny drift' that testing misses.
GPU Performance Analysis: Triton vs. CUDA
Research paper quantifying the performance gap between high-level GPU programming models (Triton) and hand-optimized CUDA kernels, with a focus on irregular workloads like finite state machines.
SplatSLAM
Real-time 3D mapping and SLAM from monocular RGB video using 3D Gaussian Splatting. Photo-realistic dense reconstruction without depth sensors, built as a Nerfstudio extension.
ConceptHub
AI-Powered full-stack learning platform integrating Google's Gemini API. Automates generation of book summaries and conceptual mind maps from text, with user authentication and persistent storage.
Music Genre Classification
End-to-end reproducible pipeline achieving SOTA 83.5% accuracy on GTZAN with a U-Net inspired model. Leak-free methodology with track-level splits, cross-validation, and transfer learning.
Chessboard.js
Modern, dependency-free JavaScript library and NPM package for building interactive chess experiences. Rich API for programmatic control, drag-and-drop, animations, and legal move enforcement.
Skills
Experience
Education
Achievements
Merit-Based Scholarship
Dec 2025Politecnico di Milano
Awarded specifically for outstanding academic performance (GPA and credits) during the M.Sc. in Computer Science and Engineering.
Global Finalist - Tech4Good
2024Huawei Technologies
Selected for the elite 'Seeds For The Future' program. Led a team to design an AI computer vision prototype, winning the National Competition and advancing to the Global Finals in China.
3rd Place - AI Challenge
2025Polimi (AIRLab)
Ranked 3rd out of 193 teams (Top 1.5%) in the Artificial Neural Networks Challenge. Engineered a custom Vision Transformer ensemble for complex medical image classification.