← All seminars

Compact Proofs of Model Performance via Mechanistic Interpretability

Louis Jaburi · Independent researcher

December 2024

Proposes constructing rigorous, compact proofs about neural-network behavior using mechanistic interpretability. Discusses challenges and scaling directions for formal verification.