Alexey Dontsov's picture

Alexey Dontsov

therem

·

somvy

AI & ML interests

None yet

Recent Activity

liked a model 14 days ago

NousResearch/Meta-Llama-3.1-8B-Instruct

liked a model 14 days ago

unsloth/Meta-Llama-3.1-8B-Instruct

upvoted a paper about 1 month ago

Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration

View all activity

Organizations

authored a paper 2 months ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published Feb 15 • 56

submitted a paper to Daily Papers 2 months ago

Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?

Paper • 2602.14111 • Published Feb 15 • 56

authored 2 papers 7 months ago

OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features

Paper • 2509.22033 • Published Sep 26, 2025 • 19

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Paper • 2509.22067 • Published Sep 26, 2025 • 28

authored a paper about 1 year ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 121