Submitted by
Jian Chen
AI & ML interests
Efficient AI
Recent Activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Efficient AI
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference