-
SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration
Paper • 2411.10958 • Published • 58 -
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
Paper • 2502.18137 • Published • 60 -
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training
Paper • 2505.11594 • Published • 77 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 50
Jintao Zhang
jt-zhang
AI & ML interests
Efficient ML
Recent Activity
updated a model 1 day ago
TurboDiffusion/TurboWan2.1-T2V-14B-480P updated a model 1 day ago
TurboDiffusion/TurboWan2.1-T2V-14B-720P updated a model 1 day ago
TurboDiffusion/TurboWan2.1-T2V-1.3B-480P