Zheng Han's picture

Zheng Han

traphix

·

AI & ML interests

None yet

Recent Activity

new activity 6 days ago

nm-testing/MiniMax-M2.5-W4A16:oneshot vs model_free_ptq? which one has better recovery?

new activity 12 days ago

RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic:W4A16 quant

new activity 13 days ago

apolo13x/Qwen3.5-35B-A3B-quantized.w4a16:Any creation details?

View all activity

Organizations

None yet

New activity in nm-testing/MiniMax-M2.5-W4A16 6 days ago

oneshot vs model_free_ptq? which one has better recovery?

#1 opened 6 days ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic 12 days ago

W4A16 quant

#1 opened about 2 months ago by

New activity in apolo13x/Qwen3.5-35B-A3B-quantized.w4a16 13 days ago

Any creation details?

#2 opened 13 days ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic 16 days ago

Creation details?

#8 opened 16 days ago by

New activity in RedHatAI/Qwen3.5-122B-A10B-FP8-dynamic 17 days ago

Creation details?

#2 opened 17 days ago by

New activity in nivvis/Qwen3.5-122B-A10B-heretic-v2-FP8 17 days ago

Which framework was used for FP8 quantization? LLM-compressor?

#1 opened 18 days ago by

New activity in huihui-ai/Huihui-Qwen3-Coder-Next-abliterated 18 days ago

GPTQ quantization

#2 opened about 2 months ago by

New activity in win10/Huihui-Qwen3.5-27B-abliterated-FP8 18 days ago

Which framework was used to quantize this model? llm-compressor? or Can you share the quantization Python script?

#1 opened 18 days ago by

New activity in edp1096/Huihui-Qwen3.5-27B-abliterated-FP8 18 days ago

Which framework was used to quantize this model? llm-compressor? or Can you share the quantization Python script?

#2 opened 18 days ago by

New activity in inference-optimization/Qwen3-Coder-Next.w4a16 28 days ago

Question about weight_observer？

#1 opened 28 days ago by

New activity in RedHatAI/MiniMax-M2.5 about 1 month ago

INT4 w4a16 quantinization？

#1 opened about 1 month ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic about 1 month ago

Quantization code for int4(w4a16) ?

#6 opened about 1 month ago by

New activity in RedHatAI/Qwen3-Next-80B-A3B-Instruct-quantized.w4a16 3 months ago

Tokenizer you are loading with an incorrect regex pattern

#2 opened 4 months ago by

New activity in Intel/Qwen3-Next-80B-A3B-Instruct-int4-AutoRound 4 months ago

Failed to find a kernel that can implement the WNA16 linear layer

#1 opened 4 months ago by

New activity in RedHatAI/Qwen3-Next-80B-A3B-Instruct-quantized.w4a16 4 months ago

vllm error: Extra inputs are not permitted

#1 opened 4 months ago by

New activity in RedHatAI/Qwen3-235B-A22B-Instruct-2507-NVFP4 4 months ago

Can A100 run Qwen3-235B-A22B-Instruct-2507-NVFP4?

#1 opened 4 months ago by

New activity in Qwen/Qwen3-Next-80B-A3B-Instruct-FP8 7 months ago

Error on 4 x L40s

#4 opened 7 months ago by

I got ValueError

#3 opened 7 months ago by

New activity in shanjiaz/qwen3-80b-fp8-dynamic 7 months ago

How to run this model via vllm？

#2 opened 7 months ago by

New activity in Qwen/Qwen3-Next-80B-A3B-Instruct 7 months ago

FP8 please

#18 opened 7 months ago by