๐ Performance Dashboard
Overview
To better visualize the performance of the SpecBundle draft models, we have built a dashboard to offer interactive experiences to users to explore the evaluation results. We evaluate the performance of SpecBundle draft models under different speculative decoding configurations (i.e. steps, topk, num_draft_tokens) on various benchmarks, the benchmarks include:
- Conversation
- MTBench
- General Knowledge
- GPQA
- FinanceQA
- Math
- GSM8K
- Math500
- Coding
- HumanEval
- LiveCodeBench
Check out the Performance Dashboard for more details.