apple
/

DiffuCoder-7B-Instruct

text-diffusion-model

diffusion large language model

Model card Files Files and versions

DiffuCoder-7B-Instruct / README.md

Sansa's picture

Create README.md

e15eeb5 verified 10 months ago

|

999 Bytes

	---
	base_model:
	- apple/DiffuCoder-7B-Base
	tags:
	- code
	- text-diffusion-model
	- diffusion large language model
	license: unknown
	---
	### DiffuCoder-7B-Instruct

	The DiffuCoder-7B-Instruct model builds on the DiffuCoder-7B-Base checkpoint with instruction-tuning to better follow code-related prompts.

	- Training recipe: with a newly introduced pad token, we train this model with fixed length conditionally on [OpenCoder-SFT](https://huggingface.co/datasets/OpenCoder-LLM/opc-sft-stage2) data for 5 epochs.

	- Benchmarks: Demonstrates stronger instruction-following capabilities than the Base model.


	#### More details and usage examples:

	- Paper: [DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation](https://arxiv.org/abs/2506.20639)

	- GitHub: https://github.com/apple/ml-diffucoder

	#### Acknowledgement
	To power this HuggingFace model release, we reuse [Dream](https://huggingface.co/Dream-org/Dream-v0-Base-7B)'s modeling architecture and generation utils.