Stable Audio Open 1.0 (Mæstræa Mirror)
Text-to-Audio SFX & Ambient Textures — Up to 47s Stereo @ 44.1kHz
Original Model by Stability AI · Stability AI Community License
This is an ungated mirror of the Stable Audio Open 1.0 model weights for use with Mæstræa AI Workstation. Only safetensors-format weights are included (legacy
.ckptfiles stripped). All credits go to the original authors.
What's in This Repo
| Path | Description | Size |
|---|---|---|
model.safetensors |
Main model checkpoint | ~3 GB |
transformer/diffusion_pytorch_model.safetensors |
DiT transformer | ~1.5 GB |
text_encoder/model.safetensors |
T5 text encoder | ~1.2 GB |
vae/diffusion_pytorch_model.safetensors |
VAE decoder | ~150 MB |
projection_model/diffusion_pytorch_model.safetensors |
Projection model | ~50 MB |
tokenizer/ |
T5 tokenizer files | < 10 MB |
model_config.json |
Model architecture config | < 1 KB |
model_index.json |
Diffusers pipeline index | < 1 KB |
scheduler/ |
Scheduler config | < 1 KB |
What Stable Audio Open Does
Stable Audio Open generates stereo audio at 44.1kHz from text prompts. It excels at:
- Sound effects — Foley, impacts, transitions
- Ambient textures — Rain, wind, crowds, environments
- Musical textures — Pads, drones, atmospheric sounds
- Audio scenes — Complex layered soundscapes
Up to 47 seconds of stereo audio per generation.
What It's NOT Good At
- Full songs with vocals
- High-fidelity musical instruments (use Foundation-1 for that)
- Speech synthesis
VRAM Requirements
- Minimum: ~4 GB (FP16)
- Recommended: ~7 GB (FP16, longer durations)
Usage with Mæstræa
These models are automatically downloaded by the Mæstræa AI Workstation backend.
Direct Usage (diffusers)
from diffusers import StableAudioPipeline
import torch
pipe = StableAudioPipeline.from_pretrained(
"AEmotionStudio/stable-audio-open-models",
torch_dtype=torch.float16,
).to("cuda")
audio = pipe(
prompt="Thunderstorm with heavy rain and distant rolling thunder",
negative_prompt="low quality, distorted",
audio_end_in_s=10.0,
num_inference_steps=100,
).audios[0]
Using stable-audio-tools
from stable_audio_tools import get_pretrained_model
model, model_config = get_pretrained_model("AEmotionStudio/stable-audio-open-models")
License
Stability AI Community License — see LICENSE.md for full terms.
Key points:
- Free for research and non-commercial use
- Commercial use requires revenue < $1M/year or a separate license from Stability AI
- Model outputs cannot be used to train competing models
Credits
- Model: Stability AI
- Paper: Stable Audio Open
- Training Data: FreeSound + Free Music Archive (see attribution CSVs)
- Mirror by: AEmotionStudio
- Downloads last month
- 149
Model tree for AEmotionStudio/stable-audio-open-models
Base model
stabilityai/stable-audio-open-1.0