Stable Audio Open 1.0 (Mæstræa Mirror)

Text-to-Audio SFX & Ambient Textures — Up to 47s Stereo @ 44.1kHz

Original Model by Stability AI · Stability AI Community License

This is an ungated mirror of the Stable Audio Open 1.0 model weights for use with Mæstræa AI Workstation. Only safetensors-format weights are included (legacy .ckpt files stripped). All credits go to the original authors.

What's in This Repo

Path	Description	Size
`model.safetensors`	Main model checkpoint	~3 GB
`transformer/diffusion_pytorch_model.safetensors`	DiT transformer	~1.5 GB
`text_encoder/model.safetensors`	T5 text encoder	~1.2 GB
`vae/diffusion_pytorch_model.safetensors`	VAE decoder	~150 MB
`projection_model/diffusion_pytorch_model.safetensors`	Projection model	~50 MB
`tokenizer/`	T5 tokenizer files	< 10 MB
`model_config.json`	Model architecture config	< 1 KB
`model_index.json`	Diffusers pipeline index	< 1 KB
`scheduler/`	Scheduler config	< 1 KB

What Stable Audio Open Does

Stable Audio Open generates stereo audio at 44.1kHz from text prompts. It excels at:

Sound effects — Foley, impacts, transitions
Ambient textures — Rain, wind, crowds, environments
Musical textures — Pads, drones, atmospheric sounds
Audio scenes — Complex layered soundscapes

Up to 47 seconds of stereo audio per generation.

What It's NOT Good At

Full songs with vocals
High-fidelity musical instruments (use Foundation-1 for that)
Speech synthesis

VRAM Requirements

Minimum: ~4 GB (FP16)
Recommended: ~7 GB (FP16, longer durations)

Usage with Mæstræa

These models are automatically downloaded by the Mæstræa AI Workstation backend.

Direct Usage (diffusers)

from diffusers import StableAudioPipeline
import torch

pipe = StableAudioPipeline.from_pretrained(
    "AEmotionStudio/stable-audio-open-models",
    torch_dtype=torch.float16,
).to("cuda")

audio = pipe(
    prompt="Thunderstorm with heavy rain and distant rolling thunder",
    negative_prompt="low quality, distorted",
    audio_end_in_s=10.0,
    num_inference_steps=100,
).audios[0]

Using stable-audio-tools

from stable_audio_tools import get_pretrained_model
model, model_config = get_pretrained_model("AEmotionStudio/stable-audio-open-models")

License

Stability AI Community License — see LICENSE.md for full terms.

Key points:

Free for research and non-commercial use
Commercial use requires revenue < $1M/year or a separate license from Stability AI
Model outputs cannot be used to train competing models

Credits

Model: Stability AI
Paper: Stable Audio Open
Training Data: FreeSound + Free Music Archive (see attribution CSVs)
Mirror by: AEmotionStudio

Downloads last month: 149

Model tree for AEmotionStudio/stable-audio-open-models

Base model

stabilityai/stable-audio-open-1.0

Finetuned

(18)

this model