Stable Audio Open 1.0 (Mæstræa Mirror)

Text-to-Audio SFX & Ambient Textures — Up to 47s Stereo @ 44.1kHz

Original Model by Stability AI · Stability AI Community License

This is an ungated mirror of the Stable Audio Open 1.0 model weights for use with Mæstræa AI Workstation. Only safetensors-format weights are included (legacy .ckpt files stripped). All credits go to the original authors.

What's in This Repo

Path Description Size
model.safetensors Main model checkpoint ~3 GB
transformer/diffusion_pytorch_model.safetensors DiT transformer ~1.5 GB
text_encoder/model.safetensors T5 text encoder ~1.2 GB
vae/diffusion_pytorch_model.safetensors VAE decoder ~150 MB
projection_model/diffusion_pytorch_model.safetensors Projection model ~50 MB
tokenizer/ T5 tokenizer files < 10 MB
model_config.json Model architecture config < 1 KB
model_index.json Diffusers pipeline index < 1 KB
scheduler/ Scheduler config < 1 KB

What Stable Audio Open Does

Stable Audio Open generates stereo audio at 44.1kHz from text prompts. It excels at:

  • Sound effects — Foley, impacts, transitions
  • Ambient textures — Rain, wind, crowds, environments
  • Musical textures — Pads, drones, atmospheric sounds
  • Audio scenes — Complex layered soundscapes

Up to 47 seconds of stereo audio per generation.

What It's NOT Good At

  • Full songs with vocals
  • High-fidelity musical instruments (use Foundation-1 for that)
  • Speech synthesis

VRAM Requirements

  • Minimum: ~4 GB (FP16)
  • Recommended: ~7 GB (FP16, longer durations)

Usage with Mæstræa

These models are automatically downloaded by the Mæstræa AI Workstation backend.

Direct Usage (diffusers)

from diffusers import StableAudioPipeline
import torch

pipe = StableAudioPipeline.from_pretrained(
    "AEmotionStudio/stable-audio-open-models",
    torch_dtype=torch.float16,
).to("cuda")

audio = pipe(
    prompt="Thunderstorm with heavy rain and distant rolling thunder",
    negative_prompt="low quality, distorted",
    audio_end_in_s=10.0,
    num_inference_steps=100,
).audios[0]

Using stable-audio-tools

from stable_audio_tools import get_pretrained_model
model, model_config = get_pretrained_model("AEmotionStudio/stable-audio-open-models")

License

Stability AI Community License — see LICENSE.md for full terms.

Key points:

  • Free for research and non-commercial use
  • Commercial use requires revenue < $1M/year or a separate license from Stability AI
  • Model outputs cannot be used to train competing models

Credits

Downloads last month
149
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AEmotionStudio/stable-audio-open-models

Finetuned
(18)
this model