VAE Lyra 🎡 - Illustrious Edition

Multi-modal VAE trained with custom CLIP weights.

CLIP Encoders

Uses CLIP weights from AbstractPhil/clips:

  • CLIP-L: IllustriousV01_clip_l.safetensors
  • CLIP-G: IllustriousV01_clip_g.safetensors

CLIP Skip: 2 (penultimate layer)

Model Details

  • Fusion Strategy: adaptive_cantor
  • Latent Dimension: 2048
  • Training Steps: 12,125
  • Best Loss: 0.0377
  • Prompt Source: booru

Quick Load (Safetensors)

from safetensors.torch import load_file

# Load just the weights (fast)
state_dict = load_file("weights/lyra_illustrious_best.safetensors")

# Or specific step
state_dict = load_file("weights/lyra_illustrious_step_5000.safetensors")

T5 Input Format

T5 receives a different input than CLIP to enable richer semantic understanding:

CLIP sees:  "masterpiece, 1girl, blue hair, school uniform, smile"
T5 sees:    "masterpiece, 1girl, blue hair, school uniform, smile ΒΆ A cheerful schoolgirl with blue hair smiling warmly"

The pilcrow (ΒΆ) separator acts as a mode-switch token.

Learned Parameters

Alpha (Visibility):

  • clip_g: 0.7316
  • clip_l: 0.7316
  • t5_xl_g: 0.7339
  • t5_xl_l: 0.7451

Beta (Capacity):

  • clip_l_t5_xl_l: 0.5709
  • clip_g_t5_xl_g: 0.5763

Usage

from lyra_xl_multimodal import load_lyra_from_hub

model = load_lyra_from_hub("AbstractPhil/vae-lyra-xl-adaptive-cantor-illustrious")
model.eval()

inputs = {
    "clip_l": clip_l_embeddings,     # [batch, 77, 768]
    "clip_g": clip_g_embeddings,     # [batch, 77, 1280]
    "t5_xl_l": t5_xl_embeddings,     # [batch, 512, 2048]
    "t5_xl_g": t5_xl_embeddings      # [batch, 512, 2048]
}

recons, mu, logvar, _ = model(inputs, target_modalities=["clip_l", "clip_g"])

Files

  • model.pt - Full checkpoint (model + optimizer + scheduler)
  • checkpoint_lyra_illustrious_XXXX.pt - Step checkpoints
  • config.json - Training configuration
  • weights/lyra_illustrious_best.safetensors - Best model weights only
  • weights/lyra_illustrious_step_XXXX.safetensors - Step checkpoints (weights only)
Downloads last month
562
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using AbstractPhil/vae-lyra-xl-adaptive-cantor-illustrious 1