File size: 2,575 Bytes
b32916f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
## 7. Results

Reconstruction quality evaluated on a curated set of test images covering photographs, book covers, and documents. Flux.1 VAE (patch 8, 16 channels) is included as a reference at the same 12x compression ratio as the c64 variant.

### 7.1 Interactive Viewer

**[Open full-resolution comparison viewer](https://huggingface.co/spaces/data-archetype/irdiffae-results)** — side-by-side reconstructions, RGB deltas, and latent PCA with adjustable image size.

### 7.2 Inference Settings

| Setting | Value |
|---------|-------|
| Sampler | ddim |
| Steps | 1 |
| Schedule | linear |
| Seed | 42 |
| PDG | no_path_dropg |
| Batch size (timing) | 4 |

> All models run in bfloat16. Timings measured on an NVIDIA RTX Pro 6000 (Blackwell).

### 7.3 Global Metrics

| Metric | semdisdiffae (1 step) | Flux.2 VAE |
|--------|--------|--------|
| Avg PSNR (dB) | 35.78 | 34.16 |
| Avg encode (ms/image) | 2.5 | 46.1 |
| Avg decode (ms/image) | 5.5 | 91.8 |

### 7.4 Per-Image PSNR (dB)

| Image | semdisdiffae (1 step) | Flux.2 VAE |
|-------|--------|--------|
| p640x1536:94623 | 35.44 | 33.50 |
| p640x1536:94624 | 31.33 | 30.03 |
| p640x1536:94625 | 35.05 | 33.98 |
| p640x1536:94626 | 33.21 | 31.53 |
| p640x1536:94627 | 32.54 | 30.53 |
| p640x1536:94628 | 29.80 | 28.88 |
| p960x1024:216264 | 46.37 | 45.39 |
| p960x1024:216265 | 29.70 | 27.80 |
| p960x1024:216266 | 47.15 | 46.20 |
| p960x1024:216267 | 40.99 | 39.23 |
| p960x1024:216268 | 38.47 | 36.13 |
| p960x1024:216269 | 32.74 | 30.24 |
| p960x1024:216270 | 36.23 | 34.18 |
| p960x1024:216271 | 44.41 | 42.18 |
| p704x1472:94699 | 43.80 | 41.79 |
| p704x1472:94700 | 32.83 | 32.08 |
| p704x1472:94701 | 39.00 | 37.90 |
| p704x1472:94702 | 34.52 | 32.50 |
| p704x1472:94703 | 32.81 | 31.35 |
| p704x1472:94704 | 33.38 | 31.84 |
| p704x1472:94705 | 39.70 | 37.44 |
| p704x1472:94706 | 35.12 | 33.66 |
| r256_p1344x704:15577 | 31.02 | 29.98 |
| r256_p1344x704:15578 | 32.38 | 30.79 |
| r256_p1344x704:15579 | 33.27 | 31.83 |
| r256_p1344x704:15580 | 37.84 | 36.03 |
| r256_p1344x704:15581 | 38.57 | 36.94 |
| r256_p1344x704:15582 | 33.41 | 32.10 |
| r256_p1344x704:15583 | 36.67 | 34.54 |
| r256_p1344x704:15584 | 33.23 | 31.76 |
| r256_p896x1152:144131 | 35.30 | 33.60 |
| r256_p896x1152:144132 | 36.99 | 35.32 |
| r256_p896x1152:144133 | 39.69 | 37.33 |
| r256_p896x1152:144134 | 36.01 | 34.47 |
| r256_p896x1152:144135 | 31.20 | 29.87 |
| r256_p896x1152:144136 | 37.51 | 35.68 |
| r256_p896x1152:144137 | 33.83 | 32.86 |
| r256_p896x1152:144138 | 27.39 | 25.63 |
| VAE_accuracy_test_image | 36.64 | 35.25 |