AbstractPhila PRO
AbstractPhil
AI & ML interests
datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.
Recent Activity
posted an update about 2 hours ago
SVD + Scatterpoint2D is the official encoding structure of the geolip system as of the image encoding tests.
Both unattuned scatterpoint2d and triton-aligned SVD are a cut above the rest by a large margin.
https://github.com/kymatio/kymatio
https://huggingface.co/blog/AbstractPhil/svd-triton-kernel-optimization
https://huggingface.co/AbstractPhil/svd-triton
https://huggingface.co/AbstractPhil/geolip-hypersphere-experiments/tree/main/spectral/notebooks
Most kymatio tests were done on standard pytorch models that yielded higher accuracy than simple conv or transformers before overfitting, but not in every instance. Most common tested low-count cifar10 and cifar100 instances yielded more for less. Those are in the hypersphere-experiments notebooks and are viewable via huggingface tensorboard metrics.
The accuracy, retention, agreement, disagreement, and sheer capacity of the refined SVD kernel shows that full Procrustes alignment is not just crucial to distillation, but also entirely representable within encoders themselves as students.
This structure can representationally re-impose layer-by-layer which is what I tested, and this capture system can behave as a global regularization system, a selector, a behavioral adjudication structure, an encoding solidification unit, a trajectory systemic accumulator, an anchored differentiation unit, and about 30 other tests show - all of the above simultaneously.
The preliminary rapid-iteration capable kernel shows that not only can these behaviorally represent utility, but the noise-drift can be directly accounted for using systems like GELU, drop path, dropout, and other elements to learn to ignore that very noise that accumulates.
Attention is now officially deemed valid when utilized based on the tests and examples allowing preserved geometric structure after attention selection.
This encoding structure is substantially more durable than I can give credit for.
Surge is coming, exactly as predicted. Late I admit. published an article about 3 hours ago
Fused Batched Thin SVD: Engineering a 5000× Speedup with Triton Kernels updated a model about 4 hours ago
AbstractPhil/svd-triton