Semantic Bundle AI: A Complementary Layer for LLMs — 91.7% Memory Reduction, 38.6% Drift Reduction, Zero Retraining

msaitou · May 27, 2026, 9:48pm

Hi HF community,

I’d like to share two preprints on Semantic Bundle AI, a drop-in complementary layer for existing LLMs that addresses three structural problems: semantic drift, difficulty of targeted edits, and memory overhead.

The problem

When you update a concept in an LLM’s embedding space:

The change drifts across unrelated concepts (semantic drift)
You can’t surgically edit one concept without contaminating others
Storing full embeddings at scale is expensive

The approach

Semantic Bundle AI sits on top of existing LLMs — no architectural changes, no retraining required.

Anchor coordinates: stable reference frames that resist drift
Semantic bundles: structured concept representations with controlled update locality
Sparse reconstruction: compress stored embeddings via bundle-based reconstruction

PoC results (4 experiments)

Metric	Result
Memory reduction (K=64)	91.7% (45.0 KB → 3.8 KB)
Reconstruction similarity	0.963
Cumulative drift reduction	38.6%
Edit contamination rate	32.6% of baseline (at ρ=0.1)

Zero retraining. Zero architectural modifications.

Papers & code

Zenodo (Paper 0 + Paper 1): Search results
Code: GitHub - msaitou-glitch/Semantic-Bundle-AI: Official repository for the "Meaning Bundle AI" project. Complementary Layer to LLMs using Stable Coordinate Systems. · GitHub

Limitations (honest)

Small-scale controlled datasets (15–110 sentences, single domain)
Stability–ranking tradeoff identified (anchor coordinates improve cluster stability but not ranking consistency)
Not yet validated at production scale
Paper 1 under review at SSRN

Looking for critiques, failure cases, and adjacent work. Happy to discuss.

msaitou · May 27, 2026, 9:50pm

PoC 01 — Anchor Coordinate Stability

Anchor coordinates reduce intra-cluster variance to 5–8% of raw embedding variance (ratio: 0.052–0.081). Synonym group clusters tighten dramatically while noise group separation is maintained.

msaitou · May 27, 2026, 9:51pm

PoC 02 — Longitudinal Consistency

Bundle updates (ρ=0.1) reduce cumulative drift by 38.6% over 10 sequential updates while maintaining consistency score of 0.931 with the initial bundle.

msaitou · May 27, 2026, 9:53pm

PoC 03 — Edit Locality

At ρ=0.1, targeted edits to one concept (Apple/Vision Pro) contaminate unrelated bundles at only 32.6% of baseline rate. Recommended operating range: ρ < 0.15.

Topic		Replies	Views
Can an LLM lose conceptual continuity while remaining coherent? Research	3	72	June 3, 2026
Detecting LLM weight corruption and semantic drift Research	0	90	December 30, 2025
[Tool] Open-source prompt compressor for LLMs – 22% avg savings with spaCy + rules Show and Tell	2	204	May 19, 2026
KNN-LM with Clustering Centroids for Continuous Learning Models	0	84	September 25, 2025
Catastrophic Forgetting by Language models Research	0	66	February 27, 2026

Semantic Bundle AI: A Complementary Layer for LLMs — 91.7% Memory Reduction, 38.6% Drift Reduction, Zero Retraining

Related topics