Semantic Bundle AI: A Complementary Layer for LLMs — 91.7% Memory Reduction, 38.6% Drift Reduction, Zero Retraining

Hi HF community,

I’d like to share two preprints on Semantic Bundle AI, a drop-in complementary layer for existing LLMs that addresses three structural problems: semantic drift, difficulty of targeted edits, and memory overhead.

The problem

When you update a concept in an LLM’s embedding space:

  • The change drifts across unrelated concepts (semantic drift)
  • You can’t surgically edit one concept without contaminating others
  • Storing full embeddings at scale is expensive

The approach

Semantic Bundle AI sits on top of existing LLMs — no architectural changes, no retraining required.

  • Anchor coordinates: stable reference frames that resist drift
  • Semantic bundles: structured concept representations with controlled update locality
  • Sparse reconstruction: compress stored embeddings via bundle-based reconstruction

PoC results (4 experiments)

Metric Result
Memory reduction (K=64) 91.7% (45.0 KB → 3.8 KB)
Reconstruction similarity 0.963
Cumulative drift reduction 38.6%
Edit contamination rate 32.6% of baseline (at ρ=0.1)

Zero retraining. Zero architectural modifications.

Papers & code

Zenodo (Paper 0 + Paper 1): Search results
Code: GitHub - msaitou-glitch/Semantic-Bundle-AI: Official repository for the "Meaning Bundle AI" project. Complementary Layer to LLMs using Stable Coordinate Systems. · GitHub

Limitations (honest)

  • Small-scale controlled datasets (15–110 sentences, single domain)
  • Stability–ranking tradeoff identified (anchor coordinates improve cluster stability but not ranking consistency)
  • Not yet validated at production scale
  • Paper 1 under review at SSRN

Looking for critiques, failure cases, and adjacent work. Happy to discuss.

PoC 01 — Anchor Coordinate Stability

Anchor coordinates reduce intra-cluster variance to 5–8% of raw embedding variance (ratio: 0.052–0.081). Synonym group clusters tighten dramatically while noise group separation is maintained.

PoC 02 — Longitudinal Consistency

Bundle updates (ρ=0.1) reduce cumulative drift by 38.6% over 10 sequential updates while maintaining consistency score of 0.931 with the initial bundle.

PoC 03 — Edit Locality

At ρ=0.1, targeted edits to one concept (Apple/Vision Pro) contaminate unrelated bundles at only 32.6% of baseline rate. Recommended operating range: ρ < 0.15.