BUSINESS

How xMemory cuts token costs and context bloat in AI agents

March 25, 2026 MMN Editor Filed Under: Uncategorized

Standard RAG pipelines break when enterprises try to use them for long-term, multi-session LLM agent deployments. This is a critical limitation as demand for persistent AI assistants grows.xMemory, a new technique developed by researchers at King’s College London and The Alan Turing Institute, solves this by organizing conversations into a searchable hierarchy of semantic themes.Experiments show that xMemory improves answer quality and long-range reasoning across various LLMs while cutting inference costs. According to the researchers, it drops token usage from over 9,000 to roughly 4,700 tokens per query compared to existing systems on some tasks.For real-world enterprise applications like personalized AI assistants and multi-session decision support tools, this means organizations can deploy more reliable, context-aware agents capable of maintaining coherent long-term memory without blowing up computational expenses.RAG wasn’t built for thisIn many enterprise LLM applications, a critical expectation is that these systems will maintain coherence and personalization across long, multi-session interactions. To support this long-term reasoning, one common approach is to use standard RAG: store past dialogues and events, retrieve a fixed number of top matches based on embedding similarity, and concatenate them into a context window to generate answers.However, traditional RAG is built for large databases where the retrieved documents are highly diverse. The main challenge is filtering out entirely irrelevant information. An AI agent’s memory, by contrast, is a bounded and continuous stream of conversation, meaning the stored data chunks are highly correlated and frequently contain near-duplicates.To understand why simply increasing the context window doesn’t work, consider how standard RAG handles a concept like citrus fruit.Imagine a user has had many conversations saying things like “I love oranges,” “I like mandarins,” and separately, other conversations about what counts as a citrus fruit. Traditional RAG may treat all of these as semantically close and keep retrieving similar “citrus-like” snippets. “If retrieval collapses onto whichever cluster is densest in embedding space, the agent may get many highly similar passages about preference, while missing the category facts needed to answer the actual query,” Lin Gui, co-author of the paper, told VentureBeat. A common fix for engineering teams is to apply post-retrieval pruning or compression to filter out the noise. These methods assume that the retrieved passages are highly diverse and that irrelevant noise patterns can be cleanly separated from useful facts.This approach falls short in conversational agent memory because human dialogue is “temporally entangled,” the researchers write. Conversational memory relies heavily on co-references, ellipsis, and strict timeline dependencies. Because of this interconnectedness, traditional pruning tools often accidentally delete important bits of a conversation, leaving the AI without vital context needed to reason accurately.Why the fix most teams reach for makes things worseTo overcome these limitations, the researchers propose a shift in how agent memory is built and searched, which they describe as “decoupling to aggregation.”Instead of matching user queries directly against raw, overlapping chat logs, the system organizes the conversation into a hierarchical structure. First it decouples the conversation stream into distinct, standalone semantic components. These individual facts are then aggregated into a higher-level structural hierarchy of themes.When the AI needs to recall information, it searches top-down through the hierarchy, going from themes to semantics and finally to raw snippets. This approach avoids redundancy. If two dialogue snippets have similar embeddings, the system is unlikely to retrieve them together if they have been assigned to different semantic components.For this architecture to succeed, it must balance two vital structural properties. The semantic components must be sufficiently differentiated to prevent the AI from retrieving redundant data. At the same time, the higher-level aggregations must remain semantically faithful to the original context to ensure the model can craft accurate answers.A four-level hierarchy that shrinks the context windowThe researchers developed xMemory, a framework that combines structured memory management with an adaptive, top-down search strategy.xMemory continuously organizes the raw stream of conversation into a structured, four-level hierarchy. At the base are the raw messages, which are first summarized into contiguous blocks called “episodes.” From these episodes, the system distills reusable facts as semantics that disentangle the core, long-term knowledge from repetitive chat logs. Finally, related semantics are grouped together into high-level themes to make them easily searchable.xMemory uses a special objective function to constantly optimize how it groups these items. This prevents categories from becoming too bloated, which slows down search, or too fragmented, which weakens the model’s ability to aggregate evidence and answer questions.When it receives a prompt, xMemory performs a top-down retrieval across this hierarchy. It starts at the theme and semantic levels, selecting a diverse, compact set of relevant facts. This is crucial for real-world applications where user queries often require gathering descriptions across multiple topics or chaining connected facts together for complex, multi-hop reasoning.Once it has this high-level skeleton of facts, the system controls redundancy through what the researchers call “Uncertainty Gating.” It only drills down to pull the finer, raw evidence at the episode or message level if that specific detail measurably decreases the model’s uncertainty.“Semantic similarity is a candidate-generation signal; uncertainty is a decision signal,” Gui said. “Similarity tells you what is nearby. Uncertainty tells you what is actually worth paying for in the prompt budget.” It stops expanding when it detects that adding more detail no longer helps answer the question.What are the alternatives?Existing agent memory systems generally fall into two structural categories: flat designs and structured designs. Both suffer from fundamental limitations.Flat approaches such as MemGPT log raw dialogue or minimally processed traces. This captures the conversation but accumulates massive redundancy and increases retrieval costs as the history grows longer.Structured systems such as A-MEM and MemoryOS try to solve this by organizing memories into hierarchies or graphs. However, they still rely on raw or minimally processed text as their primary retrieval unit, often pulling in extensive, bloated contexts. These systems also depend heavily on LLM-generated memory records that have strict schema constraints. If the AI deviates slightly in its formatting, it can cause memory failure.xMemory addresses these limitations through its optimized memory construction scheme, hierarchical retrieval, and dynamic restructuring of its memory as it grows larger.When to use xMemoryFor enterprise architects, knowing when to adopt this architecture over standard RAG is critical. According to Gui, “xMemory is most compelling where the system needs to stay coherent across weeks or months of interaction.”Customer support agents, for instance, benefit greatly from this approach because they must remember stable user preferences, past incidents, and account-specific context without repeatedly pulling up near-duplicate support tickets. Personalized coaching is another ideal use case, requiring the AI to separate enduring user traits from episodic, day-to-day details.Conversely, if an enterprise is building an AI to chat with a repository of files, such as policy manuals or technical documentation, “a simpler RAG stack is still the better engineering choice,” Gui said. In those static, document-centric scenarios, the corpus is diverse enough that standard nearest-neighbor retrieval works perfectly well without the operational overhead of hierarchical memory.The write tax is worth itxMemory cuts the latency bottleneck associated with the LLM’s final answer generation. In standard RAG systems, the LLM is forced to read and process a bloated context window full of redundant dialogue. Because xMemory’s precise, top-down retrieval builds a much smaller, highly targeted context window, the reader LLM spends far less compute time analyzing the prompt and generating the final output.In their experiments on long-context tasks, both open and closed models equipped with xMemory outperformed other baselines, using considerably fewer tokens while increasing task accuracy.However, this efficient retrieval comes with an upfront cost. For an enterprise deployment, the catch with xMemory is that it trades a massive read tax for an upfront write tax. While it ultimately makes answering user queries faster and cheaper, maintaining its sophisticated architecture requires substantial background processing.Unlike standard RAG pipelines, which cheaply dump raw text embeddings into a database, xMemory must execute multiple auxiliary LLM calls to detect conversation boundaries, summarize episodes, extract long-term semantic facts, and synthesize overarching themes.Furthermore, xMemory’s restructuring process adds additional computational requirements as the AI must curate, link, and update its own internal filing system. To manage this operational complexity in production, teams can execute this heavy restructuring asynchronously or in micro-batches rather than synchronously blocking the user’s query.For developers eager to prototype, the xMemory code is publicly available on GitHub under an MIT license, making it viable for commercial uses. If you are trying to implement this in existing orchestration tools like LangChain, Gui advises focusing on the core innovation first: “The most important thing to build first is not a fancier retriever prompt. It is the memory decomposition layer. If you get only one thing right first, make it the indexing and decomposition logic.”Retrieval isn’t the last bottleneckWhile xMemory offers a powerful solution to today’s context-window limitations, it clears the path for the next generation of challenges in agentic workflows. As AI agents collaborate over longer horizons, simply finding the right information won’t be enough.“Retrieval is a bottleneck, but once retrieval improves, these systems quickly run into lifecycle management and memory governance as the next bottlenecks,” Gui said. Navigating how data should decay, handling user privacy, and maintaining shared memory across multiple agents is exactly “where I expect a lot of the next wave of work to happen,” he said.

Amazon is selling a reversible outdoor rug for $24, and it’s 100% waterproof

March 25, 2026 MMN Editor Filed Under: Uncategorized

TheStreet aims to feature only the best products and services. If you buy something via one of our links, we may earn a commission.Why we love this dealThe onset of spring brings with it a welcome return to the great outdoors. Whether you’re gearing up for camping season, or you’re simply looking forward to enjoying your backyard patio set, the sun is a sight for sore eyes. That’s why it’s important this time of year to start stocking up on what you’ll need for the more temperate months. For many, that begins with a good outdoor rug. After all, what good is that patio set if you’ve got no accent rug to make it pop? Amazon’s Big Spring Sale is ready to help you do that, with a handsome indoor outdoor area rug that’s available at a big discount.The Genimo Reversible Outdoor Rug is only $24 at the moment. That’s 40% off the original price of $40. If ever there was a good time to start dressing up your patio, it’s as soon as the clouds clear. If not, you might be last in line.Genimo Reversible Outdoor Rug, $24 (was $40) at Amazon

Courtesy of Amazon

Shop at AmazonWhy do shoppers love it?This rug offers so many benefits, we’d almost feel guilty paying such a small price. However, we did say almost. You might want to consider it for yourself, especially because of the reversible design. That means it’s practically two rugs in one, and you’re getting it at a huge discount. The entire rug is made from durable and long-lasting polypropylene. It’s a safe material that’s UV-resistant, fade-resistant, and waterproof. You can set it in your backyard and not worry about removing it until the next winter rears its ugly head. Even then, you can continue to use it inside, thanks to the stylish color and pattern.The rug is a gorgeous Coffee and beige color with the reverse on the backside. Whether you want a lighter motif or a darker one, this piece has you covered. What’s more, once you flip it over, you can easily hose it off with water and it looks as good as new. That said, the lighter side of each rug may take a little more elbow grease to get clean if it’s been face down for a long period of time.The rug measures 5 feet long by 8 feet wide, though it’s available in 10 shapes and sizes. Some may find this size a bit too small for a larger patio or yard, so those shoppers may want to spend a little more for one of the larger sized options. This model also comes in 10 color variants, so there’s something for everyone’s needs and personal aesthetic. With the sale price so low, it might be wise to buy more than one while you still can.One of the biggest benefits of a reversible waterproof rug like this one is how versatile it is. While it’s great as a long-term accent piece for your patio or porch, it’s also easy to take on the go. You can use it for camping, beach days, or even kids sporting events. Its relatively thin design means you can fold it and stow it in the back of your car or trunk with very little hassle. There’s really no reason not to buy it with all this upside.Related: Craftsman’s 9-drawer organizer is just $20, and it’s a dream for DIY projectsDetails to knowDimensions: 8 feet long by 5 feet wide.Material: All-weather polypropylene.Sizes: 10 variants.Colorways: 10 options.Amazon customers certainly saw plenty to rave about with the rug. One called it the “perfect outdoor rug,” before also saying, “I got so many positive comments…It’s excellent for outside, easy to clean, and easy to pick up and put down.”Shop more deals Gaomon 9×12 Waterproof Outdoor Rug, $47 (was $56) at AmazonHebe 5×8 Waterproof Outdoor Rug, $21 (was $30) at AmazonGarveehome 8×10 Scalloped Outdoor Rug, $57 at AmazonIf you want a rug that can work just about anywhere inside or outside of the home, then consider the Genimo Reversible Outdoor Rug. At just $24 during Amazon’s Big Spring Sale, this deal is almost as welcome as the warm spring weather.

Oil pares losses as traders weigh cease-fire prospects and try to ‘price in light at the end of the tunnel’

March 25, 2026 MMN Editor Filed Under: Uncategorized

Traders are watching the clock on President Trump’s five-day pause on energy-infrastructure attacks.

SBI, Sony back Startale’s $63 million push to expand Japan’s tokenized finance stack

March 25, 2026 MMN Editor Filed Under: Uncategorized

The Singapore-based company builds blockchain tools for financial firms and retail users, including a blockchain for tokenized securities, stablecoins, and a consumer app.

Here’s how much it could cost to fix Mideast oil and gas production damaged by the Iran war

March 25, 2026 MMN Editor Filed Under: Uncategorized

The damage to energy infrastructure in the Middle East caused by the war with Iran will take years and billions of dollars to repair, according to Rystad analysts.

Macy’s is selling a $1,984 sleeper sofa for only $397, and it has major vintage-style vibes

March 25, 2026 MMN Editor Filed Under: Uncategorized

TheStreet aims to feature only the best products and services. If you buy something via one of our links, we may earn a commission.Why we love this dealEveryone wants the perfect sofa. Whether you’re looking for a full-sized sofa to anchor your living room or just a simple futon for the occasional guest, you know when it just feels right. We found a couch that straddles the line perfectly between the two, and it’s on sale at Macy’s. That said, we’re not sure exactly how long it might be available. At 80% off, we’re not even sure how the inventory lasted through this first paragraph.The Gaomon Convertible Sleeper Sofa is currently available for only $397, which is alarming (in the best way) considering the original price is a whopping $1,984. That sounds like a sleeper hit to us, so you may want to put one in your cart ASAPGaomon Convertible Sleeper Sofa, $397 (was $1,984) at Macy’s

Courtesy of Macy’s

Shop at Macy’sWhy do shoppers love it?When it comes to sleeper sofas, comfort, convenience, and chicness are important in equal measure. This model has all that and more. For starters, the couch’s frame and legs are made from thick and sturdy engineered wood. It offers a stable base for the piece, which feels solid and hefty when you sit. While some prefer solid wood construction for its exceptional durability, engineered wood offers a lightweight alternative that also keeps costs low. What’s more, the soft linen flannel upholstery is soft to the touch and provides maximum comfort after a long day on your feet. The seat and back cushions are thick enough to provide relief while still offering the support needed for those with back issues. As for convenience, where do we begin? With sofa dimensions of 75.5 inches long by 28.3 inches wide by 32.2 inches high, it’s the perfect size to be an accent piece in the living room or a standalone centerpiece in a guest room. It converts from couch to bed and back in just seconds, adding even more convenience to the mix. When the sofa is folded down into bed form, the dimensions change to 67 inches long by 35.8 inches wide by 13.7 inches high. While those dimensions make it a great fit for small spaces, some may not favor a mattress that’s slightly smaller than a standard twin size. The deep green color gives it its signature vintage style, which is different enough to be intriguing without calling too much attention. The cushions have vertical stitching that perfectly complements the modern lines of this lovely piece of furniture. Comfort, convenience, and chicness in an affordable package make this a deal we don’t think should be ignored. Related: Macy’s is selling a charming $323 rattan porch swing for $102Pros and ConsProsLightweight: the construction makes the sofa easy to move and convert.Size: Smaller dimensions offer more versatility for placement.Appearance: A vintage-style design looks great with any aesthetic. ConsMaterial: The sofa does not have a solid wood frame, but rather one made of engineered wood.Mattress: When converted into a bed, the dimensions are slightly smaller than a standard twin mattress.Midcentury-modern designs are all the rage at the moment, and this piece fits perfectly into that category without appearing too niche. It’s a bit of a chameleon, taking on the spirit of any decor it’s paired with. Shop more deals Gaomon Sleeper Sofa Loveseat, $384 (was $1,920) at Macy’sGaomon Orange Corduroy Sofa, $415 (was $2,076) at Macy’sGaomon Boneless Sectional Cloud Sofa, $796 (was $3,979) at Macy’sThe Gaomon Convertible Sleeper Sofa is available for only $397 for the time being, and we think that’s your sign to take your chances on this deal. Furniture discounts this big only come along once in a while, so we recommend grabbing them with both hands when you can.

BUSINESS

How xMemory cuts token costs and context bloat in AI agents

Amazon is selling a reversible outdoor rug for $24, and it’s 100% waterproof

Oil pares losses as traders weigh cease-fire prospects and try to ‘price in light at the end of the tunnel’

SBI, Sony back Startale’s $63 million push to expand Japan’s tokenized finance stack

Here’s how much it could cost to fix Mideast oil and gas production damaged by the Iran war

Macy’s is selling a $1,984 sleeper sofa for only $397, and it has major vintage-style vibes

Amazon’s No. 1 Movie ‘Mercy’ Has Two Rotten Tomatoes Records

The Iran War Is Like The Iraq War — And Here’s The Lessons Trump Could Learn: Analyst

Solana bets on AI agents: Foundation says network is becoming core infrastructure for ‘agentic’ internet

Sky-backed Obex spreads $1 billion across credit, energy and AI assets to expand stablecoin yield