Introducing vector search with UltraWarm in Amazon OpenSearch Service

April 16, 2025

7

Amazon OpenSearch Service has been offering vector database capabilities to allow environment friendly vector similarity searches utilizing specialised k-nearest neighbor (k-NN) indexes to prospects since 2019. This performance has supported numerous use circumstances similar to semantic search, Retrieval Augmented Era (RAG) with massive language fashions (LLMs), and wealthy media looking. With the explosion of AI capabilities and the rising creation of generative AI purposes, prospects are looking for vector databases with wealthy characteristic units.

OpenSearch Service additionally provides a multi-tiered storage answer to its prospects within the type of UltraWarm and Chilly tiers. UltraWarm gives cost-effective storage for less-active knowledge with question capabilities, although with greater latency in comparison with scorching storage. Chilly tier provides even lower-cost archival storage for indifferent indexes that may be reattached when wanted. Transferring knowledge to UltraWarm makes it immutable, which aligns properly with use circumstances the place knowledge updates are rare like log analytics.

Till now, there was a limitation the place UltraWarm or Chilly storage tiers couldn’t retailer k-NN indexes. As prospects undertake OpenSearch Service for vector use circumstances, we’ve noticed that they’re going through excessive prices as a result of reminiscence and storage turning into bottlenecks for his or her workloads.

To supply related cost-saving economics for bigger datasets, we at the moment are supporting k-NN indexes in each UltraWarm and Chilly tiers. This can allow you to avoid wasting prices, particularly for workloads the place:

A good portion of your vector knowledge is accessed much less continuously (for instance, historic product catalogs, archived content material embeddings, or older doc repositories)
You want isolation between continuously and often accessed workloads, minimizing the necessity to scale scorching tier cases to assist forestall interference from indexes that may be moved to the nice and cozy tier

On this publish, we talk about this new functionality and its use circumstances, and supply a cost-benefit evaluation in several situations.

New functionality: Ok-NN indexes in UltraWarm and Chilly tiers

Now you can allow UltraWarm and Chilly tiers to your k-NN indexes from OpenSearch Service model 2.17 and up. This characteristic is accessible for each new and current domains upgraded to model 2.17. Ok-NN indexes created after OpenSearch Service model 2.x are eligible for migration to heat and chilly tiers. Ok-NN indexes utilizing numerous forms of engines (FAISS, NMSLib, and Lucene) are eligible emigrate.

Use circumstances

This multi-tiered method to k-NN vector search advantages the next numerous use circumstances:

Lengthy-term semantic search – Keep searchability on years of historic textual content knowledge for authorized, analysis, or compliance functions
Evolving AI fashions – Retailer embeddings from a number of variations of AI fashions, permitting comparisons and backward compatibility with out the price of preserving all knowledge in scorching storage
Massive-scale picture and video similarity – Construct in depth libraries of visible content material that may be searched effectively, even because the dataset grows past the sensible limits of scorching storage
Ecommerce product suggestions – Retailer and search by way of huge product catalogs, shifting much less standard or seasonal gadgets to cheaper tiers whereas sustaining search capabilities

Let’s discover real-world situations as an instance the potential price advantages of utilizing k-NN indexes with UltraWarm and Chilly storage tiers. We might be utilizing us-east-1 because the consultant AWS Area for these situations.

Situation 1: Balancing scorching and heat storage for combined workloads

Let’s say you’ve got 100 million vectors of 768 dimensions (round 330 GB of uncooked vectors) unfold throughout 20 Lucene engine indexes of 5 million vectors every (roughly 16.5 GB), out of which 50% of knowledge (about 10 indexes or 165 GB) is queried sometimes.

Area setup with out UltraWarm assist

On this method, you prioritize most efficiency by preserving the entire knowledge in scorching storage, offering the quickest doable question responses for the vectors. You deploy a cluster with 6x r6gd.4xlarge cases.

The month-to-month price for this setup involves $7,550 per thirty days with an information occasion price of $6,700.

Though this gives top-tier efficiency for the queries, it is perhaps over-provisioned given the combined entry patterns of your knowledge.

Price-saving technique: UltraWarm area setup

On this method, you align your storage technique with the noticed entry patterns, optimizing for each efficiency and price. The recent tier continues to offer optimum efficiency for continuously accessed knowledge, whereas much less essential knowledge strikes to UltraWarm storage.

Whereas UltraWarm queries expertise greater latency in comparison with scorching storage—this trade-off is commonly acceptable for much less continuously accessed knowledge. Moreover, since UltraWarm knowledge turns into immutable, this technique works greatest for steady datasets that don’t require any updates.

You retain the continuously accessed 50% of knowledge (roughly 165 GB) in scorching storage, permitting you to cut back your scorching tier to 3x r6gd.4xlarge.search cases. For the much less continuously accessed 50% of knowledge (roughly 165 GB), you introduce 2x ultrawarm1.medium.search cases as UltraWarm nodes. This tier provides a cheap answer for knowledge that doesn’t require absolutely the quickest entry instances.

By tiering your knowledge primarily based on entry patterns, you considerably scale back your scorching tier footprint whereas introducing a small heat tier for much less essential knowledge. This technique permits you to keep excessive efficiency for frequent queries whereas optimizing prices for the complete system.

The recent tier continues to offer optimum efficiency for almost all of queries focusing on continuously accessed knowledge. For the nice and cozy tier, you see a rise in latency for queries on much less continuously accessed knowledge, however that is mitigated by efficient caching on the UltraWarm nodes. Total, the system maintains excessive availability and fault tolerance.

This balanced method reduces your month-to-month price to $5,350, with $3,350 for the new tier and $350 for the nice and cozy tier, decreasing the month-to-month prices by roughly 29% total.

Situation 2: Managing Rising Vector Database with Entry-Based mostly Patterns

Think about your system processes and indexes huge quantities of content material (textual content, pictures, and movies), producing vector embeddings utilizing the Lucene engine for superior content material advice and similarity search. As your content material library grows, you’ve noticed clear entry patterns the place newer or standard content material is queried continuously whereas older or much less standard content material sees decreased exercise however nonetheless must be searchable.

To successfully leverage tiered storage in OpenSearch Service, think about organizing your knowledge into separate indices primarily based on anticipated question patterns. This index-level group is necessary as a result of knowledge migration between tiers occurs on the index stage, permitting you to maneuver particular indices to cost-effective storage tiers as their entry patterns change.

Your present dataset consists of 150 GB of vector knowledge, rising by 50 GB month-to-month as new content material is added. The info entry patterns present:

About 30% of your content material receives 70% of the queries, sometimes newer or standard gadgets
One other 30% sees reasonable question quantity
The remaining 40% is accessed sometimes however should stay searchable for completeness and occasional deep evaluation

Given these traits, let’s discover a single-tiered and multi-tiered method to managing this rising dataset effectively.

Single-tiered configuration

For a single-tiered configuration, because the dataset expands, the vector knowledge will develop to be round 400 GB over 6 months, all saved in a scorching (default) tier. Within the case of r6gd.8xlarge.search cases, the info occasion depend could be round 3 nodes.

The general month-to-month prices for the area beneath a single-tiered setup could be round $8050 with an information occasion price of round $6700.

Multi-tiered configuration

To optimize efficiency and price, you implement a multi-tiered storage technique utilizing Index State Administration (ISM) insurance policies to automate the motion of indices between tiers as entry patterns evolve:

Sizzling tier – Shops continuously accessed indices for quickest entry
Heat tier – Homes reasonably accessed indices with greater latency
Chilly tier – Archives hardly ever accessed indices for cost-effective long-term retention

For the info distribution, you begin with a complete of 150 GB with a month-to-month development of fifty GB. The next is the projected knowledge distribution when the info reaches 400 GB at across the 6 month mark:

Sizzling tier – Roughly 100 GB (most continuously queried content material) on 1x r6gd.8xlarge
Heat Tier – Roughly 100 GB (reasonably accessed content material) on 2x ultrawarm1.medium.search
Chilly Tier – Roughly 200 GB (hardly ever accessed content material)

Below the multi-tiered setup, the associated fee for the vector knowledge area totals $3880, together with $2330 price of knowledge nodes, $350 price of UltraWarm nodes, and $5.00 of chilly storage prices.

You see compute financial savings as the new tier occasion measurement diminished by round 66%. Your total price financial savings have been round 50% year-over-year with multi-tiered domains.

Situation 3: Massive-scale disk-based vector search with UltraWarm

Let’s think about a system managing 1 billion vectors of 768 dimensions distributed throughout 100 indexes of 10 million vectors every. The system predominantly makes use of disk-based vector search with 32x FAISS quantization for price optimization, and about 70% of queries goal 30% of the info, making it a great candidate for tiered storage.

Area setup with out UltraWarm assist

On this method, utilizing disk-based vector search to deal with the large-scale knowledge, you deploy a cluster with 4x r6gd.4xlarge cases. This setup gives satisfactory storage capability whereas optimizing reminiscence utilization by way of disk-based search.

The month-to-month price for this setup involves $6,500 per thirty days with an information occasion price of $4,470.

Price-saving technique: UltraWarm area setup

On this method, you align your storage technique with the noticed question patterns, much like Situation 1.

You retain the continuously accessed 30% of knowledge in scorching storage, utilizing 1x r6gd.4xlarge cases. For the much less continuously accessed 70% of knowledge, you utilize 2x ultrawarm1.medium.search cases.

You employ disk-based vector search in each storage tiers to optimize reminiscence utilization. This balanced method reduces your month-to-month price to $3,270, with $1,120 for the new tier and $400 for the nice and cozy tier, decreasing the month-to-month prices by roughly 50% total.

Get began with UltraWarm and Chilly storage

To reap the benefits of k-NN indexes in UltraWarm and Chilly tiers, guarantee that your area is working OpenSearch Service 2.17 or later. For directions emigrate k-NN indexes throughout storage tiers, confer with UltraWarm storage for Amazon OpenSearch Service.

Take into account the next greatest practices for multi-tiered vector search:

Analyze your question patterns to optimize knowledge placement throughout tiers
Use Index State Administration (ISM) to handle the info lifecycle throughout tiers transparently
Monitor cache hit charges utilizing the k-NN stats and modify tiering and node sizing as wanted

Abstract

The introduction of k-NN vector search capabilities in UltraWarm and Chilly tiers for OpenSearch Service marks a major step ahead in offering cost-effective, scalable options for vector search workloads. This characteristic permits you to steadiness efficiency and price by preserving continuously accessed knowledge in scorching storage for lowest latency, whereas shifting much less energetic knowledge to UltraWarm for price financial savings. Whereas UltraWarm storage introduces some efficiency trade-offs and makes knowledge immutable, these traits usually align properly with real-world entry patterns the place older knowledge sees fewer queries and updates.

We encourage you to judge your present vector search workloads and think about how this multi-tier method may benefit your use circumstances. As AI and machine studying proceed to evolve, we stay dedicated to enhancing our providers to fulfill your rising wants.

Keep tuned for future updates as we proceed to innovate and broaden the capabilities of vector search in OpenSearch Service.

Concerning the Authors

Kunal Kotwani is a software program engineer at Amazon Internet Providers, specializing in OpenSearch core and vector search applied sciences. His main contributions embrace growing storage optimization options for each native and distant storage programs that assist prospects run their search workloads extra cost-effectively.

Navneet Verma is a senior software program engineer at AWS OpenSearch . His main pursuits embrace machine studying, serps and bettering search relevancy. Outdoors of labor, he enjoys enjoying badminton.

Sorabh Hamirwasia is a senior software program engineer at AWS engaged on the OpenSearch Undertaking. His main curiosity embrace constructing price optimized and performant distributed programs.

Previous articleInsurance coverage agency Lemonade warns of breach of 1000’s of driving license numbers

Next articlePosit AI Weblog: torch 0.11.0

Introducing vector search with UltraWarm in Amazon OpenSearch Service

New functionality: Ok-NN indexes in UltraWarm and Chilly tiers

Use circumstances

Situation 1: Balancing scorching and heat storage for combined workloads

Area setup with out UltraWarm assist

Price-saving technique: UltraWarm area setup

Situation 2: Managing Rising Vector Database with Entry-Based mostly Patterns

Single-tiered configuration

Multi-tiered configuration

Situation 3: Massive-scale disk-based vector search with UltraWarm

Area setup with out UltraWarm assist

Price-saving technique: UltraWarm area setup

Get began with UltraWarm and Chilly storage

Abstract

Concerning the Authors

Nintendo reschedules Swap 2 American preorders for April 24

Star Wars: Zero Firm brings Clone Wars techniques in 2026

dbt Labs Report Reveals How AI Is Boosting Information Budgets and Group Development

LEAVE A REPLY Cancel reply

Most Popular

Fallout: London with Daniel Morrison Neil and Jordan Albon

Clone Robotics releases eerie video of twitching, kicking humanoid robotic

Generate beautiful visuals in your Android apps with Imagen 3 through Vertex AI in Firebase

Finest Thunderbolt and USB-C docking stations for MacBook Air and Professional 2025

Recent Comments

ABOUT US

POPULAR POSTS

Fallout: London with Daniel Morrison Neil and Jordan Albon

Clone Robotics releases eerie video of twitching, kicking humanoid robotic

Generate beautiful visuals in your Android apps with Imagen 3 through Vertex AI in Firebase

POPULAR CATEGORY