Research Finder
Find by Keyword
How Is AWS Building Service Value Around Data Stored in Amazon S3?
New context, annotation, and vector capabilities expand how information in S3 can be discovered, contextualized, retrieved, and consumed across analytics and AI workloads.
06/24/26
Key Highlights
- AWS introduced AWS Context, a new service designed to provide AI agents and applications with governed access to organizational knowledge.
- AWS Glue Data Catalog adds Business Context and Semantic Search capabilities to improve natural language discovery of enterprise data assets.
- Amazon S3 annotations enables organizations to attach structured business context directly to S3 objects.
- Amazon S3 Vectors now supports up to 10,000 similarity search results per query and reduces query costs for large vector indexes.
- The announcements expand the services available around information stored in Amazon S3 and support emerging analytics and AI workloads.
The News
AWS introduced several new data and AI capabilities focused on improving how organizations discover, contextualize, and retrieve enterprise information. The announcements include AWS Context, Business Context and Semantic Search for AWS Glue Data Catalog, Amazon S3 annotations, and enhancements to Amazon S3 Vectors. The announcements were released during the AWS Summit New York timeframe and build on AWS's broader investments in data services for analytics and AI workloads. For more details, check out the What’s New with AWS page.
Analyst Take
Amazon S3 has become the de facto standard for cloud object storage and one of the most widely adopted storage interfaces in the industry. What began as a cloud-based object storage service has evolved substantially over the past two decades as organizations adopted Amazon S3 for data lakes, analytics, backup repositories, cloud-native applications, AI workloads, and many other use cases.
AWS reports that Amazon S3 now stores more than 500 trillion objects, manages hundreds of exabytes of data, and serves more than 200 million requests per second globally.
In our view, AWS is increasing the value customers derive from data stored in S3 by layering additional services around it, enabling more workloads to consume the same information assets across analytics, search, and AI environments. S3 reduces the need for organizations to move data into separate repositories, and AWS continues to expand the services available around information already stored on S3.
This strategy aligns with how enterprise workloads are evolving. Organizations continue to migrate data lakes, large-scale file repositories, HPC environments, and engineering workloads to the cloud. At the same time, AI introduces new requirements for retrieval, memory, context management, and agent interaction. These workloads share a common dependency on information access, creating demand for services that can discover, contextualize, retrieve, and act on data without requiring additional copies or specialized storage systems.
Regardless of how those capabilities evolve, the underlying information must reside somewhere. AWS’s recent S3 investments suggest the company intends to continue the services available around information stored in S3, multiplying the value of that data.
The new AWS Context, S3 annotations, and S3 Vectors provide additional evidence of a direction that has been emerging across Amazon S3 for several years. AWS Context focuses on connecting organizational knowledge, S3 annotations adds business meaning to stored information, and S3 Vectors improves retrieval from large content repositories. Collectively, they help organizations understand content, locate relevant information, and improve access across analytics and AI workloads.
A robust portfolio of analytics tools, backup applications, AI frameworks, data services, and storage vendors has developed around S3. As new services, workload categories, data types, and partners are added, the utility of information stored in S3 continues to expand.
What Was Announced
AWS introduced several capabilities designed to improve how organizations discover, contextualize, and retrieve information across analytics and AI environments.
The most significant addition is AWS Context, providing applications and AI agents with governed access to organizational knowledge. AWS describes this as a shared knowledge layer that can connect information from data lakes, databases, data warehouses, business applications, and other enterprise systems. The company also announced Business Context and Semantic Search capabilities for AWS Glue Data Catalog, allowing users and applications to discover information using natural language without requiring knowledge of specific table names, schemas, or underlying data structures.
AWS introduced Amazon S3 annotations, allowing organizations to attach structured business context directly to S3 objects. Annotations support JSON, XML, and YAML formats and can store up to 1 GB of information per object. The metadata remains associated with the object throughout copy and replication operations and can be surfaced through S3 Metadata, where it is stored in managed Apache Iceberg tables that can be queried through Amazon Athena and other Iceberg-compatible analytics tools.
The company announced two enhancements to Amazon S3 Vectors. It now supports up to 10,000 similarity search results per query, a 100x increase over the previous limit. AWS states that the enhancement is intended to support multi-stage retrieval workflows that perform additional processing such as reranking, aggregation, or deduplication before producing a final result set. AWS also reduced query charges for vector indexes containing more than 10 million vectors, lowering costs for retrieval, semantic search, and retrieval-augmented generation (RAG) workloads operating at scale.
Looking Ahead
AWS continues to expand the services available around information stored in Amazon S3, creating opportunities for customers, partners, and developers to build additional value on top of the same underlying data.
We expect AWS to continue expanding the ways applications, analytics, and AI systems can operate directly against data stored in S3. For example, AWS’s plans to integrate OpenSearch with S3 Vectors provide an early example of how existing services can leverage information already managed within S3. Similar approaches could extend the utility of data stored in S3 while reducing data movement and architectural complexity.
AWS continues to create opportunities for partners as it adds new services and customers adopt new workload types. Partners that align with those changes will be well positioned to participate in the next phase of analytics, AI, and data services innovation. Emerging AI workloads may accelerate this trend. Agents, retrieval systems and multimodal applications all depend on access to information and context. Services that can operate directly against existing data assets are well positioned to help reduce data movement while improving resiliency and control over enterprise information.
Analytics, backup applications, data services, AI frameworks, and cloud-native applications already consume information stored in S3. As capabilities such as context management, semantic discovery, metadata services, and vector retrieval become more broadly available, organizations will have additional ways to leverage information without moving it into dedicated, specialized storage repositories.
Don Gentile | Analyst-in-Residence -- Storage & Data Resiliency
Don Gentile brings three decades of experience turning complex enterprise technologies into clear, differentiated narratives that drive competitive relevance and market leadership. He has helped shape iconic infrastructure platforms including IBM z16 and z17 mainframes, HPE ProLiant servers, and HPE GreenLake — guiding strategies that connect technology innovation with customer needs and fast-moving market dynamics.
His current focus spans flash storage, storage area networking, hyperconverged infrastructure (HCI), software-defined storage (SDS), hybrid cloud storage, Ceph/open source, cyber resiliency, and emerging models for integrating AI workloads across storage and compute. By applying deep knowledge of infrastructure technologies with proven skills in positioning, content strategy, and thought leadership, Don helps vendors sharpen their story, differentiate their offerings, and achieve stronger competitive standing across business, media, and technical audiences.