Sources

1577 sources collected

Ava Czechowska, Principal Cloud Engineer at ClearPoint, explains why, even in 2025, Amazon Web Services (AWS) S3 Buckets are still a high-risk area. ... … Despite significant advancements in cloud security, Amazon S3 buckets continue to be a common initial access vector for adversaries. Even in 2025, with all the new features and best practices, misconfigurations, forgotten buckets, and evolving threat landscapes mean S3 remains a critical challenge. ClearPoint partners with AWS and other cloud solutions to provide end-to-end support in migrating, modernising, implementing and managing your cloud infrastructure. ... The Datadog 2024 State of Cloud Security Report states that, as of their analysis, 1.48% of AWS S3 buckets were "effectively public," similar to the 1.5% figure from 2023. While they note increasing adoption of public access blocks, this persistent percentage indicates that misconfigurations are still a factor. Another point from the same report highlights the risks posed by long-lived cloud credentials. Such credentials never expire and frequently get leaked in source code, container images, build logs, and application artifacts. The report acknowledges another past research showing that **long-lived credentials are the most common cause of publicly documented cloud security breaches**. The Fortinet 2025 Global Threat Landscape Report mentions that "cloud environments remain a top target, with adversaries exploiting persistent weaknesses, such as open storage buckets, over-permissioned identities, and misconfigured services," and that "open storage buckets and over-permissioned identities continue to be leading vectors of attack." Fortinet's 2025 State of Cloud Security Report recognises configuration and misconfiguration management as the third most important operational challenge in cloud security, noting that it has already led to numerous high-profile breaches. … #### Why do S3 misconfigurations still happen? As CrowdStrike's "Insider’s Playbook: Defending Against Cloud Threats" explains, a cloud misconfiguration is "a poorly chosen, incorrect or absent security setting that exposes the cloud environment to risk." The playbook highlights that because cloud architectures are so complex, the real-time detection of such misconfigurations is difficult. Other points mentioned in the playbook are: **Speed over Security:**The rapid pace of modern development often encourages engineers to "quickly push projects to production." This velocity can inadvertently sideline security considerations, leading to overlooked configurations. **Shadow Cloud Environments:**The ease of spinning up cloud resources can lead to "shadow cloud environments" – resources deployed without proper oversight or security controls, creating blind spots for security teams. **Siloed Security Tools:**"Today’s cloud security tools are very bespoke, forcing organisations to build their cloud security programs on siloed point products." This fragmented approach makes it difficult to get a holistic view of security posture and can lead to gaps. … **Automated Security Defaults:**As of April 2023, newly created S3 buckets automatically enable S3 Block Public Access and disable Access Control Lists (ACLs). This means public access is blocked by default, and access is controlled primarily through more robust IAM policies. **Default Encryption for New Objects:**Since January 2023, Amazon S3 automatically applies server-side encryption with Amazon S3 managed keys (SSE-S3) for every new object uploaded, unless a different encryption option is specified. … **S3 Object Lock:**This feature supports a write-once, read-many (WORM) model, preventing objects from being overwritten or deleted for a specified period or indefinitely. This is vital for compliance and ransomware protection. **Amazon S3 Metadata (Preview, July 2025):**This new feature provides comprehensive visibility into all objects in S3 buckets through live inventory and journal tables. This allows for SQL-based analysis of both existing and new objects with automatic updates, greatly aiding in security audits and compliance. … **Embracing Automation:**Leveraging AWS's new default security settings and automating security checks. **Strengthening Identity and Access Management:**Implementing least privilege principles and regularly reviewing IAM policies. **Gaining Centralised Visibility:**Overcoming siloed tools by adopting solutions that provide a unified view of your cloud security posture. **Fostering Collaboration:**Breaking down barriers between DevOps and security teams to embed security throughout the development lifecycle. **Continuous Monitoring and Threat Detection:**Investing in robust threat detection capabilities to identify and respond to incidents swiftly, addressing the "Insufficient Threat Detection" challenge head-on.

8/20/2025Updated 9/25/2025

Yes, **S3** — the Swiss Army knife of AWS. It sounds harmless, even elegant: a “bucket” where you store your files. But the truth is, **most developers (up to 85%) are not using S3 the right way**, especially at scale or in production-grade applications. From skyrocketing costs, broken performance, misconfigured security, to compliance nightmares — the misuse of S3 creates hidden dangers that quietly eat away at your infrastructure. ## Misunderstood Simplicity: Why S3 Gets Abused So Often The beauty and danger of S3 is its simplicity. > “Just create a bucket and upload your stuff. Done.” That’s the mentality. ... But here's the catch: **S3 is an enterprise-grade service pretending to be beginner-friendly.** … ## 1. Misuse #1: Buckets Left Public “for Testing” ### The Problem: One of the most common S3 misuse patterns is leaving a bucket public for quick testing or uploads… and forgetting about it. Developers do this to get things done fast: … ## 2. Misuse #2: No Lifecycle Policies = Skyrocketing Storage Bills ### The Problem: S3 is “cheap” per GB, but it adds up. Especially if you’re uploading logs, backups, videos, or user-generated content. Without **lifecycle policies**, your S3 bucket becomes a black hole of never-deleted data. … ## 5. Misuse #5: Serving Static Sites Without CloudFront ### The Problem: Using S3 static website hosting **without CloudFront** is a recipe for: - High latency in non-US regions - No caching - No DDoS protection - Lack of SSL (on custom domains) S3’s website hosting is good for PoC — **not production.** … ## 6. Misuse #6: Using S3 Like a Database ### The Problem: Storing structured data (like JSON) in S3 and scanning it manually is a common trap. Yes, **S3 is “infinite”**, but it’s not a database. Searching, updating, or querying structured data directly from S3 is slow and expensive. … ## 7. Misuse #7: No Monitoring or Alerting ### The Problem: Many teams treat S3 as a “set and forget” system — until something breaks. No access logs. No monitoring. No alerts. Then a 100GB file gets uploaded, or a script runs wild and deletes a critical folder.

6/17/2025Updated 3/22/2026

## Performance matters Over the years, as S3 has evolved from a system primarily used for archival data over relatively slow internet links into something far more capable, customers naturally wanted to do more and more with their data. This created a fascinating flywheel where improvements in performance drove demand for even more performance, and any limitations became yet another source of friction that distracted developers from their core work. … If anything, I expect this trend to accelerate as customers pull the experience of using S3 closer to their applications and ask us to support increasingly interactive workloads. It’s another example of how removing limitations – in this case, performance constraints – lets developers focus on building rather than working around sharp edges. ## The tension between simplicity and velocity The pursuit of simplicity has taken us in all sorts of interesting directions over the past two decades. There are all the examples that I mentioned above, from scaling bucket limits to enhancing performance, as well as countless other improvements especially around features like cross-region replication, object lock, and versioning that all provide very deliberate guardrails for data protection and durability. ... … But on the other hand, racing to release something with painful gaps can frustrate early customers and worse, it can put you in a spot where you have backloaded work that is more expensive to simplify it later. This tension between simplicity and velocity has been the source of some of the most heated product discussions that I’ve seen in S3, and it’s a thing that I feel the team actually does a pretty deliberate job of. … The other complexity is that because these tables are actually made up of many, frequently thousands, of objects, and are accessed with very application-specific patterns, that many existing S3 features, like Intelligent-Tiering and cross-region replication, don’t work exactly as expected on them.

3/14/2025Updated 3/17/2026

, and the dozens of others listed in the S3 API reference—is far more complex to implement.16 This is not an oversight by competitors but a reflection of the difficulty in building the complex backend state management and asynchronous job processing required for these features. The consequence is that while an application’s data access code might be portable, the entire suite of operational and cost-management tooling built around S3’s management APIs is not. This creates a hidden form of vendor lock-in at the operational level, even if the application code itself is adaptable. … - **Unsupported Advanced Features:** The official compatibility documentation explicitly states that many of S3’s advanced management and application-integration features are not supported.9 This list includes Bucket Website Hosting, Analytics, Inventory, Logging, Replication, and Tagging. The absence of these features clearly defines the service’s intended scope. - **Encryption:** The service supports Server-Side Encryption with Customer-Provided Keys (SSE-C). However, there is no mention of support for the more commonly used SSE-S3 (provider-managed keys) or SSE-KMS (integration with a key management service), which limits the options for encryption key management.9 … is the most critical deficiency for long-term data management. Without this feature, users cannot create rules to automatically transition data to different storage tiers or to expire and delete old objects. All data archival and cleanup tasks must be managed by an external, client-side process, such as a cron job running a script. This adds a layer of operational complexity and responsibility that is handled natively within S3.11

10/25/2025Updated 3/19/2026

{ts:967} what they tell us is that it's becoming harder and harder to find the objects that they need in order to execute on a task. {ts:976} We have many customers who have built metadata stores next to their unstructured data stores, but these are difficult to build {ts:983} and they result in a dependency

12/5/2024Updated 2/27/2026

S3 is highly scalable, so in principle, with a big enough pipe or enough instances, you can get arbitrarily high throughput. A good example is S3DistCp, which uses many workers and instances. But almost always you’re hit with one of two bottlenecks: 1. The size of the pipe between the source (typically a server on premises or Amazon EC2 instance) and S3. 2. The level of concurrency used for requests when uploading or downloading (including multipart uploads). … Thirdly, and critically if you are dealing with lots of items, **concurrency matters**. Each S3 operation is an API request with significant latency — tens to hundreds of milliseconds, which adds up to pretty much forever if you have millions of objects and try to work with them one at a time. So what determines your overall throughput in moving many objects is the concurrency level of the transfer: How many worker threads (connections) on one instance and how many instances are used. … ### How to use nested S3 folder organization and common problems Newcomers to S3 are always surprised to learn that latency on S3 operations depends on key names because prefix similarities become a bottleneck at more than about 100 requests per second. If you need high volumes of operations, it is essential to consider naming schemes with more variability at the beginning of the key names, like alphanumeric or hex hash codes in the first 6 to 8 characters, to avoid internal hot spots within S3 infrastructure. … ### Why you should avoid using AWS S3 locations in your code This is pretty simple, but it comes up a lot. Don’t hard-code S3 locations in your code. This is tying your code to deployment details, which is almost guaranteed to hurt you later. You might want to deploy multiple production or staging environments. Or you might want to migrate all of one kind of data to a new location, or audit which pieces of code access certain data.

11/7/2022Updated 4/4/2026

What It Really Is, How It Works, and Why It Became the Bedrock of Modern Data

9/2/2025Updated 3/25/2026

### I'm squarely in the trough of disillusionment with S3. There’s no denying that S3 is a feat of engineering. Building and Operating a Pretty Big Storage System is a top-tier technology flex. But S3’s feature set is falling behind its competitors. Notably, S3 has no compare-and-swap (CAS) operation—something every single other competitor has. It also lacks multi-region buckets and object appends. Even S3 Express is proving to be lackluster. These missing features haven’t mattered much for data lakes and offline use cases. But new infrastructure is using object storage as their primary persistence layer—something I’m excited about. Here, S3’s feature gaps are a bigger problem. … S3 is the only object store that doesn’t support preconditions. Every other object store—Google Cloud Storage (GCS), Azure Blob Store (ABS), Cloudflare Ridiculously Reliable (R2) storage, Tigris, MinIO—all have this feature. Instead, developers are forced to use a separate transactional store such as DynamoDB to enforce transactional operations. Building a two-phase write between DynamoDB and S3 is not technically difficult, but it’s annoying and leads to ugly abstractions. ## S3 Express One Zone Isn’t S3 I was really excited about S3 Express One Zone (S3E1Z) when it first came out. The more time I spend with it, the less impressed I am. The first wart that surfaced was a new directory bucket type, which Amazon introduced for Express. But the gaps don’t stop here. S3E1Z is missing a *ton* of standard S3 features including object version support, bucket tags, object locks, object tags, and MD5 checksum ETags. The complete list is pretty staggering. S3E1Z buckets can’t be treated like a normal S3 buckets. As with CAS operations, developers must design around these deficiencies. And because S3E1Z is not multi-zonal, developers are left to build quorum writes to multiple availability zones for higher availability. Factor in S3E1Z’s high storage cost of $0.16/gig—twice elastic block store’s (EBS) general purpose SSD (gp3) cost—and S3E1Z looks more like an expensive EBS with a half-implemented S3 API. … ## Embracing Reality The dream is that developers are offered an object store with all of these features: low latency, preconditions, dual-region/multi-region, and so on. But we live in reality, where engineers face a choice: abandon S3 or build around these gaps. Turbopuffer is my favorite example of a company that’s gone all-in on abandoning S3.

5/22/2024Updated 3/29/2026

We replicate objects with RTC guarantees, but when objects arrive, we move them in the background as large batch operations. For tables, doing so can cause problems. Because tables have internal data dependencies throughout their snapshots. If you replicate the top-level metadata and don't replicate the metadata nodes it points to, queries will actually fail. Because the engine you are using (Spark, PySpark, or whatever) cannot find the files or objects that the other objects point to.

12/3/2025Updated 3/20/2026

The cost forecast for the service is not accurate. There is room for improvement. The over head charges sometimes feel beefed up. There is also sync delay sometimes. Overall a really good experience. We offload media from WordPress /wp-content and store it in S3 for delivery via Cloudfront. ... … The activation of the S3 event is not guaranteed. In exceptional cases, it may not trigger and you may have to write another Lambda for recovery. This is problematic. From what kind of service should be configured for the cloud, taking into account the contents of the introduction, you can arrange the necessary specifications for storage and judge whether the S3 service is useful. I am sure you will get the performance you expect. … it is only as good as the user/admin who knows how to make the most out of this software. there is a deep learning process involved without which one could be flying blind and exposing otherwise sensitive data to the public. the search functionality is practically absent. So there is no easy way of choosing or searching for files. the date columns on the UI seldom work

5/22/2025Updated 5/26/2025

AWS ( /products/amazon-aws-reviews ) should provide more attractive and comprehensive documentation, especially while integrating Amazon S3 with other AWS services like Elemental MediaConvert. Currently, the documentation can be difficult to understand, requiring additional resources such as AI, YouTube videos, or other aids. … No direct connector is available to connect Amazon S3 with other tools like MicroStrategy. So, I have to use a third party and connect to it. The solution's cost should be improved. The solution's performance needs to increase for small objects. Amazon S3 is excellent for large projects, but its performance can be comparatively slower for fewer small objects. … It is difficult to retrieve files in an emergency because the public cloud has data limits for immediate downloads so they often take twelve hours. We need better, immediate control of our content. The cloud and on-premises solutions should sync all data. Infrastructure services could be improved to provide free education related to the solution's features. For example, developers who do not have experience building games could learn how to do so with free support. … The pricing and licensing are pretty complex. It needs to be simplified. They have no development environment. That means that even during the development time, we are paying. The initial setup can be a bit complex. It would be helpful if there were more templates available. … The only pain point is migrating the life cycle policy because if someone is a newbie, they cannot easily create the life cycle policy. This is because there are multiple, additional ways. If I want to store data for one month, they have the option to Glacier over it. With Glacier, you have addition policies for one year or four years. The option will not be there. So you would need to use a UI process over it.

9/8/2019Updated 6/13/2025

### Transcript {ts:0} I know I know clickbait title I promise you though this is worth it the chaos {ts:4} that is S3 has been frustrating me for years I have tweets going back almost four years ago where I was begging for a {ts:10} better S3 alternative for modern full stack devs I kept getting skill issue S3 is so easy why are you having all these {ts:16} problems thankfully over the last few years especially last few weeks seems like people are finally noticing how {ts:21} painful it can be to work with S3 I have a video that I haven't had a chance to put out just yet where I showcase some {ts:26} security issues with S3 but that's not what we're here to talk about today I will quickly show them so you know it {ts:30} we're talking about S3 specifically how most people suck at securing it and some of the things that can result if you {ts:36} don't set it up correctly tldr S3 pre-signed posts or other ways of uploading files can easily be abused {ts:42} with cross-site scripting or unwanted paths for uploads yep keep an eye out for that video today we're not talking {ts:48} about the security issues around how pre-signed URLs work today we're talking about a chart and some numbers that were {ts:53} very scary this is a picture a chart a chaotic terrifying chart of two empty S3 buckets that suddenly got just an insane {ts:61} amount of empty post requests that is 50 million requests is to each bucket almost 100 million total and this cost … {ts:140} against AWS as it turns out one of the most popular open source tools had a default configuration to store their {ts:145} backups in S3 and as a placeholder for a bucket name they used the same name I had used for my {ts:153} bucket that is actually hilarious oh my God this meant that every deployment of this tool with default configuration … {ts:194} with S3 you're charged for at someplace even if you're not authorized to be doing the thing that you're being built {ts:199} for which is crazy is this is again a private bucket that no one has access to with almost no files in it this is {ts:204} confirmed in their exchange with AWS support as they wrote back S3 charges for unauthorized requests as well that's … {ts:408} source tool they quickly fix the default config although they can't fix the existing deployments yep there's a {ts:413} problem with when you are installing something on your device and not using a server for anything you cannot update

5/11/2024Updated 7/19/2025