Sources

453 sources collected

Our previous blog article, “The Part of PostgreSQL We Hate the Most,” discussed the problems caused by everyone’s favorite street-strength DBMS multi-version concurrency control (MVCC) implementation. These include version copying, table bloat, index maintenance, and vacuum management. This article will explore ways to optimize PostgreSQL for each problem. Although PostgreSQL’s MVCC implementation is the __worst__ among other widely used databases like Oracle and MySQL, it remains our favorite DBMS, and we still love it! By sharing our insights, we hope to help users unlock the full potential of this powerful database system. ... ## Problem #1: Version Copying When a query modifies a tuple, regardless of whether it updates one or all of its columns, PostgreSQL creates a new version by copying all of its columns. This copying can result in significant data duplication and increased storage demands, particularly for tables with many columns and large row sizes. Optimization: Unfortunately, there are no workarounds to address this issue without a significant rewrite of PostgreSQL’s internals that would be disruptive. It’s not like replacing a character on a sitcom that nobody notices. ... ## Problem #2: Table Bloat PostgreSQL stores expired versions (dead tuples) and live tuples on the same pages. Although PostgreSQL’s autovacuum worker eventually removes these dead tuples, write-heavy workloads can cause them to accumulate faster than the vacuum can keep up. Additionally, the autovacuum only removes dead tuples for reuse (e.g., to store new versions) and does not reclaim unused storage space. During query execution, PostgreSQL loads dead tuples into memory (since the DBMS intermixes them on pages with live tuples), increasing disk IO and hurting performance because the DBMS retrieves useless data. If you are running Amazon’s PostgreSQL Aurora, this will increase the DBMS’s IOPS and cause you to give more money to Jeff Bezos! Optimization: We recommend monitoring PostgreSQL’s table bloat and then periodically reclaiming unused space. The pgstattuple built-in module accurately calculates the free space in a database but it requires full table scans, which is not practical for large tables in production environments. ``` $ psql -c "CREATE EXTENSION pgstattuple" -d $DB_NAME $ psql -c "SELECT * FROM pgstattuple('$TABLE_NAME')" -d $DB_NAME ``` … ## Problem #3: Secondary Index Maintenance When an application executes an `UPDATE` query on a table, PostgreSQL must also update all the indexes for that table to add entries to the new version. These index updates increase the DBMS’s memory pressure and disk I/O, especially for tables with numerous indexes (one OtterTune customer has **90** indexes on a single table!). As the number of indexes in a table increases, the overhead incurred when updating a tuple increases. PostgreSQL avoids updating indexes for Heap-Only Tuples (HOT) updates, where the DBMS stores the new version on the same page as the previous version. But as we mentioned in our last article, OtterTune customers’ PostgreSQL databases only use the HOT optimization for 46% of update operations. … `DROP INDEX` command. ## Problem #4: Vacuum Management PostgreSQL’s performance heavily depends on the effectiveness of its autovacuum to clean up obsolete data and prune version chains in its MVCC scheme. However, configuring the autovacuum to operate correctly and remove this data in a timely manner is challenging due to its complexity. The default global autovacuum settings are inappropriate for large tables (millions to billions of tuples), as it may take too long before triggering vacuums. Additionally, if each autovacuum invocation takes too long to complete or gets blocked by long-running transactions, the DBMS will accumulate dead tuples and suffer from stale statistics. Delaying the autovacuum for too long results in queries getting gradually slower over time, requiring manual intervention to address the problem. Optimization: Although having to vacuum tables in PostgreSQL is a pain, the good news is that it is manageable. But as we now discuss, there are a lot of steps to this and a lot of information you need to track.

6/1/2023Updated 9/18/2025

Maybe the functionality that the user wants doesn't exist. Maybe they've implemented a particular architecture because they're working around constraints in their own infrastructure that they can't actually do anything about. Maybe the most appropriate architecture for their use case isn't well documented or explained. Maybe the user doesn't understand something because there aren't enough training resources, or we've not made it clear to the users where they can find the training resources that they needed. … Lots of people are setting really, really high values of max_connections. Although it's a lot less of an issue than it used to be, it's still causing problems. I'm hypothesising, but I suspect that it's an education issue, especially with people coming from other database management systems; that we still need to explain to users how Postgres works, what the implications are if they set max_connections too high, and if they have too many concurrent connections.

Updated 8/26/2025

www.instaclustr.com

Top Postgresql Best...

### Replication and high availabilityitReplication in PostgreSQL involves maintaining a real-time copy of a database on another server. It ensures high availability by allowing transition to a replica if the primary server fails. Managing replication can be challenging due to the need for intricate configuration and careful monitoring to prevent data loss or corruption. Ensuring consistent data across replicas while accommodating write and read loads requires careful setup and ongoing oversight. Achieving high availability requires automated failover mechanisms and monitoring. Balancing load between primary and standby servers is crucial for PostgreSQL performance and availability. Selecting the appropriate strategy (streaming replication or logical replication) based on system needs is essential in maintaining operational readiness. ... ### Performance optimizationioOptimizing performance in PostgreSQL involves a combination of hardware, configuration, and query tuning. Effective indexing, appropriate data types, and well-designed database schemas are important for improving speed. Developers must analyze query performance through tools like EXPLAIN to identify bottlenecks. Efficient use of resources such as CPU and memory is also crucial in supporting high performance, particularly under heavy loads. Developers must monitor query execution alongside regular maintenance tasks like ANALYZE and VACUUM to maintain efficiency over time. ### Security managementenSecurity in PostgreSQL involves protecting data against unauthorized access and ensuring compliance with relevant standards. Configuring authentication mechanisms and permissions is vital for restricting access to authorized users only. Implementing SSL encryption for data transit and applying security patches promptly are additional layers of defense. Security measures must be strong enough to protect against common threats like SQL injection and brute-force attacks. PostgreSQL provides built-in tools for monitoring suspicious activities, such as logging failed login attempts. Users must know how to properly implement these tools and ensure they regularly audit permissions and access logs. ### Data backup and recoveryerPostgreSQL backup strategies include full, incremental, and continuous methods, offering different recovery points and durations. Full backups ensure recovery but require significant resources, while incremental methods reduce time and storage by only capturing changes since the last backup. Continuous archiving offers real-time recovery capabilities but can be complex to manage effectively. Choosing the right mix of strategies is important for balancing speed, cost, and data safety. The recovery plan must address both data consistency and downtime minimization. Users must test recovery procedures to prepare the system for real-world scenarios. PostgreSQL tools like pg_basebackup and pg_dump can be used for data backup management.

11/11/2025Updated 3/29/2026

wiki.postgresql.org

Usability Challenges

## Core server management and configuration - Too much tuning - Memory management: too complex, there are few useful guidelines, most things could be automated - Vacuum: should be automatic -- yay, autovacuum - Background writer configuration: Who needs that? - Write-ahead log configuration: too complicated, should be automatic - Free-space map: The server knows full well how much FSM it needs; see also memory management. - Managability is lacking - User accounts: still no good way to manage pg_hba.conf from SQL - Statistics: too much data but most people don't know what to make of it - Configuration files: too long, too many options that most people don't need - Plugins: Using external modules is complicated, sometimes risky, hard to manage. - Logging: Logging configurability is great but the default configuration is less than useful for new users. - Tracing: Everything notwithstanding, it is still really hard at times to know what is happening, such as in nested PL/pgSQL calls, in cascaded foreign key actions, and other nested and cascaded contexts.* Clients: … No out-of-band monitoring is supported. If pg_ctl launched the postmaster but the postmaster can't start properly functioning backends, the only diagnostics are free-form text logs. This stinks for people trying to manage and automate PostgreSQL installs. An out-of-band monitoring tool is needed that can report things like the port(s) Pg is listening on, any errors produced when trying to start backends, memory status, running queries (w/o having to start a new backend just to query pg_stat_activity), lock status, etc. … ## Backups, `pg_dump`, `pg_dumpall` and `pg_restore` - The default encodings/locales selected on Windows and Linux (UTF8) systems are incompatible with each other, so running `pg_dump -Fc -f dbname.backup dbname`on Linux then `pg_restore -C --dbname postgres dbname.backup`on Windows (or vice versa) … `template0`can't be connected to, there's no DB they can always connect to by default. This leads to weird command lines like `pg_restore --create --dbname postgres mydb.backup`to *restore to a newly created database, probably but not necessarily called*. If the user omits `mydb`, not to the … `-Fc`mode. This means that *by default PostgreSQL database dumps cannot be restored correctly unless the user dumps additional information separately!*. `pg_dump`should include global objects like roles that are referred to by the database being dumped, so that backups are complete and correct by default. `pg_dumpall`doesn't support the custom format. You can't make an archive containing all databases on a cluster, or have it spit out one dump file per database plus a globals file. This must be done manually using scripting, and that's rather less than user friendly. Backups need to be easy to get right by default! … ## PgAdmin-III (First point of contact for most newbies) - PgAdmin-III usability may be somewhat lacking - Using the "Restore" dialog with PgAdmin-III and pointing it a .sql dump produces an unhelpful error message. It should offer to run the SQL dump against the target database, at least when faced with a … *have to be edited by hand before they can be restored*. - PgAdmin-III uses the unhelpful `.backup`suffix for backups it creates with `pg_dump -Fc`behind the scenes. Backup of *what?*There's nothing in `pg_restore`that says files should have a .backup extension, nor does it encourage them to be created as such, so users who want to restore a backup created from the command line via PgAdmin-III often have to rename the file or change the filter before they can even see it in the file list to restore. … `pg_restore`'s `-C`option. That's really counter-intuitive; you should just be able to select the server you want to restore to, or use the restore item in the menu and be prompted for the target server. - You don't get a choice of the database name to use for the newly created database with PgAdmin-III's Restore, Create Database option, it silently uses the db name in the backup file (and doesn't give you any indication of what it is). This is a … ## Replication - Built-in replication can't replicate only some databases of a cluster; you have to replicate both my-critically-important-10MB-database and my-totally-unimportant-50GB-database with the same settings, same priority, etc. This is a usability challenge because it means people have to create and manage multiple clusters to control replication groups, and multiple clusters are hard to manage and configure. See also Usability reviews

5/1/2012Updated 2/1/2025

Why Uber Engineering Switched from Postgres to MySQL We encountered many Postgres limitations: Inefficient architecture for writes, Inefficient data replication, Issues with table corruption, Poor replica MVCC support, Difficulty upgrading to newer releases … Postgres does not have true replica MVCC support. The fact that replicas apply WAL updates results in them having a copy of on-disk data identical to the master at any given point in time. This design poses a problem for Uber. Oxide Podcast: The challenges of operating PostgreSQL at scale during their time at Joyent and how autovacuum caused an outage starts at about 20 minutes into the podcast. (This podcast was the inspiration for this blog post) “We found a lot of behavior around synchronous replication that was either undocumented, or poorly documented and not widely understood, which contributed to a feeling that this thing (PostgreSQL) was really hard to operationalize. Even if you know about these things, they are very hard to workaround, and fix.”

12/4/2024Updated 3/30/2025

# 5 Common Pain Points in PostgreSQL #postgres #database Before building Flashboard (our PostgreSQL admin app), our development team had struggled with a lot of stuff that you might've encountered in your PostgreSQL journey. I'm not just talking about tracking foreign keys or building extra-complex SQL queries, but also finding a way to make data editable for non-technical folks without needing to build an entire new software to support the *main thing* — after all, the *main thing* is where the core of the work should be, right, Project Owners? … ## 1. Time-Consuming Query Creation Building SQL queries is fun — until it isn't. It's like RegEx — you get that big, broad smile when it works, but after the 10th time you just hit ChatGPT and ask it to build it for you. It becomes tiresome, error-prone, and, mainly, time-consuming. The *f-in-fun* becomes *f-in-frustration* when you have to redo it all because you got a new client. ## 2. Navigating Foreign Keys A good PostgreSQL database probably has a lot of foreign keys — and that's a green flag in architecture design... or at least that's what the architects say. When the developers need to navigate the schema, jumping from one related table to the other, things get confusing — you gain database-performance in exchange for developer-speed. ## 3. Handling JSON Data I know the feeling of working with JSON in PostgreSQL. Saving data this way saves time and space, but it's waaaay too difficult to query it. You need special operators and functions, and when you’re working with nested structures, it’s easy to mess something up. ## 4. Creating Secure Interfaces for Non-Tech Teams You might need to give access to the database to non-technical users — sometimes there's no time to build a feature in the *main app* or there's no proper admin panel setup for them. This is hard work — it’s not just about making sure they don’t break anything, but also ensuring sensitive data stays locked down. Building a secure admin interface means juggling encryption, access control, and usability — when you notice it, you are building a secondary software as hard to maintain as the primary one. ## 5. Scaling Custom Admin Solutions Continuing from the last topic: what starts as a simple admin panel for a small team can quickly balloon into a mess as your user base or data grows. Suddenly, you’re dealing with performance issues, broken queries, and features that no longer work as expected. Scaling a custom solution is a beast of a task, and unless you planned for it from day one, you’ll probably need to rebuild a good chunk of it down the road.

9/10/2024Updated 10/17/2025

You need to have safe and effective rollback. {ts:251} Also useful no matter what your deployment strategy is. However, with continuous deployment, it becomes a constraint and not a choice. You are signing yourself up to have to be good at testing. You're assigning yourself to be good at, have to be good at mean time to resolution in a way that you are not in other modes of deployment.

12/5/2025Updated 12/8/2025

If I need to look up the docs for every single API, or even just **feel** like I might have to look them up, it detracts from my actual work: solving business problems. Inconsistent APIs cost more time to implement, and they increase the mental load on developers. But they might also reduce the quality of the applications using them. Worst case scenario, this can lead to a bug in production, with serious time and effort required to fix it. And that cost adds up. … What I found was that there’s no consistency between different AWS services, between related AWS services, and even **within** a single service’s API. Let me give you an example of each. … When we include Kinesis Video Streams (KVS), it gets even worse. This service also has the `StreamARN `and `StreamName` fields, but you can get them through `ListStreams` **and** `DescribeStream`. And for KVS, the creation timestamp is stored under yet another name: `CreationTime`. ### Inconsistency within an AWS service Inconsistency within a service is the worst type of inconsistency because, ostensibly, a single service is maintained by a single team. It’s a real problem for developers. You can make a mental note that when talking to a different API, you might need to use a different syntax. But when various calls within the same service are inconsistent, you can never really rely on muscle memory, ever. One of the more painful examples of this can be found working with REST APIs in API Gateway. This is a complex service, so it can be forgiven for having a complex service API. It has dozens of resources, including API Keys, Methods, Resources, Usage Plans, and Documentation Parts. Each of these resource types can be retrieved in singular form with a request like `GetAuthorizer` and in plural form with a request like `GetAuthorizers`. (Side note: That’s a whole new variation on the `Describe `and `List` standard.) … This `items` versus `item` example can lead to bigger issues. If you’ve seen 20 examples of an `items` response, you’d be forgiven for assuming the 21st also returns `items`. An automated check or test deployment might surface this bug, but it might also miss it. Fixing it after it’s live can cost significant time as well as your reputation. … ### Amazon CloudFront The worst service API I’ve encountered so far is CloudFront. This API is inconsistent compared to other services, is inconsistent within its own domain, and returns responses in such a broken way that it should be considered a bug. First, the method-naming and response format. CloudFront has `**Get**`, **`List`**, **and** `**Describe**` calls. The `List` operations all return a dictionary with an `Items` key, which is a list: … This is bad. This will break any parser because an empty string is not valid JSON. It also means that all of these calls have two different types of output: a dictionary or an empty string. As a developer, it means you have to inspect the result length before passing it on to the parser. You can’t even write a simple try-catch block because then you wouldn’t be able to differentiate between a failing API call and zero results. This is not only bad design, but once again, it’s inconsistent with all the other services that at least offer reliable output. … Despite the problems that inconsistent APIs create for AWS customers, the burden will continue to be on us developers for the foreseeable future.

9/24/2021Updated 3/25/2026

{ts:549} but if your requirements are too specific not supported unlikely to be supported {ts:555} within the time frame that you need maybe you need to compose your own service not for everything but for this {ts:561} particular thing and to sort of illustrate that i'm going to borrow and adapt some slides that adrian

8/18/2022Updated 3/10/2026

"To create resilient systems, one must remain LLM-agnostic," statedil W, C at Klyo. "The 'best' model for any task is constantly evolving, so the ability to switch to the most effective model without significant re-architecture will be essential for sustained success." … ... In conversations I had, some attendees expressed dissatisfaction with the announcements, suggesting they weren't as groundbreaking as previous keynotes. They felt many updates were reiterations of existing offerings or added minimal value to current products. However, the counterargument was that AWS is essentially feature-complete, and any new additions should provide more incentives for customers to remain loyal.

12/15/2025Updated 12/15/2025

# 5 common pain points with AWS Lambda ## Starting without a framework One of the biggest mistakes made when creating a serverless architecture using AWS Lambda is writing functions without a framework. There are multiple frameworks available to deploy and wire up Lambda functions to the events which will be triggering them. These frameworks handle the deployment of individual Lambda functions, setup of API Gateway to pass through HTTP events or a Cloudwatch Scheduled Event to trigger a Lambda on a cron-job. … ## Large bundle sizes When starting out with AWS Lambda large bundle sizes aren’t typically a problem, however when your architecture grows and you start to amass a large number of Lambda functions max bundle size starts to become an issue. There is a default deployment size limit of 50MB. To get around this you can utilise webpack to bundle each entrypoint/Lambda into its own file. Another added benefit of individual bundling is faster deployment and spin up of individual functions meaning you can update and deploy just a single function instead of the entire architecture. … ## API Gateway rejects your response It’s possible for API Gateway to return an error 502 Bad Gateway, which states that the response the Gateway received from your Lambda is malformed. The biggest cause for this that I encountered when getting started with AWS Lambda is the requirement to return a string as the body of the request. It’s a good idea to create helper functions to format your responses and add any extra appropriate headers like CORS.

5/3/2018Updated 8/30/2025

{ts:488} really face it some of these manual tasks are not even things we actually enjoy doing as developers modernization {ts:495} transformation I'm sure some of you wake up every day say wow I want to modernize I want to transform but sometimes it {ts:501} just sounds kind of like you're just kind of stepping through some mud and leaving some some Footprints and fossils

2/28/2025Updated 7/9/2025