Skip to content

[Enhancement] Improve configuration management to align with PostgreSQL best practices #1102

@HasanH47

Description

@HasanH47

Summary

pg_auto_failover currently places include directives at the beginning of postgresql.conf, which conflicts with PostgreSQL best practices and creates risk of silent misconfiguration.

Current Behavior

When initializing a PostgreSQL instance with pg_autoctl, the include directives are placed at the top of postgresql.conf:

# postgresql.conf (generated by pg_autoctl)
include 'postgresql-auto-failover-standby.conf'
include 'postgresql-auto-failover.conf'

# ... 600+ lines of PostgreSQL default settings ...
shared_buffers = 128MB
work_mem = 4MB
# ... etc

This is documented in docs/operations.rst:

"The include directive is placed on the top of the postgresql.conf file in a way that you may override any setting by editing it later in the file."

Problem Statement

While the intent to allow user overrides is appreciated, this approach creates several issues:

1. Conflicts with PostgreSQL Best Practice

PostgreSQL official documentation explicitly states:

"only the last setting encountered for a particular parameter while the server is reading configuration files will be used"

The documentation demonstrates include directives at the end of configuration files, not the beginning:

# PostgreSQL documentation example
# ... default settings ...

# Includes at the end
include 'shared.conf'
include 'memory.conf'

References:

2. Risk of Silent Misconfiguration

Users may unknowingly override critical settings by uncommenting lines in postgresql.conf:

# Default PostgreSQL setting:
shared_buffers = 128MB
# ↑ This takes precedence and silently overrides pg_autoctl's tuned value (e.g., 4GB)
# No warning or validation occurs

Critical HA settings that could be accidentally overridden:

  • wal_level - Required for replication
  • synchronous_commit - Impacts HA guarantees
  • max_wal_senders - Required for standbys
  • shared_buffers - Breaks automatic tuning from pg_autoctl do pgsetup tune

3. Unclear Documentation

Current documentation doesn't specify:

  • Which settings are safe to override (e.g., work_mem, effective_cache_size)
  • Which settings are risky (e.g., shared_buffers - defeats auto-tuning)
  • Which settings are critical for HA (e.g., wal_level, synchronous_commit)

Proposed Solutions

Option A: Move Includes to End + Support Custom Config File (Recommended)

Implementation:

# postgresql.conf (generated by pg_autoctl)
# ... all default PostgreSQL settings ...

# ========================================
# MANAGED BY PG_AUTO_FAILOVER
# Do not edit below this line.
# For custom settings, create postgresql-custom.conf
# ========================================
include 'postgresql-auto-failover-standby.conf'
include 'postgresql-auto-failover.conf'

# User custom configuration
include_if_exists 'postgresql-custom.conf'

Code changes required:

  • Modify pg_include_config() in src/bin/pg_autoctl/pgctl.c to append instead of prepend
  • Generate postgresql-custom.conf.example with safe override examples
  • Update documentation with customization guidelines

Benefits:

  • ✅ Aligns with PostgreSQL best practices
  • ✅ Auto-failover settings always take effect
  • ✅ Clear separation: defaults → auto-failover managed → user custom
  • ✅ Mostly backward compatible with existing installations

Migration path for existing users:

  • Document that uncommenting lines in postgresql.conf will no longer override
  • Provide script to extract custom settings into postgresql-custom.conf
  • Add migration guide in release notes

Option B: Add Configuration Validation (Complementary)

Add a validation command to detect overrides:

$ pg_autoctl config validate

WARNING: shared_buffers in postgresql.conf (128MB) overrides auto-failover tuning (4GB)
WARNING: Uncommenting default values may impact HA configuration
ERROR: wal_level must not be changed when using pg_auto_failover

Run 'pg_autoctl config check' for details on safe customization.

Option C: Documentation Improvements (Minimum)

If code changes aren't feasible immediately, enhance documentation with:

  1. Clear categorization of settings:

    • Safe to override (performance tuning)
    • Risky to override (defeats auto-tuning)
    • Critical for HA (must not change)
  2. Customization workflow examples:

    • How to properly customize settings
    • Config sync strategy for multi-node setups
    • Best practices for heterogeneous node specs
  3. Warning about precedence:

    • Explicit statement that uncommenting overrides includes
    • Link to PostgreSQL documentation on config precedence

Questions for Maintainers

  1. What was the original reasoning for placing includes at the top?

    • Are there specific use cases or constraints this pattern addresses?
    • Would moving includes to the end break any known workflows?
  2. Which option would you be most comfortable accepting?

    • Option A (structural change with backward compatibility path)?
    • Option B (validation tooling to help users)?
    • Option C (documentation improvements as first step)?
  3. Are there backward compatibility concerns we should address?

    • Should we provide automated migration tooling?
    • Would a feature flag to maintain old behavior be helpful?

Additional Context

Implementation location: The behavior is in src/bin/pg_autoctl/pgctl.c:

// Current implementation (prepends):
appendPQExpBufferStr(newConfContents, configIncludeLine);
appendPQExpBufferStr(newConfContents, configIncludeComment);
appendPQExpBufferStr(newConfContents, currentConfContents);

I'm willing to contribute a PR implementing the chosen solution if this enhancement is accepted.

Environment

  • pg_auto_failover version: 2.2.2
  • PostgreSQL version: 17.7
  • OS: Debian (Docker)

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions