How to generate a Aurora Postgresql cluster with all auto explain enabled
PostgreSQL has query execution plans configured as extension, meaning they do not come out of the box we need to configure it, For on-prem or owning server you can check this link which tells how to configure it. Problem is there are so many steps. And it is confusing for AWS Aurora
I wrote a small bash script Here just to make this work automatic. Here is powershell version
- SubnetGroupName -> AWS subnet name for connecting which has all the configuration ready
- $psqlPath = “C:\tools\postgresql-16.6-2\pgsql\bin\psql.exe” -> user should have psql in the machine and change the path accordingly
- Before starting it, user should have aws secret and keys defined in their env variables, explained here
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
set -ex | |
# This script creates an Aurora PostgreSQL DB cluster and instance, applies a custom parameter group to enable auto explain for all queries/execution plans, and creates a database and user. | |
# It uses the AWS CLI to interact with Amazon RDS, and the `psql` command-line tool to execute SQL commands. | |
# Assumes that `psql` is installed and in the system PATH, and that AWS credentials are set as environment variables. | |
# Usage: ./PrepareAuroraPostgreSql.sh <SubnetGroupName> | |
if [ "$#" -ne 1 ]; then | |
echo "Usage: $0 <SubnetGroupName>" | |
exit 1 | |
fi | |
SUBNET_GROUP_NAME=$1 | |
PSQL_PATH="psql" # Adjust the path to your psql executable | |
if ! command -v psql &> /dev/null; then | |
echo "Error: psql not found in PATH." | |
exit 1 | |
fi | |
function get_timestamp { | |
date +"%Y-%m-%d %H:%M:%S" | |
} | |
# Set AWS credentials and region | |
AWS_ACCESS_KEY=$AWS_ACCESS_KEY_ID | |
AWS_SECRET_KEY=$AWS_SECRET_ACCESS_KEY | |
AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN | |
REGION="eu-west-2" | |
# Create Aurora PostgreSQL DB cluster | |
CLUSTER_IDENTIFIER="testdelete-cluster" | |
DB_INSTANCE_IDENTIFIER="ozkantest-instance" | |
DB_NAME="testdelete_db" | |
MASTER_USERNAME="ozkantest_user" | |
MASTER_USER_PASSWORD="ozkantest_password" | |
aws rds create-db-cluster --db-cluster-identifier $CLUSTER_IDENTIFIER --engine "aurora-postgresql" --master-username $MASTER_USERNAME --master-user-password $MASTER_USER_PASSWORD --database-name $DB_NAME --db-subnet-group-name $SUBNET_GROUP_NAME | |
# Wait for the cluster to be available | |
CLUSTER_STATUS="" | |
while [ "$CLUSTER_STATUS" != "available" ]; do | |
sleep 30 # Adjust sleep time as needed | |
CLUSTER_STATUS=$(aws rds describe-db-clusters --db-cluster-identifier $CLUSTER_IDENTIFIER --query 'DBClusters[0].Status' --output text) | |
echo "$(get_timestamp) - Current cluster status: $CLUSTER_STATUS" | |
done | |
# Create a DB instance in the cluster | |
aws rds create-db-instance --db-instance-identifier $DB_INSTANCE_IDENTIFIER --db-cluster-identifier $CLUSTER_IDENTIFIER --engine "aurora-postgresql" --db-instance-class "db.t3.medium" --publicly-accessible --db-subnet-group-name $SUBNET_GROUP_NAME | |
# Wait for the instance to be available | |
DB_INSTANCE_STATUS="" | |
while [ "$DB_INSTANCE_STATUS" != "available" ]; do | |
sleep 30 # Adjust sleep time as needed | |
DB_INSTANCE_STATUS=$(aws rds describe-db-instances --db-instance-identifier $DB_INSTANCE_IDENTIFIER --query 'DBInstances[0].DBInstanceStatus' --output text) | |
echo "$(get_timestamp) - Current DB instance status: $DB_INSTANCE_STATUS" | |
done | |
# Retrieve the DB Instance Endpoint | |
ENDPOINT=$(aws rds describe-db-instances --db-instance-identifier $DB_INSTANCE_IDENTIFIER --query 'DBInstances[0].Endpoint.Address' --output text) | |
echo "DB Endpoint: $ENDPOINT" | |
# Define the SQL commands without ALTER SYSTEM | |
SQL_COMMANDS=$(cat <<EOF | |
CREATE USER redgatemonitor WITH PASSWORD '$MASTER_USER_PASSWORD'; | |
GRANT pg_monitor TO redgatemonitor; | |
GRANT ALL PRIVILEGES ON DATABASE redgatemonitor TO redgatemonitor; | |
CREATE EXTENSION IF NOT EXISTS pg_stat_statements; | |
CREATE EXTENSION IF NOT EXISTS log_fdw; | |
CREATE SERVER sqlmonitor_file_server FOREIGN DATA WRAPPER log_fdw; | |
GRANT EXECUTE ON FUNCTION list_postgres_log_files() TO redgatemonitor; | |
GRANT EXECUTE ON FUNCTION create_foreign_table_for_log_file(text, text, text) TO redgatemonitor; | |
GRANT USAGE ON FOREIGN SERVER sqlmonitor_file_server TO redgatemonitor; | |
EOF | |
) | |
# Set the password environment variable for psql | |
export PGPASSWORD=$MASTER_USER_PASSWORD | |
# Define the SQL command for creating the database | |
SQL_CREATE_DATABASE="CREATE DATABASE redgatemonitor;" | |
# Execute the SQL command using psql | |
$PSQL_PATH -h $ENDPOINT -U $MASTER_USERNAME -d $DB_NAME -c "$SQL_CREATE_DATABASE" | |
# Execute the remaining SQL commands | |
$PSQL_PATH -h $ENDPOINT -U $MASTER_USERNAME -d redgatemonitor -c "$SQL_COMMANDS" | |
echo "SQL commands executed." | |
CUSTOM_PARAMETER_GROUP_NAME="auto-explain-everything-aurora-postgresql15" | |
PARAMETER_GROUP_EXISTS=$(aws rds describe-db-cluster-parameter-groups --db-cluster-parameter-group-name $CUSTOM_PARAMETER_GROUP_NAME --query 'DBClusterParameterGroups[0].DBClusterParameterGroupName' --output text) | |
if [ -z "$PARAMETER_GROUP_EXISTS" ]; then | |
aws rds create-db-cluster-parameter-group \ | |
--db-cluster-parameter-group-name $CUSTOM_PARAMETER_GROUP_NAME \ | |
--db-parameter-group-family aurora-postgresql15 \ | |
--description "Enable auto explain for everything query/execution plans group for Aurora PostgreSQL" | |
fi | |
# Modify the parameters in the custom parameter group in multiple requests | |
PARAMETERS1=$(cat <<EOF | |
[ | |
{"ParameterName":"shared_preload_libraries","ParameterValue":"pg_stat_statements,auto_explain","ApplyMethod":"pending-reboot"}, | |
{"ParameterName":"log_destination","ParameterValue":"csvlog","ApplyMethod":"pending-reboot"}, | |
{"ParameterName":"auto_explain.log_format","ParameterValue":"json","ApplyMethod":"immediate"} | |
] | |
EOF | |
) | |
PARAMETERS2=$(cat <<EOF | |
[ | |
{"ParameterName":"auto_explain.log_min_duration","ParameterValue":"0","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.sample_rate","ParameterValue":"1.0","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.log_verbose","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.log_nested_statements","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.log_analyze","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.log_buffers","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.log_timing","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"auto_explain.log_triggers","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"track_io_timing","ParameterValue":"true","ApplyMethod":"immediate"}, | |
{"ParameterName":"pg_stat_statements.track","ParameterValue":"top","ApplyMethod":"immediate"} | |
] | |
EOF | |
) | |
aws rds modify-db-cluster-parameter-group \ | |
--db-cluster-parameter-group-name $CUSTOM_PARAMETER_GROUP_NAME \ | |
--parameters "$PARAMETERS1" | |
aws rds modify-db-cluster-parameter-group \ | |
--db-cluster-parameter-group-name $CUSTOM_PARAMETER_GROUP_NAME \ | |
--parameters "$PARAMETERS2" | |
# Apply the custom parameter group to the DB cluster | |
aws rds modify-db-cluster \ | |
--db-cluster-identifier $CLUSTER_IDENTIFIER \ | |
--db-cluster-parameter-group-name $CUSTOM_PARAMETER_GROUP_NAME \ | |
--apply-immediately | |
# Reboot the DB instance to apply pending-reboot parameters | |
aws rds reboot-db-instance --db-instance-identifier $DB_INSTANCE_IDENTIFIER | |
# Retrieve the DB Instance Endpoint | |
ENDPOINT=$(aws rds describe-db-instances --db-instance-identifier $DB_INSTANCE_IDENTIFIER --query 'DBInstances[0].Endpoint.Address' --output text) | |
# Check if the endpoint is retrieved correctly | |
if [ -z "$ENDPOINT" ]; then | |
echo "Error: DB instance endpoint not found." | |
exit 1 | |
fi | |
echo "$(get_timestamp) - Custom parameter group applied successfully." |
After successful execution you should see the test cluster like below, and you can check logs to see how execution plans are created.
And in case you wonder how explain and query plan works in detail you can check the code here study some C 🤓