PostgreSQL Replication Express Setup

The system I work on we deploy almost solely on the Amazon AWS platform. Even I try to design the architecture in a way not to be locked-in too much into Amazon, I make use of the Amazon tools and products as much as possible (EC2, VPC, S3, SNS). PostgreSQL is our reference DB and the only DB product in production environments, still we run dedicated instances with PostgreSQL. I am quite delighted about AWS recent offering staring RDS with PostgreSQL. While is is still in BETA and I did not started yet with a conclusive test and migration plan, I need to maintain our existing instances.

There are plenty of books and tutorials about setting up PostgreSQL replication with on-board tools, without going into the details I share the express setup in this tutorial based on Streaming Replication which is part of PostgreSQL since version 9.0. I highly recommend to review the parameters and settings from the below tutorial as your project might have different requirements.

References

Tutorial

Remarks

  • The tutorial is based on PostgreSQL 9.2 running on Ubuntu Server
  • Paths and settings are all the PostgreSQL defaults.
  • This is async setup, the master will not wait for feedback from the salve and continue to work even the slave is not available

Prerequisite

  • 2 Server running the same PostgreSQL version (9.0+)
  • Backup your data or use a sandbox environment.
  • In the tutorial I refer to
    MASTER (ip: 0.0.0.1) and
    SLAVE (ip: 0.0.0.2)

Configuration

Master

  • Create a replicator user
    sudo -u postgres psql -c "CREATE USER replicator REPLICATION LOGIN ENCRYPTED PASSWORD 'mypassword';"
    
  • Add the slave ip to /etc/postgresql/9.2/main/pg_hba.conf
    host    replication     all      0.0.0.2/32         trust
    
  • Modify parameters in /etc/postgresql/9.2/main/postgresql.conf
    wal_level = hot_standby
    max_wal_senders = 3
    checkpoint_segments = 3
    wal_keep_segments = 3
    

    Review these parameters and set them up according to your requirements

  • Start the PostgreSQL instance
    service postgresql start
    

Slave

  • Modify parameters in /etc/postgresql/9.2/main/postgresql.conf
    wal_level = hot_standby
    max_wal_senders = 3
    checkpoint_segments = 3
    wal_keep_segments = 3
    hot_standby = on
    
  • Stop the PostgreSQL instance
    service postgresql stop
    
  • Clean up the old data directory
    sudo -u postgres rm -rf /var/lib/postgresql/9.2/main
    
  • Copy the database from the master with pg_basebackup
    sudo -u postgres pg_basebackup -h 0.0.0.1 -D /var/lib/postgresql/9.2/main -U replicator -v -P -x
    

    You can see the backup progress and it should result in something likes

    root@:/var/lib/postgresql/9.2/main# transaction log start point: 41/7D000020
    31524952/31524952 kB (100%), 2/2 tablespaces (/var/lib/postgresql/9.2/main/PG_9.)
    transaction log end point: 41/7D0002A8
    pg_basebackup: base backup completed
    
  • Create a recovery configuration file /var/lib/postgresql/9.2/main/recovery.conf
    standby_mode = 'on'
    primary_conninfo = 'host=0.0.0.1 port=5432 user=replicator password=mypassword sslmode=require'
    trigger_file = '/tmp/postgresql.trigger'
    
  • Start the PostgreSQL instance
    service postgresql start
    

    Check the pglogs.

Test the replication

  • Open any table with pgadmin on the master and apply a change, it should be reflected in the slave within short time.
  • Try to change data on the slave, it will fail due to the hot-standby mode

Monitor the replication

  • The master instance will not alert you when the replication is down. You can check by yourself or create a little cronjob to do it for you with this sql statement.
    sudo -u postgres psql -x -c "select * from pg_stat_replication;"
    

    You get the status back (if it is running, otherwise the statement will return ‘no rows’).

    Check replication

    Check replication

Evaluating JVM Heap Dumps

Despite running JVM supposedly being rock-solid and stable I see enough JVM who cant take the load, most likely to non-optimal settings, throwing heap dumps.
A while ago I struggled to open Thread Dumps I can open heap dumps with VisualVM (Version 1.3.7 at the time of writing)

VisualVM

VisualVM

Heap Dump File

Heap Dump File

Beware, with significant heap dump files (in my sample 700MB) it can take quite a while to open or review the results. Open thread dump list took a good 10 minutes.

Long Processing Times

Long Processing Times

Hide Glassfish Server Information

Despite moving on to JBoss progressively I still share my findings, often they apply to other products too.

For a public facing server you want to reveal as little background information as possible. In Glassfish you can hide create custom 404 and 500 error pages (previous post), but you should also hide the server info that comes with the server header, easily revealed by a tool ike the Firefox plugin httpfox.

Response Header

Response Header

There are 2 crucial settings you must change:

  • JVM setting for product name: -Dproduct.name=”My App Server”

    JVM Settings

    JVM Settings

  • Remove the “XPowered By” flag

    XPowered By

    XPowered By

As result you will have a pretty generic response header

Response Header

Response Header

Oracle abandons Commercial Support for Glassfish !

I could not believe it when I read this in my daily serving of IT news. But it is the truth, Oracle will stop the (commercial) support for Glassfish in 2016 !
I remember a)  the full blown announcements during the take-over from SUN they would continue with Glassfish and b) people questioning it, the later should be right now. Ultimately, even Oracle saying they would continue with the community edition, means Glassfish will be dead, the core team will shrink and Glassfish becomes a hobby in order to maintain the reference EE implementation (to control the other players). They killed OpenOffice, OpenSolaris, Hudson CI (pretty much). I spent a lot of time with Glassfish and right now rolling it out to customers who actually expect commercial support, this back-stabs my efforts in the last few years being a Glassfish fan.

Dear Oracle, I will not continue with WebLogic  Application Server (another closed product), but swing over to JBoss (for which I did some compatibility tests for our applications anyway the last 2 years)

So Long, and Thanks for All the (Glass)Fish !

Update:

Increase size and type of AWS EBS volume

I was offline for quite a while because shifting from one continent to another. But now regular posts should be rolling in again.

I am running a couple of instances in pre-production requirement mode and changed from a standard EBS volume to a IOPS volume for the DB instance or the volume with the DB files. I could not identify a reasonable increase of performance, maybe a misconception that IOPS volumes will boost performance, rather provide a defined and consistent random access I/O throughput. I must admit I did not use a value higher than 1000.

Billing IOPS

Billing IOPS

Some recommended reading:

I decided to return to a standard ESB volume for my database as its performance did not benefit from the IOPS type (the DB is not overly busy too).
You cant change type and size of an EBS volume on the fly.

Here the steps to achieve the same: Continue reading

Glassfish Tuning and Thread Dumps

Once you get into a production like stage with more data volume, more users, etc. you will find yourself pretty fast in the situation where Glassfish gets some hickups or slowness. An expected situation if you did not change the default parameters  out of the box which are chosen to make Glassfish to run even on a small box.

We ran into some kind of concurrency problems with JDBC pools and thread where Glassfish appeared to be hanging. One approach is to create thread dumps for the JVM.
With Glassfish we have a few options:

  1. jstack
    Run jps which returns you the list of applications running a JVM, choose the PID and execute
    jstack <PID>
    or jstack -F <PID> > td.log
  2. asadmin
    Go the Glassfish_HOME/bin folder and execute
    ./asadmin --user admin generate-jvm-report --type=thread > threaddump.txt
  3. kill -3 <PID>
    Supposed to create a dump in the default log folder of Glassfish. Doesnt work for me.

With the (or better more than 1) file at hand you can evaluate them by hand or use some of the tools around. I am still struggling to make the tool analyzing my dumps. They simply open the files like a editor.

tda tool

tda tool

Some References

Enforce password for Ubuntu user on EC2 instances

Using linux (Ubuntu) instances on Amazon EC2 is a quite safe thing to do, at least measured by the security provided by the platform (security groups, ACL, physical security,..). I recommend reading their security site here. At the end of the day the server is only as secure as you configure it, if you choose to open all ports running services with their default configurations and password settings, Amazon can’t help you.

When connecting to a Ubuntu server with ssh you need to provide the keyfile (somekeyfile.pem) that you can download when creating the key pair.

Key file

Key file

This 2048 bit key is required to login as regular ubuntu user. What I dislike is the fact that this user can sudo all, so once someone manage to get into you user account, he has root access too. I recommend to set a password for the ubuntu user and change the sudoers configuration.

Change the password for user ubuntu

Open the sudoers include file

sudo vi /etc/suderos.d/90-cloudimg-ubuntu or sudo vi /etc/sudoers

change last line from

ubuntu  ALL=(ALL) NOPASSWD:ALL

to

ubuntu ALL=(ALL) ALL