Backing Up The Database
First, we need to connect to our database server instance via SSH using the ubuntu user.
We will need to install some dependencies to get our backup script to run:
sudo apt-get install build-essential
sudo apt-get install ruby1.8-dev
sudo apt-get install rubygems
sudo gem update --system
You will need to tweak RubyGems so that the update works correctly, as described here.
Now you can install Gemcutter, which is the new ultra cool repository for gems:
sudo gem install gemcutter
sudo gem tumble
Finally we are ready to install the Amazon EC2 rubygem:
sudo gem install amazon-ec2
Now we can create our backup script. Save this code into the ~/ directory under the name backup_database.rb. You will need to substitute the Amazon ACCESS_KEY_ID and SECRET_ACCESS_KEY, as well and entering the correct EBS volume for the DATABASE_VOLUME constant:
ACCESS_KEY_ID = 'YOUR_ACCESS_KEY'
SECRET_ACCESS_KEY = 'YOUR_SECRET_ACCESS_KEY'
DATABASE_VOLUME = 'vol-XXXXXXXX'
puts "Starting database snapshot..."
ec2 = AWS::EC2::Base.new(:access_key_id => ACCESS_KEY_ID, :secret_access_key => SECRET_ACCESS_KEY)
ec2.create_snapshot(:volume_id => DATABASE_VOLUME)
puts "Database snapshot completed."
Due to the finicky way that Ruby runs as part of a cron job, we are best off creating a shell script that then runs the Ruby backup script. Save this code into the ~/ directory under the name backup_db.sh:
Don't forget to make the backup shell script executable:
chmod +x /home/ubuntu/backup_db.sh
Now we just need to configure this script to run as part of a cron job, so that the backups take place automatically. The crontab command brings up the list of configured cron tasks for the current user:
This example crontab entry runs the backup daily at midnight, but you may want it to run more frequently:
0 0 * * * /home/ubuntu/backup_db.sh
At this point, you should have a fully functional automated backup system. Verify after midnight that the script has run as you expect, by looking to see if a new snapshot has been created, using Elastifox or however you administrate your EC2 instances.
Creating The AMI
Creating the AMI to backup the entire database instance is pretty easy. First, you need to upload the PEM files. Remember you are authenticating as the "ubuntu" user:
scp -i id_rsa-gsg-keypair pk-YOUR.pem cert-YOUR.pem ubuntu@domU-12-34-31-00-00-05.usma1.compute.amazonaws.com:
Use your SSH connection into the database server instance to copy the PEM files to the /mnt directory:
sudo cp /home/ubuntu/*.pem /mnt
Now create the bundle. Make sure you use your Amazon account number (without dashes) as the value for the -u parameter. This can take quite a while, so do not get impatient:
sudo ec2-bundle-vol -d /mnt -k /mnt/pk-YOUR.pem -c /mnt/cert-YOUR.pem -r i386 -u YOURUSERACCOUNTNUMBER
You can now upload the bundle to your Amazon S3 account, in preparation for making available as an AMI. Use something versioned for the -b parameter which is the name of the bundle:
sudo ec2-upload-bundle -b my-database-server-1.0-ami -m /mnt/image.manifest.xml -a YOUR_ACCESS_KEY -s YOUR_SECRET_ACCESS_KEY
Final step is going back to your local machine, and making the newly created bundle available to be used to start a new instance:
You can now now launch a brand new database server instance based on this AMI, and it will be a clone of your existing database server. This is the procedure you would follow if you need to restore your database server instance from backups.
Restoring Your Database Server From The Backups
In the case that something goes terribly terribly wrong, you can get back to normal as follows:
- startup a new EBS volume from your most recent snapshot backup,
- start up a new server from your database AMI
- configure your new server instance to use the new volume started from the backup data
- switch your elastic IP to point to the new server, or update the references in your application to point to the new server
This concludes part 2 of the great PostgreSQL config post for the EC2 cloud. Hopefully it will help you with a nice simple way to take the basic PostgreSQL instance that you got up and running on Ubuntu/EC2 using the directions in the part 1 post, and add the confidence that backed up data and a completely reproducible configuration provides.
Check out Eric Hammond's article on creating consistent EBS snapshots where multiple volumes are involved. It's written with MySQL in mind, but it can be modified to work with Postgresql.
Thanks for pointing out that article. I depended on several of Eric's other articles when writing my previous posts on PostgreSQL/Ubuntu/EC2, as I mentioned in those posts. His work has been seminal in everyone's use of these tools.
That said, the purpose of my post was to provide the simplest way to achieve a backup of PostgreSQL. ec2-consistent-snapshot looks very interesting for MySQL, but I was not up to the work right at the moment for the modifications needed for PostgreSQL configurations.
Overheard on Twitter (whose search feature is currently not working for older tweets, so I am paraphrasing):
"The killer feature of EC2 is @esh"
Oh yes! He has saved my assets, not just in general, but personally.
Beauty. Nicely done, while Hammond's is great for the deep dive this was perfect for the short and sweet.
Thanks so much for this and the earlier lead article - finally got my existing vmware .vmdk ubuntu image successfully converted over to an ec2 ami after much trial&error and moving my postgresql/postgis database into the cloud(EBS) is the next step. Your documentation was the first I found that really put things together well from the beginner's perspective.
Am running into issues with the ec2-bundle-vol command.
Unfortunately getting that command to install itself is a huge task. It needs ec2-ami-tools(didn't realize they weren't part of ec2-api-tools) ruby, openssl, multiverse and what not. After resolving all those dependencies, it gives me this error,
ERROR: You need to be root to run /home/ubuntu/ec2-ami-tools-1.3-34544//lib/ec2/amitools/bundlevol.rb
Now am onto resolving above error. :)
Thanks Ron for such a useful article
I'm not sure taking EBS snapshots are the best mechanism for backing up a production database. From the developer forums "To get a clean snapshot, unmounting the volume (or using dm suspend as described elsewhere) is necessary, as you need to tell the OS to flush all pending writes to the device." This is especially true if the database is hot (or you don't want to incur daily downtime). Also, you are likely backing up a lot of empty disk, as the snapshot is of the whole volume but your data just a portion. A much simpler solution I've used is to pg_dump a tar, compress, and use s3cmd to persist to S3.
Post a Comment