Configuring AWS Elastic Beanstalk Environment

In my previous post, we looked at the Django project deployment to AWS Elastic Beanstalk (EB). Now, as promised, we will continue with this topic and look at configuring the Django database settings and related options. More specifically, we will be configuring our EB environment and Django project to use Postgres database.

We are starting with a Django project ready to be deployed to AWS EB, i.e., exactly where we stopped in my previous post. If you followed through the steps discussed there, you most likely finished your tests with issuing the eb terminate command, which removed the EB environment you created but left an app in place that should look approximately like this from AWS EB console side (existing Elastic Beanstalk application with no environments):

AWS Elastic Beanstalk – existing application without environments

Yet, you should still have configured the Django project folder stored locally on your Ubuntu machine (if not, just go through the steps described in my previous article). Having configured the project folder in place, we can easily create an EB environment again using the eb create command like we did before:

Using eb create command to create Django environment

Once the environment has been created, we can quickly confirm that everything works issuing the eb status command and making sure that we have the Ready/Green status reported for the environment’s Status and Health respectively, and, further on, opening our app in the web browser by means of issuing the eb open command. Provided that all is good with these basic checks, we can start looking at the database configuration changes to make in this environment.

At this stage, our Elastic Beanstalk environment works using SQLite as a database – our local db.sqlite3 database file gets uploaded up to EC2 instance belonging to the Beanstalk environment. That can be very convenient as you upload the database with all its data and you don’t even need to care about running migrations on the Beanstalk instance – you just run them locally and your database gets uploaded as a part of the deployment process with all the data included. But this configuration has two downsides:

SQLite is not something you can use for any type of large-scale multi-user applications (and as Django allows you to build web application, those are multi-user by their nature)
With such configuration, you are always overwriting the database data on the Beanstalk instance with your local data file contents – which may not be acceptable for your tests when you require data added by application users post deployment, and it is definitely not acceptable for production applications which have to store such things as real users’ data and transactions history (though, depending on your app architecture, you can use separate databases for that)

Having said that we are going to see how we can switch over to a more serious database engine fit for real world applications.

Configuring database engine using AWS EB Console

Let’s have a look at changing the database engine configuration to Postgres from the EB environment properties. For that we need to navigate to AWS EB console. We can do that through its standard URL or by means of issuing the eb console command, which will open the console in the browser window for you. Use of the eb console command is much more convenient as it immediately takes you to the right AWS console section which looks as shown below:

AWS Console – Elastic Beanstalk App Environment

Once you’ve opened the console, you need to click on Configuration and then on the Modify button in the Database tile (assuming you are using the grid view):

AWS Elastic Beanstalk console – Accessing environment’s Database settings

That will open the database configuration settings page which allows us to configure the Beanstalk environment database:

Elastic Beanstalk – Database Settings

Here we can select the desired database settings to create a database associated with the Beanstalk environment. Let’s configure our environment to use the Postgres database. For that we select Postgres as an engine from the first drop down. Second drop down allows us to select the engine’s version. The AWS Elastic Beanstalk Python platform supports Django 2.1 (source), and Django 2.1, in turn, defines the supported version of PostgreSQL as PostgreSQL 9.4 and higher. On a lower level, Django’s capabilities to work with PostgreSQL will depend on the psycopg2 module, and current versions of it provide support for the most recent versions of PostgreSQL (at the moment, it is version 12). Unless you have requirements dictating the use of some specific Postgres version, you are safe to select the latest available version, which is 11.5 at the time of this writing.

Instance class can be left on db.t2.micro (which will be the cheapest option) and you can keep the storage at its minimum 5 GB value.

You will also need to fill out the Username and Password fields, taking into account that the username is limited to 16 alphanumeric characters and the password does not allow the use of forward slash (/), quotes (“) or at (@) symbols. As usual, be sure to use a complex password.

If you don’t want to keep a database snapshot after the environment termination, you can select Delete in the retention drop down and Availability can be kept on the “Low (one AZ)” level for our purposes here. After making these adjustments, just click on Apply as shown below:

Database setting – Postgres database options configured

This concludes the configuration of PostgreSQL from the Elastic Beanstalk environment side, but we still need to configure our Django project to use this newly configured database. That includes 2 things: modifying the database connection configuration and adding the Python-PostgreSQL Database Adapter to the requirements.txt file.

Configuring Django project to use EB Postgres database

Let’s start with modifying our settings.py file located inside of the project app folder (subfolder which corresponds to initial project app which acts as a main web application for our Django project and has the same name as our project). We can edit the settings.py file using nano editor as shown below:

Opening settings.py file using nano

Inside of the settings.py file, we need to locate DATABASES settings, which, by default, defines the use of SQLite and looks as follows:

settings.py file – default DATABASES configuration

In order to specify the use of PostgreSQL database that we added in our EB environment, we need to change DATABASES settings to the following:

settings.py file – DATABASES configuration for EB environment Postgres database

In addition to these changes, to ensure that we can continue running our Django project both locally and on AWS Elastic Beanstalk, we can append our databases configuration with the IF/ELSE condition as shown below (it will basically dictate the use of Beanstalk database settings when the project is running on the Beanstalk instance and the use of sqlite3 when Django is started locally, based on the presense of the ‘RDS_DB_NAME’ variable):

settings.py file – DATABASES configuration for EB appended with fallback condition to use SQLite when run locally

Now we need to append our requirements.txt file located in the root of the Django project folder to include the Python-PostgreSQL Database Adapter as shown below:

Adding Python-PostgreSQL Database Adapter into Django project’s requirements.txt

We now can deploy our project using the eb deploy command and once deploy succeed Elastic Beanstalk environment will be using Postgres database associated with it. If you want to you may connect to Postgres AWS RDS database via its endpoint name. But to do that, we first need to enable the external access to this database instance over port 5432 as shown below:

Configuring external access for PostgreSQL database

As you can see from the screenshot below, we can drill down into the database instance settings from the EB environment Database tile by means of clicking on the endpoint name, there under Connectivity and security, we need to click on the associated VPC security group name and edit inbound rules adding new rule allowing access from our local IP to this instance over default PostgreSQL port. Existing rule allows access to this instance from EC instance associated with the EB environment and has been created automatically when we added database in our EB environment settings.

Once we’ve allowed external access, we can use pgAdmin on our Ubuntu workstation to connect to this Postgres database, specifying the connection properties as shown below (using our EB database endpoint name as a hostname along with credentials we set for it earlier):

Connecting to EBS instance database using pgAdmin

The problem we have at this stage is that the database looks a bit empty. Moreover, if you attempt to navigate to the Django admin page, it won’t work. Why? Because unlike with the SQLite database configuration, which was used by your instance before (and it was uploaded to the EB instance as a file containing both applied migrations and data), we now have a brand new database and need to take care of migrations and adding data into it.

Adding migration commands for EB environment into Django project

To run migrations we now need to execute the migrate commands on the EB instance, and for that, we need to a create db-migrate.config file inside of the hidden .ebextensions directory of our Django project and add the following content into it:

db-migrate.config – adding database migration command

This file will allow you to run migration on your Elastic Beanstalk environment, and the leader_only parameter there indicates that, in case of deployment to multiple instances, the command will run only once during deployment on the first instance as database is a shared resource in this scenario (we are speaking about backend EC2 instances responsible for scaling and capacity of EB instance here).

The commands mentioned above will apply the migrations, but we also need to have an admin user added into it. For that, we will need to execute the createsuperuser command and that will be a bit of a typing exercise in which you will also need to escape doublequote characters, so here you have the sample command in text form so that you can just paste and edit it:

Command:

"echo \"from django.contrib.auth.models import User; User.objects.create_superuser('admin_user_name', 'admin_user@domain.com', 'Password')\" | python manage.py shell"

1	"echo \"from django.contrib.auth.models import User; User.objects.create_superuser('admin_user_name', 'admin_user@domain.com', 'Password')\" \| python manage.py shell"

Here is how your db-migrate.config may look like after adding the createsuperuser command into it:

db-migrate.config – adding createsuperuser command

Now we can run the eb deploy command and, once it completes, we should see that the Django admin app is working and we can login into it – which is a clear indication that migrations were applied to it and the admin user was created in our new EB environment Postgres database. Additionally, we can now connect to the Postgres RDS instance using PGAdmin to see that our ebdb database now has some tables created:

Accessing EB instance Postgres database via pgAdmin

With these settings in place, we are now using the EB environment with the Postgresql RDS database instance linked to it.

Be sure to edit your db-migrate.config and remove the three lines responsible for the execution of createsuperuser command – as it has to be executed only once and will be causing deployment failures with the “duplicate key value violates unique constraint” error when not removed after it had been run once. Here is a sample screenshot of this error:

eb deploy failure due to attempt to execute createsuperuser command again

So, once again, if your deployment fails with the error shown above, you need to remove the following lines from the db-migrate.config file located in the .exbetensions folder:

Lines which has to be removed to prevent execution of createsupruser comman

From this moment on you can keep working with your Django project incrementally adding our changes and applying them using the eb deploy command. Important things to understand about such a configuration are those:

We are using the Postgres RDS database associated with the EB environment – and it gets deleted when you issue the eb terminate command, and data retention depends on the settings you configured while adding the database for the EB environment and, basically, boils down to two available options: immediate deletion of the database instance (all data is lost) or creation of a database snapshot before database deletion (i.e. snapshot will be saved and you will be able to apply/restore it once you create environment again)
Data no longer travels from your local development environment to the EB environment everytime you issue the eb deploy command – as we now target different databases – that raises questions of data synchronization in case you need to do local tests with data and so on.

To address the question of data sync and retention for our development environment, we can further adjust our EB environment configuration for use of an independent AWS RDS database instance. Such a configuration will allow us to decouple the database from the EB instance startup and termination as well as use the same database both on our local development machine and in EB environment.

Configuring EB Django project to use dedicated AWS RDS database (independent from EB environment)

Let’s go through the steps involved in the setting up dedicated to the AWS RDS Postgres database for use with our Django project. Before we do that, we need to terminate our EB environment using the eb terminate command and create it again with eb create. That will allow us to start with the environment created from scratch without the database instance attached to it (remember that EB instance database gets deleted during environment termination – and you need to create/attach it again). Once we do that, we can configure the EB environment and our Django project to use the external/dedicated RDS Postgres database.

First of all, we need to create the RDS Postgres database instance. For that, we just open the AWS RDS console and click on the Create Database button as shown below:

AWS RDS Console – Create database

Note that the database will be created within the region selected by you earlier, and you may change it before creating your database if necessary. On the next page, leaving the database creation method set to “Standard Create,” we will need to select PostgreSQL under the engine type, selecting the latest available engine version (11.5-R1) and template, which we can be set to Free tier, for our test purposes here:

AWS RDS Console – Selecting database engine settings and template

Next, we need to fill in/adjust the database settings providing the DB instance identifier, which has to be unique within your AWS account per the AWS region. We can use the EB environment name as part of the database instance ID to simplify the management and ifentification of resources, based on that let’s use ” django-test-env-db” as a DB instance identifier. Next we can change master user name to something non-default, for example “testenvuser” and type in secure password twice (you can also use auto generate password feature). Most importantly, make sure you expand Additional connectivity configuration settings and set Publicly accessible option to Yes as shown below:

Creating RDS Database – Enabling public access

The rest of the settings can be left on their defaults at this stage and we can just click on the Create database button on the bottom of this page to initiate the database instance creation process. I found it a bit confusing that throughout the RDS console, the term “database” is frequently used as an equivalent of “database (sever) instance”, despite the fact that you only create and manage the database (server) instances and not databases within those, the RDS console uses the term “database instance” and “database” to refer to database (server) instances. For example, you click on the Create database button to create a database (server) instance, while we still need to create a Django project database within this instance later on.

Once the RDS database instance has been created, we need to allow the access to it from the EB environment and from our local development machine. For that, we need to edit our database instance VPC security group Inboud rules. To do this, we drill down into the database settings/properties and there click on VPC security group name under Security in database Connectivity & security settings as shown below:

RDS Console – Accessing database VPC security group settings

That will bring up the security group settings page where we need to swith on the Inbound tab and click on the Edit button as shown below:

RDS Console – VPC Security Group Settings

As I mentioned before, we need to add two inbound rules – one to access database from EB environment and another one to access it from our local development machine. Let’s create the rules shown on the screenshot below and click on the Save button:

RDS Console – Inbound rules for PostgreSQL database

To allow inbound PostgreSQL connection from our EB environment (second rule on the screenshot above), we need to paste its security group ID into the source field – you can find it in the environment configuration under Instances > Security groups as shown below:

EB environment configuration – Confirming instance’s security group ID

With the Postgres RDS database instance created and inbound access rules configured, we now need to configure our EB environment variables to use this database. Of course, we can specify plain text passwords along with other database connection properties in our settings.py file, but that won’t be good from the security standpoint. To avoid this, we will be using the same EB environment variables which we used before (RDS_DB_NAME, RDS_PASSWORD and so on) – the only difference now is that with the independent RDS database, we need to fill them out ourselves (when we used database linked to EB environment those were created for us automatically). To set these variables, we need to open the Software settings of our EB environment (EB Console > Application > Environment > Configuration > Software > Modify) as shown below:

EB Console – Accessing environment’s software settings

Once you clicked on Modify under software settings, it will open the Modify software page where we need to specify and define all our database connection variables as shown below (for values we need to use settings configured for our recently RDS Postgres database) and click Apply:

EB Console – Modify Software – Adding database related Environment Properties

Pay attention to the fact that at the stage when we created the RDS Postgres database instance, we specified the instance name but a database still has to be created. As you can see from the screenshot above, I set the RDS_DB_NAME variable to be “djangotestenvdb” – but it still needs to be created. For that, we can connect to the RDS Postgres instance using pgAdmin and create a new database named “djangotestenvdb” – thus, we will also confirm that the access to the instance is working from our development workstation.

With the database created within our Postgres RDS instance, we can finally adjust our Django project settings. As we want to avoid storing the database password within our project settings.py file, we will define the same variables we defined for our EB environment on our local Ubuntu machine by means of placing them into the .bashrc file. For that, we just need to open this file using the nano ~/.bashrc command and define the required variables at the very end of that file:

Defining RDS variables in Ubuntu .bashrc file

After modifying this file, we need to reboot our Ubuntu machine to get those changes applied. Once the reboot completes, we can confirm that the variables were added correctly using the printenv command and confirming that its output contains the newly added variables.

All these changes allow us to have “unified” the DATABASE configuration in the settings.py file, which will be valid both for our Ubuntu development machine and for our EB environment. Essentially, we just need to remove the IF condition we set when we were using the combination of the EB environment database with the local SQLite database and adjust the ENGINE parameter to specify the use of psycopg2 package which acts as a PostgreSQL database adapter for Python. Here is how our DATABASES configuration should look like:

“Unified” DATABASES config

As a final touch, we need to install the psycopg2 package within our local virtual environment by means of issuing the pip3 install psycopg2-binary command and the remove db-migrate.config file that we had created earlier under the .ebextensions project folder – as we now use the same database on the development machine, we can apply migrations locally.

Providing that you performed all the steps mentioned above correctly, you should be able to issue all the migrations commands locally and deploy your Django project using the eb deploy command, so that it will keep using the same RDS Postgres database which may make your development work a little bit more convenient. As you can see, we did this without storing the sensitive database password within our project code and we are no longer deleting our Django app database upon the EB environment termination. But the problem we have with this type of configuration is that if you try to issue the eb terminate command, it will be failing due to the fact that the RDS database instance references security group identifier of the EB environment of the EC instance associated with the EB environment – so you will need to delete the database Inbound rule referencing this security group ID before terminating the environment (and if you started terminate operation before doing this you will need to delete corresponding inbound rule and run terminate command using EB command line again). You should also remember that after the recreation of the EB environment (terminate-create cycle), you will need to add the database related environment variables again. That’s a bit of a nuisance, but still can be faster/more convenient than recreating your database and reapplying snapshots every time you terminate your EB environment.

I hope this post provided enough information on the database configuration settings for Django projects run in AWS EB environments and made this topic at least a little bit clearer for you. As usual, you can post your questions or feedback in the comments section below. There are still some topics related to deploying and running Django projects on AWS EB which I’ll cover in my next post (e.g. static files, DNS and SSL configuration settings) – stay tuned.

Deploying Django Project to AWS Elastic Beanstalk, Part 2: Database Settings Configuration

Configuring database engine using AWS EB Console

Configuring Django project to use EB Postgres database

Adding migration commands for EB environment into Django project

Configuring EB Django project to use dedicated AWS RDS database (independent from EB environment)