Managing the cookbooks
The cookbooks that we will be using are all available at the following URL: https://github.com/johnewart/simplewebpy_app. Because a number of cookbooks used in this example are under active development, the ones required for the examples have been frozen (as of writing of this book) to ensure compatibility with the examples; it is better to have them slightly out of date than broken in this case.
However, when you write your own cookbooks and deploy your own software beyond this example, you will find that there are a large number of cookbooks that can be found through the Chef community site—http://supermarket.getchef.com/—or by searching the Web for cookbooks; many of these will be hosted on GitHub, BitBucket, or similar source code-hosting sites.
Downloading cookbooks
Here in the following code, we will simply download the cookbook collection as a whole:
http://github.com/johnewart/chef_essential_files
To install the collection, we can do the following from the chef_essential_files/cookbooks
directory:
knife cookbook install -o . *
This will install all of the cookbooks that are provided. The provided cookbooks are all that is required for the examples in this chapter to be successful. Let's take a look at our custom cookbook, the pythonwebapp
cookbook, as all of the others are off-the-shelf cookbooks that are designed to provide some general support functionalities.
Looking at the database recipe
We will do a few things here, so let's look at our database
recipe. In order for our web application to be useful, it needs a database to connect to. Typically, this involves installing the database server software, constructing a database, and granting access to that database by a specified user (or users). Our application is no different, so we will leverage the database
cookbook in order to accomplish this.
First, in our recipe, we need to include the PostgreSQL-specific resources from the database
cookbook, which we will do using the following code:
include_recipe "database::postgresql"
You will need to know what database you will be creating and to which user you will be granting access to along with the password that will be used to identify them:
dbname = node[:webapp][:dbname] dbuser = node[:webapp][:dbuser] dbpass = node[:webapp][:dbpass]
In order to create a database and user as well as grant access, you will need to establish a connection to the database server with a user that has permission to do so. You will see that this user has also been granted access in your role's pg_hba
settings so that PostgreSQL knows to allow the postgres
user to connect to the database locally, as shown in the following code:
postgresql_connection_info = { :host => 'localhost', :port => node['postgresql']['config']['port'], :username => 'postgres', :password => node['postgresql']['password']['postgres'] }
Using this connection information, you can construct a database and a user (if they don't already exist), and then grant that user full access to our new database:
# Construct an actual database on the server postgresql_database 'webapp' do connection postgresql_connection_info action :create end # Create a database user resource using our connection postgresql_database_user dbuser do connection postgresql_connection_info password dbpass action :create end # Grant all privileges on all tables in 'webapp' postgresql_database_user dbuser do connection postgresql_connection_info database_name dbname privileges [:all] action :grant end
This high-level language allows us to easily manipulate the database without the need to know any database-specific SQL or commands. If you want to convert your application to use MySQL, for example, provisioning a new MySQL database would largely be as easy as converting the word postgresql to mysql in our recipe, and the database-specific adapter in the database
cookbook will be responsible for the implementation details.
Looking at your application deployment cookbook
Once our database has been provisioned, you can look at how you can install our web application. In the pythonwebapp::webapp
recipe, you have all the information you need to do this. The way that you define a recipe for deploying an application will vary wildly among applications, as each application is unique. However, this particular example was designed to be a representative of most web applications (reasonably) and should present you with a good starting point to understand the basics of deploying a web application with Chef.
Modern web applications typically follow the same pattern: provision a user, install an interpreter, or other engine (such as Python, Ruby, and Java), create directories if needed, check out the source code, run any data migrations (if needed) to update your database, and then make sure that your service is up and running; this is no different. The more complicated your application, the more infrastructure you may need to model, such as job queue engines, asynchronous workers, and other libraries.
If you look at the web application cookbook located at cookbooks/pythonwebapp
, you will see that it has the following: two recipes, a template, and a PIP-requirement definition inside it. The recipes included are for the web application itself and to manage the creation of the PostgreSQL database and user on the database host.
Most of the interesting work is in the application recipe, cookbooks/pythonwebapp/recipes/webapp.rb
; so, let's start by taking a look at that. All applications are going to have a slightly different deployment logic, but modern web applications usually follow a pattern that looks like the following:
- Install any system-wide packages required
- Construct the directories needed for the software
- If this is Python or Ruby, possibly install a new
virtualenv
tool or RVM gemset - Install the libraries needed to run the application
- Check out the application's source code
- Build and configure the application as needed
- Create or update the database schema
- Stop the web application services
- Start the web server or process manager that monitors the application
This example application is no different, so let's look at the steps needed to deploy this web.py
application. First, declare any application configuration data needed with the following command:
app_root = node[:webapp][:install_path] python_interpreter = node[:webapp][:python] virtualenv = "#{app_root}/python" virtual_python = "#{virtualenv}/bin/python" src_dir = "#{app_root}/src/" # Grab the first database host dbhost = search(:node, "role:postgresql_server")[0]['ipaddress'] environment_hash = { "HOME" => "#{app_root}", "PYTHONPATH" => "#{app_root}/src" }
In this snippet, we used the computed attributes to tell our recipe where to install the application; in this case, the default is /opt/webapp
but this can be overridden for flexibility. Additionally, we set the path to the Python interpreter we want to use for our Python virtualenv
. However, you can just as easily specify a Ruby or Java path if your application used one of those languages. There is a path to the source code and a database host address. This path is determined by searching the Chef data for all nodes with the postgresql_server
role, taking the first one, and using its IP. This allows us to replace the database server and not have to worry about updating our configuration data, which we'll see in a bit.
Preparing the directories
In order to deploy our application, and for it to run, we need to have a location to put our data. In this application, we have defined a need for: a configuration directory, a log directory, and a place to view the source code. In our recipe, we will create these directories and set proper ownership to our deployment user and group. Note that you do not need to create the application root directory if it already exists, and you do not need to set special ownership or permissions on the root directory. Because we are leveraging the recursive property of the directory resource, the root application directory will be implicitly created; however, we are constructing it here for the sake of completeness.
It is critical that our directories have the correct ownership and permissions; without this, the application will be unable to interact with those directories to store log data or read-and-write any configuration data. The following code constructs these directories for us and changes the ownership and permissions:
directory "#{app_root}" do owner node[:webapp][:user] group node[:webapp][:group] mode "0755" action :create recursive true end # Create directories ["src", "logs", "conf"].each do |child_dir| directory "#{app_root}/#{child_dir}" do owner node[:webapp][:user] group node[:webapp][:group] mode "0755" action :create recursive true end end
One thing to note here is that we are using a loop to construct our directories. This is a good way to manage multiple resources of the same type that have the same set of configuration parameters. Here we are saying that we have three subdirectories, src
, log
, and conf
. Also, we want to construct a directory
resource inside of our application's root directory for each subdirectory with proper ownership and permissions. The recursive flag is similar to the -p
option on mkdir
, which tells it to create any directories that are missing in between the root and the directory being created.
Constructing your Python virtual environment
This may be new to non-Python developers but should be fairly straightforward. A virtual environment operates in a similar way to RVM or rbenv
for Ruby, or a self-contained JAR file for Java. In that, it isolates the Python interpreter and installed libraries to a specific location on the system. In our case, we will use the following code to achieve this:
python_virtualenv "#{virtualenv}" do owner node[:webapp][:user] group node[:webapp][:group] action :create interpreter "#{python_interpreter}" end
This python_virtualenv
resource comes from the python
cookbook and will construct a virtual environment in the location named by the resource (in our case, the directory stored in virtualenv
, which as we saw previously, is defined as though in a python
directory inside our application root) using the specified interpreter and ownership properties.
A virtual environment will be created, which contains a minimal installation of the Python interpreter as well as any Python libraries that are installed into the virtual environment. Think of it as your application's own installation of Python that is unaffected by, and subsequently does not affect, any other Python installation on the system.
This is a very useful technique to install and manage Python applications, and the same concept can be extended to the Rails application using any similar technology from the Ruby world such as RVM or rbenv
, as mentioned earlier.
Checking the source code
One interesting thing in this recipe, which has been included for future reference, is the usage of a cookbook, ssh_known_hosts
, that grabs a host's SSH key and adds it to the system's list of known SSH keys. This is extremely useful to deploy software via GitHub or BitBucket, where you are using SSH to pull down the source code, especially as their host keys might change:
# Need to install SSH key for github.com ssh_known_hosts_entry 'github.com'
Note that it is also somewhat insecure as you are blindly accepting the host's fingerprint—if you are concerned about security, you can provide the known fingerprints manually using the :key
attribute. Supplying the fingerprint is done through the following code:
ssh_known_hosts_entry 'github.com' do key 'github.com ssh-rsa AAAAB3NzaC1yc....' end
If there are a large number of host fingerprints that you need to manage, or if they change frequently, you can use a data bag to store them. If you are interested, look at the README for the ssh_known_hosts
cookbook for more examples.
Once the SSH keys are registered, you can now clone the source from a git+ssh
URL such as GitHub's authenticated SSH endpoint.
In this example, we are using a publicly available HTTPS source code repository; if you were to replace this with your own SSH-enabled repository, you would need to change the repository attribute and also make sure to store your deployment key on the endhost:
# Clone our application source git "#{src_dir}" do repository "https://github.com/johnewart/simplewebpy_app.git" action :sync branch 'master' user node[:webapp][:user] end
By using the git
resource, the repository will be cloned into the designated source directory on the endhost. Here, we will also be pulling data from the master
branch and performing this action as our webapp
user.
Installing any extra dependencies
There are two ways to model dependencies for your application: in your cookbook and recipe, or through an external mechanism such as Bundler, pip, or other dependency resolution, and the downloading tool depending on the language of your choice. As with everything, there are both inherent drawbacks and benefits to each of these methods.
By modeling your dependencies in Chef, you have a consistent model that you can look to in a centralized location. This means that your application needs a new Ruby gem, or a Python library that someone must update a cookbook or Chef configuration with that information in order for the deployment to be successful. This can limit your ability to continuously deploy based solely on the contents of a source code repository. In effect, this requires you to model the following in Chef:
- Dependent libraries
- Library versions
- Possibly, the dependencies of any declared dependencies (which can spiral quickly)
However, modeling it this way does ensure that Chef has an accurate picture of all the information associated with your application. This solution does offer some other benefits:
- Dependencies are precisely modeled in Chef and can be queried by other tools
- Any system-specific packages that are needed for your interpreted libraries are going to be modeled by Chef anyway, so it's all in one place (examples can include native XML or database libraries)
- Developers can't arbitrarily change dependencies and accidentally break deployments because the underlying libraries have not been installed in production
Let's look at some things to think about when using tools external to Chef for this task.
Using an external tool such as Bundler or pip has some advantages, including flexibility and ease of use by developers who may not be involved in infrastructure configuration. It also introduces the possibility of misconfigured dependencies and underlying libraries. The primary advantage of this mechanism is that it provides a simpler dependency management model for developers—simply add a requirement to the Gemfile, requirements.txt
, or other metadata file, and Chef will automatically install them during the next run. You also now have to look in two different places to determine what is being installed on endhosts. This also means that you are now configuring dependencies in multiple places, increasing the possibility of making a wrong configuration change in one place.
It's important to take away that there is not always only one tool for the job, and depending on how your organization or team operates, you may choose to mix and match how you model the application-level dependencies. For the sake of demonstrating them both to you, the application cookbook models the dependencies in the recipe as well as through a requirements.txt
file using pip. Additionally, you may find that initially your team uses one way and then moves to another as your requirements stop changing so frequently, or you are able to combine them to your advantage.
Using Python's requirements file
Our webapp
cookbook has a custom pip_requirements
definition that provides an easy way to install any requirements stored inside a requirements.txt
file into a specified virtual environment using the copy of pip provided by that virtual environment. In the following code, you will see how we can achieve this:
pip_requirements "webapp" do action :run pip "#{virtualenv}/bin/pip" user node[:webapp][:user] group node[:webapp][:group] requirements_file "#{src_dir}/requirements.txt" end
In this example, we are telling pip to run as our application's user and group and to install the dependencies in our requirements.txt
into the virtual environment specified by virtualenv
. Again, a similar resource can be created (if one does not already exist) to execute Bundler for Ruby, CPAN for Perl, or PEAR to manage PHP dependencies.
Configuring your application
Now that you have prepared your system for your application, you need to configure it. In order for the application to talk to our database, you must provide the required database connection information that we have stored in Chef. Here, we will use a template that is stored in templates/default/config.py.erb
, and inject it with our database configuration. The resource for this looks like the following code:
template "#{src_dir}/config.py" do source "config.py.erb" user node[:webapp][:user] group node[:webapp][:group] mode "0600" variables({ :dbname => node[:webapp][:dbname], :dbuser => node[:webapp][:dbuser], :dbpass => node[:webapp][:dbpass], :dbhost => dbhost }) end
Here, we load our database information onto our template and store it in our application's install directory (where we checked out the source for simplicity), and set some sane file permissions. Were this a Rails application, we can use a similar template to generate database.yml
and matching settings.yml
, or if it were a Dropwizard application, a service.yml
file, a PHP INI file, or any other type of configuration data that were needed. In our case, we are simply populating the following Python code so that we have a database connection object:
import web db_params = { 'dbn': 'postgres', 'db': '<%= @dbname %>', 'user': '<%= @dbuser %>', 'pw': '<%= @dbpass %>', 'host' : '<%= @dbhost %>' } DB = web.database(**db_params) cache = False
The previous example uses the web.py
database module to construct a new database connection using the hash, which can then be imported and used in the other portions of the application. Again, this is a good starting example for our web.py
application that can be used as a model for whatever framework or application server you are using in your systems.
Keeping your application running
All applications need to be started and kept running in some manner. If you are using Rails with mod_passenger
, then the Apache daemon will be the primary entry point for your application, and this software will need to be installed and configured. In our example, we will be using the supervisord service from http://supervisord.org, which is written in Python and serves as a very configurable, lightweight, and reliable process manager. You can configure an entry in the supervisord system configuration using a supervisor_service
resource that is provided by the supervisor
cookbook installed earlier:
supervisor_service "webapp" do action [:enable,:restart] autostart true user node[:webapp][:user] command "#{virtual_python} #{src_dir}/server.py #{node[:webapp][:port]}" stdout_logfile "#{app_root}/logs/webapp-stdout.log" stdout_logfile_backups 5 stdout_logfile_maxbytes "50MB" stdout_capture_maxbytes "0" stderr_logfile "#{app_root}/logs/webapp-stderr.log" stderr_logfile_backups 5 stderr_logfile_maxbytes "50MB" stderr_capture_maxbytes "0" directory src_dir environment environment_hash end
The previous example will generate a configuration file for supervisord with the settings specified in our resource block. Unless you change the location, the configuration files will be located in /etc/supervisor.d/[service_name].conf
. In our case, the service is named webapp
, and its configuration file will be /etc/supervisor.d/webapp.conf
.
Here, we are telling supervisord that we want to enable our service and then restart it (which will start it if it's not currently running), where we want to log the process's output, how we want to rotate those log files, where to start our process, what environment variables to use, and most importantly what command to execute.
Now that we've looked at our recipes, let's go ahead and set up our roles, provision some systems, and deploy our application!