Site icon Hip-Hop Website Design and Development

Cheap WordPress Update – Thoughts: Dev Environments With ZFS and Containers

Development environments are used to develop and test changes before pushing code further through the release pipeline. Giving each developer their own environment allows changes to be made within the codebase without affecting other developers or environments.
Traditionally, these environments would be locally based on the developer’s machine with all of the libraries and programs needed to run the code being installed too. We’ll examine development environments from the perspective of a web developer, so the needed programs would include a web server and a database server.
The Problem
There are two main issues with local development environments: isolation and resources.
Isolation
Before virtual machines (VMs) were common, many developers had a single web server and database server installed on their computer. Multiple sites (and multiple copies of the same site) were separated at the directory level with each copy of the site taking up the same amount of space as its parent.
Having all sites served by the same web and database servers presents a few problems:

Server level configuration is applied to every site

Software versions are limited at server level

Each site could potentially access other sites data

Configuration becomes cluttered over time

Once virtualisation became available to desktop hardware, software such as Virtualbox made it possible to create VMs which allowed the developer to have an entire server for each website or for each client. This approach meant that the development environment could now be tailored to match production environments and can be completely isolated from other development environments.
Resources
Before VMs, each site would only take up the code and database’s size of disk space. Generally, as only one web server and one database server ran at a time, the resources of only a single instance of each would be used.
A VM is an entire computer, from the virtual hardware assigned, to the Linux operating system, to the applications such as the web server. This means that each VM needs more resources to run than just a single web server with multiple copies of code. Each VM will be allocated its own memory, CPU and disks – taking up more resources on a developers local machine.
Solving the problem
To solve the problems of isolation and resources, we’ll need to use different technologies to address the individual concerns of each.
To address the issues of isolation and indirectly address resource usage, container technology can be used. Amazon Web Services (AWS) describe containers as being “a method of operating system virtualization that allow you to run an application and its dependencies in resource-isolated processes”. Using containers grants several benefits:

Isolation – each container will be isolated from other containers

Better resource use – unlike VMs, we don’t need to assign memory and disk space for the operating system

Faster – even with VM snapshots, creating new containers is very quick

Even though containers can reduce the disk space usage of development environments compared to VMs, it’s possible to further save space by using a filesystem that supports snapshotting and copy-on-write (COW) resource management. Copy-on-write allows data to be shared until a change is made to that data, at which point, a copy is made. This allows multiple copies of the same code & database to exist without taking up the expected amount of disk space, which is beneficial if a developer wants to use a container to work on several features on the same site at the same time.
Tech used
To demonstrate container technology, Docker will be used. Although there are other container engines available for use, Docker is the currently ubiquitous choice.
ZFS will be used for snapshotting and for copy-on-write.
Although ZFS is available for OSX (https://openzfsonosx.org/wiki/Downloads), at the time of writing, 10.12 does not have a stable version. Ubuntu Xenial will be used instead for the OS.
ZFS
ZFS is a filesystem that was originally designed by Sun Microsystems in the early 2000s. ZFS was designed to solve several problems that filesystems at the time had. It was designed to be easily scalable, have native snapshots for easy backup and restoration of data and check for corruptions via checksumming and if necessary, self heal.  By controlling both of the, traditionally separate, facets of data management – the physical management (such as hard disks) and the file management (a filesystem such as NTFS), ZFS has knowledge of and complete control of everything that makes use of it.
ZFS filesystems are built on top of virtual storage pools called zpools. A zpool is made up of virtual devices that are made up of block devices, for example a disk.
ZFS snapshots
When ZFS writes new data, the blocks containing the old data can be retained, allowing a snapshot version of the file system to be maintained. ZFS snapshots are created very quickly, since all the data composing the snapshot is already stored. They are also space efficient, since any unchanged data is shared among the file system and its snapshots.
Creating filesystems from snapshots
Writeable snapshots (“clones”) can also be created, resulting in two independent file systems that share a set of blocks. As changes are made to any of the clone file systems, new data blocks are created to reflect those changes, but any unchanged blocks continue to be shared, no matter how many clones exist. This is an implementation of the copy-on-write principle and is what we’ll be taking advantage of to create new environments.
ZFS example
Installation
Install ZFS

apt install zfsutils-linuxBefore creating a ZFS filesystem, we need to create a zpool for it:

zpool create blogpost /dev/sdb/dev/sdb is a 2GB disk attached to the machine. The zpool will take up the entire available space of the disk.
zpool list shows:

NAME       SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH
blogpost  1.98G    64K  1.98G         –     0%     0%  1.00x  ONLINEAnd mount shows:

/blogpost on /blogpost type zfs (rw,relatime,xattr,noacl)You can change the mountpoint at creation time by passing -m and a path.
To create the filesystem:

zfs create blogpost/masterzfs list will show the filesystem we created and the root filesystem:

NAME              USED  AVAIL  REFER  MOUNTPOINT
blogpost          250K  1.92G    19K  /blogpost
blogpost/master    19K  1.92G    19K  /blogpost/masterSnapshots
To demonstrate the copy-on-write/snapshot features, create some test data:

dd if=/dev/zero of=/blogpost/master/file-1mb.txt count=1024 bs=1024
dd if=/dev/zero of=/blogpost/master/file-2mb.txt count=2048 bs=1024And now create the snapshot:

zfs snapshot blogpost/master@testsnapshotzfs list -t snapshot will show snapshots:

NAME                             USED    AVAIL        REFER     MOUNTPOINT
blogpost/master@testsnapshot      0        –         3.02M             -Clone the snapshot by specifying the clone and the destination filesystem. In this case, we’re taking the clone of master called ‘testsnapshot’ and creating a filesystem called development:

zfs clone blogpost/master@testsnapshot blogpost/developmentListing the files in /blogpost/development will show the files:

drwxr-xr-x 2 root root       4 Jan 10 17:51 ./
drwxr-xr-x 4 root root       4 Jan 10 17:57 ../
-rw-r–r– 1 root root 1048576 Jan 10 17:51 file-1mb.txt
-rw-r–r– 1 root root 2097152 Jan 10 17:51 file-2mb.txt
Even though you can see the files and their sizes, only 1k of space is taken up instead of 3MB. The 3MB of data is being referenced rather than existing.
zfs list:

NAME                         USED  AVAIL  REFER  MOUNTPOINT
blogpost/development        1K  1.92G  3.02M  /blogpost/development
blogpost/master            3.02M  1.92G  3.02M  /blogpost/master
After making a change to the 1mb file in /blogpost/development, zfs list now looks like:

NAME                         USED  AVAIL  REFER  MOUNTPOINT
blogpost/development          1.14M  1.92G  3.15M  /blogpost/development
blogpost/master             3.02M  1.92G  3.02M  /blogpost/master
Docker
Docker Example
Installation
(Steps below are for Ubuntu 16.04. See https://docs.docker.com/engine/installation/ for other operating systems)

sudo apt install apt-transport-https ca-certificates
sudo apt-key adv –keyserver hkp://ha.pool.sks-keyservers.net:80 –recv-keys 58118E89F3A912897C070ADBF76221572C52609D
echo “deb https://apt.dockerproject.org/repo ubuntu-xenial main” | sudo tee /etc/apt/sources.list.d/docker.list
sudo apt update
sudo apt-cache policy docker-engine
sudo apt install linux-image-extra-$(uname -r) linux-image-extra-virtual
sudo apt install docker-engineUsage

docker run -d -p 80:80 tutum/hello-worldThe command above will automatically download an image called “hello-world” from the Tutum repository on the docker hub. Once downloaded, it will start the container in detached mode (runs in the background) and publishes port 80 so that we can connect. Once running, opening the page in the browser gives:

Now that the image is downloaded (and already built as it’s stored in the Docker Hub), starting another container is as simple as changing the port and running the command again:

docker run -d -p 8080:80 tutum/hello-worldThis will instantly start another container. We can use ‘docker ps’ to see the running containers:
 

CONTAINER ID        IMAGE               COMMAND                     STATUS              PORTS
21def91c5b50        tutum/hello-world   “/bin/sh -c ‘php-fpm ”     Up 28 seconds   0.0.0.0:8080->80/tcp
fe4b2176f151        tutum/hello-world   “/bin/sh -c ‘php-fpm ”     Up 39 minutes    0.0.0.0:80->80/tcp
So using two docker commands, we have 2 separate containers running Nginx and PHP – yet only using the process’ amount of resources instead of an entire operating systems, in the case of the VMs:

CONTAINER           CPU %               MEM USAGE / LIMIT
21def91c5b50        0.02%               3.977 MiB / 488.4 MiB
fe4b2176f151        0.01%               4.516 MiB / 488.4 MiB
Putting it all together
Using the above examples, it should be possible to see how containers and a copy-on-write filesystem can be used together in order to create new development environments quickly and cheaply.
The proof of concept techniques used above can be expanded into working with a minimal PHP developer’s setup – Nginx, PHP & MySQL. Docker-compose will be used in order to simplify the defining and running of multiple containers.
Note:
The PHP container uses a custom image which extends the official php-7 image and adds mysql pdo extensions. To create this image, create a file called Dockerfile containing:

FROM php:7-fpm

RUN docker-php-ext-install mysqli pdo pdo_mysqlRunning the following will build the image.

docker build -t blogpost-web . 
Docker-compose reads a docker-compose.yml file to determine what to do. The syntax is similar to that of a normal Dockerfile. Create a file in /blogpost/master called docker-compose.yml and use the following:

web:
   image: nginx:latest
   ports:
       – “80:80”
   volumes:
       – ./code:/code
       – ./site.conf:/etc/nginx/conf.d/site.conf
   links:
       – php

php:
   image: blogpost-web
   volumes:
       – ./code:/code
   links:
       – db

db:
 image: mysql
 volumes:
   – ./database:/var/lib/mysql/
 ports:
   – “3306:3306”
 environment:
   MYSQL_ROOT_PASSWORD: master
   MYSQL_USER: master
   MYSQL_PASSWORD: master
   MYSQL_DATABASE: masterdocker-compose up will download the images (if needed) and start the containers.
Test Data
In order to test the functionality, a small file is placed in the code directory that connects to the DB and prints out the contents of the messages table. An example is:

$db = new PDO(‘mysql:host=db;dbname=master;charset=utf8mb4’, ‘master’, ‘master’);
echo(“I am masterMessages:”);
foreach($db->query(‘SELECT * FROM messages’) as $row) {
   echo $row[‘message’];
}The disk space usage now looks like:
 

NAME                       USED  AVAIL  REFER  MOUNTPOINT
blogpost/master/code        20K  1.72G    20K  /blogpost/master/code
blogpost/master/database   210M  1.72G   210M  /blogpost/master/databaseSo the database is currently taking up 210MB. Traditionally, if we wanted to duplicate the database for another branch/feature, we’d copy the directory, thereby increasing the disk space usage to 420MB.
To snapshot the master branch, use the following commands:

zfs snapshot blogpost/master/database@masterdatabase
zfs snapshot blogpost/master/code@mastercodeThen create the development filesystem and then clone the master snapshots to the development filesystem:

zfs create blogpost/development
zfs clone blogpost/master/code@mastercode blogpost/development/code
zfs clone blogpost/master/database@masterdatabase blogpost/development/databasezfs list now shows the master & development filesystems:

NAME                            USED  AVAIL  REFER
blogpost/master/code             20K  1.72G    20K
blogpost/master/database        210M  1.72G   210M
blogpost/development/code         1K  1.72G    20K
blogpost/development/database     1K  1.72G   210M
The code and database for the development branch is only taking up 2KB disk space instead of 210MB! Quite a saving if used with large databases and codebases.
Running docker-compose up in the development directory brings up the 3 containers. Edit the test PHP and replace ‘master’ with ‘development’, so that we can prove which codebase is being used.
When visiting the master site, you should see

And when visiting the development site, you should see:

zfs list shows:
 

NAME                            USED  AVAIL  REFER
blogpost/master/code             29K  1.70G    20K
blogpost/master/database        210M  1.70G   210M
blogpost/development/code      9.50K  1.70G    20K
blogpost/development/database  12.6M  1.70G   210MSo although some extra space has been used, nearly 200MB is still saved even though an entirely new development environment now exists.
Get a shell on the development DB container by running: docker exec -it development_db_1 /bin/bash
And insert some data:

mysql -umaster -pmaster master -e ‘insert into messages (message) values (“This is the development branch”);’Refreshing the page on your development web server should now show another message:

NAME                            USED  AVAIL  REFER
blogpost/master/code             29K  1.70G    20K
blogpost/master/database        210M  1.70G   210M
blogpost/development/code      9.50K  1.70G    20K
blogpost/development/database  12.8M  1.70G   210MSo even with a change in the database, the majority of the data is referenced from the snapshotted data with only 12.8MB taking up space on the disk.
Conclusion
The examples above have shown that it is possible to create quick andresource effective development environments locally. We have shown how containers can be used to isolate environments which provide a layer of security and cleanliness whilst still saving space and time when compared to traditional VMs. We have also shown how using a copy-on-write filesystem such as ZFS can be used to quickly and easily clone environments and reduce disk space usage, allowing a developer to have more environments than a traditional setup might allow.
The commands used can easily be scripted to make the process interactive and even faster.


Source: New feed