AWS EFS — The Container Friendly File System
In April 2015 AWS announced a new cloud service: Elastic File System — a Shared File Storage for Amazon EC2. Fast forward to June 28th 2016, more than a year later, and AWS announced that EFS is finally available and production-ready in 3 regions (us-east-1, us-west-2 and eu-west-1).
Let’s give it a try and run a Postgres container on EFS…
We’ll see that EFS is quite easy to get up and running, and that it works as advertised. A Postgres data directory is synchronized to every instance in the cluster. This unlocks a much needed challenge with containers. We can now run, kill and reschedule a Postgres container anywhere in the cluster and resume serving the persistent data.
We’ll also see that while possible, Postgres on EFS probably isn’t something we need or want to use outside of development or testing, but this experiment gives us confidence to add EFS to our infrastructure toolbox.
EFS and NFS Primer
EFS is an implementation of the Network File System (NFS) protocol.
NFS dates back to 1984, but EFS implements version 4.1 which was proposed as a standard in 2010.
With NFS, every computer uses an NFS client to connect to an NFS server, and synchronize file metadata and data over the network. The goal is to synchronize low level file modifications— locks and writes — across servers so that all clients have an eventually consistent filesystem to read from.
The NFS client/server protocol is designed to handle tough failures around network communication and mandatory and advisory file locking. Version 4.1 made big improvements around using multiple servers to separate the metadata paths from the data paths.
Because NFS looks like a standard filesystem it is trivial to use an EFS volume with Docker containers.
EFS is free to provision and offers 5 GB of storage for free in the AWS free tier. Usage beyond that costs $0.30 per GB per month. See the launch announcement for more details.
1990 called and wants its technology back
Update Convox To Use EFS
An EFS volume is relatively easy to provision with CloudFormation and to mount into an EC2 instance with UserData.
Here I am demonstrating the simple, open-source Convox project and tools to write set up a new environment (or to update an existing environment) to use EFS and inspect the resulting instances and containers. See this pull request for the specifics of the CloudFormation and UserData changes.
As long as we install Convox into one of the 3 regions where EFS is supported, Convox instances now automatically have EFS mounted in /volumes.
# install convox in us-east-1, us-west-2 or eu-west-1 $ convox install --region=us-west-2 # update to the EFS release $ convox rack update 20160629185452-efs # check out the new /volumes path $ convox instances ssh i-492ab0cf $ mount | grep /volumes us-east-1c.fs-f223e6bb.efs.us-east-1.amazonaws.com:/ on /volumes type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.2.139,local_lock=none,addr=10.0.2.95) # modify something in the /volumes path $ sudo su $ echo hi > /volumes/hi # see the changes on other instances $ convox instances ssh i-cb979a57 cat /volumes/hi hi $ convox instances ssh i-f08a3660 cat /volumes/hi hi
Sure enough we have a filesystem shared between three instances!
Update an App Database To Use an EFS Volume
Let’s run a database on this filesystem.
Convox takes an application with a Dockerfile and docker-compose.yml file and automates building images and pushing them to the EC2 Container Registry (ECR) and deploying the images to AWS via the EC2 Container Service (ECS).
We can take the convox-examples/rails app and add a persistent volume to the Postgres container by adding 2 new lines to the docker-compose.yml file that say we want a data directory in production:
volumes: - /var/lib/postgresql/data
Now I can deploy the app:
$ cd rails $ cat docker-compose.yml web: build: . labels: - convox.port.443.protocol=tls - convox.port.443.proxy=true links: - database ports: - 80:4000 - 443:4001 database: image: convox/postgres ports: - 5432 volumes: - /var/lib/postgresql/data $ convox deploy Deploying rails ... RUNNING: docker pull convox/postgres RUNNING: docker tag rails/database 132866487567.dkr.ecr.us-east-1.amazonaws.com/convox-rails-eefmdtclkf:database.BHEUKPGVHOB RUNNING: docker push 132866487567.dkr.ecr.us-east-1.amazonaws.com/convox-rails-eefmdtclkf:database.BHEUKPGVHOB database.BHEUKPGVHOB: digest: sha256:be8596b239b2cf9c139b93013898507ba23173a5238df1926e0ab93e465b342c size: 11181 Promoting RJVOZPCGYHE... UPDATING
And watch the logs:
$ convox logs agent:0.69/i-e0117466 Starting database process 4c6cde98595c database/4c6cde98595c The files belonging to this database system will be owned by user "postgres". database/4c6cde98595c database/4c6cde98595c This user must also own the server process. database/4c6cde98595c database/4c6cde98595c The database cluster will be initialized with locale "en_US.utf8". database/4c6cde98595c The default database encoding has accordingly been set to "UTF8". database/4c6cde98595c The default text search configuration will be set to "english". database/4c6cde98595c database/4c6cde98595c Data page checksums are disabled. database/4c6cde98595c fixing permissions on existing directory /var/lib/postgresql/data ... ok database/4c6cde98595c creating subdirectories ... ok database/4c6cde98595c selecting default max_connections ... 100 database/4c6cde98595c selecting default shared_buffers ... 128MB database/4c6cde98595c selecting dynamic shared memory implementation ... posix database/4c6cde98595c creating configuration files ... ok database/4c6cde98595c creating template1 database in /var/lib/postgresql/data/base/1 ... ok database/4c6cde98595c initializing pg_authid ... ok database/4c6cde98595c initializing dependencies ... ok database/4c6cde98595c creating system views ... ok database/4c6cde98595c loading system objects' descriptions ... ok database/4c6cde98595c sh: locale: not found database/4c6cde98595c creating collations ... ok database/4c6cde98595c No usable system locales were found. database/4c6cde98595c Use the option "--debug" to see details. database/4c6cde98595c creating conversions ... ok database/4c6cde98595c creating dictionaries ... ok database/4c6cde98595c setting privileges on built-in objects ... ok database/4c6cde98595c creating information schema ... ok database/4c6cde98595c loading PL/pgSQL server-side language ... ok database/4c6cde98595c vacuuming database template1 ... ok database/4c6cde98595c copying template1 to template0 ... ok database/4c6cde98595c copying template1 to postgres ... ok database/4c6cde98595c syncing data to disk ... ok database/4c6cde98595c database/4c6cde98595c WARNING: enabling "trust" authentication for local connections database/4c6cde98595c You can change this by editing pg_hba.conf or using the option -A, or database/4c6cde98595c --auth-local and --auth-host, the next time you run initdb. database/4c6cde98595c database/4c6cde98595c database/4c6cde98595c Success. database/4c6cde98595c database/4c6cde98595c database/4c6cde98595c PostgreSQL stand-alone backend 9.4.6 database/4c6cde98595c backend> statement: CREATE DATABASE app; database/4c6cde98595c database/4c6cde98595c backend> database/4c6cde98595c database/4c6cde98595c PostgreSQL stand-alone backend 9.4.6 database/4c6cde98595c backend> statement: ALTER USER postgres WITH SUPERUSER PASSWORD 'password'; database/4c6cde98595c database/4c6cde98595c backend> database/4c6cde98595c LOG: database system was shut down at 2016-07-05 21:53:36 UTC database/4c6cde98595c LOG: MultiXact member wraparound protections are now enabled database/4c6cde98595c LOG: database system is ready to accept connections database/4c6cde98595c LOG: autovacuum launcher startedTest Persistence
Postgres boots up though this looks just like how it worked before with ephemeral Docker volumes.
But… Pick a different instance and look again at the shared filesystem. It also sees the Postgres data directory!
$ convox instances ssh i-f08a3660 sudo ls /volumes/var/lib/postgresql/data base global pg_clog pg_dynshmem pg_hba.conf pg_ident.conf pg_logical pg_multixact pg_notify pg_replslot pg_serial pg_snapshots pg_stat pg_stat_tmp pg_subtrans pg_tblspc pg_twophase PG_VERSION pg_xlog postgresql.auto.conf postgresql.conf postmaster.opts postmaster.pid
The Postgres container is only accessible from inside the VPC so lets proxy in, create a table and insert a record:
$ convox apps info Name rails Status running Release RJVOZPCGYHE Processes database web Endpoints internal-rails-database-T5LLUKS-i-1848066794.us-east-1.elb.amazonaws.com:5432 (database) rails-web-WQDNFES-853899250.us-east-1.elb.amazonaws.com:80 (web) rails-web-WQDNFES-853899250.us-east-1.elb.amazonaws.com:443 (web) $ convox proxy internal-rails-database-T5LLUKS-i-1848066794.us-east-1.elb.amazonaws.com:5432 proxying 127.0.0.1:5432 to internal-rails-database-T5LLUKS-i-1848066794.us-east-1.elb.amazonaws.com:5432 $ psql -h 127.0.0.1 -U postgres -d app Password for user postgres: psql (9.4.5, server 9.4.6) app=# CREATE TABLE users ( app(# name varchar(40) app(# ); CREATE TABLE app=# INSERT INTO users VALUES('foo'); INSERT 0 1
Now for the moment of truth, let’s kill the Postgres container while watching the logs:
$ convox ps stop 4c6cde98595c Stopping 4c6cde98595c... OK $ convox logs agent:0.69/i-e0117466 Stopped database process 4c6cde98595c via SIGKILL database/4c6cde98595c LOG: received smart shutdown request database/4c6cde98595c LOG: autovacuum launcher shutting down database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet database/4c6cde98595c LOG: incomplete startup packet agent:0.69/i-e0117466 Stopped database process 4c6cde98595c via SIGKILL agent:0.69/i-e0117466 Dead database process 4c6cde98595c agent:0.69/i-e0117466 Stopped database process 4c6cde98595c via SIGTERM agent:0.69/i-492ab0cf Starting database process 533a014918b5 database/533a014918b5 LOG: incomplete startup packet database/533a014918b5 LOG: incomplete startup packet database/533a014918b5 LOG: database system was not properly shut down; automatic recovery in progress database/533a014918b5 LOG: record with zero length at 0/16CB998 database/533a014918b5 LOG: redo is not required database/533a014918b5 LOG: MultiXact member wraparound protections are now enabled database/533a014918b5 LOG: database system is ready to accept connections database/533a014918b5 LOG: autovacuum launcher started
The old container stopped and the new container started on a different instance and it sees and recovers the data!
$ psql -h 127.0.0.1 -U postgres -d app Password for user postgres: psql (9.4.5, server 9.4.6) app=# SELECT * FROM users; name ------ foo (1 row)
Provision an EFS volume with a bit of CloudFormation
Mount EFS to every instance via UserData and standard Linux NFS utilities
Mount sub-directories of the EFS volume into Docker containers (ECS services) via volume mounts
Persist data across container stops and re-starts, independent of instances
All of this is working with a couple of hours of integrating EFS into an existing AWS / CloudFormation / VPC / ECS setup. Thanks Convox!.
I did notice some side-effects…
Rolling deploys can result in two processes trying to lock the database
Deleting an app leaves data around
And I didn’t push towards extreme usage scenarios…
Always Bet On AWS — A new standard has been set in cloud storage. Not only does EFS work, it promises all the things we want from the cloud. It’s cheap at $0.30/GB-month, you don’t provision storage up front, you pay for what you use, and it scales to petabytes.
No doubt there will be questions about the NFS protocol and horror stories of the dark days of EBS that cast a shadow over EFS. But you’d be silly to think AWS won’t operate this with the extremely high quality-of-service that is the reason they are taking over all of the world’s computing.
They will keep the system running, recover as fast as possible when it does have problems, and will answer support tickets when it’s not working as advertised.
Amazon continues the trend of turning cloud computing into utility services that get better and cheaper over time all while taking on massive infrastructure complexity so we don’t have to.
Filesystems Are Back — The experiment with Postgres gives me confidence that EFS this will fit into my architecture in some places. Now one of the tenets of 12 Factor is up for review:
Execute the app as one or more stateless processes
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.
Shared state and file persistence is back!
All of a sudden Wordpress got a lot easier to run on AWS. What other software and use cases are possible again? How about:
Docker image caching
User file uploads and processing
Integration Over Invention — The 12 Factor tenet came from avoiding the hard problems of stateful containers.
I have been working on container runtimes for more than 6 years between Heroku and Convox and have explicitly avoided putting engineering resources towards this problem until now.
We have been tempted with solutions like S3FS, GlusterFS, or Flocker. We have built contraptions around EBS volumes, snapshots and rsync. But these systems always bring on a lot of additional complexity which means more operational risk and more engineering time.
Tremendous engineering has gone into these systems and people have been extremely successful with the various solutions. But most of us have correctly put our own energy into architecting our applications to not need to solve tough problems with state, delegating everything to a Postgres service and S3.
Finally the tides have turned. We can integrate a service with a few lines of CloudFormation and build around shared state vs. invent, install, debug and maintain complex distributed systems software.
Specialized Utility Services Still Win —Even if EFS works for a Postgres container an RDS Postgres database starts at $12/mo. This includes more assurances about catastrophic failure like Postgres data replication and backups that would be be risky to ignore when running on any storage service.
So I still see no reason to take on the operational properties of a containerized data volume for a database outside of development or test purposes.
Likewise S3 isn’t going anywhere. It’s hard to beat the simplicity and maturity of a simple blob storage service in our application architecture.
What do you think?
Is EFS a new tool in your infrastructure toolbox?
What new or old use cases does this open up for our apps in the cloud?
What use cases will you still avoid using EFS for at all costs?
What becomes easier, cheaper and and more reliable now that AWS has taken on this tough challenge for us?