Functionality
Harpoon comes with a fair amount of functionality.
Failed build intervention
Harpoon has the ability to commit failed images during build and run /bin/bash
against this image. This behaviour is known as intervention
.
Intervention images are cleaned up after they are exited from and are disabled
if harpoon is run with either --non-interactive
or with --no-intervention
.
The yaml configuration
Harpoon reads everything from a yaml configuration. By default this is a
harpoon.yml
file in the current directory, but may be changed with the
--harpoon-config
option or HARPOON_CONFIG
environment variable.
It will also read from ~/.harpoon.yml
and will be overridden by anything in
the configuration file you’ve specified.
This yaml file looks like the following:
---
images:
<image_name>:
<image_options>
And so when harpoon reads this yaml, it gets a dictionary of images names to
image options under the images
key.
An example may look like the following:
---
images:
myapp:
commands:
- FROM ubuntu
- RUN apt-get update && apt-get -y install caca-utils
- CMD cacafire
And then we can do things like:
# Run the default command in the image
$ harpoon run myapp
# Make the image and start an interactive bash shell in it
$ harpoon ssh myapp
And harpoon will make sure things are cleaned up and no longer on your system when you quit the process.
The only required option for an image is commands
which is a list of commands
as what you would have in a Dockerfile.
Cache from
You can specify what images can be used as the –cache-from option when building a docker image.
Note
docker will not pull down these images for you, if you specify an image as cache_from then it’ll only be used if it’s already pulled down.
This option can either be a boolean where True indicates use this image name as the cache. Or it can be a string or list of strings of images.
For example:
---
image_index: gcr.io/somewhere/
image_name_prefix: example
images:
blah:
cache_from: true
commands:
[...]
meh:
cache_from: "{images.stuff}"
commands:
[...]
stuff:
cache_from: true
commands:
- [FROM, "{images.meh}"]
[...]
other:
cache_from: gcr.io/mygreatcompany/some_image
commands:
[...]
In the example above, building the blah
or stuff
images will use any
existing gcr.io/somewhere/example-blah
or gcr.io/somewhere/example-stuff
images respectively. Building the meh
image will use any existing
gcr.io/somewhere/example-stuff
image for cache. And building the other
image will use gcr.io/mygreatcompany/some_image
for cache.
It is recommended doing something like the following:
$ harpoon pull_all_external --ignore-missing
$ harpoon pull my_image --tag latest --ignore-missing
$ harpoon push my_image --tag $(git rev-parse HEAD)
$ harpoon tag my_image latest --tag $(git rev-parse HEAD)
$ harpoon untag my_image --tag $(git rev-parse HEAD)
This will make sure when we build the image we are using the latest parent images and that the latest tag of the image is pulled down if it exists, followed by creating and pushing a new image with the latest git revision as the tag, followed by pushing up a latest tag based off that image we just made. We then clean up the tag we created and pushed up so that it doesn’t hang around taking up space.
Controlling the context
Docker is a server-client architecture, where the server is essentially a web server that speaks HTTP.
When you build an image with a docker client (for example
the official docker cli tool), the client must first send a context
to the
server. This context is then used to locate files that are added to the image
via ADD commands.
Harpoon has options available for specifying what goes into the context uploaded to the docker server. For now, it’s a little limited, but it’s certainly better than no control.
These options may be specified either at the root of the configuration or within the options for the image itself. Any option in the image options overrides the root option.
- use_gitignore
Ignore anything gitignore would when creating the context.
- exclude
A list of globs that are used to exclude files from the context
Note: Only works when use_gitignore has been specified
- enabled
Don’t include any context from the local system if this is set to false.
- parent_dir
The parent directory to get the context from. This defaults to the folder the
harpoon.yml
was found in.
For example, let’s say you have the following file structure:
project/
app/
ui-stuff/
large_folder/
docker/
harpoon.yml
Where for some reason large_folder is committed into git but contains a lot of large assets that don’t need to be in the docker image, then the harpoon.yml may look something like:
---
context:
use_gitignore: true
folders:
- project_dir: "{config_root}/.."
images:
myapp:
context:
parent_dir: "{folders.project_dir}"
exclude:
- large_folder/**
- docker/**
commands:
- FROM ubuntu
- ADD app /project/app
- ADD ui-stuff /project/ui-stuff
- RUN setup_commands
This also means it’s very easy to have multiple docker files adding content from the same folder.
Inter-Document linking
Many option values in the harpoon.yml
file will be formatted such that you
can reference the value from something else in the document.
For example, let’s say you want to link one image into another:
---
images:
db:
commands:
- <commands here>
app:
links:
- ["{images.db}", "dbhost"]
commands:
- <commands here>
The formatting works such that looking for “{name}” will look for name
in the
options. In this case it looks for ‘options[“images”][“db”][“container_name”]’
Note that images have some generated values:
- image_name
The name of the image that is created. This is produced by concatenating the
image_index
andimage_name_prefix
options it finds with the name of the image.So for:
--- image_index: some-registry.somewhere.com/user/ image_name_prefix: my-project images: blah: [..]
images.blah.image_name
will be “some-registry.somewhere.com/user/my-project-blah”- container_name
This is a concatenation of the
image_name
and a uuid1 hash.This means if we fail to clean up, future invocations won’t complain about conflicting container names.
Environment variables
There are two special format options for environment variables. One for when the variable should be resolved in the container and one for when it should be resolved by harpoon.
The one resolved by harpoon is “:from_env”. This will complain if the variable you want is not in the environment given to harpoon. A good use case for this is modifying the image_index based on an environment variable:
---
images:
blah
image_index: "{IMAGE_INDEX:from_env}"
commands:
...
This will complain if there is no IMAGE_INDEX
environment variable and if it
does exist will replace it with the value of that variable.
Then there is “:env” that you can use to transform something into a bash variable. So “{BLAH:env}” transforms into “${BLAH}”.
For example:
---
images:
blah:
commands:
...
tasks:
something:
options:
bash: "echo {THINGS:env} > /tmp"
env:
- THINGS
Then this will run the container with the docker-cli equivalent of “–env THINGS” and run the command “/bin/bash -c ‘echo ${THINGS} > /tmp’”.
You can also specify environment variables via the –env switch.
Also, you can specify “env”, “images.<image>.env” or “images.<image>.tasks.<task>.env” as a list of environment variables you want in your image.
The syntax for the variables are:
- VARIABLE
Will complain if this variable isn’t in your current environment and will expose this environment variable to the container
- VARIABLE=VALUE
Will set this variable to VALUE regardless of whether it’s in your environment or not
- VARIABLE:DEFAULT
Will set this variable to DEFAULT if it’s not in your current environment, otherwise it will use the value in your environment
Dockerfile commands
So when you specify your image you specify a list of commands to go into the Dockerfile as a list of instructions:
---
images:
myimage:
commands:
- <instruction>
- <instruction>
- <instruction>
Where instruction may be:
<string>
A string is just added into the Dockerfile as is
[<string>, <string>]
The first string is used as is, the second string is formatted and the two results are joined together to form the command.
So let’s say you have:
--- image_name_prefix: amazing-project images: base: commands: <commands here> app: commands: - [FROM, "{images.base}"]Then the first instruction for the
app
Dockerfile will be a FROM command that uses thebase
image.
[<string>, <string>, <string>]
You can use three strings to specify a
FROM {image} as something
. This is so you can make a staged build from an image you’ve defined in your your configuration.For example:
--- image_name_prefix: amazing-project images: base: commands: <commands here> app: commands: - [FROM, "{images.base}", "as base"] - RUN some_command.sh - FROM centos:7 - COPY --from=base /somefile /destination - RUN cat /destination
- [<string>, [<string>, <string>, …]]
A list of a string and a list will use the first string as the command unmodified and it will then format each string and use that as a seperate value.
So let’s say you have:
--- image_name_prefix: amazing-project passwords: db: sup3rs3cr3t images: app: commands: - FROM ubuntu - [ENV, ["DBPASSWORD {passwords.db}", "random_variable 3"]]
Then the resulting Dockerfile for the
app
image will look like:FROM ubuntu ENV DBPASSWORD sup3rs3cr3t ENV random_variable 3
- [<string>, <dictionary>]
This has special meaning depending on the first String.
[ADD, {content:<content>, dest:<dest>}]
This will add a file to the context with the content specified and make sure that gets to the destination specified.
So say you have:
--- images: app: commands: - FROM ubuntu - - ADD - dest: /tmp/blah content: | blah and stuff
This will add a file to the context with the name as some uuid value. For example “DDC895F6-6F65-43C1-BDAA-00C4B3F9BB7B” and then the Dockerfile will look like:
FROM ubuntu ADD DDC895F6-6F65-43C1-BDAA-00C4B3F9BB7B /tmp/blah
[ADD, {content: {image: <image>, path: <path>}, dest: <dest>}]
This will add the files found in <image> at <path> to <dest>. It uses a tar file to add in these files to the context.
For example:
--- images: one: commands: - FROM busybox - RUN mkdir /tmp/blah - RUN echo 'lol' > /tmp/blah/one - RUN echo 'hehehe' > /tmp/blah/two - RUN mkdir /tmp/blah/another - RUN echo 'hahahha' > /tmp/blah/another/three - RUN echo 'hello' > /tmp/other two: commands: - FROM busybox - - ADD - dest: /tmp/copied content: image: "{images.one}" path: /tmp/blah - - ADD - dest: /tmp/copied/other content: image: "{images.one}" path: /tmp/other - CMD find /tmp/copied -type f -exec echo {} \; -exec cat {} \; tasks: cat: description: Cat out the copied file from the one image!
Using this definition, we can now run
harpoon cat
and it will print out the files we stole from theone
image![ADD, {context:<context>, dest:<dest>}]
This is the same as specifying
content
instead ofcontext
, howevercontext
is the same as the context options on the image and will create a tar archive that is untarred into the dockerfile.[ADD, {prefix: <prefix>, get:[<string>, <string>]}]
This is a shortcut for adding many files with the same destination prefix.
For example:
--- images: app: commands: - FROM ubuntu - - ADD - prefix: /app get: - app - lib - spec
Which translates to:
FROM ubuntu ADD app /app/app ADD lib /app/lib ADD spec /app/spec
[COPY, {“from”: <image>, “path”: <string>, “to”: <string>}]
This allows us to pull from an image.
<image>
may be a string to the name of some external image, or it may be a formatted string to an image you’ve defined in this configuration. Path is the path in the image you want to copy from, and to is the path you want to copy to.For example:
--- images: one: commands: - FROM centos:7 - RUN echo 'blah' > /tmp/blah two: commands: - FROM centos:7 - - COPY - from: "{images.one}" path: /tmp/blah to: /tmp/copied - RUN cat /tmp/copied
Staged builds
Docker lets you specify staged builds, where you create an image with a name and then copy contents from that image in a new image below that. Harpoon lets you use these commands normally, but if you want to use images in your configuraion you may format them into the commands:
For example:
---
image_name_prefix: amazing-project
images:
base:
commands:
<commands here>
app:
commands:
- [FROM, "{images.base}", "as base"]
- RUN some_command.sh
- FROM centos:7
- COPY --from=base /somefile /destination
- RUN cat /destination
Or if you don’t need to run extra commands in your stage:
---
image_name_prefix: amazing-project
images:
base:
commands:
<commands here>
app:
commands:
- FROM centos:7
- - COPY
- from: "{images.base}"
path: /somefile
to: /destination
- RUN cat /destination
Dependant containers
When you reference an image_name created by the harpoon config, then harpoon will ensure that image is created before it’s used.
Also, if you specify a container_name created by the harpoon config, harpoon will ensure that container is running before it is used.
For example, say you have this folder structure:
project/
app/
app/
db/
lib/
spec/
config/
Gemfile
Gemfile.lock
Rakefile
docker/
harpoon.yml
Then your harpoon.yml may look like:
---
folders:
api_dir: "{config_dir}/.."
images:
bundled:
context:
parent_dir: "{folders.api_dir}"
commands:
- FROM some_image_with_ruby_installed
- RUN apt-get -y install libmysqlclient-dev ruby-dev
- RUN mkdir /api
- ADD Gemfile /api/
- ADD Gemfile.lock /api/
- WORKDIR /api
- RUN bundle config --delete path && bundle config --delete without && bundle install
mysql:
context:
parent_dir: "{folders.api_dir}"
commands:
- [FROM, "{images.bundled}"]
- VOLUME shared
<install mysql>
## Expose the database
- EXPOSE 3306
- [ADD, {prefix: "/app", get: ["db", "lib", "config", "app", "Rakefile"]}]
## Run the migrations
- RUN (mysqld &) && rake db:create db:migrate
- CMD cp /app/db/schema.rb /shared && mysqld
unit_tests:
context:
parent_dir: "{folders.api_dir}"
links:
- ["{images.mysql}", "dbhost"]
volumes:
share_with:
- "{images.mysql}"
commands:
- [FROM, "{images.bundled}"]
- ADD . /app/
- CMD cp /shared/schema.rb /app/db && rake
And harpoon will ensure that the bundled image is created before both the mysql and unit_tests images are created, and that when we run the unit_tests container it first creates the mysql container.
Harpoon will also ensure all these containers are cleaned up afterwards. Images stay around because we want to use the awesome caching powers of Docker.
Linking containers and volumes
You have the following options available:
- links
A list of strings representing the container name to link into the container
Or a list of list of strings of
[container_name, link_name]
wherecontainer_name
may be of the form{images.<image_name>}
(i.e. a reference to an image specified in the configuration.Harpoon will spawn docker networks such that each container has it’s own network with the specified linked containers in it.
These networks are cleaned up when all the containers specified in it have been stopped.
For example:
--- images: db: commands: ... app: links: - ["{images.db}", "dbhost"] commands: ...
Will make sure that when you start the app container, it will run the db image in a detached state and there will be an entry in the
/etc/hosts
of theapp
container that pointsdbhost
to thisdb
container.- volumes.share_with
This behaves like
link
in that you specify strings similar to what you would do for the docker cli (https://docs.docker.com/userguide/dockervolumes/#creating-and-mounting-a-data-volume-container)So something like:
--- images: db: commands: - FROM ubuntu - VOLUME /shared app: volumes: share_with: - "{images.db}" commands: ...
Then the
app
container will share the volumes from thedb
container.- volumes.mount
This is also specified as string similar to what you do for the docker cli (https://docs.docker.com/userguide/dockervolumes/#data-volumes)
For example:
--- folders: app_dir: "{config_root}/../app" images: app: volumes: mount: - "{app_dir}/coverage:/project/app/coverage:rw"
Will mount the
coverage
directory from the host into /project/app/coverage on the image.
Sometimes you need your dependency container to not be running in a detached
container. To make it so a dependency is running in an attached container, you
may specify dependency_options
:
---
images:
runner:
commands:
...
- CMD activator run
uitest:
links:
- ["{images.runner}", "running"]
dependency_options:
runner:
# Typesafe activator run stops in a detached container
attached: True
commands:
...
- CMD ./do_a_uitest.sh running:9000
Waiting for dependency containers
Harpoon will let you specify wait_condition
options to say what conditions
must be satisfied before a container is considered ready to be used as a
dependency.
For example:
---
images:
first:
commands:
- FROM ubuntu:14.04
- CMD sleep 4 && touch /tmp/wait
wait_condition:
file_exists:
- /tmp/wait
second:
links:
- "{image.first}"
commands:
- FROM ubuntu:14.04
- CMD date
When we do something like harpoon run second
it will create images for both
of them, and then create a container for the first
image, wait for the
condition to be met (in this case waiting for /tmp/wait
to exist in the
container) and then, when that condition is met, will start the second
container and link it with the first.
There are several different conditions you may specify:
- greps
A dictionary of <file to grep>: <string to grep for>
- command
A list of commands that must be met
- port_open
A list of ports that must be waiting for traffic (tested with
nc -z 127.0.0.1 <port>
)- file_value
A dictionary of <file>: <expected content>
- curl_result
A dictionary of <url>: <expected result>
- file_exists
A list of files to look for
You also have these two options:
- timeout
Fail waiting for the container after this amount of time
- wait_between_attempts
Wait atleast this long between attempting to resolve all the conditions
You may also specify wait_conditions for dependencies on the container that uses those dependencies:
---
images:
first:
commands:
- FROM ubuntu:14.04
- CMD sleep 4 && touch /tmp/wait
second:
dependency_options:
first:
wait_condition:
file_exists:
- /tmp/wait
links:
- "{image.first}"
commands:
- FROM ubuntu:14.04
- CMD date
Wait conditions specified this way will overwrite any wait_conditions set by the dependency itself.
Port bound detection
One of the more annoying errors that can happen is if a container wants to bind to a port that already exists, harpoon would just complain saying the container exited with a nonzero exit code before it even started.
With this new feature since version 0.5.8.2 Harpoon will try and work out if the required ports are already bound and complain if they are:
---
images:
my_image:
context: false
commands:
- FROM ubuntu:14.04
- CMD python3 -m http.server 9000
tasks:
runner:
description: Run our python server in the docker container
options:
ports:
- "9000:9000"
$ python3 -m http.server 9000 &
$ harpoon runner
11:06:37 INFO harpoon.executor Connected to docker daemon driver=aufs kernel=4.1.17-boot2docker
11:06:37 INFO option_merge.collector Adding configuration from /Users/stephen.moore/.harpoonrc.yml
11:06:37 INFO option_merge.collector Adding configuration from /Users/stephen.moore/deleteme/harpoon.yml
11:06:37 INFO harpoon.collector Converting harpoon
11:06:37 INFO harpoon.collector Converting images.my_image
11:06:37 INFO harpoon.ship.builder Making image for 'my_image' (my_image) - FROM ubuntu:14.04
11:06:37 INFO harpoon.ship.builders.mixin Building 'my_image' in '/Users/stephen.moore/deleteme' with 10.2 kB of context
Step 1 : FROM ubuntu:14.04
---> 06ab2de020f4
Step 2 : CMD python3 -m http.server 9000
---> Running in 32200c32359a
---> 15052fde2407
Removing intermediate container 32200c32359a
Successfully built 15052fde2407
11:06:38 INFO harpoon.ship.runner Creating container from my_image image=my_image container_name=my_image-4826b066-1582-11e6-a2d8-20c9d088bcc7 tty=True
11:06:38 INFO harpoon.ship.runner Using ports ports=[9000]
11:06:38 INFO harpoon.ship.runner Port bindings: [9000]
11:06:38 INFO harpoon.executor Connected to docker daemon driver=aufs kernel=4.1.17-boot2docker
11:06:38 INFO harpoon.ship.runner Removing container my_image-4826b066-1582-11e6-a2d8-20c9d088bcc7:ec5867550aeff206fd4d64258053e123fe092f96148725634e66f977a6513609
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Something went wrong! -- AlreadyBoundPorts
"Ports are already bound by something else" ports=[9000]
Authentication
Harpoon supports authentication for registries via plain credentials, Kms encrypted credentials or via a “slip” in an S3 bucket.
Note
kms and slip authentication require you install boto3 in your environment. Since version 0.14.4, this is not installed by default
It also supports the Google Container registry if you have run
gcloud auth configure-docker
(https://cloud.google.com/sdk/gcloud/reference/auth/configure-docker)
authentication:
registry.my-amazing-company.com.au
reading:
use: plain
username: bob
password: super_s3cr3t
writing:
use: kms
role: arn:aws:iam::1234544232:role/kms-reader
region: ap-southeast-2
username: bob
password: CiB1pqppldpSEDooCLKBYvCRHy/qWPs9+yJ0eUJ0MKRHsxKLAQEBAgB4daaqaZXaUhA6KAiygWLwkR8v6lj7PfsidHlCdDCkR7MAAABiMGAGCSqGSIb3DQEHBqBTMFECAQAwTAYJKoZIhvcNAQcBMB4GCWCGSAFlAwQBLjARBAzo+RPkrpz3+4riJkQCARCAH7NXjqqu0OSmYtiNXK7SrUw3mzWa8NYy5KfC4RKGFTQ=
registry.my-other-amazing-company.com.au
reading:
use: s3_slip
role: arn:aws:iam::124879330703/role/s3_reader
location: s3://my-amazing-slips/the-slip.txt
Plain authentication is what it says, just plain text and use as is. Kms encrypted means that the password is a base64 encoded encrypted string that is decrypted with kms after assuming the specified role.
S3 Slips are a special construct where there is a file in s3 containing a string of “username:password” and harpoon will assume the specified role and use that to get the slip and extract the username and password from it.
S3 slips are nice in that they can be rotated and the client doesn’t need to know that it’s been rotated (so long as it gets the new creds each time it interacts with the registry)
Container Manager
See Container Manager.