Awesome
LXD external provider for GARM
The LXD external provider allows garm to create runners using LXD containers and virtual machines. This code was separated from GARM itself to allow it's development and licensing to be separate from GARM.
Build
Clone the repo:
git clone https://github.com/cloudbase/garm-provider-lxd
Build the binary:
cd garm-provider-lxd
go build .
Copy the binary on the same system where garm
is running, and point to it in the config.
Configure
The config file for this external provider is a simple toml used to configure the credentials needed to connect to your OpenStack cloud and some additional information about your environment.
A sample config file can be found in the testdata folder.
LXD remotes
By default, this provider does not load any image remotes. You get to choose which remotes you add (if any). An image remote is a repository of images that LXD uses to create new instances, either virtual machines or containers. In the absence of any remote, the provider will attempt to find the image you configure for a pool of runners, on the LXD server we're connecting to. If one is present, it will be used, otherwise it will fail and you will need to configure a remote.
The sample config file in this repository has the usual default LXD
remotes:
- https://cloud-images.ubuntu.com/releases (ubuntu) - Official Ubuntu images
- https://cloud-images.ubuntu.com/daily (ubuntu_daily) - Official Ubuntu images, daily build
- https://images.linuxcontainers.org (images) - Community maintained images for various operating systems
When creating a new pool, you'll be able to specify which image you want to use. The images are referenced by remote_name:image_tag
. For example, if you want to launch a runner on an Ubuntu 20.04, the image name would be ubuntu:20.04
. For a daily image it would be ubuntu_daily:20.04
. And for one of the unofficial images it would be images:centos/8-Stream/cloud
. Note, for unofficial images you need to use the tags that have /cloud
in the name. These images come pre-installed with cloud-init
which we need to set up the runners automatically.
You can also create your own image remote, where you can host your own custom images. If you want to build your own images, have a look at distrobuilder.
Image remotes in the provider config, is a map of strings to remote settings. The name of the remote is the last bit of string in the section header. For example, the following section [image_remotes.ubuntu_daily]
, defines the image remote named ubuntu_daily. Use this name to reference images inside that remote.
You can also use locally uploaded images. Check out the performance considerations page for details on how to customize local images and use them with GARM.
LXD Security considerations
GARM does not apply any ACLs of any kind to the instances it creates. That task remains in the responsibility of the user. Here is a guide for creating ACLs in LXD. You can of course use iptables
or nftables
to create any rules you wish. I recommend you create a separate isolated lxd bridge for runners, and secure it using ACLs/iptables/nftables.
You must make sure that the code that runs as part of the workflows is trusted, and if that cannot be done, you must make sure that any malicious code that will be pulled in by the actions and run as part of a workload, is as contained as possible. There is a nice article about securing your workflow runs here.
Tweaking the provider
Garm supports sending opaque json encoded configs to the IaaS providers it hooks into. This allows the providers to implement some very provider specific functionality that doesn't necessarily translate well to other providers. Features that may exists on Azure, may not exist on AWS or OpenStack and vice versa.
To this end, this provider supports the following extra specs schema:
{
"$schema": "http://cloudbase.it/garm-provider-lxd/schemas/extra_specs#",
"type": "object",
"description": "Schema defining supported extra specs for the Garm LXD Provider",
"properties": {
"extra_packages": {
"type": "array",
"description": "A list of packages that cloud-init should install on the instance.",
"items": {
"type": "string"
}
},
"disable_updates": {
"type": "boolean",
"description": "Whether to disable updates when cloud-init comes online."
},
"enable_boot_debug": {
"type": "boolean",
"description": "Allows providers to set the -x flag in the runner install script."
},
"runner_install_template": {
"type": "string",
"description": "This option can be used to override the default runner install template. If used, the caller is responsible for the correctness of the template as well as the suitability of the template for the target OS. Use the extra_context extra spec if your template has variables in it that need to be expanded."
},
"extra_context": {
"type": "object",
"description": "Extra context that will be passed to the runner_install_template.",
"additionalProperties": {
"type": "string"
}
},
"pre_install_scripts": {
"type": "object",
"description": "A map of pre-install scripts that will be run before the runner install script. These will run as root and can be used to prep a generic image before we attempt to install the runner. The key of the map is the name of the script as it will be written to disk. The value is a byte array with the contents of the script."
}
},
"additionalProperties": false
}
An example extra specs json would look like this:
{
"disable_updates": true,
"extra_packages": ["openssh-server", "jq"],
"enable_boot_debug": false,
"extra_context": {
"GolangDownloadURL": "https://go.dev/dl/go1.22.4.linux-amd64.tar.gz"
},
"pre_install_scripts": {
"01-script": "IyEvYmluL2Jhc2gKCgplY2hvICJIZWxsbyBmcm9tICQwIiA+PiAvMDEtc2NyaXB0LnR4dAo=",
"02-script": "IyEvYmluL2Jhc2gKCgplY2hvICJIZWxsbyBmcm9tICQwIiA+PiAvMDItc2NyaXB0LnR4dAo="
},
"runner_install_template": "#!/bin/bash

set -e
set -o pipefail

{{- if .EnableBootDebug }}
set -x
{{- end }}

CALLBACK_URL="{{ .CallbackURL }}"
METADATA_URL="{{ .MetadataURL }}"
BEARER_TOKEN="{{ .CallbackToken }}"

if [ -z "$METADATA_URL" ];then
	echo "no token is available and METADATA_URL is not set"
	exit 1
fi

function call() {
	PAYLOAD="$1"
	[[ $CALLBACK_URL =~ ^(.*)/status(/)?$ ]] || CALLBACK_URL="${CALLBACK_URL}/status"
	curl --retry 5 --retry-delay 5 --retry-connrefused --fail -s -X POST -d "${PAYLOAD}" -H 'Accept: application/json' -H "Authorization: Bearer ${BEARER_TOKEN}" "${CALLBACK_URL}" || echo "failed to call home: exit code ($?)"
}

function systemInfo() {
	if [ -f "/etc/os-release" ];then
		. /etc/os-release
	fi
	OS_NAME=${NAME:-""}
	OS_VERSION=${VERSION_ID:-""}
	AGENT_ID=${1:-null}
	# strip status from the callback url
	[[ $CALLBACK_URL =~ ^(.*)/status(/)?$ ]] && CALLBACK_URL="${BASH_REMATCH[1]}" || true
	SYSINFO_URL="${CALLBACK_URL}/system-info/"
	PAYLOAD="{\"os_name\": \"$OS_NAME\", \"os_version\": \"$OS_VERSION\", \"agent_id\": $AGENT_ID}"
	curl --retry 5 --retry-delay 5 --retry-connrefused --fail -s -X POST -d "${PAYLOAD}" -H 'Accept: application/json' -H "Authorization: Bearer ${BEARER_TOKEN}" "${SYSINFO_URL}" || true
}

function sendStatus() {
	MSG="$1"
	call "{\"status\": \"installing\", \"message\": \"$MSG\"}"
}

function success() {
	MSG="$1"
	ID=${2:-null}
	call "{\"status\": \"idle\", \"message\": \"$MSG\", \"agent_id\": $ID}"
}

function fail() {
	MSG="$1"
	call "{\"status\": \"failed\", \"message\": \"$MSG\"}"
	exit 1
}

# This will echo the version number in the filename. Given a file name like: actions-runner-osx-x64-2.299.1.tar.gz
# this will output: 2.299.1
function getRunnerVersion() {
	FILENAME="{{ .FileName }}"
	[[ $FILENAME =~ ([0-9]+\.[0-9]+\.[0-9+]) ]]
	echo $BASH_REMATCH
}

function getCachedToolsPath() {
	CACHED_RUNNER="/opt/cache/actions-runner/latest"
	if [ -d "$CACHED_RUNNER" ];then
		echo "$CACHED_RUNNER"
		return 0
	fi

	VERSION=$(getRunnerVersion)
	if [ -z "$VERSION" ]; then
		return 0
	fi

	CACHED_RUNNER="/opt/cache/actions-runner/$VERSION"
	if [ -d "$CACHED_RUNNER" ];then
		echo "$CACHED_RUNNER"
		return 0
	fi
	return 0
}

function downloadAndExtractRunner() {
	sendStatus "downloading tools from {{ .DownloadURL }}"
	if [ ! -z "{{ .TempDownloadToken }}" ]; then
	TEMP_TOKEN="Authorization: Bearer {{ .TempDownloadToken }}"
	fi
	curl --retry 5 --retry-delay 5 --retry-connrefused --fail -L -H "${TEMP_TOKEN}" -o "/home/{{ .RunnerUsername }}/{{ .FileName }}" "{{ .DownloadURL }}" || fail "failed to download tools"
	mkdir -p /home/{{ .RunnerUsername }}/actions-runner || fail "failed to create actions-runner folder"
	sendStatus "extracting runner"
	tar xf "/home/{{ .RunnerUsername }}/{{ .FileName }}" -C /home/{{ .RunnerUsername }}/actions-runner/ || fail "failed to extract runner"
	# chown {{ .RunnerUsername }}:{{ .RunnerGroup }} -R /home/{{ .RunnerUsername }}/actions-runner/ || fail "failed to change owner"
}

CACHED_RUNNER=$(getCachedToolsPath)
if [ -z "$CACHED_RUNNER" ];then
	downloadAndExtractRunner
	sendStatus "installing dependencies"
	cd /home/{{ .RunnerUsername }}/actions-runner
	sudo ./bin/installdependencies.sh || fail "failed to install dependencies"
else
	sendStatus "using cached runner found in $CACHED_RUNNER"
	sudo cp -a "$CACHED_RUNNER"  "/home/{{ .RunnerUsername }}/actions-runner"
	sudo chown {{ .RunnerUsername }}:{{ .RunnerGroup }} -R "/home/{{ .RunnerUsername }}/actions-runner" || fail "failed to change owner"
	cd /home/{{ .RunnerUsername }}/actions-runner
fi


sendStatus "configuring runner"
{{- if .UseJITConfig }}
function getRunnerFile() {
	curl --retry 5 --retry-delay 5 \
		--retry-connrefused --fail -s \
		-X GET -H 'Accept: application/json' \
		-H "Authorization: Bearer ${BEARER_TOKEN}" \
		"${METADATA_URL}/$1" -o "$2"
}

sendStatus "downloading JIT credentials"
getRunnerFile "credentials/runner" "/home/{{ .RunnerUsername }}/actions-runner/.runner" || fail "failed to get runner file"
getRunnerFile "credentials/credentials" "/home/{{ .RunnerUsername }}/actions-runner/.credentials" || fail "failed to get credentials file"
getRunnerFile "credentials/credentials_rsaparams" "/home/{{ .RunnerUsername }}/actions-runner/.credentials_rsaparams" || fail "failed to get credentials_rsaparams file"
getRunnerFile "system/service-name" "/home/{{ .RunnerUsername }}/actions-runner/.service" || fail "failed to get service name file"
sed -i 's/$/\.service/' /home/{{ .RunnerUsername }}/actions-runner/.service

SVC_NAME=$(cat /home/{{ .RunnerUsername }}/actions-runner/.service)

sendStatus "generating systemd unit file"
getRunnerFile "systemd/unit-file?runAsUser={{ .RunnerUsername }}" "$SVC_NAME" || fail "failed to get service file"
sudo mv $SVC_NAME /etc/systemd/system/ || fail "failed to move service file"
sudo chown root:root /etc/systemd/system/$SVC_NAME || fail "failed to change owner"
if [ -e "/sys/fs/selinux" ];then
	sudo chcon -h system_u:object_r:systemd_unit_file_t:s0 /etc/systemd/system/$SVC_NAME || fail "failed to change selinux context"
fi

sendStatus "enabling runner service"
cp /home/{{ .RunnerUsername }}/actions-runner/bin/runsvc.sh /home/{{ .RunnerUsername }}/actions-runner/ || fail "failed to copy runsvc.sh"
sudo chown {{ .RunnerUsername }}:{{ .RunnerGroup }} -R /home/{{ .RunnerUsername }} || fail "failed to change owner"
sudo systemctl daemon-reload || fail "failed to reload systemd"
sudo systemctl enable $SVC_NAME
{{- else}}

GITHUB_TOKEN=$(curl --retry 5 --retry-delay 5 --retry-connrefused --fail -s -X GET -H 'Accept: application/json' -H "Authorization: Bearer ${BEARER_TOKEN}" "${METADATA_URL}/runner-registration-token/")

set +e
attempt=1
while true; do
	ERROUT=$(mktemp)
	{{- if .GitHubRunnerGroup }}
	./config.sh --unattended --url "{{ .RepoURL }}" --token "$GITHUB_TOKEN" --runnergroup {{.GitHubRunnerGroup}} --name "{{ .RunnerName }}" --labels "{{ .RunnerLabels }}" --no-default-labels --ephemeral 2>$ERROUT
	{{- else}}
	./config.sh --unattended --url "{{ .RepoURL }}" --token "$GITHUB_TOKEN" --name "{{ .RunnerName }}" --labels "{{ .RunnerLabels }}" --no-default-labels --ephemeral 2>$ERROUT
	{{- end}}
	if [ $? -eq 0 ]; then
		rm $ERROUT || true
		sendStatus "runner successfully configured after $attempt attempt(s)"
		break
	fi
	LAST_ERR=$(cat $ERROUT)
	echo "$LAST_ERR"

	# if the runner is already configured, remove it and try again. In the past configuring a runner
	# managed to register it but timed out later, resulting in an error.
	./config.sh remove --token "$GITHUB_TOKEN" || true

	if [ $attempt -gt 5 ];then
		rm $ERROUT || true
		fail "failed to configure runner: $LAST_ERR"
	fi

	sendStatus "failed to configure runner (attempt $attempt): $LAST_ERR (retrying in 5 seconds)"
	attempt=$((attempt+1))
	rm $ERROUT || true
	sleep 5
done
set -e

sendStatus "installing runner service"
sudo ./svc.sh install {{ .RunnerUsername }} || fail "failed to install service"
{{- end}}

if [ -e "/sys/fs/selinux" ];then
	sudo chcon -R -h user_u:object_r:bin_t:s0 /home/runner/ || fail "failed to change selinux context"
fi

AGENT_ID=""
{{- if .UseJITConfig }}
sudo systemctl start $SVC_NAME || fail "failed to start service"
{{- else}}
sendStatus "starting service"
sudo ./svc.sh start || fail "failed to start service"

set +e
AGENT_ID=$(grep "agentId" /home/{{ .RunnerUsername }}/actions-runner/.runner |  tr -d -c 0-9)
if [ $? -ne 0 ];then
	fail "failed to get agent ID"
fi
set -e
{{- end}}
systemInfo $AGENT_ID

success "runner successfully installed" $AGENT_ID
{{- if .ExtraContext.GolangDownloadURL }}
curl -LO {{ .ExtraContext.GolangDownloadURL }}
rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.22.4.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
{{- end }}"
}
NOTE: The extra_context
spec adds a map of key/value pairs that may be expected in the runner_install_template
.
The runner_install_template
allows us to completely override the script that installs and starts the runner. In the example above, I have added a copy of the current template from garm-provider-common
, with the adition of:
{{- if .ExtraContext.GolangDownloadURL }}
curl -LO {{ .ExtraContext.GolangDownloadURL }}
rm -rf /usr/local/go && sudo tar -C /usr/local -xzf go1.22.4.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin
{{- end }}
NOTE: runner_install_template
is a golang template, which is used to install the runner. An example on how you can extend the currently existing template with a function that downloads, extracts and installs Go on the runner is provided above.
To set it on an existing pool, simply run:
garm-cli pool update --extra-specs='{"disable_updates": true}' <POOL_ID>
You can also set a spec when creating a new pool, using the same flag.
Workers in that pool will be created taking into account the specs you set on the pool.
Aside from the above schema, this provider also supports the generic schema implemented by garm-provider-common