Awesome
Amazon EKS Serverless Drainer
Amazon EKS node drainer with AWS Lambda.
amazon-eks-serverless-drainer is an Amazon EKS node drainer with AWS Lambda. If you provision spot instances or spotfleet in your Amazon EKS nodegroup, you can listen to the spot termination signal from CloudWatch Events 120 seconds in prior to the final termination process. By configuring this Lambda function as the CloudWatch Event target, amazon-eks-serverless-drainer will drain the terminating node and all the pods without relative toleration will be evicted and rescheduled to another node - your workload will get very minimal impact on the spot instance termination.
Implementations
Previously this project has a native golang
implementation with client-go
.
However, as AWS announced Lambda layer
and Lambda custom runtime
, thanks to the aws-samples/aws-lambda-layer-kubectl project,
it's very easy to implement this with a few lines of bash script in Lambda(tweet) whilst the code size could be reduced from 11MB
to just 2.4KB
.
So we will stick to bash
implementation in this branch. We believe this will eliminate the complexity to help people develop similar projects in the future.
Option 1: Deployt from SAR(Serverless App Repository)
The most simple way to build this stack is creating from SAR:
Edit Makefile
and then
$ make sam-package-from-sar sam-deploy
Or just click the button to deploy
Region | Click and Deploy |
---|---|
us-east-1 | |
us-east-2 | |
us-west-1 | |
us-west-2 | |
ap-northeast-1 | |
ap-northeast-2 | |
ap-southeast-1 | |
ap-southeast-2 | |
eu-central-1 | |
eu-west-1 | |
eu-west-2 | |
eu-west-3 | |
eu-north-1 | |
sa-east-1 |
This will provision the whole amazon-eks-serverless-drainer stack from SAR including aws-lambda-layer-kubectl
lambda layer out-of-the-box. The benefit is you don't have to build the layer yourself.
Option 2: Building from scratch
If you want to build it from scratch including the aws-lambda-layer-kubectl
Prepare your Layer
Follow the instructions to build and publish your aws-lambda-layer-kubectl
Lambda Layer.
Copy the layer ARN(e.g. arn:aws:lambda:ap-northeast-1:${AWS::AccountId}:layer:layer-eks-kubectl-layer-stack:2
)
Edit the sam.yaml
Set the value of Layers
to the layer arn in the previous step.
Layers:
- !Sub "arn:aws:lambda:ap-northeast-1:${AWS::AccountId}:layer:layer-eks-kubectl-layer-stack:2"
update Makefile
edit Makefile
and update S3BUCKET variable:
modify this to your private S3 bucket you have read/write access to
S3BUCKET ?= pahud-temp-ap-northeast-1
set the AWS region you are deploying to
LAMBDA_REGION ?= ap-northeast-1
package and deploy with SAM
$ make func-prep sam-package sam-deploy
(SAM
will deplly a cloudformation stack for you in your {LAMBDA_REGION}
and register cloudwatch events as the Lambda source event)
Uploading to 032ea7f22f8fedab0d016ed22f2bdea4 11594869 / 11594869.0 (100.00%)
Successfully packaged artifacts and wrote output template to file packaged.yaml.
Execute the following command to deploy the packaged template
aws cloudformation deploy --template-file /home/samcli/workdir/packaged.yaml --stack-name <YOUR STACK NAME>
Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - eks-lambda-drainer
# print the cloudformation stack outputs
aws --region ap-northeast-1 cloudformation describe-stacks --stack-name "eks-lambda-drainer" --query 'Stacks[0].Outputs'
[
{
"Description": "Lambda function Arn",
"OutputKey": "Func",
"OutputValue": "arn:aws:lambda:ap-northeast-1:xxxxxxxx:function:eks-lambda-drainer-Func-1P5RHJ50KEVND"
},
{
"Description": "Lambda function IAM role Arn",
"OutputKey": "FuncIamRole",
"OutputValue": "arn:aws:iam::xxxxxxxx:role/eks-lambda-drainer-FuncRole-TCZVVLEG1HKD"
}
]
Add Lambda Role into ConfigMap
eks-lambda-drainer
will run with provided lambda role or with exactly the role arn you specified in the parameter. Make sure you have added the role into aws-auth
ConfigMap.
Read Amazon EKS document about how to add an IAM Role to the aws-auth
ConfigMap.
Edit the aws-auth
ConfigMap by
kubectl edit -n kube-system configmap/aws-auth
And insert rolearn
, groups
and username
into the mapRoles
, make sure the groups contain system:masters
For eample
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: arn:aws:iam::xxxxxxxx:role/eksdemo-NG-1RPL723W45VT5-NodeInstanceRole-1D4S7IF32IDU1
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
- rolearn: arn:aws:iam::xxxxxxxx:role/eks-lambda-drainer-FuncRole-TCZVVLEG1HKD
username: EKSForLambda
groups:
- system:masters
The first rolearn
is your Amazon EKS NodeInstanceRole and the 2nd rolearn
would be your Lambda Role.
Validation
You may decrease the desired capacity
of your autoscaling group for Amazon EKS nodegroup. Behind the scene, on
instance termination from auoscaling group, the node will first enter the Terminating:Wait state and after a pre-defined graceful period of time(default: 10 seconds),
eks-lambda-drainer will be invoked through the CloudWatch Event and perform kubectl drain
on the node and immediately
put CompleteLifecycleAction back to the hook and the autoscaling group then move on to the
Terminaing:Proceed phase to execute the last termination process. The Pods in the terminating node will be rescheduled to other node(s) before the termination
Your service will have almost zero impact.
In Actions
Live tail the log
$ make sam-logs-tail
kubectl drain or kubectl taint
By default, eks-lambda-drainer
will kubectl drain
the node, however, if you specify Lambda environment variable drain_type=taint
then it will kubectl taint
the node.(details)
cluster name auto discovery
You don't have to specify the Amazon EKS cluster name, by default eks-lambda-drainer
will determine the EC2 Tag of the terminating node:
kubernetes.io/cluster/{cluster_name} = owned
For example, kubernetes.io/cluster/eksdemo = owned
will make the cluster_name=eksdemo
.
clean up
$ make sam-destroy
(this will destroy the cloudformation stack and all resources in it)
License Summary
This sample code is made available under the MIT-0 license. See the LICENSE file.