Home

Awesome

Issues Maintenance Twitter

Templates for AWS Fault Injection Simulator (FIS)

These templates let you perform fault injection experiments on resources (applications, network, and infrastructure) in the AWS Cloud.

What is AWS FIS anyway?

AWS Fault Injection Simulator (AWS FIS) is a managed service that enables you to perform fault injection experiments on your AWS workloads. Fault injection is based on the principles of chaos engineering. These experiments stress an application by creating disruptive events so that you can observe how your application responds. You can then use this information to improve the performance and resiliency of your applications so that they behave as expected.

To use AWS FIS, you set up and run experiments that help you create the real-world conditions needed to uncover application issues that can be difficult to find otherwise. AWS FIS provides templates that generate disruptions, and the controls and guardrails that you need to run experiments in production, such as automatically rolling back or stopping the experiment if specific conditions are met.

What is included in this package?

This CDK package will deplay a bunch of stacks. (1) the parent stack FISPa, (2) a stack for the IAM roles FisRole, (3) a stack for the stop-condition StopCond (CloudWatch alarm), (4) a stack for each FIS experiment group (EC2API, AsgExp, EksExp, NaclExp, Ec2InstExp), and (5) a stack dedicated to uploading SSM documents FisSsmDocs.

You can pick and choose which experiment group you want to deploy by simply commenting out the respective stacks here

1 - The IAM roles required to run the experiments:

2 - A set of AWS FIS experiments to get you started:

EC2 Instance faults

EC2 Control Plane faults

Auto Scaling Group faults

Network Access Control List faults

EKS faults

Security Group faults

Iam Access faults

Lambda faults

Configuring experiments:

These sample FIS experiments uses default values for some of the parameters, such as a vpc_id, asg_name, eks_cluster_name, etc. Modify these in the file cdk.json before deploying to reflect the particularity of your own AWS environment.

  "context": {
    "vpc_id": "vpc-01316e63b948d889d",
    "asg_name": "Test-FIS-ASG",
    "eks_cluster_name": "test-cluster-chaos",
    "security_group_id": "sg-022eb488dbd1655b3",
    "target_role_name": "target-role",
    "s3-bucket-to-deny": "mybucket/*",
    "ssm_parameter_name": "chaoslambda.config"
  }

You can also specify your own tags for filtering EC2 instances. The currently used ones are defined as:

resourceTags: {
        'FIS-Ready': 'true'
      }

3 - An example stop-condition using CloudWatch alarm

All templates use the same CloudWatch Alarm to get you started using the stop-condition. You can use this alarm to get familiar with canceling experiments. For example, you can trigger that alarm, for 1 minutes, using the following command:

aws cloudwatch set-alarm-state --alarm-name "NetworkInAbnormal" --state-value "ALARM" --state-reason "testing FIS"

Once you are familiar with the stop-condition, you should of course update the CloudWatch alarms with ones specific to your application and architecture.

4 - A stack dedicated to uploading SSM docs (Automation or Run-Command)

Deploy this package via CDK:

You first need to install the AWS CDK as described here - typically using:

npm install -g aws-cdk@2.x

You then must configure your workstation with your credentials and an AWS region, if you have not already done so. If you have the AWS CLI installed, the easiest way to satisfy this requirement is issue the following command:

aws configure

Finally, you can deploy these FIS experiments using the CDK as follows:

npm install
cdk bootstrap
cdk deploy --all

During the creation of the different stacks, some will generate a security warning as follow:

(NOTE: There may be security-related changes not in this list. See https://github.com/aws/aws-cdk/issues/1299)

Do you wish to deploy these changes (y/n)?

Select y (yes).

Other useful CDK commands:

The cdk.json file tells the CDK Toolkit how to execute your app.