Awesome
Charmander-Experiment: Maxusage
Our MaxUsage-Experiment will analyze the actual memory usage of a running simulators and use that result to overwrite the memory-allocation for subsequent run-requests for the same simulators.
Prerequisite
A local Charmander cluster has to be up and running. Related documentation available at https://github.com/att-innovate/charmander.
Verify that you are in your local Charmander directory and reset the Charmander cluster.
./bin/reset_cluster
Verify that no task is running using the Mesos console at http://172.31.1.11:5050
Build and deploy the simulators and the analyzer
The maxusage analyzer is implemented in Scala using Spark-Streamning and Spark-SQL. The code is part of this project and can be found at MaxUsage.scala.
One could argue that using Spark to resolve the max memory usage of a simulator is an over-kill .. but hey, we had to try out streaming and Spark-SQL .. and we are certain that we can scale it up if needed.
Lets build it first. Change to the experiments folder and check out the code into a folder called maxusage
cd experiments
git clone https://github.com/att-innovate/charmander-experiment-maxusage.git maxusage
Change your working directory back to the root of Charmander and start the build process
cd ..
./experiments/maxusage/bin/build
This command builds maxusage, and creates and deploys Docker images for itself and the different load simulators. This process will take some time the first time you run it.
Start cAdvisor and Analytics-Stack
./bin/start_cadvisor
./bin/start_analytics
Start the different simulators
./experiments/maxusage/bin/start_lookbusy200mb
./experiments/maxusage/bin/start_lookbusy80mb
./experiments/maxusage/bin/start_stress60mb
Start maxusage
./experiments/maxusage/bin/start_maxusage
Verify the experiment setup in Redis
Redis-UI can be found at: http://172.31.2.11:31610
The information in Redis gets updated by the scheduler every 15s. Give it some time to get synchronized. Refresh the page until "task-intelligence" shows up. You should then see something like:
Redis shows all the 3 slaves/nodes, all the currently running tasks, all the "metered" tasks, and the "intelligence" collected by maxusage in the task-intelligence section. The mem value represents the highest memory-use of a metered task, for lookbusy it should be something roughly 210MB (209928192).
Verify idle memory
Open the Mesos console at http://172.31.1.11:5050 and look for the Resources idle number at the bottom left. It should be something like 682MB.
Redeploy simulators
./bin/reshuffle
This command will kill and restart our running simulators. The Mesos console can be used to see the progress of the reshuffling.
Verify idle memory
The memory allocation for the simulators gets adjusted based on our "task intelligence" (maxusage + 10% safety). That decrease in allocated memory should increase the amount of idle memory for the cluster.
Open Mesos console at http://172.31.1.11:5050 and look for the Resources idle number at the bottom left. It should now be roughly 790MB.
Timeseries in InfluxDB
In case you are curious about the raw timeseries stored in InfluxDB. InfluxDB is available at http://172.31.2.11:31400
To Login: Username/password: both root , hostname: 172.31.2.11 and port 31410
After log in click on "Explore Data" for charmander and execute following queries:
select memory_usage from machine where hostname='slave1' limit 200
This returns and shows a histogram based on 200 data points
To get the memory usage for the stress60mb simulator try
select memory_usage from stats where container_name =~ /stress*/ limit 500
This returns 500 datapoints for the stress simulator.
That's it, let's clean up
./bin/reset_cluster
..and head back to the Charmander Homepage