Home

Awesome

web-traffic-generator

A quick and dirty HTTP/S "organic" traffic generator.

About

Just a simple (poorly written) Python script that aimlessly "browses" the internet by starting at pre-defined ROOT_URLS and randomly "clicking" links on pages until the pre-defined MAX_DEPTH is met.

I created this as a noise generator to use for an Incident Response / Network Defense simulation. The only issue is that my simulation environment uses multiple IDS/IPS/NGFW devices that will not pass and log simple TCPreplays of canned traffic. I needed the traffic to be as organic as possible, essentially mimicking real users browsing the web.

Tested on Ubuntu 14.04 & 16.04 minimal, but should work on any system with Python installed.

asciicast

How it works

About as simple as it gets...

First, specify a few settings at the top of the script...

The interval between every HTTP GET requests is chosen at random between the following two variables...

Dependencies

Only thing you need and might not have is requests. Grab it with

sudo pip install requests

Usage

Create your config file first:

cp config.py.template config.py

Run the generator:

python gen.py

Troubleshooting and debugging

To get more deets on what is happening under the hood, change the Debug variable in config.py from False to True. This provides the following output...

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Traffic generator started
Diving between 3 and 10 links deep into 489 different root URLs,
Waiting between 5 and 10 seconds between requests.
This script will run indefinitely. Ctrl+C to stop.
Randomly selecting one of 489 URLs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://arstechnica.com] ~~~ [depth = 7]
  Requesting page...
  Page size: 77.6KB
  Data meter: 77.6KB
  Good requests: 1
  Bad reqeusts: 0
  Scraping page for links
  Found 171 valid links
  Pausing for 7 seconds...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://arstechnica.com/author/jon-brodkin/] ~~~ [depth = 6]
  Requesting page...
  Page size: 75.7KB
  Data meter: 153.3KB
  Good requests: 2
  Bad reqeusts: 0
  Scraping page for links
  Found 168 valid links
  Pausing for 9 seconds...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://arstechnica.com/information-technology/2020/01/directv-races-to-decommission-broken-boeing-satellite-before-it-explodes/] ~~~ [depth = 5]
  Requesting page...
  Page size: 43.8KB
  Data meter: 197.1KB
  Good requests: 3
  Bad reqeusts: 0
  Scraping page for links
  Found 32 valid links
  Pausing for 8 seconds...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Recursively browsing [https://www.facebook.com/sharer.php?u=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1647915] ~~~ [depth = 4]
  Requesting page...
  Page size: 64.2KB
  Data meter: 261.2KB
  Good requests: 4
  Bad reqeusts: 0
  Scraping page for links
  Found 0 valid links
  Stopping and blacklisting: no links

The last URL attempted provides a good example of when a particular URL throws an error. We simply add it to our config.blacklist array in memory, and continue browsing. This prevents a known bad URL from returning to the queue.