Awesome
Tiros
Tiros is an IPFS website measurement tool. It is intended to run on AWS ECS in multiple regions.
Table of Contents
Measurement Methodology
We are running Tiros as a scheduled AWS ECS task in seven different AWS regions. These regions are:
eu-central-1
ap-south-1
af-southeast-2
sa-east-1
us-east-2
us-west-1
af-south-1
Each ECS task consists of three containers:
scheduler
(this repository)chrome
- viabrowserless/chrome
ipfs
- an IPFS implementation like ipfs/kubo or ipfs/helia-http-gateway
If run with kubo
we'll run it with LIBP2P_RCMGR=0
which disables the libp2p Network Resource Manager.
The scheduler
gets configured with a list of websites that will then be probed. A typical website config looks like this ipfs.io,docs.libp2p.io,ipld.io
.
The scheduler probes each website via the IPFS implementation by requesting http://localhost:8080/ipns/<website>
and via HTTP by requestinghttps://<website>
.
Port 8080
is the default kubo
HTTP-Gateway port. The scheduler
uses go-rod
to communicate with the browserless/chrome
instance.
The following excerpt is a gist of what's happening when requesting a website:
browser := rod.New().Context(ctx).ControlURL("ws://localhost:3000")) // default CDP chrome port
browser.Connect()
defer browser.Close()
var metricsStr string
rod.Try(func() {
browser = browser.Context(c.Context).MustIncognito() // first defense to prevent hitting the cache
browser.MustSetCookies() // second defense to prevent hitting the cache (empty args clears cookies)
page := browser.MustPage() // Get a handle of a new page in our incognito browser
page.MustEvalOnNewDocument(jsOnNewDocument) // third defense to prevent hitting the cache - clears the cache by running `localStorage.clear()`
// disable caching in general
proto.NetworkSetCacheDisabled{CacheDisabled: true}.Call(page) // fourth defense to prevent hitting the cache
// finally navigate to url and fail out of rod.Try by panicking
page.Timeout(websiteRequestTimeout).Navigate(url)
page.Timeout(websiteRequestTimeout).WaitLoad()
page.Timeout(websiteRequestTimeout).WaitIdle(time.Minute)
page.MustEval(wrapInFn(jsTTIPolyfill)) // add TTI polyfill
page.MustEval(wrapInFn(jsWebVitalsIIFE)) // add web-vitals
// finally actually measure the stuff
metricsStr = page.MustEval(jsMeasurement).Str()
page.MustClose()
})
// parse metricsStr
jsOnNewDocument
contains javascript that gets executed on a new page before anything happens. We're subscribing to performance events which is necessary for TTI polyfill and we're clearing the local storage. This is the code (link to source):
// From https://github.com/GoogleChromeLabs/tti-polyfill#usage
!function(){if('PerformanceLongTaskTiming' in window){var g=window.__tti={e:[]};
g.o=new PerformanceObserver(function(l){g.e=g.e.concat(l.getEntries())});
g.o.observe({entryTypes:['longtask']})}}();
localStorage.clear();
Then, after the website has loaded we are adding a TTI polyfill and web-vitals to the page.
We got the tti-polyfill from GoogleChromeLabs/tti-polyfill (archived in favor of the First Input Delay metric).
We got the web-vitals javascript from GoogleChrome/web-vitals by building it ourselves with npm run build
and then copying the web-vitals.iife.js
(iife
= immediately invoked function execution)
Then we execute the following javascript on that page (link to source):
async () => {
const onTTI = async (callback) => {
const tti = await window.ttiPolyfill.getFirstConsistentlyInteractive({})
// https://developer.chrome.com/docs/lighthouse/performance/interactive/#how-lighthouse-determines-your-tti-score
let rating = "good";
if (tti > 7300) {
rating = "poor";
} else if (tti > 3800) {
rating = "needs-improvement";
}
callback({
name: "TTI",
value: tti,
rating: rating,
delta: tti,
entries: [],
});
};
const {onCLS, onFCP, onLCP, onTTFB} = window.webVitals;
const wrapMetric = (metricFn) =>
new Promise((resolve, reject) => {
const timeout = setTimeout(() => resolve(null), 10000);
metricFn(
(metric) => {
clearTimeout(timeout);
resolve(metric);
},
{reportAllChanges: true}
);
});
const data = await Promise.all([
wrapMetric(onCLS),
wrapMetric(onFCP),
wrapMetric(onLCP),
wrapMetric(onTTFB),
wrapMetric(onTTI),
]);
return JSON.stringify(data);
}
This function will return a JSON array of the following format:
[
{
"name": "CLS",
"value": 1.3750143983783765e-05,
"rating": "good",
...
},
{
"name": "FCP",
"value": 872,
"rating": "good",
...
},
{
"name": "LCP",
"value": 872,
"rating": "good",
...
},
{
"name": "TTFB",
"value": 717,
"rating": "good",
...
},
{
"name": "TTI",
"value": 999,
"rating": "good",
...
}
]
If the website request went through the IPFS gateway we're running one round of garbage collection by calling the /api/v0/repo/gc
endpoint. With this, we make sure that the next request to that website won't come from the local kubo node cache.
To also measure a "warmed up" kubo node, we also configured a "settle time". This is just the time to wait before the first website requests are made. After the scheduler has looped through all websites we configured another settle time of 10min before all websites are requested again. Each run in between settles also has a "times" counter which is set to 5
right now in our deployment. This means that we request a single website 5 times in between each settle times. The loop looks like this:
for _, settle := range c.IntSlice("settle-times") {
time.Sleep(time.Duration(settle) * time.Second)
for i := 0; i < c.Int("times"); i++ {
for _, mType := range []string{models.MeasurementTypeIPFS, models.MeasurementTypeHTTP} {
for _, website := range websites {
pr, _ := t.Probe(c, websiteURL(c, website, mType))
t.Save(c, pr, website, mType, i)
if mType == models.MeasurementTypeIPFS {
t.GarbageCollect(c.Context)
}
}
}
}
}
So in total, each run measures settle-times * times * len([http, ipfs]) * len(websites)
website requests. In our case it's 2 * 5 * 2 * 14 = 280
requests. This takes around 1h
because some websites time out and the second settle time is configured to be 10m
Measurement Metrics
I read up on how to measure website performance and came across this list:
https://developer.mozilla.org/en-US/docs/Learn/Performance/Perceived_performance
To quote the website:
Performance metrics
There is no single metric or test that can be run on a site to evaluate how a user "feels". However, there are a number of metrics that can be "helpful indicators":
First paint The time to start of first paint operation. Note that this change may not be visible; it can be a simple background color update or something even less noticeable.
First Contentful Paint (FCP) The time until first significant rendering (e.g. of text, foreground or background image, canvas or SVG, etc.). Note that this content is not necessarily useful or meaningful.
First Meaningful Paint (FMP) The time at which useful content is rendered to the screen.
Largest Contentful Paint (LCP) The render time of the largest content element visible in the viewport.
Speed index Measures the average time for pixels on the visible screen to be painted.
Time to interactive Time until the UI is available for user interaction (i.e. the last long task of the load process finishes).
I think the relevant metrics on this list for us are First Contentful Paint
, Largest Contentful Paint
, and Time to interactive
. First Meaningful Paint
is deprecated (you can see that if you follow the link) and they recommend: "[...] consider using the LargestContentfulPaint API instead.".
First paint
would include changes that "may not be visible", so I'm not particularly fond of this metric.
Speed index
seems to be very much website-specific. With that, I mean that the network wouldn't play a role in this metric. We would measure the performance of the website itself. I would argue that this is not something we want.
Besides the above metrics, we should still measure timeToFirstByte
. According to https://web.dev/ttfb/ the metric would be the time difference between startTime
and responseStart
:
In the above graph you can also see the two timestamps domContentLoadedEventStart
and domContentLoadedEventEnd
. So I would think that the domContentLoaded
metric would just be the difference between the two. However, this seems to only account for the processing time of the HTML (+ deferred JS scripts).
We could instead define domContentLoaded
as the time difference between startTime
and domContentLoadedEventEnd
.
Run
You need to provide many configuration parameters to tiros
. See this help page:
NAME:
tiros run
USAGE:
tiros run [command options] [arguments...]
OPTIONS:
--websites value [ --websites value ] Websites to test against. Example: 'ipfs.io' or 'filecoin.io [$TIROS_RUN_WEBSITES]
--region value In which region does this tiros task run in [$TIROS_RUN_REGION]
--settle-times value [ --settle-times value ] a list of times to settle in seconds (default: 10, 1200) [$TIROS_RUN_SETTLE_TIMES]
--times value number of times to test each URL (default: 3) [$TIROS_RUN_TIMES]
--dry-run Whether to skip DB interactions (default: false) [$TIROS_RUN_DRY_RUN]
--db-host value On which host address can this clustertest reach the database [$TIROS_RUN_DATABASE_HOST]
--db-port value On which port can this clustertest reach the database (default: 0) [$TIROS_RUN_DATABASE_PORT]
--db-name value The name of the database to use [$TIROS_RUN_DATABASE_NAME]
--db-password value The password for the database to use [$TIROS_RUN_DATABASE_PASSWORD]
--db-user value The user with which to access the database to use [$TIROS_RUN_DATABASE_USER]
--db-sslmode value The sslmode to use when connecting the the database [$TIROS_RUN_DATABASE_SSL_MODE]
--kubo-api-port value port to reach the Kubo API (default: 5001) [$TIROS_RUN_KUBO_API_PORT]
--kubo-gateway-port value port to reach the Kubo Gateway (default: 8080) [$TIROS_RUN_KUBO_GATEWAY_PORT]
--chrome-cdp-port value port to reach the Chrome DevTools Protocol port (default: 3000) [$TIROS_RUN_CHROME_CDP_PORT]
--cpu value CPU resources for this measurement run (default: 2) [$TIROS_RUN_CPU]
--memory value Memory resources for this measurement run (default: 4096) [$TIROS_RUN_MEMORY]
--help, -h show help
Development
To test the tool locally, you need to start a database, kubo node, and headless chrome. You can do all of this by running:
docker compose up -d
Then you need to point tiros
to your local deployment. You can do this by
sourcing the included .env.local
file:
source .env.local
Finally, run tiros
via:
go build -o tiros .
./tiros run
# OR
go run . run
After the run has finished, you can check the local database for the measurement data. Run:
docker exec -it tiros-db-1 psql -U tiros_test -d tiros_test
to connect to the local database. If prompted for a password enter password
or
whatever is set in the .env.local
file for the TIROS_RUN_DATABASE_PASSWORD
environment variable.
Example output:
$ docker exec -it tiros-db-1 psql -U tiros_test -d tiros_test 3s
psql (14.6 (Debian 14.6-1.pgdg110+1))
Type "help" for help.
tiros_test=# select * from runs;
id | region | websites | version | times | cpu | memory | updated_at | created_at | finished_at | ipfs_impl
----+--------+---------------+----------------+-------+-----+--------+-------------------------------+-------------------------------+-------------------------------+-----------
1 | local | {filecoin.io} | 0.19.0-1963219 | 1 | 2 | 4096 | 2024-03-26 09:26:07.948483+00 | 2024-03-26 09:25:30.600963+00 | 2024-03-26 09:26:07.948482+00 | KUBO
2 | local | {filecoin.io} | 0.19.0-1963219 | 1 | 2 | 4096 | 2024-03-26 09:32:05.247122+00 | 2024-03-26 09:31:28.844582+00 | 2024-03-26 09:32:05.247122+00 | KUBO
(2 rows)
Migrations
To create a new migration run:
migrate create -ext sql -dir migrations -seq create_measurements_table
To create the database models
make models
Alternative IPFS Implementation
An alternative IPFS implementation needs to support a couple of things:
- The
/api/v0/repo/gc
endpoint - The
/api/v0/version
endpoint - Expose a rudimentary IPFS Gateway that at least supports resolving IPNS links
Maintainers
Contributing
Feel free to dive in! Open an issue or submit PRs.
License
MIT © Dennis Trautwein