Benchmark5

Hello today we measure the performance of the hinsightd web server against previous versions of itself and other common web servers. What we measure is the number of request each server can process for multiple concurrency points to see how it handles a high load and how fast it can process requests under stress.

For the moment the tests only measure static file serving because that's the easiest to measure reliably and that's also the focus of the server's development at this time. Stay tuned for reverse proxy and fastcgi benchmarks.

The three versions of the hinsightd server are all event-based with the 0.10 branch also being multi-threaded. The rest of the servers are mostly default instalations and only have minor changes from the default config to suit the test.

index.html

index.html concurrency graph

Target is a small 102 bytes file. This tests how fast a server can open a connection, parse headers, and close the connection. This benchmark measures how fast a server processes requests that don't involve any other bottleneck. While an ideal case it is worth mentioning that hinsightd keeps up and even beats the fastest servers tested.

test hinsightd/0.9.20 hinsightd/0.10 unpatched hinsightd/0.10.2 nginx/1.25.3 lighttpd/1.4.73 Apache/2.4.58 caddy/2.7.6
ab -c 50 92944.58 123365.41 124046.39 106615.49 92842.75 46338.77 38366.36
ab -c 100 94345.85 120250.12 121654.5 105412.96 90651.97 45797.61 38107.58
ab -c 250 89390.27 105693.72 106805.66 99929.05 87581.01 45026.18 36292.24
ab -c 500 76104.66 88635.2 89766.61 88397.79 82109.9 42196.05 34878.33
ab -c 750 69969.7 81203.76 81240.71 80171.25 74577.33 39665.23 33404.04
ab -c 1000 65129.18 74870.47 75520.14 74467.93 73565.65 38595.88 30630.78

jquery.js

jquery.js concurrency graph

This target is a medium sized 89.7kB file with sendfile support disabled, this is primarily a disk bandwidth test. The low results compared to all the other servers are are still a work in progress.

test hinsightd/0.9.20 hinsightd/0.10 unpatched hinsightd/0.10.2 nginx/1.25.3 lighttpd/1.4.73 Apache/2.4.58 caddy/2.7.6
ab -c 50 3210.75 26010.78 23656.27 34926.93 34674.42 30386.36 29233.18
ab -c 100 12762.0 25267.33 23560.4 31371.56 32054.57 26297.66 28916.61
ab -c 250 11118.68 25201.93 24729.22 30492.17 32994.04 24009.72 27589.63
ab -c 500 9625.7 23997.79 24807.19 29544.02 32963.92 22820.63 25792.07
ab -c 750 9319.79 21840.07 24299.03 28783.02 29456.56 21854.96 24772.4
ab -c 1000 9253.43 20960.05 22934.04 27815.02 28395.05 21859.79 23612.47

compressed jquery via gzip

jquery.js concurrency graph

Finally a worthy test of the new multithreaded capabilities of the new 0.10 branch. In the previous post we tried to benchmark ssl connections but that didn't give good results on a single computer so this is a replacement benchmark where we correctly test the server not the benchmark program.

Here we test dynamic gzip requests, the server should grab a medium sized file off the disk, apply the same cpu intensive algorithm and then respond. Because the time it takes to apply the gzip algorithm is several times longer than any other processing we're basically testing hdd bandwidth with a CPU bottleneck.

As expected most servers are showing horizontal lines. The weak results of hinsightd 0.9.20 are not worrying because we're running on a 4 core machine and as noted hinsightd/0.9 is single-threaded only but it's outputting straight horizontal lines which is good.

We had to enable multiple workers for lighttpd despite it being advised against otherwise the line wouldn't wiggle as much but it would be right next to the single-threaded hinsightd v0.9 line. Single threaded servers are really not suited to this kind of test.

And no the hinsightd/0.10 unpatched isn't 10 times faster than the other servers that's just a bug left in for reference.

test hinsightd/0.9.20 hinsightd/0.10 unpatched hinsightd/0.10.2 nginx/1.25.3 lighttpd/1.4.73 Apache/2.4.58 caddy/2.7.6
ab -c 50 270.44 1106.35 727.16 973.17 822.39 880.28 795.4
ab -c 100 269.26 1098.19 702.17 975.11 727.86 878.18 806.48
ab -c 250 267.51 1153.42 722.49 953.28 552.99 870.89 802.6
ab -c 500 262.42 1290.8 728.32 965.61 621.38 860.48 812.53
ab -c 750 258.7 1622.57 737.46 956.14 578.91 855.3 811.1
ab -c 1000 254.25 2265.3 726.06 930.04 372.22 844.06 811.74

Conclusions

Performance is very ticky to measure because you have to double check and doubt all of your results or you end up knowing less than you knew before you started. If you're interested in more you can check out parts I, II, III or IV of the benchmarks series.

Notes:

  1. hin no patch means commit d0ec626012a2c9bb1edbb4b5c1349cc9da582632 a version prior to fixing the bug mentioned in the post.
  2. tests were ran on a 4 core machine with no hyperthreading. hinsightd 0.10, nginx and lighttpd have 4 workers/threads enabled the rest are left at the default.
  3. tests are run several times for each concurrency level and only the highest result is taken into account.
  4. nginx config, lighttd config, caddy config, the apache2 config is too long to post but it's using worker mpm with ThreadsPerChild 64 before we were using event mpm with the default settings.
Related topics: benchmark, hinsightd

Benchmark4

posted Last year

Hello this is another set of benchmarks of the hinsightd webserver. Again hinsightd v0.10 is a multithreaded server and the current branch under active development while hinsightd v0.9 is a singlethreaded server with more features but also more bugs.

What we test here is how fast the server responds to increases in concurrent connections. While this is not a perfect metric for servers, a webserver that can respond to more requests in a given time period is not only more likely to resist a DDoS attack but you can also add more processing per request before the delays become noticeable. Please consult a full list of features before deciding which webserver to use for your own site.

index.html

index.html concurrency graph

Testing a small 102 bytes file. This tests how fast a server can open a connection, parse headers, and close the connection, nothing that impressive. But it should be noted the multithreaded branch is the fastest server tested.

test hinsightd/0.9.19 hinsightd/0.10.1 nginx/1.25.3 lighttpd/1.4.72 Apache/2.4.58 caddy/2.7.5
index.html -c 50 93850.01 127296.1 117020.65 56887.03 40069.56 37654.71
index.html -c 100 95571.23 121204.78 114876.51 56396.17 41065.91 36727.71
index.html -c 250 90076.29 106621.17 106724.72 53473.93 38698.64 35235.71
index.html -c 500 78003.12 91496.33 91003.4 50046.54 35879.2 33626.33
index.html -c 750 71083.31 83508.7 82136.87 50651.63 35209.66 32187.05
index.html -c 1000 66276.52 77152.16 76078.6 48633.87 33906.01 31376.68

jquery.js

jquery.js concurrency graph

For a more realistic test, the current jquery library, a 89.7kB minified js file. Unfortunately my server doesn't support sendfile so to make the test 'fair' I've disabled it across the board. I actually believe without sendfile this is a more realistic benchmark because in the web development world of today you rarely have unencrypted or uncompressed connections. This should test how fast a server is able to get data from the disk and push it to the http client with minimal changes.

In previous benchmarks (and in the 0.9 branch) the lack of TCP CORK flag really slowed down everything, adding support for it vastly improved performance but it's still under other servers. Further research and optimization needed.

test hinsightd/0.9.19 hinsightd/0.10.1 nginx/1.25.3 lighttpd/1.4.72 Apache/2.4.58 caddy/2.7.5
jquery.js -c 50 2103.32 28276.54 36029.15 18746.64 26397.06 28822.85
jquery.js -c 100 3288.06 27483.85 32395.48 18618.75 27594.5 28424.11
jquery.js -c 250 5974.85 25550.88 31045.68 18116.5 24566.16 27583.62
jquery.js -c 500 10033.39 23537.39 30233.58 17595.81 23997.39 26233.43
jquery.js -c 750 10200.92 21585.31 29622.17 17745.63 23632.95 25614.69
jquery.js -c 1000 9777.69 21598.88 28680.33 17556.98 23047.9 24684.84

index.html via ssl

While I would like to benchmark SSL connections, previous tests proved that without a secondary machine this is impossible, testing SSL connections is very CPU intensive so without another machine it just tests how fast the benchmark software can create connections on a CPU bottlenecked system, not very useful data.

If you find these tests useful a more comprehensive set of tests and benchmarks is include with the source code in the external/tests directory.

Notes:

  1. tests were ran on a 4 core machine with no hyperthreading. hinsightd 0.10.1 and nginx have 4 workers/threads enabled the rest are left at the default.
  2. tests are run 6 times for each concurrency level and only the highest result is taken into account.
  3. both hinsightd branches are using the current HEAD.
  4. nginx config
  5. lighttd config
  6. the apache2 config is too complicated to post, but it's the distribution default.
  7. caddy config
Related topics: benchmark, hinsightd
posted Last year 📝 by tiotags

Benchmark3

posted Last year

This post is about concurrency tests for the hinsightd server. Hinsightd v0.9 is a purely single threaded/single process webserver while the new v0.10 rewrite is a multithreaded webserver using mostly the same code as hinsightd v0.9. Currently the rewrite is still a work in progress so it doesn't have all the features of the previous version.

Edit: a flaw was detected in the testing methodology so the results are fairly useless.

index.html

index.html concurrency graph

Testing against a small file. This test basically tests how fast a server can open a connection, parse headers, and close the connection. I have to say I'm surprised of the results in my tests my development branch is always faster than nginx but for these tests I had to disable access logging in nginx, I did not expect such a large different. In any case time for optimizations.

jquery.js

jquery.js concurrency graph

This tests against a medium sized file. My own server seems to have some persistent bandwidth issues that I haven't pinpointed yet. The apache2 line is so bad because apache keeps throwing errors, usually it gets fixed by increasing the number of iterations but for medium sized file this takes too long for me to test.

index.html via ssl

ssl index.html concurrency graph

This test is the one that surprises me the most. I wanted to implement threads in the server because it should logically increase ssl throughput but this test looks exactly the opposite of that. I have to investigate further, I don't understand why threading would decrease performance for ssl processing unless openssl is single threaded but as far as I know this is not the case.

Edit: This benchmark is basically useless because I tested on a single machine so both the benchmark software and the server were fighting for the same resources, this is true for the tests above too but if the cpu is not bottlenecked the effect is slightly less. I will retest some day when I get access to another linux machine.

test hinsightd/0.10.1 hinsightd/0.9.19 nginx/1.25.2 lighttpd/1.4.71 Apache/2.4.57 caddy/2.7.4
index.html -c 50 88235.55 89009.15 111221.1 57542.03 41061.69 37059.68
index.html -c 100 113072.29 90904.96 109440.32 57628.58 41711.33 36194.77
index.html -c 250 93982.31 86229.94 103420.1 54405.19 38868.61 35264.54
index.html -c 500 82594.8 75005.81 90039.8 50410.34 38918.99 33320.12
index.html -c 750 77411.36 68771.53 81939.0 50894.73 38114.99 32053.03
index.html -c 1000 72974.06 64592.39 75759.87 49168.32 37654.15 31623.95
jquery.js -c 50 2534.11 2205.48 30930.45 17948.91 26170.89 28317.06
jquery.js -c 100 10683.82 3621.08 28285.42 17870.61 25567.6 28060.88
jquery.js -c 250 13657.51 6018.14 27123.35 17457.67 24235.43 27269.8
jquery.js -c 500 13657.47 8494.7 26590.51 17128.92 23576.01 25962.63
jquery.js -c 750 13221.6 8376.68 25870.81 17267.22 6619.22 25533.33
jquery.js -c 1000 12419.48 8177.4 24944.75 16978.42 648.32 25040.0
ssl_index.html -c 50 29414.62 33393.0 33263.59 1725.03 14528.04 16864.54
ssl_index.html -c 100 21528.99 26042.68 23523.49 1726.26 5700.76 12044.7
ssl_index.html -c 250 11090.64 14406.29 12238.76 1395.36 5073.35 5912.12
ssl_index.html -c 500 6121.92 9036.61 6503.25 1055.53 3271.95 2937.05
ssl_index.html -c 750 4135.2 6644.76 4363.63 1274.92 3496.11 1950.87
ssl_index.html -c 1000 3081.8 5260.24 3370.99 987.59 5125.96 1460.06

Notes:

  1. tests were ran on a 4 core machine with no hyperthreading with 4 threads enabled in: hinsightd 0.10.1 and nginx.
  2. tests are run 5 times for each concurrency level and only the highest result is taken into account
  3. nginx config
  4. lighttd config
  5. the apache2 config is complicated and has many files so I will not post it here, it's also mostly the distribution default
  6. caddy config
  7. due to testing on a single machine the ssl benchmark is basically useless
Related topics: benchmark, hinsightd
posted Last year 📝 by tiotags

Benchmark2

New benchmarks are here, after several optimization patches and some unexplained performance improvements, here we are.

Optimization takes a lot of time and depends on trial and error plus modern magic, old code is rarely in a good shape, frequently breaks at the lightest change, insufficient documentation also hampers optimization efforts, and that dreaded feeling of "what does this do again ?" is encountered way too often. But the biggest problem by far is motivation, it's very hard to justify spending 90% time to increase performance by 10%; despite all of this sometimes magic prevails and the stars align to improve performance. Though I have to be honest I wish I knew why moving 3 lines of code to another function increases performance by 15k req/s.

So on to the tests, spoiler alert we managed to beat Lighttpd.

for 250 concurrency ab -k -c 250 -n 10000 http://localhost:<port>/

server and version failed requests requests/sec (mean)
hinsightd/0.9.17 0 63035.01
nginx/1.23.2 0 26673.64
lighttpd/1.4.67 0 53693.29
Apache/2.4.54 414 37474.10
Caddy/2.6.2 0 35412.02

for 500 concurrency ab -k -c 500 -n 10000 http://localhost:<port>/

server and version failed requests requests/sec (mean)
hinsightd/0.9.17 0 54984.63
nginx/1.23.2 0 26172.73
lighttpd/1.4.67 215 1613.59
Apache/2.4.54 1221 34305.55
Caddy/2.6.2 0 33995.57

Notes:

Please note I am a server developer not a server administrator and might not know of all the ways to increase performance of other servers while I know how to get every last bit of performance out of my own. So please take these kind of tests with a drop of salt.

Due to rapid io uring development and possible variance, it's needs to be noted we are using linux kernel version 6.0.9.

I wish I had the time to investigate why Lighttpd starts going very slowly when overloaded.

Related topics: benchmark, hinsightd
posted 2 years, 44 days ago 📝 by tiotags

Benchmark1

Performance usually comes with tradeoffs. These can take many forms, from skipping security checks to only optimizing a small part of the program that is being benchmarked. If you let performance guide a design other aspects of a program can suffer. Or in other words it doesn't matter how fast a broken program runs.

Despite optimization not being a priority some benchmarks are so interesting that are worth sharing. The condensing patch referenced below condenses multiple network write request into a single one.

for 500 concurrency ab -k -c 500 -n 10000 http://localhost:<port>/

server and version requests/sec (mean)
before patches 22790.46
-O2 alone 23167.02
After the condensing patch 26393.86
After patch + -O2 30257.64
nginx/1.21.6 26683.67
lighttpd/1.4.64 fails at 500 concurrency
Apache/2.4.54 fails at 500 concurrency

bonus for 100 concurrency due to other server giving errors ab -k -c 100 -n 10000 http://localhost:<port>/

server and version requests/sec (mean)
after patch + -O2 32136.88
nginx/1.21.6 26788.39
lighttpd/1.4.64 56971.62
Apache/2.4.54 fails at 100 concurrency

Switching from -Os to -O3 has a small performance impact, but reducing write requests adds around 4k req/s (about 20% faster). Looking at the other servers defaults I have to say I'm confused, I thought nginx would be faster, it does handle concurrency better than either apache or lighttpd though. Also it seems lighttpd is really fast at static requests.

While testing consumes time and I can't say I'm a big fan of optimization, free performance with minimal tradeoffs is best performance.

Now on to figuring out why file transfers are so slow.

Related topics: benchmark, hinsightd

Brevity

read more ...

Rust

read more ...

Better writing

read more ...

First

read more ...

Archived posts