diff --git a/blog/site-update-rss-bandwidth-2021-01-14.markdown b/blog/site-update-rss-bandwidth-2021-01-14.markdown new file mode 100644 index 0000000..ce68c48 --- /dev/null +++ b/blog/site-update-rss-bandwidth-2021-01-14.markdown @@ -0,0 +1,69 @@ +--- +title: "Site Update: RSS Bandwidth Fixes" +date: 2021-01-14 +tags: + - devops + - optimization +--- + +# Site Update: RSS Bandwidth Fixes + +Well, so I think I found out where my Kubernetes cluster cost came from. For +context, this blog gets a lot of traffic. Since the last deploy, my blog has +served its RSS feed over 19,000 times. I have some pretty naiive code powering +the RSS feed. It basically looked something like this: + +- Write RSS feed content-type and beginning of feed +- For every post I have ever made, include its metadata and content +- Write end of RSS feed + +This code was _fantastically simple_ to develop, however it was very expensive +in terms of bandwidth. When you add all this up, my RSS feed used to be more +than a _one megabyte_ response. It was also only getting larger as I posted more +content. + +This is unsustainable, so I have taken multiple actions to try and fix this from +several angles. + +

Rationale: this is my +most commonly hit and largest endpoint. I want to try and cut down its size. +

current feed (everything): 1356706 bytes
20 posts: 177931 bytes
10 +posts: 53004 bytes
5 posts: 29318 bytes pic.twitter.com/snjnn8RFh8

— Cadey +A. Ratio (@theprincessxena) January +15, 2021
+ +[Yes, that graph is showing in _gigabytes_. We're so lucky that bandwidth is +free on Hetzner.](conversation://Mara/hacker) + +First I finally set up the site to run behind Cloudflare. The Cloudflare +settings are set very permissively, so your RSS feed reading bots or whatever +should NOT be affected by this change. If you run into any side effects as a +result of this change, [contact me](/contact) and I can fix it. + +Second, I also now set cache control headers on every response. By default the +"static" pages are cached for a day and the "dynamic" pages are cached for 5 +minutes. This should allow new posts to show up quickly as they have previously. + +Thirdly, I set up +[ETags](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag) for the +feeds. Each of my feeds will send an ETag in a response header. Please use this +tag in future requests to ensure that you don't ask for content you already +have. From what I recall most RSS readers should already support this, however +I'll monitor the situation as reality demands. + +Lastly, I adjusted the +[ttl](https://cyber.harvard.edu/rss/rss.html#ltttlgtSubelementOfLtchannelgt) of +the RSS feed so that compliant feed readers should only check once per day. I've +seen some feed readers request the feed up to every 5 minutes, which is very +excessive. Hopefully this setting will gently nudge them into behaving. + +As a nice side effect I should have slightly lower ram usage on the blog server +too! Right now it's sitting at about 58 and a half MB of ram, however with fewer +copies of my posts sitting in memory this should fall by a significant amount. + +If you have any feedback about this, please [contact me](/contact) or mention me +on Twitter. I read my email frequently and am notified about Twitter mentions +very quickly.