Shove

11 minutes read

Table of Contents

Say hello to shove, my brand new HTTP file server and content manager!

Shove is an S3-backed HTTP file server that handles live-reloads, partial updates and even basic HTTP auth and it has now replaced caddy as the HTTP server that serves this very blog1! This blog post will be part explanation, part advertisement and part me being excited about my latest project!

Current Functionality

Currently, there’s three main commands inside shove - protect, upload and serve, which I’ll briefly explain in that order. I’ll then go over some of the interesting things I did with each.

Protect

shove protect deals with the HTTP Basic Auth that shove has - internally it’s modelled as a bunch of users (each of which has a username, a password and a uuid), and a list of realms (each of which has a pattern to protect and a list of uuids that can access it).

You run shove protect from wherever you’re uploading from (to get easy access to environment variables), add the users and add the realms. The default behaviour for paths which don’t match any patterns is just to allow anyone and everyone access to those files - for example, my blog currently has no access control in place.

Protect

shove upload uploads the given directory to the given S3 bucket - as far as the user cares, that’s all it does.

The complexity comes from the fact that it only uploads new files and deletes old files - to achieve this there’s a file in the root directory that stores hashes of each file. shove protect first reads in all the files, then gets that file, checks the hashes and only uploads the files that have changed.

That’s why shove protect can’t upload a ‘current directory’ - it has to have somewhere to put all the files in the bucket where there’s a guarantee the data file (and the auth file) won’t clash.

Serve

By far however, the most complicated part is shove serve - this is the general lifecycle:

  1. Read in the data file from S3
  2. Find all the files
  3. Read them in from S3 and put them into a cache

Then, when we need to serve a file it does this:

  1. Check for access control - if that fails, then send a 401.
  2. Check if that file is in the cache - if so, send it
  3. If it isn’t: grab it from S3, put that in the cache and serve it
  4. If we couldn’t find it in S3, serve a 404 page.

That’s all simple2. The complicated part is the live reloading which goes a little something like this:

  1. Check if we need to reload - Tigris has a webhook, or we can just re-fetch the upload data every 60s and see if that file has changed.
  2. Work out which files have changed, and add their new versions to our cache.
  3. Work out which files have gone missing, and evict them from our cache.
  4. Send a notice to all the websockets connected to reload their pages.

Fun Tricks & Stories

Protect

HTTP Basic Auth was certainly interesting to get working - I wanted to have some form of authentication, and got 90% the way through implementing SCRAM-SHA-2563, before I realised that this is supposed to be for server->server authentication and no web browser lets you just put in a password for that.

I then redirected that work into my HTTP Basic Auth implementation. I’d originally planned to more closely copy Caddy (which this project is designed to replace for me), but I kinda ended up going ham on the access control. I’d always been annoyed that I had to SSH-in, use caddy hash-password, update the caddyfile and restart caddy to change the auth4 but shove auth just updates an encrypted file in the bucket that gets regularly checked for updates!

On that front, the auth benefits from the livereloading as well - shove serve never changes the auth, so we know that if it’s been changed then we need to update. Technically, if multiple people were to start changing the protections and finishing their work in weird orders, there could be a race condition where work would be lost, but I’m not concerned about this at the moment5.

Upload

upload was (comparatively) simple - we just read in the provided directory, calculate hashes, read in the upload data from S3, and deal with the S3 bits. The only vaguely interesting parts are that I’ve used things like FuturesUnordered to get a load of uploads going at once.

Serve

serve is by far the most complicated, because it handles so damn much. This section will be more an explanation of how stuff works, rather than just tricks and stories. With axum/hyper/tower projects, I always find a fun way to gauge the complexity is to have a look at the State that the services use, and this is mine:

#[derive(Clone)]
pub struct State {
    bucket: Box<Bucket>,
    pub tigris_token: Option<Arc<str>>,
    pages: Pages,
    live_reloader: LiveReloader,
    auth: AuthChecker,
}

Yeah, that’s not a lot but still considerable, especially considering three of them are custom structs with lots of other functionality. They also happen to be relatively neat lines to draw for explanations and stories about serve so I’ll give a general explanation, and then dive into those three parts (the live reloader, the auth and the pages).

UploadData

The UploadData is the struct that holds all of the hashes for all of the files in S3 and was where this whole project started. Whenever the reload is triggered6, we check whether any of the upload data has changed and if so, change it. Once I got hashing working, this was relatively simple.

It doesn’t store any of the actual data, just a map which links a path to a hash, and the name of the root directory we’re serving from.

LiveReloader

To use the livereloader, sites must include a small snippet in their javascript that creates a websocket connection to the server and reloads the page when it receives a reload message. Technically, they can open a websocket connection to any path as the HTTP server only checks whether the requests are for websocket upgrades and not what path they’re looking for. This works here and for now because I know there won’t be any other reasons to open websockets7.

But how does that get handled in the server? So, whenever we receive an upgrade request8, we send back an upgrade response (kindfully handled by soketto), and then create a Sender/Receiver pair for messages. Then, whenever we need to reload the pages (handled by a channel9), we just send a reload message to all of the senders!

If you’ve had a peek over the source code though, it might look like livereload.rs is doing a fair bit more than that, and it is because we have to deal with closed sockets. That gets dealt with by another thread that pings all of the sockets every 60 seconds to make sure that they’re still alive and stops keeping track of the dead ones. It could be argued that this is excessive, but it’s only a server-side load and I’m happy with it. If you’ve got a better solution, feel free to submit a PR! The code is also slightly complicated because it uses a FuturesUnordered to deal with dead senders as their pings finish, rather than needing to wait for all of them.

There’s also some code there for closing sockets when the server finishes.

AuthChecker

Shove also has ways of dealing with HTTP basic auth. That’s when you open a webpage and the browser asks for a username and password, rather than the webpage. On the plus side, it’s relatively easy to implement from the server, but also it involves sending the entered password over the wire in plaintext which means that this must only be used with HTTPS deployments. In addition, you have to implement things like ratelimiting and it’s a faff.

How do I deal with it? I’ll firstly explain how the auth works, and then how it applies itself. So, when we load a page we firstly get a list of every user that has access to that page, and the hash of their password10. That function deliberately returns an Option<HashMap> so that we can encode three states:

Once we’ve got the list of users and confirmed that the page needs auth, we check that user’s IP against a ratelimiter. At any point past here, if we fail then we return a response that tells the browser that the user is unauthenticated and needs to provide the correct password. If they pass that check, we then check the Authorization header in their request - if it isn’t there then we fail them and their browser prompts them for a username and password. We then try to find their user in the list we retreieved earlier. If we couldn’t find it then we fail them, but not before running a hash on a fake password - this ensures that they can’t use the round-trip-time to work out which users are valid and invalid. If we could find their password hash, then we hash what they provided and compare the two. If they match, then we serve the page and if not we fail them.

But how do we know when to apply the auth? As I mentioned earlier, I am not necessarily a fan of the way Caddy does things and I wanted to have some fun here - there’s a whole system of Users and Realms. There’s a list of Users which are just uuid-name-hash combinations, and of Realms which can match on paths (there’s exact match, regex, etc) and then we can link together users and realms for authentication. For me, this system just makes sense.

Conclusion

This project was a blast to work on - just large enough to have some fun complicated parts, but not so large that it becomes painful to work on. It’s also legitimately useful for me which is a nice side effect. This was also my first project deployed with fly.io, which has been incredible to work with - speedy support, great documentation, reasonable prices, and the CLI works very very well.

  1. I honestly don’t understand why more people don’t dogfood their stuff - it’s been so incredible for finding bugs and new features.

  2. Yes I spent far too long getting wavy text working - even with the help of a sick codepen, it was certainly interesting learning more about Hugo & Zola and how to get a version that was repeatable on dynamic pages working.

  3. From the RFC alone, I might add. Definitely because I wanted the challenge, not because there were’t any rust crates I could find to do it for me ;). And yes, I am aware of how horrific an idea rolling my own cryptography is.

  4. And I just updated the files through rclone for reference.

  5. But if you are, feel free to open a PR! My first thought would be some kind of lockfile in the bucket which is the first thing created and the last thing destroyed in the protect flow.

  6. Either by a webhook which lives on /reload or on a timer every 60s.

  7. If I ever need some kind of control/management plane, I’ll probably just do it with a raw TCP socket.

  8. A special kind of request that signals to a server: Heyo! I’d love to open a new websocket connection, can I?

  9. That channel gets set off by either a 60s timer that checks for a different content hash in S3, or a Tigris webhook that lives on /reload.

  10. For persistence, this data is stored in an encrypted file in S3.