Dishing up Krill

Dishing up Krill
Photo by Farhad Ibrahimzade on Unsplash

RPKI Certification Authorities (CAs) publish their RPKI related content using a so-called Publication Server, which in turn makes the 'publication point' for this and all other CAs it serves available in an RPKI Repository. These repositories are served to Relying Party (RP) software for RPKI validation using rsync and the RPKI Repository Delta Protocol (RRDP).

If you are running Krill, or another RPKI CA implementation, under an RIR or NIR who also offers a Publication Server, then you are in luck. It is strongly advised to make use of their services!

On the other hand, if you need to host your own Publication Server infrastructure, then you may find that making sure that your repository is available 24/7 can be a challenge. In this post we want to talk about those challenges, and discuss how Krillsync,  a tool that we have developed for this purpose, can help.

Oh, and although the name of this post suggests otherwise.. almost all of the following applies to all RPKI Repositories and Krillsync (since version 0.2) can be used with any RRDP capable Publication Server implementation.

RRDP vs Rsync

If you are less familiar with RPKI, then you might wonder why there are two different protocols in use for accessing RPKI Repositories. When the RPKI standards were first defined, rsync was the only access protocol in use.

However, there were concerns that scaling up rsync can be hard. The rsync protocol requires that the server and client enter into a dialogue to determine which content the client is missing. As such the server needs to spend CPU and memory to optimise for minimising the amount of data that is transferred. In principle there is nothing wrong with that, except that it requires a significant investment of resources from the server for each connection. This makes individual rsyncd servers vulnerable to resource exhaustion, and they can be trivially DoS'ed. Furthermore, because there are many, many clients (over 50k ASNs today if every operator would validate) this also means that rsync servers would need to be scaled significantly - even to deal with the load resulting from friendly fire.

RPKI Repository operators were concerned with this, and RRDP was invented as an alternative protocol - most importantly to help with server side scaling. In a nutshell the protocol relies on the fact that the state of an RPKI repository at a given moment in time is immutable, and as such a snapshot of that state and a delta from one revision to the next can be stored as immutable data. All that changes is a notification file that tells RPs what the current session and serial state of the repository is, and where the appropriate deltas and snaphot may be retrieved.

The first objective of this approach was to eliminate the need for the server to take part in delta negotiations with a client. The deltas are pre-calculated by the server and the load of determining what is applicable is shifted to the client side. While this may cause individual clients slightly more work, this ensures that the server CPU and memory can no longer be a scaling bottleneck.

Furthermore, because the snapshot and delta files are immutable, they can easily be served using caching HTTP infrastructure - be it in the form of local HTTP servers serving them as static files, or by using a fully fledged CDN. Because there are many commercial HTTP CDNs available this also allows Repository operators to outsource this concern. This can lower the cost. But, more importantly, because it's the CDN's bread and butter to deal with availability and DoS attacks, this can also result in much lower operational risks.

Scaling up RRDP

There are a number of ways to scale up serving RRDP content after it has been generated by the Publication Server. In case you are using Krill as your Publication Server then a very minimalistic setup that you can use for this is to have a local dedicated HTTPS server, such as NGINX, serve the RRDP files directly.

For some setups this would be good enough, but of course one can do much better than this. We can see at least three ways to improve this scaling:

  • Use multiple dedicated caching HTTPS servers
  • Serve static files from multiple dedicated HTTPS servers
  • Use a CDN

Each option has its own concerns, and we leave it to you to decide what would work best for you. Let's talk about each of them briefly..

Caching HTTP servers

Possibly the easiest way to achieve high-availability is by the use of caching HTTPS servers. Instead of relying on local content that is synchronised using Krillsync (or some other way) you can configure your server as a caching proxy to your real Publication Server.

There are some important things to keep in mind if you choose to go this route:

  • Make sure that you serve stale content, in order to be protected against outages of the origin server.
  • You can configure your server to cache files locally, but if you do then you need to make sure that the notification.xml file is not cached for longer than a minute or so (but do serve stale, if there is no fresh copy available)

To facilitate this kind of setup, the Krill Publication Server adds Cache-Control (max-age) headers to responses, so that a proxy can know how long content can be used. For the notification.xml file it uses 60 seconds, for all other RRDP files it uses 86400 seconds - or 24 hours as those of you familiar with DNS will recognise.

So, based on this useful guide one could use a fairly simple NGINX caching server configuration like this (assumes that https will be configured using certbot) :

# Use nginx as a cache for the back-end server.
proxy_cache_path /var/lib/nginx-cache 
   levels=1:2 
   keys_zone=my_cache:1m
   max_size=1g
   use_temp_path=off;

server {

  server_name rrdp.krill.cloud;

  location / {
     proxy_cache my_cache;
     proxy_cache_use_stale 
        error timeout http_500 http_502 http_503 http_504;

     proxy_pass https://ps.private.krill.cloud/rrdp/;
  }
}

Static Files from local HTTP servers

Another way to scale up this service would be to use a number of dedicated HTTPS servers behind a load balancer, each serving local copies of the RRDP Respository files in order to make them independent of any outages of the Publication Server.

Enter Krillsync

As mentioned at the beginning of this post, we have developed Krillsync specifically to help synchronise RRDP content (and rsync - as we will see below). Even though the name suggests that this tool is specific to Krill, it is in fact generic and can be used with any RRDP capable RPKI Publication Server.

We build krill-sync packages for amd64/x86_64 architecture running a recent Debian or Ubuntu distribution, as well as Red Hat Enterprise Linux/CentOS 7 or 8. They can be found in our package repository.

After installing, you can use the krill-sync command to retrieve files from your source server and store them locally. You will need to provide the public RRDP URI that will be used to access the notification file from the server you are synchronising to, and the source address of your (hidden) Publication Server. For the latter you can also use a path on disk, which can be useful in case you run krill-sync on the same server.

Example command:

krill-sync https://rrdp.example.com/rrdp/notification.xml \
    --source_uri_base https://hidden.example.com/rrdp/

By default the content will be saved to /var/lib/krill-sync/rrdp/. You can then serve this content using an HTTPS server. We like to use NGINX with Let's Encrypt and Certbot, but you can use any other option of your preference.

Make things Sticky?

There is a caveat to the setup that we have described here, and that is.. that in a multi-node setup problems can arise if a Relying Party tool retrieves a new RRDP notification file from one node, and then tries to retrieve a snapshot or delta file from another node before it was synchronised.

RPs can work around this issue by using HTTP Keep-Alive. Probably a wise thing to do anyway if they need to fetch multiple files over HTTPS, but it should not be required that they do so.

There are a number of ways that the server can resolve this. The preferred option would be to use support for sticky (source IP based) load balancing if it is available.

Another option is to make sure that krill-sync runs at the same time (from cron) on each node, measure the time that synchronisation runs take, and then instruct krill-sync to postpone writing the notification file allowing for some variance in this time. That way new notification files will only be picked up after all nodes have written the local copies of new snapshot and delta files.

The following option in krill-sync can be used for this:
--rrdp-notify-delay <seconds>

This setup appears slightly more involved in the sense of this rrdp-notify-delay than by using caching servers. After all, for caching servers a cache miss would result in the server fetching the data from your source. But on the other hand  using a local copy ensures that all content is available, whereas a cache miss could result in a 404 if your source repository is unavailable.

Using a CDN

If you really want to scale up, then using a commercial CDN may be for you. You could of course configure this CDN directly in front of your back-end Repository Server. But perhaps a mixed setup where you use multiple HTTP servers that serve static content as described above as the back-end to a CDN would be most resilient. The latter approach could also allow you to fall-back to your local infrastructure in case of an ongoing issue with the CDN (e.g. change DNS).

A lot of this depends on the CDN that one would use, so we will not try attempt to be specific here. That said, the caching concerns described earlier - in case a local caching server is used - apply here as well. And the Cache-Control headers set by the Krill Publication Server (60 seconds for notification.xml and 24 hours for the other files) should help here as well.

Consistent Content with Rsync

When a client connects to an rsync server both parties exchange information about their local copies of files, so that they can work out what the difference is and decide what needs to be resynchronised. This can be problematic in case the server side RPKI repository content is modified during a transfer - as it could lead to inconsistencies in the resulting fileset that an RP uses for RPKI validation.

This can be avoided by Publication Servers if they write the entire new rsync content under a new empty base directory instead, and then use a symlink to map the module configured for rsyncd to this new base directory. This way rsyncd will resolve the link to the real new path when a new client connects, while clients for ongoing transfers will still be served the previous (also consistent) content. Because replacing the symlink is an atomic operation, this also means that there is no chance of a race condition in case a client would connect during such a switch.

Krillsync for rsyncd

As described above.. managing consistent repository versions, symbolic links, and cleaning up old versions is not entirely trivial. Furthermore, what if one wants to use multiple rsycnd installations for a more redundant set up? Naively rsyncing the source to public facing rsyncd nodes would result in local inconsistencies again, even if the source Publication Server would use the described symbolic link approach.

One way to solve this, is by using Krillsync to synchronise your Publication Server content to your public facing rsyncd server(s). Krillsync performs the following steps when it synchronises the rsync content:

  • retrieve the latest consistent RRDP snapshot
  • write the new rsync repository to a new directory for this session and revision
  • rename the symbolic link
  • remove old rsync repository directories if they are unused for more than 10 minutes (configurable)

An example rsyncd configuration (/etc/rsyncd.conf) could be:

uid = nobody
gid = nogroup
max connections = 50

[repo]
path = /var/lib/krill-sync/rsync/current/
comment = RPKI repository
read only = yes

And then a krill-sync command, similar to the following example, can be added to crontab to do the actual synchronisation:

krill-sync https://rrdp.example.com/rrdp/notification.xml \
    --source_uri_base https://hidden.example.com/rrdp/

multiple rsyncd nodes

The rysnc protocol runs over TCP and uses a single connection for a session, and as a result we can be sure that connected (RP) clients will get their content from the one rsyncd server that they happen to connect to. So, this means that we do not need to worry if multiple ryncd nodes are updated in parallel. The RP will still get a consistent view from the server they are connected to, and they will get updates the next time they connect.

So, in principle we can run multiple instances behind a load balancer without the need for them to be in perfect sync with each other, or you can even go wild and create your own anycast cloud of rsyncd instances that all use krill-sync to fetch the repository content from your source.

Of course, the latter would essentially amount to building one's own rsync CDN, something very few organisations have operational experience with. So, even though this can be done, we are happy that RP software prefers RRDP because it allows leveraging existing CDNs and experience in scaling up HTTPS.

In Conclusion

We hope that this post helped you to understand some of the challenges of scaling up an RPKI Repository, appreciate the concerns that sparked Repository operators to develop RRDP as an alternative to rsync, and finally how a tool like Krillsync can help in daily operations of both alternative RPKI Repository access methods.