Ben Schofield
All PostsCategories

Paper Reviews

  • Jan 23, 2025

    It's Time to Replace TCP in the Datacenter

    Summary of It’s Time to Replace TCP in the Datacenter

    This position paper from John Ousterhout sets out everything that’s wrong with TCP and exactly how we should fix it.

  • Jan 6, 2025

    Kafka: a Distributed Messaging System for Log Processing

    Summary of Kafka: a Distributed Messaging System for Log Processing

    Apache Kafka is a system for streaming logs with the aim of providing high throughput over strict guarantees around delivery. This was an interesting paper to read ~13 years on, as Kafka has become more and more ubiquitous in system design.

  • Oct 15, 2024

    Anvil: Verifying Liveness of Cluster Management Controllers

    Summary of Anvil: Verifying Liveness of Cluster Management Controllers

    Wouldn’t it be nice to write some software and confidently say that you know it’s right? That, as long as some assumptions about the world hold, it’s going to do exactly what you want it to, no matter what strange permutations or combinations of failures happen. In broad strokes that’s the promise of formal verification and proofs in software.

  • Jul 4, 2024

    MRTOM: Mostly Reliable Totally Ordered Multicast

    Summary of MRTOM: Mostly Reliable Totally Ordered Multicast

    This paper is about building a network primitive to speed up consensus protocols. Like other papers in this area, MRTOM builds on the fact that network ordered protocols can have much higher throughput than standard consensus protocols. MRTOM takes this one step further by offloading not just packet ordering, but also the fast path of consensus protocols to programmable switches.

  • May 1, 2024

    How Hard is Asynchronous Weight Reassignment?

    Summary of How Hard is Asynchronous Weight Reassignment?

    Majority quorum systems are useful in providing a simple mechanism for consensus. To accept a value, you need a majority of servers to agree to accepting it. Weighted majority quorum services (WQMS) take this approach and recognise that some servers are going to have better performance than others, so they should get more voting power.

  • Apr 28, 2024

    Hydra: Serialization-Free Network Ordering for Strongly Consistent Distributed Applications

    Summary of Hydra: Serialization-Free Network Ordering for Strongly Consistent Distributed Applications

    Replicated systems typically pretty much always have some overhead in comparison to unreplicated systems, at least if you want strong consistency for your data. We need to do extra work in order to make sure that we get the same result across all nodes. The fastest systems minimise or avoid that coordination, but where we can’t avoid it, we need an algorithm to manage that consensus.

  • Mar 26, 2024

    XFaaS: Hyperscale and Low Cost Serverless Functions at Meta

    Summary of XFaaS: Hyperscale and Low Cost Serverless Functions at Meta

    This paper from Meta presents their home-built Function-as-a-Service system called XFaaS. According to the paper, XFaaS handles trillions of function invocations per day, across hundreds of thousands of worker servers. The paper sets out the system architecture, and talks about how they’re able to achieve some quite impressive hardware utilization at scale.

  • Mar 11, 2024

    Keeping CALM: when distributed consistency is easy

    Summary of Keeping CALM: when distributed consistency is easy

    This paper from Hellerstein and Alvaro centers around what we can do without coordinating. The core insight of the paper is that some distributed programs can run without coordination, as long as the output of running the program on a subset of the inputs doesn’t change once you get more information.

  • Benjscho
  • benjscho

subscribe via RSS

Software developer