Louis Pennachio Reach out

Creating a join code system

Context

As part of the quiz platform I'm currently building, I want to make it simple for players to group up in lobbies. In this article, we will focus on join codes, which are short intelligible codes designed to be communicated and typed easily. We'll talk about how to generate the join codes and how to scale the solution to a large number of lobbies.

The flow starts with a single user creating a lobby. From there, they need a way to bring other players in. There are multiple ways to achieve this, summarized in the table below:

Approach Description Technical consideration Implementation cost
In-app invite The user invites other players via a player search feature or some kind of friend list.

Requires a user interface for searching and inviting players.

Requires a notification system to send and receive invites.

Heavy
Deep link The user shares a lobby deep link to other players via social messaging.

Requires platform-specific configuration for Android, iOS, and Web.

Requires actual implementation of the deep link (navigate to the lobby screen and trigger the join action).

Medium
QR code The user shows a QR code holding a deep link to join the lobby. Same as a deep link, with the added complexity of generating a QR code from the deep link. Medium
Join code The user shares a join code that other players need to manually enter to join the lobby. Requires generating a join code for each lobby. Low

Join codes look like the lowest-cost approach, so I started there. Here is what join codes look like in GeoGuessr. From the inviter's point of view, on Android:

Screenshot of the GeoGuessr Android app showing the lobby screen.

The lobby screen, featuring the "join by code" button.

Screenshot of the GeoGuessr Android app showing the join code instructions.

The join code instructions.

From the joiner's point of view, on Web:

Screenshot of the GeoGuessr Web app showing the join code screen.

The join code screen, where the join code is typed in.

We can consider the following pros and cons for join codes:

Pros Cons
  • Self-contained implementation.
  • Less prone to bugs than deep links (in the case of a deferred deep link, for instance).
  • Easy to understand for non-technical users (similar to a phone number).
  • Can be communicated verbally (practical when playing in-person or while in a vocal channel).
  • Require more steps from the joiner's side: navigate to the lobby screen, and type the join code (or paste it).

With random generation

The first idea that comes to mind to implement join codes: generate a random code for each new lobby. But codes are short, so two lobbies could end up with the same one. Is that a real problem?

Each lobby has one join code, and each user can own at most one active lobby. So the number of active lobbies is bounded by the number of users. When a lobby retires (all users leave), it's fine if its code later gets assigned to a new lobby, since the old one is no longer joinable. Only active-vs-active collisions matter.

So, how big is the join code space? With each digit from [0-9a-z] (36 characters), we get 36⁵ = 60,466,176 codes. Plenty, right? Spoiler: it isn't.

This is an instance of the Birthday problem, which is the counterintuitive fact that in a room of just 23 people, the probability that two share a birthday already exceeds 50%.

I've asked Claude Code to generalize this problem to our case and approximate collisions:

Screenshot of the Claude Code approximation of the join code collision. Y-axis shows the probability of collision. X-axis shows the number of active lobbies.

Claude Code approximation of the join code collision.

Our 60,466,176 combinations now look weaker, with a 50% chance collision at only 9,157 active lobbies, and a 1% chance collision at 1,103 active lobbies.

I've also run the simulation locally:

@Test
fun `Experiment join code collision`() {
    val rounds = 1000

    val numberOfCollisionsToNumberOfRounds = sortedMapOf<Int, Int>()

    val charset = "abcdefghijklmnopqrstuvwxyz0123456789"
    val codeLength = 5
    val attempts = 9_150

    for (round in 0 until rounds) {
        val seen = HashSet<String>()
        var collisions = 0

        repeat(attempts) {
            val code = buildString(codeLength) {
                repeat(codeLength) {
                    append(charset[Random.nextInt(charset.length)])
                }
            }

            if (!seen.add(code)) {
                collisions++
            }
        }

        numberOfCollisionsToNumberOfRounds[collisions] = numberOfCollisionsToNumberOfRounds[collisions]?.let { it + 1 } ?: 1
    }

    println("Number of rounds: $rounds")
    println("Attempts per round: $attempts")
    println(
        "Number of collisions to number of rounds:\n" +
                numberOfCollisionsToNumberOfRounds.entries.joinToString("\n") { (k, v) -> "  $k collision(s): $v round(s)" },
    )
}

Which outputs:

Number of rounds: 1000
Attempts per round: 9150
Number of collisions to number of rounds:
  0 collision(s): 488 round(s)
  1 collision(s): 361 round(s)
  2 collision(s): 111 round(s)
  3 collision(s): 34 round(s)
  4 collision(s): 4 round(s)
  5 collision(s): 2 round(s)

This aligns with the theory, with 48.8% of rounds without collisions, and 51.2% of rounds with at least one collision.

In other words, we reach a 50% chance of collision after using only ~0.015% of the join code space (9,150 out of 60,466,176). This tells us that generating join codes randomly among the join code space does not scale.

Now, how can we make better use of the join code space? The obvious alternative is to start at join code 00000 and increment by one for each new lobby. However, that would expose us to enumeration attacks, meaning a malicious user could create a lobby, check the join code, and try to join other lobbies by subtracting from it. Let's not do that.

So, could we traverse our join code space in a way that cannot be easily predicted? Yes, by applying a permutation on our space. The idea is to generate each join code from a monotonically increasing counter, then map it through a permutation that yields another unique value in the same space.

Using a Linear congruential generator

The first permutation I found is the Linear Congruential Generator (LCG). It makes use of two carefully chosen parameters (the multiplier and the increment) to generate the join code.

Here is a popular example that uses the Knuth and H. W. Lewis parameters from the Parameters in common use:

const val multiplier = 1664525L
const val increment = 1013904223L
const val modulus = 1L shl 32 // 2^32

fun generate(counter: Long): Long {
    return (counter * multiplier + increment).mod(modulus)
}

Implementation of a LCG with Knuth and H. W. Lewis parameters.

However, we cannot directly reuse it for our case, since the modulus does not match our space size. To do so, we would have to craft the parameters, given modulus=60,466,176.

One way to do it is by following Hull-Dobell theorem. Let's see what that would involve:

  • m and c are coprime
  • a−1 is divisible by all prime factors of m
  • a−1 is divisible by 4 if m is divisible by 4

Unfortunately, this involves guesswork and offers no guarantee the generator will be free of observable patterns.

From the Wikipedia page:

LCGs should be evaluated very carefully for suitability in non-cryptographic applications where high-quality randomness is critical.

On top of that, the parameters can still be reverse-engineered from a handful of outputs using lattice reduction (as explained in this writeup on reverse-engineering LCGs). Because of that, LCG is not a fit.

Using Format-preserving encryption (FF3-1)

Next, we have Format-preserving encryption algorithms. These produce an output with the same format as the input. Since we use Kotlin/Java for our backend, we'll look into the FF3-1 algorithm provided by Bouncy Castle library.

To use FF3-1, we need:

  • A secret key of 16, 24, or 32 bytes (we will use 32 bytes in our implementation).
  • A 7-byte tweak, allowing different permutations under the same key. In our case, we use a fixed value.
  • A space size of at least 1,000,000.
libs.versions.toml
[versions]
bouncycastle = "1.84"

[libraries]
bouncycastle = { module = "org.bouncycastle:bcprov-jdk18on", version.ref = "bouncycastle" }
build.gradle.kts
dependencies {
    implementation(libs.bouncycastle)
}
JoinCodeGenerator.kt
private const val ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyz"

private const val RADIX = ALPHABET.length

// FF3-1 algorithm requires a 7 bytes tweak
private val TWEAK = Base64.decode(Env.JOIN_CODE_TWEAK)

private val engine = FPEFF3_1Engine().also {
    it.init(
        true,
        FPEParameters(
            KeyParameter(
                Base64.decode(Env.JOIN_CODE_KEY),
            ),
            RADIX,
            TWEAK,
        ),
    )
}

// FF3-1 engine is not thread-safe
private val mutex = Mutex()

fun generate(counter: Long, length: Int): String {
    if (length <= 0) {
        throw IllegalArgumentException("Length must be greater than 0.")
    }

    // Convert counter to base-36 digits
    val input = ByteArray(length)
    var n = counter
    for (i in length - 1 downTo 0) {
        input[i] = (n % RADIX).toByte()
        n /= RADIX
    }

    // Encrypt in-place within the space
    val output = ByteArray(length)

    mutex.withLock {
        engine.processBlock(
            /* inBuf = */ input,
            /* inOff = */ 0,
            /* length = */ length,
            /* outBuf = */ output,
            /* outOff = */ 0,
        )
    }

    return output.joinToString("") { ALPHABET[it.toInt()].toString() }.uppercase()
}

Note that:

  • It requires securing the key (and optionally the tweak), that's why I inject them as environment variables.
  • FPEFF3_1Engine is not thread-safe, so I use a Mutex when generating a join code.

With the associated test:

JoinCodeGeneratorTest.kt
class JoinCodeGeneratorTest {

    @BeforeTest
    fun setup() {
        mockkObject(Env)
        every { Env.JOIN_CODE_KEY } returns "45ZfQk4arxyViupI0wB6TNZcMglR5RTiUMLhCon/Jgg="
        every { Env.JOIN_CODE_TWEAK } returns "o9C+wXHL/Q=="
    }

    @Test
    fun `Generate join code with length of zero should throw`() {
        shouldThrow<IllegalArgumentException> {
            generate(
                counter = 0,
                length = 0,
            )
        }
    }

    @Test
    fun `Generate join code with negative counter should throw`() {
        shouldThrow<IllegalArgumentException> {
            generate(
                counter = -1,
                length = 5,
            )
        }
    }

    @Test
    fun `Generate join codes with length of five should generate join codes`() {
        generate(
            counter = 0,
            length = 5,
        ) shouldBe "C7YWR"

        generate(
            counter = 1,
            length = 5,
        ) shouldBe "NZIX8"

        generate(
            counter = 2,
            length = 5,
        ) shouldBe "K4T54"

        generate(
            counter = 45_435_423,
            length = 5,
        ) shouldBe "4YA1K"
    }
}

This approach finally meets our needs:

  • No collisions.
  • No reverse engineering is possible (unless the key and tweak are leaked).
  • Fast to use.
  • Allows extending the length of join codes.

One last concern, what to do if we consume all possible join codes? One approach is to simply extend the join code length by one digit, which will allow us to generate 36⁶=2,176,782,336 join codes.

If we look at Lichess, one of the largest chess platforms, it runs about 3 million games per day. For a quiz platform with a lobby system, where each user owns at most one lobby at a time, and each lobby hosts multiple games, six-digit join codes give us a comfortable horizon. A five-digit code remains possible with a rotation policy, but that adds non-trivial complexity: handling ongoing games, notifying affected users, and so on. So I think extra digit is the cleaner trade.

To achieve this, all we need is a singleton document to store the current counter value and the max value, updated atomically on each generation.

@Serializable
data class JoinCodeInfo(
    @SerialName("_id")
    val id: String = "singleton",
    val counter: Long,
    val codeLength: Int,
    val maxCounter: Long,
    val updatedAt: Instant,
    val schemaVersion: Int = 1,
)

Summary

Given our join code specification, we saw that generating join codes randomly would not scale with the number of active lobbies due to collisions. Sequential generation avoids collisions but is easily enumerable, so we explored two permutation-based approaches: LCG and FF3-1. LCG turned out to be hard to configure for our space size, offered weak randomness, and could be reverse-engineered from a handful of outputs. FF3-1 addressed all three concerns: security, distribution, and efficiency. It became our final choice.

Thanks for reading!

Louis