I was motivated to write this by the Writing Gaggle. Check out the other posts from this month! Especially if you’re looking for something shorter to read; I went over-long this time.

Discord Bots 101 - Peaceful Shores

In this post, we’re going to explore one easy route to making a Discord bot in 2023, with Rust.

Or, well. It should be easy.

It’s going to be a little idiosyncratic because this style seems to be relatively unexplored in the Rust Discord bot ecosystem, since everyone is already used to the old way of doing it. We’ll cover what that is in a future post.

What’s in a Bot?

The first thing we have to do is understand what, really, we’re trying to make here. A Discord bot is a program that listens for events from a chat it cares about, and issues responses to those particular events.

For our purposes, the events we care about are Interactions, and our responses will use Followup Messages.

What Kind of Bot?

We’re going to make a dice rolling bot, because that sounds easy. Our goal is simple:

Receive slash commands to /roll a dice expression
Issue followup messages with the result of rolling those dice expressions

This is the kind of bot I made when I got started, so now you’re going to do it too.

Making a Discord Application

The first thing you need to do is open the Discord Developer Portal: https://discord.com/developers/applications

Here, you’ll need to create a new “Discord Application”, which shall become your bot.

Click the “New Application” button, located in the top right:

New Application button

And then fill in the name of your bot:

Application creation form

And, of course, click the “Create” button. Presuming nothing went wrong, like choosing an invalid name for your application, you should be brought to the General page for your new application. The relevant navigation options are on the left. They vanish if the window isn’t wide enough, replaced by a little sandwich icon (3 stacked horizontal bars) in the top left which brings them back when clicked.

You’ll be using the Discord Developer Portal a lot, so keep this page open.

Now for the code.

Dependencies

Let’s start by adding some dependencies we know we’ll need. In particular, we’re going to want twilight-model, which gives us the data representations we’ll need for consuming and producing messages with the Discord API.

We’ll also want twilight-util, which provides an assortment of nice utilities, one of which whose use is pretty much demanded by twilight-model in a later section.

The next thing we need is some library to help us serve a web API. We’ll be using axum, because I like it. This will also require us to add tokio as a dependency, as that is the async runtime used by axum. If you don’t know what that means, don’t worry too much about it. We’ll be brushing over those details anyway.

We’ll also need ed25519-dalek for verifying interactions are from Discord. We wouldn’t want unauthorized randoms posting to our API! Also for this purpose, we’ll use the hex crate to parse the hexadecimal input Discord gives us signatures as.

We’ll also need something to do a couple manual HTTPS requests to Discord’s API, during command registration. This will happen once at the start of program execution, so it doesn’t really matter what we use. I’m choosing ureq for this purpose.

In the process of talking to Discord’s API, we’ll need to base64 encode one thing, which we’ll do using the base64 crate.

And finally, we’ll use bracket-random for rolling dice expressions.

By now, our Cargo.toml should have something like this in it:

[dependencies]
twilight-model = "0.15.1"
twilight-util = { version = "0.15.1", features = ["builder"] }
axum = "0.6.6"
tokio = { version = "1.25.0", features = ["full"] }
ed25519-dalek = "2.0.0-pre.0"
hex = "0.4.3"
bracket-random = "0.8.7"
ureq = { version = "2.6.2", features = ["json"] }
base64 = "0.21.0"

Receiving Events

So, why are we serving a web API anyway?

Turns out, Discord lets you give it a URL to submit Interactions to in the form of POST requests. Setup a public facing web server with a POST endpoint, and it’ll send you JSON events.

Now, I put all my web servers behind NGINX, so my config is going to look something like this:

server {
       server_name api.catmonad.xyz;
       location /blog/discord/interactions {
                proxy_pass http://127.0.0.1:4635/;
       }

       listen 443 ssl;
       ssl_certificate /etc/letsencrypt/live/api.catmonad.xyz/fullchain.pem;
       ssl_certificate_key /etc/letsencrypt/live/api.catmonad.xyz/privkey.pem;
       include /etc/letsencrypt/options-ssl-nginx.conf;
       ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
}

What you do will vary depending on how exactly you want to do hosting. It’d be perfectly reasonable to do TLS (using rustls, of course) in your Rust web server and expose it directly to the Internet, but we’re not covering that today.

With TLS termination out of the way, it’s time to try receiving some events!

If we go ahead and put https://api.catmonad.xyz/blog/discord/interactions in that “Interactions Endpoint URL” box on our Application page, then click that “Save Changes” button that appears, Discord will… oh.

The specified interactions endpoint url could not be verified.

Yeah, we actually need to have a working server up that Discord can message before it’ll accept our choice of Interactions URL. We’ll have to go a bit further before we can play around.

To give you more of an idea what’s happening here: This is what the test request Discord sends to my Interactions endpoint looks like, trimmed down a bit and formatted.
Headers:

POST / HTTP/1.0
Host: 127.0.0.1:4635
Connection: close
Content-Length: 507
Accept-Encoding: gzip
x-signature-timestamp: <snip>
x-signature-ed25519: <snip>
content-type: application/json
user-agent: Discord-Interactions/1.0 (+https://discord.com)

Body:

{
  "application_id": "1075527780680351764",
  "id": "1075555207599104162",
  "token": "<snip>",
  "type": 1,
  "user": {
    "avatar": "eb168256edfe1f25c38e1f173b58df6e",
    "avatar_decoration": null,
    "discriminator": "4158",
    "display_name": null,
    "id": "97171464851042304",
    "public_flags": 4194304,
    "username": "Monadic Cat"
  },
  "version": 1
}

I cut out the x-signature-ed25519 header and the token field in the body because they were really large, but the important thing to know is that they’re big blobs of text we’re just going to hand over to opaque APIs that know what to do with them, lol.

A Web Server, Briefly.

This isn’t meant to be a tutorial on how to make web servers with Axum, so let’s speedrun this part.

Imports:

use axum::http::{HeaderMap, StatusCode};
use axum::routing::post;
use axum::{Json, Router};
use std::net::SocketAddr;
use twilight_model::application::interaction::Interaction;

(I rely on my IDE to quickly add appropriate imports. Some people might use a few glob imports here instead.)

Declare the only route, initialize the server:

#[tokio::main]
async fn main() {
    let port = 4635;
    let app = Router::new().route("/", post(post_interaction));

    let addr = SocketAddr::from(([127, 0, 0, 1], port));

    axum::Server::bind(&addr)
        .serve(app.into_make_service())
        .await
        .unwrap();
}

Then define the handler as something that gives us what we need:

type InteractionResult = Result<
    (StatusCode, Json<InteractionResponse>),
    (StatusCode, String)
>;
async fn post_interaction(
    headers: HeaderMap,
    Json(interaction): Json<Interaction>,
) -> InteractionResult {
    dbg!(&headers);
    dbg!(&interaction);
    Err((StatusCode::NOT_FOUND, "I dunno".to_string()))
}

One thing I want to call out here: Make sure the success response for your web server sets a Content-Type header of application/json, or Discord may consider the interaction failed regardless of the actual content of the body. (This doesn’t appear to be the case for Ping interactions in particular, but is the case for ApplicationCommand ones.)

With that out of the way, let’s see what it looks like when Discord sends this a test Interaction. I don’t want to deal with automating deployment right now, or copying binaries over repeatedly, so I’m just going to SSH port forward from my local machine to my server.

ssh -R 4635:localhost:4635 my_server

I’ve gone ahead and trimmed down parts of this output like before. Additionally, Discord appears to retry the request once when your server returns a failure. I’ve removed that from this output as well.

[src/main.rs:28] &headers = {
    "host": "127.0.0.1:4635",
    "connection": "close",
    "content-length": "507",
    "accept-encoding": "gzip",
    "x-signature-timestamp": "<snip>",
    "x-signature-ed25519": "<snip>",
    "content-type": "application/json",
    "user-agent": "Discord-Interactions/1.0 (+https://discord.com)",
}
[src/main.rs:29] &interaction = Interaction {
    app_permissions: None,
    application_id: Id<ApplicationMarker>(1075527780680351764),
    channel_id: None,
    data: None,
    guild_id: None,
    guild_locale: None,
    id: Id<InteractionMarker>(1076556840638349363),
    kind: Ping,
    locale: None,
    member: None,
    message: None,
    token: "<snip>",
    user: Some(<snip>),
}

Validating Requests

Discord wants us to validate every request comes with a valid signature for its content, so let’s do that. It’d be nice if we could do this without throwing away the convenience of the Json extractor, but, to do the validation correctly, we need the exact bytes used in the body and we shouldn’t assume that re-serializing it will produce the same thing byte-for-byte.

Let’s add serde_json to our Cargo.toml so we can do the deserialization manually:

serde_json = "1.0.93"

And in Rust, we’ve replaced the interaction argument with body: String:

async fn post_interaction(headers: HeaderMap, body: String) -> InteractionResult {
    let Ok(interaction): Result<Interaction, _> = serde_json::from_str(&body) else {
        return Err((StatusCode::BAD_REQUEST, "request contained invalid json".to_string()))
    };
    dbg!(&headers);
    dbg!(&interaction);
    Err((StatusCode::NOT_FOUND, "I dunno".to_string()))
}

Which gives us access to the request body as it is actually given to us, in body.

We could probably make this whole validation bit into our own custom extractor or middleware that handles the whole process and gives the validated and then deserialized JSON automatically, but let’s not worry about that just yet.

Since the request signature is given in hexadecimal, let’s parse it and see what it looks like:

    let Some(sig) = headers.get("x-signature-ed25519") else {
        return Err((StatusCode::BAD_REQUEST,
                "request did not include signature header".to_string()))
    };
    let Ok(sig) = hex::decode(sig) else {
        return Err((StatusCode::BAD_REQUEST,
                "request signature is invalid hex".to_string()))
    };
    println!("Sig bytes: {:?}", sig);

Output, this time including the second request:

Sig bytes: [76, 180, 84, 83, 224, 226, 119, 71, 49, 231, 234, 185, 234, 87, 182, 207, 74, 134, 99, 133, 168, 10, 140, 29, 4, 110, 223, 85, 215, 27, 109, 121, 68, 240, 23, 193, 87, 4, 138, 14, 12, 116, 0, 110, 168, 242, 50, 227, 91, 39, 47, 182, 41, 67, 106, 47, 161, 131, 140, 56, 248, 131, 151, 2]
Sig bytes: [198, 66, 20, 24, 74, 33, 193, 3, 31, 200, 23, 218, 84, 54, 148, 155, 192, 53, 183, 132, 217, 42, 119, 227, 247, 99, 62, 130, 20, 252, 212, 129, 250, 96, 237, 10, 157, 19, 45, 83, 169, 34, 200, 53, 194, 206, 236, 229, 155, 52, 5, 54, 141, 51, 196, 67, 14, 109, 81, 150, 82, 76, 28, 12]

We’re definitely making progress!

Next, we’ll use ed25519_dalek to parse the signature.

    let Ok(sig) = Signature::from_slice(&sig) else {
        return Err((StatusCode::BAD_REQUEST,
                "request signature is malformed".to_string()))
    };
    println!("Sig: {:?}", sig);

Output:

Sig: ed25519::Signature { R: [76, 180, 84, 83, 224, 226, 119, 71, 49, 231, 234, 185, 234, 87, 182, 207, 74, 134, 99, 133, 168, 10, 140, 29, 4, 110, 223, 85, 215, 27, 109, 121], s: [68, 240, 23, 193, 87, 4, 138, 14, 12, 116, 0, 110, 168, 242, 50, 227, 91, 39, 47, 182, 41, 67, 106, 47, 161, 131, 140, 56, 248, 131, 151, 2] }
Sig: ed25519::Signature { R: [198, 66, 20, 24, 74, 33, 193, 3, 31, 200, 23, 218, 84, 54, 148, 155, 192, 53, 183, 132, 217, 42, 119, 227, 247, 99, 62, 130, 20, 252, 212, 129], s: [250, 96, 237, 10, 157, 19, 45, 83, 169, 34, 200, 53, 194, 206, 236, 229, 155, 52, 5, 54, 141, 51, 196, 67, 14, 109, 81, 150, 82, 76, 28, 12] }

And then validate it. Discord wants us to concatenate the X-Signature-Timestamp header with the body for this, so:

    let Some(signed_buf) = headers.get("x-signature-timestamp") else {
        return Err((StatusCode::BAD_REQUEST,
                "request did not include signature timestamp header".to_string()))
    };
    let mut signed_buf = signed_buf.as_bytes().to_owned();
    signed_buf.extend_from_slice(body.as_bytes());

    let pub_key = discord_pub_key();

    dbg!(pub_key.verify_strict(&signed_buf, &sig));

Output:

[src/main.rs:64] pub_key.verify_strict(&signed_buf, &sig) = Err(
    signature::Error { source: Some(Verification equation was not satisfied) },
)
[src/main.rs:64] pub_key.verify_strict(&signed_buf, &sig) = Ok(
    (),
)

It works! Great. Why did one of those requests fail verification?

Turns out, Discord’s not doing a retry- these two requests are entirely different. One of them is intentionally failing, as Discord is trying to ensure that we’re actually validating the signatures on these requests, by forcing us to respond with a failure to its test request which has an invalid signature.

So, okay, let’s do that.

    let Ok(()) = pub_key.verify_strict(&signed_buf, &sig) else {
        return Err((
            StatusCode::UNAUTHORIZED,
            "interaction failed signature verification".to_string(),
        ));
    };

Hang On, Where Did That Public Key Come From?

In that snippet above, where we actually did the signature verification, we called this discord_pub_key() function with no explanation. Where did that come from?

It’s something you copy and paste from the Discord Developer portal, on your application’s page: Discord developer panel

In particular, it is this section that you need to care about: Public Key spot in the Discord developer panel

With that information, we can write discord_pub_key() like this:

fn discord_pub_key_bytes() -> Vec<u8> {
    hex::decode("<paste me from the developer portal for your app>").unwrap()
}
fn discord_pub_key() -> VerifyingKey {
    // You might wonder, why is Monad using `.unwrap()` here when they so
    // meticulously avoided it until now?
    // It's because the public key is something we confirm out of band,
    // and if it's malformed we shouldn't even be running.
    // Additionally, I consider unwrapping during startup tasks which come
    // before the main loop perfectly acceptable for a program
    // which will only be seen by devs. This unwrapping won't meaingfully
    // affect the fault tolerance of the program. I'll stop explaining this
    // before it becomes an essay on its own.
    let pub_key_bytes: [u8; 32] = discord_pub_key_bytes().try_into().unwrap();
    VerifyingKey::from_bytes(&pub_key_bytes).unwrap()
}

Responding To Interactions

Okay, that was a lot more prep work than I’d like. But, we’re ready to actually inspect Interactions and send responses back to Discord.

async fn post_interaction(headers: HeaderMap, body: String) -> InteractionResult {
    // ... validate signature, introduce `interaction: Interaction` to work with ...
    // TODO: actually process and respond to the Interaction
}

Pings

The first thing we need to get out of the way is handling ping messages from Discord. These are something Discord uses simply to test that your server is up and running. All you gotta do is reply with a pong 😃

    match interaction.kind {
        InteractionType::Ping => {
            let pong = InteractionResponse {
                kind: InteractionResponseType::Pong,
                data: None,
            };
            Ok((StatusCode::OK, Json(pong))
        }
        _ => Err((StatusCode::NOT_FOUND,
              "requested interaction not found".to_string())),
    }

It is at this point that Discord will finally accept our server is functioning, and let us save its URL as the Interactions Endpoint URL for our app.

Discord saved our Interaction Endpoint URL

Now all we need to do is handle ApplicationCommand Interactions and we’ll be all set!

Application Commands

So, we add an arm to our match expression to handle InteractionType::ApplicationCommand, and then extract specifically the shape of data we care to respond to, and again just return NOT_FOUND for everything we don’t know or care about.

    fn not_found() -> InteractionResult {
        Err((
            StatusCode::NOT_FOUND,
            "requested interaction not found".to_string(),
        ))
    }

<snip>

        InteractionType::ApplicationCommand => {
            let Some(InteractionData::ApplicationCommand(data)) = interaction.data else {
                return not_found()
            };
            // TODO: It'd be better if we checked the command ID here,
            //       but we haven't actually registered any commands yet.
            match &*data.name {
                "roll" => match &*data.options {
                    [CommandDataOption {
                        name,
                        value: CommandOptionValue::String(expr),
                    }] if name == "expression" => {
                        dbg!(expr);
                        let roll = InteractionResponse {
                            kind: InteractionResponseType::ChannelMessageWithSource,
                            data: Some(InteractionResponseData {
                                content: Some("TODO: actually roll dice".to_string()),
                                ..Default::default()
                            })
                        };
                        Ok((StatusCode::OK, Json(roll)))
                    },
                    _ => not_found(),
                },
                _ => not_found(),
            }
        }

This could be done in a number of ways, including splitting out the arms on the match expression using data.name into their own functions, or performing a HashMap lookup to get an appropriate callback, or plenty of other strategies. We’re going with this one purely because it is what I wrote first, and we’re only implementing one command here.

Now that that’s done, there’s just one last step to be able to run this command we just made from Discord.

Registering Commands

To register a command, you need to POST an “Application Command Object” as described here to Discord.

twilight-model, of course, ships the representation we’ll be using here: Command

Those docs recommend we use a builder provided in twilight-util to construct this object, to avoid hilarious verbosity, and so we shall:

fn register_command() {
    let cmd = CommandBuilder::new("roll", "Roll a dice expression", CommandType::ChatInput)
        .option(StringBuilder::new("expression", "dice expression to be rolled").required(true))
        .build();
    // TODO: POST this to the Discord API
}

Next we need to actually do a POST request, which will need us to set an Authorization header with a token we haven’t gotten yet. For now, let’s just put a function which gets that info and writes the header in the correct format as a TODO:

fn discord_auth_header() -> String {
    todo!("fetch the needed info and format the header")
}

and use it to perform our command registration:

    let auth = discord_auth_header();
    ureq::post(&add_command_api())
        .set("Authorization", &auth)
        .send_json(cmd)
        .unwrap();

Once we fill in discord_auth_header, this will work.

Client ID and Secret

For the next part, we’re going to need some extra information from the Discord developer panel for your application:

Your Discord application client ID
Your Discord application client secret

Now, because one of these is something you must keep secret, we’re going to cover how you should keep it.

In particular, we’re going to pull these from environment variables, and you need to ensure that those environment variables are defined in a script you do not commit to version control.

I use a file named .env which I list in my .gitignore to ensure I don’t add it to my repository:

CLIENT_ID="<your client ID>"
CLIENT_SECRET="<your client secret>"

Now, some people use a crate called dotenvy which handles loading files like this into the environment. You could consider using this, but as I work in a shell which supports loading them by running . .env, I will not be.

My command line for running the bot now looks like: . .env && cargo run

Now, to get this information from the Discord developer panel, go to the OAuth2 page, and find these parts:

Discord OAuth2 client information

You’ll need to click “Reset Secret” to get a new client secret, and then you can just click the “Copy” button which appears there and paste it into your .env file. That secret will be hidden again next time you visit the page.

Client Credentials Grant

Next, to get the token we need for authorization when registering our Slash Command, we use the client credential flow.

This boils down to sending some form URL encoded data to Discord and getting JSON back.

Nothing shocking here. So let’s do it, home brewing some models that I couldn’t find in twilight-model:

#[derive(Deserialize)]
struct ClientCredentialsResponse {
    access_token: String,
    token_type: String,
    expires_in: u64,
    scope: String,
}

fn authorization() -> String {
    let engine = engine::GeneralPurpose::new(&alphabet::STANDARD, engine::general_purpose::PAD);
    let auth = format!(
        "{}:{}",
        std::env::var("CLIENT_ID").unwrap(),
        std::env::var("CLIENT_SECRET").unwrap()
    );
    engine.encode(auth)
}

fn client_credentials_grant() -> ClientCredentialsResponse {
    ureq::post("https://discord.com/api/v10/oauth2/token")
        .set("Authorization", &format!("Basic {}", authorization()))
        .send_form(&[
            ("grant_type", "client_credentials"),
            ("scope", "applications.commands.update"),
        ]).unwrap().into_json().unwrap()
}

There is a deprecated way of writing HTTP Basic Authentication parameters in the request URL, but we will not be using it here.

Finally, we can define discord_auth_header like this:

fn discord_auth_header() -> String {
    let grant = client_credentials_grant();
    format!("Bearer {}", grant.access_token)
}

If our Discord application had a bot user, we could just use the bot token instead of going through this whole rigmarole, but we’re avoiding creating a bot user for this post. Discord says we don’t need one!

Rolling Dice. Finally.

I don’t know about you, but, after all that, I’m exhausted. Fortunately, the stuff that’s left really is easy.

Let’s just write a function which takes a dice expression as a &str, parses it, evaluates it, and returns a formatted result string. We’ll call it inside the ApplicationCommand handler we wrote earlier.

fn roll_expression(expr: &str) -> String {
    let Ok(expr) = parse_dice_string(expr) else {
        return "invalid dice expression".to_string()
    };

    let mut rng = bracket_random::prelude::RandomNumberGenerator::new();

    // Note: The `rng.roll_dice(expr)` function, while convenient,
    // doesn't handle the case of overflow, and we're handling untrusted
    // inputs here. I believe `bracket-lib` is meant for things which
    // actually run on the user's system, so don't hold this too much against them.
    let mut total: i32 = 0;
    let Some(upper_bound) = expr.die_type.checked_add(1) else {
        return "This die has too many sides!".to_string()
    };
    for _ in 0..expr.n_dice {
        let Some(res) = total.checked_add(rng.range(1, upper_bound)) else {
            return "Encountered overflow while rolling dice!".to_string()
        };
        total = res;
    }
    if let Some(res) = total.checked_add(expr.bonus) {
        total = res;
    } else {
        return "Encountered overflow while rolling dice!".to_string();
    }

    format!("Your roll: {total}")
}

And with that, our basic dice Discord bot works exactly as expected.

Successful /roll invocation

Addendum: Improving The Ecosystem

Work is ongoing in Twilight to make approaching this style of Discord bot easier in the future. I went ahead and asked if I could contribute a utility for doing signature verification, and have been asked in turn to file an issue proposing an API for the purpose for twilight-util. I intend to do so after a bit of rest.

I don’t know what the situation is in the Serenity side of the Rust Discord bot ecosystem, beyond “not supported right now”. I haven’t asked people over there about it yet.

Update: March 24th, 2023

I have since written an Axum middleware which handles signature verification, and filed that issue. I plan to work on it in the near future. I haven’t received much feedback on which direction to take it, so I’ll probably try all the approaches I listed there.

Addendum: Corners Cut

In this post, I cut corners on:

error handling: we should at least initialize a tracing subscriber and report failures using it
configuration: we should use a more principled method of passing configuration values like Discord pub key, client id, secrets, etc. to the program
code organization: plenty of things here could be their own functions, modules, or be approached in ways which would make them easier to extend when we want to do things like add new commands
automatic testing: we should probably have a test suite at all, lol

Following along with this post will definitely get you started making Discord bots, but there’s a lot more to making a project good, clean, and sustainable than I was able to cover here.