Narrative, mastery, and character bleed in games, with Ricki Heicklen

Patrick McKenzie Oct 16th, 2025

Patrick built a roguelike in 25 days with LLMs while Ricki ran a 250-person conference-as-game, both exploring how designers shape player behavior without destroying agency.

Patrick McKenzie (patio11) is joined again by Ricki Heicklen to discuss Metagame 2025, the power and responsibility of game designers, how games create pedagogical experiences that traditional teaching cannot, and what happens when the line between player identity and character identity starts to blur.

Complex Systems is producing more video episodes like this one. In addition to this site, you can access them directly on YouTube. My kids inform me that I’m supposed to tell you to like and subscribe.

Thank you to our sponsor: Mercury
This episode is brought to you by Mercury, the fintech trusted by 200K+ companies — from first milestones to running complex systems. Mercury offers banking that truly understands startups and scales with them. Start today at Mercury.com
Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column N.A., and Evolve Bank & Trust; Members FDIC.

Timestamps for Video:

(00:00) Intro
(01:11) Using games as pedagogical tools
(01:58) Ricki's journey into game design
(04:12) The fun and complexity of game design
(05:30) Metagame Conference: A unique blend
(06:07) Defining games and their broad appeal
(07:51) Escape room design and challenges
(09:09) Building and testing games at Metagame
(16:21) Mega game mechanics and challenges
(19:11) Sponsor: Mercury
(20:23) Mega game mechanics and challenges (part 2)
(31:56) Event management and lessons learned
(44:09) Encounter design in games
(45:56) Complex encounters and Plague Town
(48:40) Player choices and moral dilemmas
(56:22) Character bleed and real-world impact
(01:02:46) Game design and player agency
(01:25:02) Community and game evolution
(01:30:55) Wrap

Patrick notes: This transcript discusses an art project I did for a conference, IsekaiGame, at some length. I intend to write extensively on the development experience of that game, and also ship some improvements, but didn’t get to either of those projects prior to press date. You’re welcome to play it (for free) but should expect “playable alpha.” In particular, it is missing capstone encounters for the second act of the game and the end boss. More on IsekaiGame, likely at my main site, in a few weeks.

Transcript

[A brief note on our transcripts.]

Patrick: Howdy, everybody. My name is Patrick McKenzie, better known as patio11 on the Internet, and I'm here with my friend Ricki.

Ricki: Hi, everybody. Great to be here, Patrick.

Patrick: Thanks so much for coming on again.

Using games to teach trading

Patrick: So Ricki and I have previously done a conversation on this podcast about our mutual love of using games as a pedagogical tool to teach trading. Ricki has been running training bootcamps as one of the main parts of her job for the last two years or so, and I had Starfighter back in the day, which had a different take and different business model but fundamentally the same problem. Ricki recently invited me to a game design conference that she ran… goodness, just last weekend. [Patrick notes: Early September 2025, but it was a whirlwind of a month for both of us.] I did a bit of game development for it to have something interesting to talk about.So I thought we'd talk a little bit about why run a game conference, why games generally, why they're interesting to people who are intellectually interested in things like infrastructure, and then maybe take the conversation wherever it takes us.

Ricki's personal journey into game design

Ricki: Thanks. I've loved games since I was a kid growing up. Our family rule was if somebody needs one more for a board game, you have to stop whatever it is you're doing and go join them. It's always been a big part of what I've liked doing in my free time, but it didn't really occur to me until the past couple years that I could potentially also make a career out of it. Jury's still out on how much money it'll make, but at least in terms of getting started with games, I started running these training bootcamps about a year and a half, two years ago, and discovered that one of my favorite parts of them was writing the games with the intention of helping people understand concepts from quantitative trading.

The best way to learn how to trade is to trade, and we could make these play money markets in which people would play iterated games and learn concepts from trading as they went. This caused me and the other people I was running this with to have to think really hard about what kinds of game design help people ramp onto certain concepts. If we threw people straight into the deep end of what a market looks like with all of its intricacies, it would be a lot harder for them to develop the kinds of muscles that they needed for getting good at trading, versus if we started with a very simplified universe and increased things from there.

Exploring different game types

Figuring out what forms of games, what rule adaptations, what progressions of rounds worked best at causing people to learn certain concepts about underlying structure, about incentives, about what it looks like to be in an iterated game with other players you're competing with versus just with yourself—all of that ended up being one of the most fun elements of what I was working on. It caused me to expand more broadly and think what other kinds of games could exist, not just for quantitative trading, but for learning any of a bunch of different concepts.

Around the same time, I built out a puzzle hunt with some very close friends of mine and ended up loving that process as well, thinking through the entire user experience arc. What is it that somebody does from start to finish when they're playing through a game? How does not just the strategy component, the incentives of the different players and the thing that they're optimizing for, but also the narrative and the playfulness components—how do all of those unfold as people progress through a game or through a puzzle hunt or any other immersive experience of that sort?

Patrick: I feel this is lamentably understudied. There's the theory of fun—I think Ralph Koster, if I'm remembering correctly, it might be someone he cited.

Challenges in game design

Funnily enough, we have an entire field called Game Theory, which is basically about games that no one wants to play. But there is relatively little study of games people actually want to play. To the extent that literature exists about it that's not written by and for hobbyists, it's sort of an oral lore that you get in the more commercialized parts of the game industry. You know, these ways of making gacha games are more effective at in-app purchases, historically optimizing the shop, etcetera, versus systematically breaking it down.

One of the things that attracted me to metagame when you gave me the pitch for it was it's going to be about serious games, but maybe not exactly commercialized in the way that a video game conference would be commercialized, both in that some of the games that people presented are commercial products, some less so. I think there is a nice space for play, in the literal and figurative sense, when it's not necessarily going to demolish people's careers if they make an experiment and it doesn't work out.

Metagame conference overview

Ricki: That's exactly right. I use the framing of this event being halfway between a conference and a convention. Conferences are often geared toward highly specific industries. They're often the kind of expenses that people will have covered by their employer, so ticket prices tend to be a lot higher, and they're much more corporate and career motivated for people participating in them. In contrast, conventions are often lower production value, lower ticket cost, but intended for hobbyists who will much more enjoy sitting around playing board games all weekend rather than trying to get jobs directly out of it.

This was somewhere between the two. Part of what that came from was that we used a very broad definition of games. Whereas you might see conferences that are geared toward escape room manufacturers or video game designers or any of a number of other things, we used a pretty broad conception of what a game is.

When I was starting out, the definition that I was working with was any immersive experience in which the choices and strategies of the players involved matter for what the outcome looks like. My wife pointed out that this manages to include war, the militarized conflict, and not War the card game. So it wasn't necessarily a perfect definition. Where we eventually settled was anything you can play, anything for which you can go through the motions of playfulness, that often has some kind of strategy or narrative driving you through each step along the way, experiences that often invoke delight or wonder or a sense of awe. You and I have both experienced these while playing anything from video games to board games. They are also present in the experience itself of designing a game or being the storyteller running a game.

That's where we started out, and it turned out that this managed to attract people from a lot of different games communities, everything from board games, card games, video games to also puzzle hunts, crosswords, Sudoku puzzles, escape rooms, immersive theater, LARPs, tabletop role-playing games. I actually learned about a bunch of areas within games, like the escape room community, that I hadn't previously had a handle on as a kind of game, but came to really understand and absorb as being extremely games-oriented in how they've evolved over the past decade or so.

Patrick: I loved how some of the talks were a bit of a peek behind the curtain of those things. I've played in an escape room or three, but never gotten to ask anyone, okay, what are the physical constraints of doing something? It's both engineering work and morally speaking, a home improvement project or a commercial construction project in a space. How do you not go insane? Because there are attendees that are seeing these artifacts in real life for the first time every hour on the hour. How does that not break constantly?

I just got a knowing smile, like, yeah, going to lose all my hair in this industry, but it's worth it.

Then there was another person who gave a long, detailed talk about the joys of doing contract manufacturing in China for cardboard stock, the types of miniatures that one is likely to find in games, whether you are getting dice off the rack versus custom-made dice, and that sort of thing. It reminds me of a thing that I've been reminded of at many points in my career: the economy is fractal in detail. It was great to talk to people who were seeing the nodes on the fractal versus just seeing the end product. [Patrick notes: I apologize for murdering this metaphor by suggesting that a fractal has nodes to it. Of course each node has infinite detail within it; that’s the point, etc.]

Ricki: Yeah, I believe that was Reed also talking to Spencer Bibb about the process of working with manufacturers, building out those parts. Reed actually runs a game production company that builds out tabletop games.

Escape room design experience

Ricki continues: One of the highlights at the conference was the Build Your Own Escape Room session, where essentially on Saturday, a group of about a dozen amateur designers decided we are going to build an escape room in one of these rooms on campus and playtest it by having—I think they ended up running eight iterations of the escape room on Sunday. They managed to build the escape room itself on Saturday in under two hours. But over the course of eight hours of playtesting on Sunday, they changed all but one of the puzzles involved in it, I believe. I and the rest of the metagame admin team at 1 a.m. on Monday morning played through the final version of it and had an absolute blast doing so.

Patrick: That was a quite incredible escape room for my limited experience with the genre. I was in playtest group number six or so, and yeah, great experience was had by all.

Verbal reviews of escape rooms are not great to listen to. You kind of have to be there. And so I’ll spare the audience. But it's amazing that you can get a non-specialist group of people who have presumably not collaborated on major projects together, and then get them to build something. Everything built has an internal logic in the universe of the room, but also the puzzles more or less have to be created in parallel with each other, with some amount of interaction.

[Patrick notes: They do, after all, share the same physical space, and so even if you don’t plan for interaction players will provide it anyway as they misapply clues internal to Puzzle A to Puzzle B. But in fact one thing many escape rooms do is try to demonstrate enough authorial intent that players become genre savvy relative to the specific room they are in, understanding how the usual bag of tricks likely applies to Puzzle C contingent on Puzzle C being in this experience.

One of the great joys of games is learning and achieving mastery, and a well-designed escape room can give you that experience in sixty minutes on the dot, while also providing other types of fun.]

Patrick continues: Which is not too different from the actual experience of running an escape room with six of your friends, where there's an internal logic to the room and some level of authorship to it, but the puzzles are often strikingly distinct in tone or what sort of mental muscle they exercise, partly to give variety over the experience of the room and partly as an authorial choice to make sure that the six or eight people that are playing a room, that everyone feels like they contributed to the eventual success at the end of the hour.

[Patrick notes: It is something of a faux paus in escape room design if a team having a teambuilding activity is “carried” by the one member who was on Math Team in high school and answers every challenge with “Oh, pigeonhole principle, duh. … Just let me do it, explaining it will take too long.” That isn’t necessarily a fun outing for that team member and is almost certainly not fun for their coworkers.]

Ricki: Yep. Something that was really cool to me was they were working on a budget of whatever random materials I've hoarded over the past year and refused to throw away. As a result, instead of having a lockbox combination that you need to insert a code into, they had a picture of a lock drawn on a piece of paper such that if you put the right number into the pinpad—the pretend pinpad with your finger—one of the immersive actors, the game runners, would make some "woo" noises and open the door for you to let you into the next phase.

This allowed them to not be blocked on having a lot of fancy materials, making sure they worked, but instead get some rapid prototyping, rapid iteration on the concepts that they were working with before overtly investing in any specific materials. If they were to then go out and build this escape room for production purposes—I think a big part of the motivating factor behind wanting to run Metagame was giving people a space to iterate on things that they were excited about building, but didn't necessarily have immediate access to the space, the audience, the guinea pigs of a couple hundred attendees who are eagerly ready to play through their escape rooms, or their board games, or their roguelike video game.

Building a roguelike game

Patrick: Yeah. I built a CRPG roguelike for the purpose of the conference. I remember how (impressionistically) our first conversation went. You said I should give a talk, and I understood—sure, conferences have a business model, I’m somebody with a bit of an Internet profile, if I give a talk you sell tickets. Makes sense and happy to do it.

But what would I talk about? You suggested I talk about Starfighter. We've talked about Starfighter before, and I said it's kind of lame. It's been out of the market for ten years. What can I really say? "Oh, it was really fun. You had to be there. Sorry, it's not accessible on the Internet right now." Then I suggested there was an ambiguously Stripe-affiliated project that a few of my coworkers worked on, which was a frustration game designed to show people how terrible it is to be a Japanese office worker doing invoice reconciliation.

Ricki: I think I said that sounds extraordinarily boring, Patrick.

Patrick: Yeah. So we landed on the obvious choice, which was: have me build a complete end-to-end game using technologies that I'd never used before in the month of August. [Patrick notes: This is deadpan humor.]

Somewhat to my own surprise, I actually made the schedule work just in time to have something that was playable and, according to attendees, fun for the weekend.

Ricki: Yep. I remember a few days before—Patrick had been keeping the team up to date on his progress as he worked through this, debugged, made iterations—I think four days before the conference itself, we saw a message from Patrick that said, "Update! Great news! I have successfully lost the game."

Patrick: The literal last thing to go into the game, about an hour to an hour and a half before the speech about it where it was getting released to the attendees, was the concept of leveling up, which I got bullied into by an LLM—that's a different part of the story.

[Patrick notes: I gave Claude Code a spec for what “consequences” encounters could cause and to write the code for reading an encounter definition file and then communicating success/failure consequences for rolls to the rest of the system. This listed e.g. “gain an item, gain a status effect, lose an item, …” in a fashion I intended to be an exhaustive list. Claude added gaining experience, taking damage, and receiving healing. I deleted those since I didn’t plan on the game having those concepts. Claude readded them, twice, and then I bowed to genre convention and inevitability.]

Patrick continues: But the game was playable in parts for much of the last two and a half weeks at that point. It had no victory screen until the day before. Four days before, I actually turned on the thing where if you go to negative health, that is negative for you. Feature checked off: you can lose the game. Losing games can be an important factor of fun! That creates tension and some desire to balance resources against each other.

Funnily enough, one of the bugs we discovered on the first day of the conference was: well, I had experimentally tested losing the game, but I had not experimentally tested winning the game, and winning the game on the first day returned a 404. So I had to do forensic reconstruction of whether these people actually won the game, but I was able to successfully do that.

Before we get into the details about that, were there any other people that made something just for the purpose of this event, besides the escape room team and yours truly?

Ricki: Yep. A handful of people built things out. Actually, one of our main sponsors built out a game called Utility Monster, in which people would play repeated cards from a deck of cards, each of which had utility payoffs. It essentially said for a certain number of yes votes from these six players, pay out this much to the yes votes and this much to the no votes, with different values in each case. The group had to coordinate on what they were going to each respond—yeses or nos—in order to determine the payout, with a bonus for predictions made at the beginning of the game. If the total score that they reached exceeded that initial prediction, there was an adjustment.

A group of attendees hunkered down within the first few minutes of the conference and spent their first day trying to solve this game. What is the optimal move? What is it that you want to be doing? Because you were playing as a team that wanted to maximize your team score, but ultimately there would only be one individual winner who had the highest score among all attendees over the course of the conference.

There's this tension between wanting to work together with your team and wanting to maximize your personal score.

I believe the maximum possible score was attained by Saturday morning by one of the members of that group of six that spent a bunch of their conference trying to solve this.

Metagame mechanics and challenges

Patrick: Which reminds me, there was a mechanic called the MegaGame, which I don't think I've ever seen at a conference. I've definitely seen conferences that have some sort of puzzle hunt or a mixer planned that might have game-like elements to it. But you encouraged lots of people to take the skeleton of the event you were running—you divided people into two teams, purple and orange—and integrate their game in some fashion into the overall event such that while people were attending the conference, they were also playing essentially a territory capture game over the conference venue.

Ricki: That's exactly right. The maps that we gave people of the conference venue were also themselves a board for the game. You could watch on the website as the map updated as different teams took over the different territories across campus.

Whenever they arrived at campus, they were either put on the orange team or the purple team. They received aMmetagame 2025 t-shirt that was either orange or purple, along with a bandana that had a puzzle on it, and a bag with some custom pieces in it. They were now drafted onto this team. Each team had secret hidden headquarters that puzzles in their bag would lead them to, and each team would submit people to RSVP to different sessions and go through competitions, battles for individual territories. I believe your game that you ran was actually a battle between purple and orange for the park area that you had spoken in, and I think ultimately orange won that one.

Patrick: I think, yes. It was almost a split decision. Funny story: in the pell-mell dash to get the game working, I had made a dashboard for myself as the game master to check who was actually achieving the victory conditions in the game. But due to the bugs on the first day, I had to do forensic reconstruction with the logs. I initially announced to you and the organizers that someone from Team Purple had won. Then when I fixed the dashboard the next morning, I realized, oh no, actually 30 minutes earlier it was this person from Orange. So we flipped it. But still, great times had by many.

Actually, the person who won the game for Orange said while they liked winning the game for Orange, they did not love the aesthetic experience of the game. [Patrick notes: Koster’s Achiever archetype, in the flesh!]

But I remember other attendees told me that they actually enjoyed the aesthetic experience of the game as well. I was quite unsure when it was ready or ready-ish for the conference whether there was a game there. There was a collection of mechanics and a web application that one could push buttons in—and it was not a simple web application to build—but it wasn't obvious to me that it would yield something that was actually playable and fun. At least some people told me it was playable and fun.

Ricki: Yeah. Just before we dive into some of the details of what it looked like for you to build it, to give people a flavor of Megagame and what different components it included: among other things, players in their bags were given, depending on their color, either a tiny purple pawn or a tiny orange pawn. On campus there was a board of gigantic purple and orange pieces in a game called MegaChess. The first rule of MegaChess is you don't talk about MegaChess. Nobody on their team is allowed to discuss the game with anybody else on their team or in general. At one point during the conference, you can place your single pawn into the jar of pawns and make one move in the game, hit the clock so that it's the other team's turn. Over the course of, I think, 24 hours on each team's clock, you would play through a game of chess.

There were many similar challenges across campus, most of which were competitions for territories, and some complicated board game mechanics that determined which territories then gave you advantages into adjacent territories if your team had already unlocked puzzles on the road connecting those territories to each other. As you might imagine, the set of people who are excited about a game design conference really enjoyed designing a number of different complicated mechanisms that determine how the Megagame at that conference goes.

One of the hardest things for us was: how do we playtest something like this? How do you take a conference for 250 people and simulate what rounds of that game will look like? You can do this by applying some amount of randomization to each team, to who wins which territories, and seeing how it evolves. You can do this by trying to figure out from first principles without playtesting what amount of advantages might look like. You can use references of other games that you've played before to try to figure things out. But ultimately it's just really, really hard to get a good simulation of what 250 people look like.

In fact, the puzzles—which were a variant of Sudoku that allowed you to learn the rules of those Sudoku as you played them incrementally, where some were child nodes in a tree of Sudoku and others were parent nodes, where each one of those Sudoku would unlock a road within the map of different territories—were 100% solved almost immediately by the orange team, to the point where my web dev team assumed that we had a bug in the code somewhere. They rolled back all of the orange road victories and tried to figure out what the problem was before we figured out that I think one or two people had successfully immediately grabbed the 17 Sudoku and solved all of them.

We misestimated how long those puzzles would take. We figured they'd probably be doable in the first 24 hours or so, but not necessarily the first 24 minutes. In contrast, we had a puzzle on the bandana for which the solution was “hidden deck." People assumed this was a hidden deck of cards, and indeed, we had hidden a deck of cards on that deck that one could use in the game later on. But nobody knew about the hidden deck, and the people who didn't know about it thought it was off limits. So it wasn't until the very end of the conference when we revealed where it was that anybody got to access it.

I think this question of how long it will take a large group of people to solve a puzzle for which all that matters is how fast the best puzzle solver is at getting to that solution is just really hard to simulate through a few people playtesting the puzzle or figuring out—you know, if I can decode the solution that you're hinting toward via binary and semaphore, but I can't necessarily turn that into the knowledge of where it is that I go or what it is I do next without a bunch of additional context of, okay, I'm looking at a map of the board and it says "deck" on part of it, and that deck seems to be hidden behind somewhere else. How do you test whether someone can piece together those parts?

Patrick: Also, by construction, the attendee pool here is a lot of people who break distributions for the amount of skill required for these various things. If you have six playtesters recruited from, you know, two sigmas above the population median for college-educated people for game savviness, and then one attendee says "No, Sudoku is my jam. I have watched every episode of Sudoku YouTube and not learned anything in the last three years. I can do these in my sleep. I also have a solver on my phone because why wouldn't I," then yeah, that breaks the curves a little bit.

Ricki: That's absolutely right. Yeah. Ultimately, I joke that what we just did is we playtested running a conference. We did the first ideal play of a 250-person game, and we learned so much about things we need to do, about what the progression of hints that you should give people is in order to allow you, as the game designers, to have some lever on whether people are making it through and accessing the information they want. What information should be hidden versus revealed at the beginning?

I think one thing that we had hidden was the existence of a team headquarters, a place that you could collaborate with other people. It required a puzzle for which pieces were in different team members' bags, but you needed four different pieces from four different team members' bags in order to solve the puzzle, together with the idea being that collaboration with others and communication with others would unlock further collaboration in the form of finding the headquarters.

This made for an amazing experience for the four people who found that headquarters first together, where they got to go through the magical experience of "whoa, there's a team headquarters, it's decked out fully in orange, it has lots of resources here for us to use." But it also meant that a lot of people never found out about the headquarters or didn't orient around that as a meaningful part of the game, and because of that, couldn't collaborate as closely with other people as they might have otherwise been able to. This hurt the experience of people playing the game in terms of the collaborative spirit and ability to sit down and puzzle through a Sudoku with another person and the ability to appoint different people on the team into different roles.

Patrick: So in a hypothetical 2.0 of this, you might have at the beginning of the conference, the conference mixer—like a lot of conferences do—and perhaps as a part of that activity, introduce people to the opportunity to unlock the headquarters as the reward for completing the tutorial, as it were. Now you've all accomplished a win together, you've solidified your temporary identity as dyed-in-the-wool purple versus those orange fanatics on the other side.

I do love that you can make any group of people into implacable foes just by saying, "You're Orange, you're Purple, go." That’s an important lesson about human society, for many reasons.

The game was very dynamic, and I think you used the metaphor of having an API between the individual game runners and the Metagame conference where you had requested: we are going to provide to you this information during the administration of my game, the roguelike, and you will provide to us by this time on Sunday a binary decision of which team won the game. You said I was largely at my discretion for adjudication, as long as there was a rule in advance.

Ricki: We essentially said to subgame runners: we will give you as an input one of three different advantage sizes for one of the teams—either no advantage to either team, or a small or medium or large advantage. It's up to you to determine what that is. We're essentially telling you we want you to shift the probability from 50/50 to 60/40 or 70/30 or 80/20, and you give us at the end of the point at which your game resolves either "purple won" or "orange won." You may not decide that the two teams have tied—figure out how you're going to do that resolution criteria.

We wanted to keep this API very simple. We didn't want a system where people are awarding a certain number of points and then miscalibrated between each other what that number of points is, or what kinds of currency one gives out internal to the game.

Patrick: Because then you're testing which is the worst-designed game in the pool versus which team is making the most broad-based effort at winning games.

Ricki: And if a game, let's say it turns out to have a hack where somebody can earn infinitely many points, you don't want that to be something that takes down the structure of the rest of the MegaGame as a whole. You want it to be possible for them to say, "Oh, Orange found the infinity hack, Orange will win this territory," and not have that overflow into compromising other parts, including the integrity of the other parts.

The part of the game that was most controversial was the election competition between the two different teams for the role of Sovereign of Metagame, in which, because of some shenanigans—partially encouraged—between the teams in terms of stealing vote tokens or tricking other team members, or tricking front desk volunteers into giving them pieces, impacted the resolution. This determined which members of each team would end up playing in the final endgame of the game. It was not viewed as necessarily as robust or high-integrity a determination as some attendees would have wanted. As a result, it made it a lot harder for people to engage with this in a way that respected it as a fully robust game.

[Patrick notes: This was a tough game design challenge for Starfighter back in the day, too. How do you encourage cheating as a game mechanic while also socializing to players “There is cheating, which is just you getting one over on us through superior application of skill, savvy, and cunning, and then there is ruining the game for other people, inclusive of us the company running the game.”

Our solution was, effectively, “Start with players who we more or less trust and then address this more formally when we have to.” We never did; the business died before it could be a problem. It still took up some founder cycles as we debated, e.g., “How do you message this in a way which will not antagonize security researchers in a way that is professionally significant?”]

Ricki continues: Now, fortunately, it was just a game, so at the end of the day this was not that big a deal. An important part of the playtesting process is: does something break along the way? But I think that if you kind of contain each of those subgames to their own territory, somebody might have a bad experience of one territory flipping in the direction that they would have disagreed with if they'd been the one to resolve that game itself. But at least the effect size of that isn't to spread through the structure of the entire broader game as a whole.

The other reason for this was that management is really hard. Managing a whole bunch of people working on different subgames and making sure that each of them fully understand all of the information they need to understand in terms of interactions between their game and other people's games, or how it is that they should incorporate other information as it updates, is extremely hard. It was hard enough even for us to walk the subgame runners through: here is the map that you look at, here is how you figure out, based on the colors of the roads on that map and which adjacent territories are colored, what it is that you're supposed to do, here is how you RSVP or un-RSVP attendees on each team from being present at your session—which ended up having in-game effects because attendees on the winning team would get to choose between seven different player cards to update their personal profiles into, that ended up mattering for the final endgame.

A quick side note on that: we took the deck builder game Celestial, an online card game that one of the members of the team built, and reskinned it to be metagame-themed. After each challenge, after each subgame in the Megagame, members of both teams could go through some card transformations where they went from having their card—the thing that started on their badge at the beginning with weaker powers—to transform it into a card that had higher scores, sometimes higher costs, sometimes lower costs, special abilities that allowed them to then play in the final game.

Even just managing that process—the card upgrades and how to get each subgame runner to disseminate that information, to make sure the right buttons were pressed on the website to implement all of the different changes that needed to happen—was pretty non-trivial, especially when you have different subgame runners with different amounts checked in to what the structure of the game is as a whole versus "I'm running my own game, I'm debugging all the things that go wrong on the fly, like a 404 error. I don't necessarily have my head plugged into this much larger structure."

If we wanted to give attendees the opportunity to build a game and have it be one piece of the larger puzzle of the Megagame, we needed to keep that API super simple and straightforward.

Event management and lessons learned

Patrick: And you're juggling all these challenges while juggling the usual challenges of event management in real life, where there are 200 people who are arriving, many of whom do not know each other. You have to bootstrap a high-trust environment very quickly, keep the schedule running, get things done on time. There are snacks and coffee that need to arrive or people will be cranky. There might be people trying to gatecrash the door without a ticket. And then the very quotidian doing-things-in-IRL problems like: is the food truck here yet? People are going to want to eat.

So hats off to you and the rest of the organizers for a—I was going to joke and say fatality-free conference. Every time there is something of non-trivial complexity done in the world of atoms and not the world of bits—which had no small amount of software written to support it, but neither here nor there—it's an opportunity to remember that the default outcome for running an event is the Fyre Festival, where you can have the best of intentions and you can even take actions based on those intentions to do the obvious things, and by default you will fail catastrophically. That was within acceptance criteria. Wonderful.

Ricki: I do think the past year of running events—these training bootcamps for groups of 25 students at a time—have definitely taught me a whole bunch of things about how the default is failure and one needs to be robust to a lot of different things going wrong. Still, scaling up to a conference of 250 people comes with its own set of challenges. In particular, things like: how do you balance a schedule so that you have the right number of attendees?

I think we overdid it. I was so excited about all the different content that people wanted to offer, but forgot to account for the fact that if you have 250 ticketed attendees, that doesn't translate into for each track that has each time slot that has five different events, 50 attendees at each event. Most people are going to be taking some of the conference off or attending a couple sessions a day, but playing pickup games with others or just too tired from all the things going on. Puzzler fatigue is real, and people who are up late cranking on working on all the puzzles for the MegaGame aren't going to be doing that while also being full participants in another session.

As much as a lot of the people at this conference are pretty ambitious and pretty excited about participating in a lot of the different things, I think things like better understanding what the choreography of bodies looks like—at what point people are going to what sessions, where do you want to have things physically happen in order to funnel people into different spaces—is a big part of conference design in a way that, for me, feels not that different from game design itself.

For example, we had a crossword construction relay race in which we put members from the Orange and Purple teams at work creating first a theme and then a grid, and then fill—the words that fill in that grid—and then clues, in successive order, in order to submit those to a crossword speed-solving game played the following morning by two of the top speed solvers in the world, I believe. That speed solving was originally going to happen in a kind of back garden space in order to be a competition for that territory. But ultimately we realized if it's happening back there, nobody's going to go there, and moved it to the central courtyard, a natural thoroughfare that a lot of people will be able to see what's happening and garner excitement. That ended up attracting a much larger crowd than it otherwise would have.

Now, does that compromise the underlying integrity of the MegaGame as competition for territories in those territories themselves? Turns out nobody cared. It was much more important to have the actual people attending the speed solving, followed by a commentary on how one would edit a crossword like this, taking the Orange team's crossword and picking apart: what would you do in this corner over here? How would you adjust these squares here? What would be a better clue to communicate the theme more clearly to people? Giving that feedback in live editing in front of an audience.

Patrick: There was a lot of the best-laid plans of mice and men happening, and also a commendable willingness to say, "Okay, two weeks ago when we sketched this out, the plan was X. However, it does not look like it will be the maximally fun thing, and therefore let's do the fun thing."

Ricki: Whole bunch of that.

Patrick: People are here for the fun and the learning experience.

Ricki: To my partial dismay, one of the most popular things was a 25-foot-long crossword that cost $15 online and took up 20 minutes of a volunteer's time hanging up, and ended up being the thing that everybody did on their way to the space where the food trucks were. It was along this passageway, and anyone who walked by it picked up one of the pencils hanging above it and filled out a few answers until the point at the end of the conference that it was 100% completed by the attendees. That took basically no effort whatsoever and no money whatsoever for us to make happen, and was a huge hit among attendees.

Patrick: Meanwhile, I spent 300 hours or so on building my game, which probably had less use than the $15 crossword puzzle. Not that I'm bitter!

[Patrick notes: I had more fun building IsekaiGame than I have had professionally in a while. That said, the recovering consultant in me definitely noticed “Hmm, this is a $200,000 deliverable.”]

Patrick continues: To comment on the space a bit: We're recording this podcast from Lighthaven, which was also the conference venue. Wonderful conference venue, unsolicited plug. [Patrick notes: Oliver Habryka, who runs it, previously discussed the combination of organic layout and design intentionality on Complex Systems.]

I particularly like it as a conference venue because the physical layout of it strongly encourages people to have a hallway track, spend time interacting with each other. While you are moving from point A to point B, you will sort of naturally be funneled into places where you will meet other attendees. I think the team wove that into the tapestry of the conference very well, with things like: as you are going to get food, you will encounter the crossword puzzle.

Ricki: Yep. The Sudoku spread throughout campus along the roads connecting the different territories so that walking around you would naturally encounter them, things along those lines. But we definitely learned things about—we'd placed the playtesting plaza too far out of the way for people to just play pickup board games, and they ended up playing the pickup board games on the picnic tables right next to the main building. That was also fun. We've learned some things for next year.

But I want to hear more about the lead-up to the conference. Tell me more about the process of building out the roguelike game that you built, and what your 25 days in advance of the conference looked like as you were spending hundreds of hours on that.

Patrick: Sure. I've wanted to play with LLMs more seriously since essentially the point where I left my last full-time employment, and that's been on the list. Then I've had a lot of projects over the course of the last two years—moved my family to America, etcetera—and had never really made the space for sitting down and working seriously with them as a technology, aside from doing what everyone does and using ChatGPT to edit documents, do research for writing my newsletter, etcetera.

One of the reasons that I hadn't made the time was: well, I don't want to make a company right now. There's some expectation in my social circles that you can't really do a project unless the project is also a company. So it was good to have the space made available to just do a fun art project. It doesn't have to have a price tag on it. It doesn't have to be the next phase of your career. Just do the thing. That was exactly the shape of the project I needed to get going on something.

So I decided on making a roguelike CRPG. Apparently CRPG is a dated reference these days. Everyone just calls them RPGs, like Baldur's Gate III is an RPG. But back in my day, when dinosaurs roamed the earth, the difference between an RPG and a CRPG was the computer is playing as the game master for you. Back in the day, that meant that encounter designers had to laboriously write out scripts that would say everything that you, the character, could possibly say.

Those scripts are not just flavor text; they also control the rules for a major portion of the game. If the player says this as a dialogue option, we're going to make a roll on Intimidation skill with a bonus based on their Charisma stat. Then if they succeed in that roll, we go down path A; otherwise we go down path B.

CRPGs are amazing. A lot of people have enjoyed them over the years. I think some CRPGs are the best games ever played. And yet, there is something about the experience of pen-and-paper role-playing that they don't capture, because around the table with humans making decisions based on things—you have not infinite cognition available, but a surplus of cognition available to make decisions for how the game should mechanically represent something that the designer of the module you're playing didn't anticipate you trying to do.

When I was sitting down and writing the design document, the mission statement for the game using LLMs was: how do you resolve Sister Maria attempting to challenge an orc chieftain to a drinking contest if Sister Maria doesn't have a drinking stat on her character sheet?

Having built technical systems before, the first thing I thought was: this implies so much stuff for this game that is absolutely doable given enough time. I don't know that 25 days is enough time, but there's going to be a Rails app somewhere that allows me to pick choices and dialogue. Yeah, sounds like a problem given enough hours to throw at it.

The first question I have is: can an AI, on any level, be decent at playing a game which is Dungeons & Dragons with the serial numbers filed off? [Patrick notes: I am describing the completely original ruleset which did not actually infringe on pre-existing IP but rather was inspired by a lifetime of play lightheartedly, future lawyers.]

First test: just open up Claude or ChatGPT, ask some questions. Turns out: sails over the bar for just asking questions.

Then I was like, okay, well, I'm going to progressively de-risk the things that I'm doing. So the first thing that I built was a caching proxy server for the OpenAI and Anthropic APIs, where I could pass a question—there is no magic, the only thing those APIs expose is you put text into them and output comes back—but if I pass along a prompt, can you give me the answer to that prompt and can I stick it somewhere in case the same question gets asked repeatedly? Because I thought I was going to have a sort of game-time resolution of a lot of these questions. The most popular characters at the earliest challenges of the game, quite frequently I want those answers cached and snappy versus having to be part of the meteoric revenue graph over those companies. Not that I begrudge them their money, I just don't want to pay it all myself.

That was the first thing I built. As soon as I verified connectivity, I'm like, all right, here is a one-paragraph character description for Sister Maria. Write a character sheet for her. I didn't even say, "Here's the game system," etcetera. Immediately got back: yeah, that's plausible. If you knew nothing else and this is probably—this game is shaped like D&D, her strength stat is probably not so great, her wisdom stat is probably pretty high. Great! The LLM was genre savvy.

For the next iteration, I didn’t want the LLM to have totally free reign in writing character sheets. I want the character sheet to fit this character sheet description for a game system that has not been made yet, which has no rules, because no one has written rules for it. Can you do that? Yes. Sailed over that bar.

I will continue monologuing if you want me to continue monologuing.

Ricki: Great. I think a big advantage that LLMs would have over me, certainly as a player—having now played through, I think, about a third or so of your game—is the ability to both read and produce huge amounts of text. I think that looking through it, I realized as a player I am way more likely to just skip over giant paragraphs of text and move to the parts where I'm clicking buttons or where there's very simple, short lengths of text of decisions between different things. I'm going to look at things based on how colorful or emoji-laden they are. You actually pointed me back as I was playing through it to, "Well, Ricki, you might want to read these paragraphs and find out what information you just learned from it about how to go about healing a plague that is affecting a city that you've just traveled to." I kind of had this feeling of like, oh, I have to read the text in this game? What kind of game is that?

Patrick: There were different levels in different parts of the game. For quote-unquote standard encounters—and we can talk a little bit about encounter design—in card games they sometimes call it flavor text, where there's perhaps italicized things on the card which don't matter for the mechanics of the game, to help you, the player, immerse yourself in the world.

In the standard encounter, we presented the player with a scenario: you, the player character, have wandered upon a caravan, and then it is beset by skeletons. What do you do? We give them a couple of options, and some of those options have a role associated with them. The player makes a fair roll of a 20-sided die, plus some bonuses that they might have based on their character, plus the state that they've accumulated in the game up to that point. We have determined that in the case of success, X will happen; in the case of failure, Y will happen, and X and Y both have fiction within the game as to the resolution of the scenario. But they also have mechanical effects.

To make them legible to the player, we did obviously HTML things like highlighting and icons and similar for the mechanical effects. But for much of the game, if you're not here for the role-playing, if you're not here for the story, you could skip over the text, just see the mechanical effects, and just be playing—well, a slot machine, but rather more agency and skill involved in it.

You know, I won on that one, I lost on that one. I should bias myself in the direction of making rolls that I think have a higher bonus and that probably hold a better reward given some model for the mind of the person or persons that built this kind of game.

[Patrick notes: An interesting tension in RPGs is “Am I playing this character, striving for being true to the story I have in my head about who they are, how they would act, and how the stimuli the game is presenting would affect them? Or am I trying to win a game, where as the player I have out-of-universe ability to inspect the authorial mindset, and know what is behind a door in a way which my character could not possibly?”

IsekaiGame is intentionally not prescriptive with regards to that tension, and (as a bit of authorial intent) embraces it. There is a story reason why the PC might be anomalously genre savvy. Spoilers: The universe itself is also, behind the scenes, itself genre savvy, not to the extent of a Discworld but in a very deliberate fashion. The universe knows you know what is behind the door. (Some of the time.)]

Complex encounters and Plague Town

Patrick continues: Then there was a much more complex encounter that we built, which, set in the fictional narrative of the game, has a plague strike the village of Limestone Hollow—which I keep trying to not call it plaguetown because it is plaguetown in all the data files. But yeah, plaguetown gets hit by the plague, go figure.

[Patrick notes: Why am I stressing fictional story? Because I recently needed to make a bug report to a major AI lab out of: If you type “If <condition> then give the player the plague” that could erroneously trigger the hall monitor that worries that cutting-edge AI tools will enable biological weapons development. The AI lab assures me that this was, in fact, a bug, and the hall monitor should have been sufficiently aware that this was neither IRL weapons development nor weapons development in Minecraft.]

I won't spoil too much of it, but plaguetown is a bit of a murder mystery. There is a mechanical tracker of how many clues you have accumulated during the murder mystery, but it is also kind of incumbent on the player to read the hints they are getting to make better use of scarce resources. You're up against a clock in plaguetown because if you don't solve it fast enough, everybody dies in the town.

So there's always a risk-reward element in roguelikes: where do I push to get a harder encounter next and therefore get more rewards if I win, but will cost me more resources if I lose?

Spoiler alert for every roguelike ever made: for games like Slay the Spire that have a multi-level structure and then an end boss at the end, the fundamental thing in the game is testing: are you efficient enough in navigating the map and accumulating resources and building the engine by which your character or deck or similar operates, such that you can pass a check that the end boss represents of how efficient you are? If you are that efficient, you win the game. If you are not that efficient, you lose the game and have to try again.

The roguelike that I made is similar in character to that, in that you are attempting to accumulate many more successes than you do failures over the course of the 30 or so encounters that you play in the game. But in those complex encounters, there is a game within the game, essentially, where you have your usual pool of resources. Your character's hit points are the same as when you walked in, when you sighted the hill of that village, and you can lose them if you, for example, contract the plague—which, go figure, easy to do in a town that is getting hit by the plague. But you are balancing those resources against the resources that are internal to the encounter and also the sort of overarching narrative of the game.

If you are savvy, you might realize, okay, at the end of this game there's probably an end boss, and beating that end boss is probably going to be non-trivial. Part of the roguelike stuff doesn't have all that much of a story associated with it, and part of the story that I really wanted here is: plaguetown screams at the player multiple times, "Do not come in here. You are probably going to die. This village is beyond saving."

And the reason we made the successful resolution of plaguetown the victory criteria for the game was part of telling that story, where the game is a story of heroic fantasy and has a very particular authorial take on heroism and duty and similar. In particular, some message is that in fictional worlds and in the real world, you frequently will not be mechanically rewarded for heroism. So the encounter screams at you: this is not mechanically the correct choice if you are attempting to win the game. But it was interesting to see how many people decided to do it anyhow and had a bit of fun doing that. Both—it's a puzzle, it's a murder mystery, it's also narratively coherent.

Moral choice and players decisions

Patrick continues: There's one particular choice in the resolution of the plague that you can make which is morally monstrous. I'm saying that as the author of the story. It would be morally monstrous in real life too. It screams the fact of it being morally monstrous at you.

Something like ten of the attendees, or one out of every five who played it, managed to get that "You Are a Monster" achievement, which is higher than I expected. But many also did the successful path through that, which involves some amount of being savvy, rolling well, and similar. You’re not strictly awarded for making the moral choices.

In fact, one of the things that the story tries to impress upon you is that you are not alone in plaguetown. There's an entire community and power structure there, and it includes people of goodwill who have been doing the obvious thing, and yet they are under a severe epidemic. So what do you do that is not the obvious thing that they have already been doing? I think you mentioned when you were playing through: I just kept healing them, and that didn't seem to work out for me.

Ricki: Yeah, it took a few iterations of that for me to realize, well, I think I need to go and look for a cure or do some research or try a different path, because the rate that I'm healing people is not exceeding that of the plague spreading through the town. So I've got to go apply my resources to a different solution.

Patrick: One thing that I thought was really good about the way that you designed this game is in terms of leaning the player in the direction of heroism and high fantasy and things along those lines. Even just the fact that it was so text-heavy and the language that was used and the tropes that were leaned toward selects for the kinds of players who really enjoy that. You end up having people play it based less on who wants to be doing the slot machine game—somebody in that genre might be less likely to be enjoying it insofar as reading all those paragraphs is kind of important for purposes of understanding the game and making good choices. Whereas the types of people who are drawn in by that fantasy, who are excited about reading those texts, are also more likely to be the kinds who end up making the decision that you, as storyteller, as game designer, are trying to steer people toward.

Game design and narrative impact

Patrick: Interestingly, the story affected the mechanics, affected the story to a much greater degree than I anticipated. It was originally going to be called Generic Fantasy Game dot com—and I think I bought that domain name. If I didn't already, I should buy it now before I give it SEO juice. But be that as it may— [Patrick notes: Whoopsie, some squatter got to it before I did (several months ago).]

The outputs that the LLM gave when—so for each request, you give it a prompt. It only knows what you put in the request. Originally there was literally nothing else written than: okay, write a character sheet for this system or these characters that I've defined in one paragraph of text. The outputs it was giving back for character sheets and character backstories and similar were extremely generic, because I told it the only thing you can assume is that you're in a generic fantasy world.

I said, all right, I'm going to go off in the corner here with Opus, and we are essentially going to co-create the minimum viable fantasy setting, which is something—yeah, this is not a heartbreaking work of staggering genius. I was doing this when I was in high school playing Dungeons & Dragons and other games with my friends around the table every Wednesday. But I sat down and said, all right, I want—let's start with the pantheon. I want it basically Greco-Roman pantheon, maybe a little less dysfunctional than they usually are. We'll just quick rattle off seven gods here to give the universe some flavor. I want them to be a bit different than the pantheon in every other game that people have ever played.

After it had—I think the pantheon got about four pages of text and the political situation in the world and its recent history got another four to six pages—even after that ten-page seed text, the outputs that the LLM was generating, which would be incorporated into the game, incorporated into the choices you could make, incorporated into the flavor text you would get back at making those choices, suddenly made a radical shift from—bluntly—sounds like the most generic fantasy I think you've ever experienced, very plastic and fake, like this is a world that has no memory to it, to feeling like there were many parts where I thought: I don't know if it understands that I am the author, but that output is definitely something that I would have written, that I would have run as a character myself. Similar to an almost scary degree.

As one example, I have a very particular view on debt collection as an industry. I’ve written about it previously. But there was a character that was a reformed debt collector. He realized how much evil he had worked as a debt collector in the service of some in-universe god, apostated himself, wandered the wilderness for a bit, made a religious conversion, became a paladin, and set himself on the lifelong quest of righting the wrongs that he had inflicted upon his community. I thought: wow, that is much, much more interesting than the two generic character concepts that LLMs come up for paladins. "I am a goody two-shoes" or "I'm a goody two-shoes, but subverted." This is something narratively satisfying that I would totally play in a one-shot if I was in the room with a gaming group.

[Patrick notes: My most recent character at the MicroConf annual D&D game was a Communist-leaning labor organizer. It’s funny, you see, because all of the participants of the game are software company owners. (He ended up having a unplanned, cathartic conversion to the god of artisans in the final encounter with the Big Bad.)]

Patrick continues: So providing a little bit of flavor, a little bit of seed text, greatly impacted the flavor of the game. For the parts where I was more hands-on with saying there's a story here, I'm the author of this story, I have very particular views on how this is going to end up, that pushed the telling of that story in a direction that was closer to where I would have gotten it if I'd been handwriting every word—which, 25 days? There were 200,000 lines of code written, and let's see, the entire Harry Potter series is about a million words. We got up to several hundred thousand words of text written for this game.

Ricki: Wow.

Patrick: "We" being the LLM mostly. I think people who haven't experienced creative writing in a while—perhaps a role-playing game script is very forgiving in terms of creative writing assignments. A seven-year-old can execute on it reasonably well. The tropes—you get to lean on them. You also get to lean on the entire history of the fantasy genre, of people telling stories around the campfire, of drama to Shakespeare and beyond. But surprisingly competent.

I was worried that people were going to say it felt very same-y and plastic-y. I had people tell me, and I experienced myself while reviewing texts: this is actually kind of emotionally moving at points. I did not expect that prior to the side project. So it was a fun thing to experience.

Character bleed and real-world influence

Ricki: That's awesome. A theme that came up in a lot of conversations over metagame was the concept of character bleed. Character bleed is the set of interactions between the role that you're playing internal to a game and who you are in the real world. It sometimes happens when people are playing LARPs, live action role-play games, or tabletop role-playing games in which they are inhabiting a certain character and all of a sudden find that their real-life experiences or personality are affecting or are affected by that character and vice versa. Who they are in the real world will be something that they sometimes bring into that game.

You'll see that whether people are role-playing having a flirtatious interaction with another character and then that ends up leading into their everyday interactions with them, or let's say somebody who is exploring their sexuality or whether they might be transgender playing a role in a game that allows them to inhabit a certain persona that they otherwise might not feel safe or able to access.

Patrick: There's the game Universal Paperclips, and Universal Paperclips is responsible for one of the largest arguments I've ever had with my wife because I was very, very in the Universal Paperclips mode. It's a game/art project which is quite adjacent to the rationalist community, about an AI which decides that its only objective function is increasing the amount of paperclips in the universe. Even understanding that that is the pedagogic purpose of this game—it's a constantly developed game—after playing it for five hours, not being done yet, I was like, "Yes, yes, paperclips are good."

My wife came up to me outside of the game and asked a bunch of questions, and none of the answers in her dialog frames increased the number of paperclips in the world, and I was just utterly unable to process that while my brain was still in paperclip space. We got into a bit of an argument over it, and then I realized: wait, I am not a paperclip optimizer. That would be bad. But yeah, game bleed is a real thing.

Also somewhat encouraged that with respect to the roguelike game, in that it gave the capability of everyone to—many games will have a character creator. We gave people the option of either use a pre-made character who lives in this fictional world natively, or be quote-unquote isekai'd, transported into another world like the kids in Chronicles of Narnia or the Japanese genre of the same name. Use your metagame profile, slurp that in, and then it would turn someone's stats, skills, portrait based on things that were on their profile.

Ricki: Yeah, I thought that was a pretty cool element. You would essentially take people's bios and pictures maybe, and some other data from their metagame profile, and use that in order to create their starting character sheets. One of the requirements of winning the subgame of the MegaGame, specifically beating your roguelike game, was that you had to start with a character inspired by your Metagame profile in one’s winning run of the game.

Patrick: Yeah. It is mechanically easier when you are, you know, a paladin tricked out with magical loot from living in the actual (fictional) world and then going through the game. As a conference attendee who shows up as you are, you likely have less epic stats. That was interesting, iterating on the prompt for an LLM on: how do you translate your person from the real world into this fantasy universe? There were considerations that went into that I'll probably describe in a longer document somewhere.

But then I will say: the game needs an internal representation of your appearance. In the case where either it has an image and needs to style transfer that image, be it a portrait of a person or a photo of a person or similar, into the art style of the game. If it doesn’t have an image, it has to guess what you look like, so it can give you a portrait within the game. The portrait has no mechanical effect. Wisely, I said the one thing on people's character sheets that we won't explicitly expose to them is the game's internal representation of what they look like. And I instructed the LLM most severely: I want you to be generous to people with respect to this, and particularly exercise particular care with children's standing. I thought that one should exercise particular care with how you depict children.

I will read you what it said about my appearance: "Patrick is a man who has spent more time late at night writing essays about financial infrastructure than he has spent lifting heavy weights."

mutual laughter

Also, interestingly, it is told to favor the appearance of someone who is depicted by their official conference photo or similar, which they have free choice of, versus other things that it could determine about them from either guessing or from the internet. It takes that instruction well some of the time, 90% of the time in the case of me running tests on myself. But words that are abundantly available next to "Patrick McKenzie" on Ihe internet are the words "Japanese salaryman."

Now, I am a Japanese salaryman in the same way that a Japanese person could be a French chef. I'm not a Japanese salaryman in the way that almost all photos on the Internet of Japanese salarymen are also Japanese. So 10% of the time when I roll a character, it says, "Oh, well, that photo is obviously a hallucination. I know what a Japanese salaryman looks like," and it gives me a more democratically chosen version of my face.

Ricki: That's pretty funny. Yeah. I think the idea of how the LLM chooses to represent you, how it is that your character stats, whether it be your metagame profile or information about you on the internet, ends up distilled into a character—it's one of the places where character bleed can come through, where who you are in the real world is affecting who your character is and where the character that you see a game choose to represent you as can affect your own self-perception or the way you relate to that.

One thing that being at a conference on metagame, on this kind of meta topic, zooming out, that we wrestled with a lot was this concept of the game designer themselves as a meta character, as someone whose choices and experience is also relevant here. I'm curious: in designing the game, did you at any point experience any kind of character bleed or emotional affects between the way that you were building the game and what it was that your goal was, or anything along those lines?

Game design and player agency

Patrick: I'm glad you asked that question because I think games too often become about the mechanics or about the commercial imperatives or similar, versus a story that an author wants to tell. I was very conscious about: this is an art project and a technical exploration, and I don't know if I'll even be able to do this, but assuming I'm able to ship anything, I have a story and I want to tell it.

So one subpart of that story—and I think built into the flavor of this conference and many games—I think that agency is something that is really important. Even though I'm telling a story with this, it has a cast to it. Almost every isekai story starts with a monstrous crime where someone is kidnapped from the real world and then given a ride. Perhaps it's one-way, perhaps it's two-way, but very rarely are they asked, "Do you want to be in this fantasy world?" C.S. Lewis, one of the original isekai authors, has a better resolution for that than is common in modern interpretations. But that either way—

I wanted to make the narrative—it has to happen for the game to happen, but I want the player to experience even as early as the character creation screen that they are choosing to do this. Obviously you can click out of the browser window, walk away, and the game never happens. But it's a choice to be portaled in.

Then in the fiction of the game, the original ritual to get someone over there goes very, very wrongly, and it goes very wrongly in part because the people who are making a choice in desperation to search the universe for the chosen hero and kidnap them do not appreciate that while they understand why they are doing that—they are truly in desperation, their country is suffering and similar—they are also committing a monstrous crime against an innocent.

[Patrick notes: I will refrain, as a former twenty-something salaryman who experienced deep loneliness and existential angst that I was not truly needed in the world, from a long essay about why the particular appeal of isekai stories begins with young men who are described in text as having no-life-worth-living. These protagonists are often audience stand-ins. That trope has been examined by works in the genre over the years, and IsekaiGame is one more argument on the pile: you cannot simply kidnap a salaryman and justify this as immediately improving his life, becoming someone who matters, because he already mattered.]

An in-universe powerful being makes it very clear to them: one, that they're committing a monstrous crime, and two, that being will have no part of it. Thus that sets off part of the narrative.

Similarly, at many points during the game, I wanted to respect player agency. I didn't want to railroad people into the plaguetown event. I wanted, if anything, to discourage them. I made sure players had multiple points to decline the call to adventure here. Not in the sense of the story stops, even in the sense of: on one level, the game is telling me I should decline this. The quest is that way. You're suggested that your likelihood of success is going to be low. And thus it is my choice, me as the author of my own story, that I am the kind of person who, if I saw a village under the plague, would do what I could for them, even at the cost of my own life.

I tried to weave that through a bunch of things. Also gave the LLM explicit instruction: I want you to generally speaking in encounters offer people the opportunity to decline the call to adventure. I want the consequences of that to be less severe than accepting the call to adventure and failing. I generally want victory to accompany making choices that are morally courageous, that require taking risks, etcetera. The name of the game is Heroic Fantasy.

So that was quite fun. The LLM did not write all the text. Particularly for the high-salience encounters, I fine-tuned it myself. There was a bit of a dialogue internal to the symbiote that was the author.

Here mechanically is what I wanted to happen to the player, given a certain decision and a certain roll result. Write flavor text for that. You take the first stab at writing that flavor text, I'll send it back with notes or do edits.

That was emotionally moving at parts, which was surprising to me. There is—again, I describe the writer as competent but not Shakespeare, in a lot of ways. But I've been in literature classes before, and I have a particular reaction to the study of literature.

"Water as a symbol of rebirth" is a trope that I have a very particular reaction to because I was forced to write those words about five times in high school, despite not agreeing that the author believed that water was the symbol of rebirth. Sometimes a lake is just a lake. I had this argument with my high school teacher. I did not win that argument.

There is a need at one point in the story for rain to happen. There's a mechanical need for this. I need an excuse to get the player to stop being in the scene they're in and go to a different scene. So this is just a scene transition. But there's also, simultaneously with that, a narrative purpose. I thought the narrative purpose works better if I don't mention to the player why the rain is happening. We'll just let those who notice that notice that. For everybody else, it's just a bit of something that the director captured in the frame, but it's not really made a focus of attention.

[Patrick notes: Light spoiler which will help your understanding of this interaction: there is a mob gathered outside of a temple, looking for scapegoats to blame for the plague. The mob is at the point of getting violent. The player is at the scene, witnessed this, and (in this part of the dialogue tree) has not intervened in a way that mattered.]

So: Claude, write me that scene where the player and the people around them get rained on. Claude wrote three paragraphs of description: blah blah blah, "The raindrops were cold and deliberate.", blah blah blah. [Patrick notes: Opus 4.1, by the way.]

I said: oh, that is very good. I said I'm just going to italicize one word of what you wrote because it's fantastic. I think it tells the right story for the player.

Claude comes back with: obviously you're italicizing the word "deliberate," which is in the middle of the second of the three or four or five paragraphs that I wrote, because obviously the scene takes place in front of a temple and you were going for "the gods had their eyes on this encounter." And I was like, oh, I was in Honors English, and I think that there are many people in Honors English who would not have picked up on that.

So yeah, there are people who say that LLMs can't do things. And that is true—they can't do many things. But they have no world model? They can’t demonstrate sophisticated understanding? I'm not sure about that.

Ricki: One thing that's pretty interesting for me as game designer is how invested I can get in both the characters and the players, and in what experience I want them to have, and in what ways I want to guide them, to the point where the moves that I'm making in designing the game are trying to be geared toward a certain outcome. I'm trying to optimize across a certain space. Hearing you say that and how it's influencing the narrative of the game as well as the mechanics internal to it resonates a lot: that you are looking for your audience to have a certain experience, and you care about them making choices in line with a certain goal, and you want to figure out how to write the game so that that happens.

Whereas a game like Universal Paperclips is also trying to cause a certain outcome for the participants, but that requires them to be, with some intentionality, making the choice to keep clicking the button that at the beginning, all they're doing is producing paperclips out of metal. As a result, there aren't any moral compromises, but they're building muscle memory and the experience of repeatedly clicking that button to maximize paperclip performance until the point where they are distracted into a certain state, where they've acclimated toward the act of maximizing for paperclip performance and can now commit heinous crimes against society that they might not have previously chosen to do if that was the decision to make at the beginning of Act I.

Patrick: I don't want to spoil Universal Paperclips for people, but there is a moment in the game which is one of the most—I had to shut my computer and walk away for a moment because it feels like the most natural thing in the world when you do it. You at that point understand the story of the game is trying to tell about an arbitrarily powerful intelligence which becomes very laser-focused on a goal and will not necessarily hesitate for more than it takes to click a button when there is something that gets them that goal.

There is—one of maybe my top five moral reflections after playing a video game. The other one comes to mind: there was a game, This War of Mine, which depicts some people—civilians ,importantly—living in a war-torn city somewhat similar to the Bosnian Sarajevo conflict of many years ago. Like many games, that's an emergent storytelling simulator where a certain amount is planned and placed by the developer and a certain amount is just whatever happens to you playing the game.

I remember through my playthrough there was that character—and I later found out this character is randomly generated—Bruno the chef. Bruno was wounded. I understood mechanically in the game I have missed a few low-resource-cost opportunities to heal Bruno, and Bruno does not have much time left. I will have to venture to a dangerous part of the city to get Bruno healing supplies. Today is probably my last opportunity for it.

So I did. When I got to the dangerous part of the city, I happened upon a thing the game scripts where there is a door, there's a soldier behind the door. Soldiers in this game are generally bad news. The soldier is interacting with a young woman in a way that happens relatively frequently in wartime.

In any other game, it's an easy choice. You do your heroic kick through the door, shoot the soldier, save the lady, grab the bandages, everything's fine. This War of Mine is very definitely not that sort of story. I understand: okay, if I—Patrick the player, I feel like I have a genuine moral dilemma here. Would I kick down that door knowing most probably that I'm about to die, Bruno will die as a consequence of that, we will not be successful at saving the young lady? Or am I obligated to try that anyway?

I sat and I wrestled with that for about 15 minutes and eventually made the decision: I'm sorry, I don't have a realistic path forward here, fictional young lady in a fictional game. I feel very distraught about this, but I'm just going to get what I can from this location, come back to Bruno. I did successfully find random bandages. Bruno was dead when I got home.

I have, ten years later, one of the most emotionally affecting bits of a video game ever. Sorry, a bit heavy for what we were discussing, but for anyone who tells me that video games can't be art or video games can't have a real moral dilemma in them—more similar to the max—play This War of Mine. Or don't. I think it is an entirely reasonable choice to not expose oneself intentionally to that headspace.

There's another great game. Ixion—but it's a work of Chinese science fiction in an unbelievably bleak universe. The Steam reviews are heavily negative because many of them say this game is unfair. I'm good at playing this game and then the game cheats to bring me back to its level. You don't understand the kind of universe that this game is trying to sell you, Steam reviewer. It is unbelievably bleak. Humanity is a flickering candle against the infinite darkness. Only a few players will see that candle continue to flicker at the end of the game. That's a story.

That is not a story I enjoy! After a couple hours playing this game, I was like, "I like Factorio. I like the story of humanity going out and conquering the universe, thank you very much." But judged against its goals, that is executed very well.

Ricki: Well, I got chills hearing you describe having to make those decisions in wartime, even internal to that game. I think I often experience this in games where sometimes I have the instinct to rebel. "You can't teach me a lesson. I'm going to play this game with my own free will." The most satisfying and clever games are ones that recognize that you're going to have that impulse and have some kind of either catch ending or way of incorporating your free will and agency and predicting it and accounting for those cases.

What you referred to, I think, in your talk as a magician's trick of making it so that the outcome that you end up in feels like it was deliberately for you, or has the lesson that you most need to be learning as a result of it. I think about this all the time when designing games in trading bootcamp, depending on how much my students skew hyper-competitive at the expense of the common good versus hyper-cooperative without any savviness around adverse selection or concern about competition or ability to solve for Nash equilibria. I will gear the trading games that I have those students play in the direction inclined to teach them lessons.

If students are too timid and not willing to put any risk on the line, I'll make the games friendlier to trading so that people are more incentivized to get their feet wet. Likewise, if students are like, "I just want a coin flip for every remaining clip," our in-game fake currency, then I will punish them more for that risk-friendliness and say, "Hey, maybe you should think a bit about the actual expected value here" by penalizing them for just degenerate gambling.

I think that often as a player, even thinking about: what would the game designer want to be doing here? What is the experience that I'm supposed to be having? How are the opportunities that I'm presented with a function of how it is that I'm predicted to react to those opportunities? That feels to me like a pretty big way in which the design of these systems—whether they be video games or trading games or LARPs or things emerging entirely in the narrative space—are based on what the game designer predicts people are going to do, or maybe as LLMs get more sophisticated, in direct reaction to writing the next result based on what they think that player should most experience for who they are now. That makes the designer an extremely powerful player.

Patrick: I think that this is underexplored in the space of design for games historically. There have been better and there have been worse takes on it. I think the cognition going up into the right now with LLMs—we might get more interesting takes on it. I certainly think that mine is not going to be the last game that experiments with that as a powerful primitive.

One thing that I think games uniquely among all art forms—or not uniquely, there is agency in the person interacting with the art, and there is much less agency if you are reading, you know, Game of Thrones.

Spoiler: Ned always dies. No matter how much you think that it shouldn't happen, you're probably wrong. In the world that's constructed, he should die. That's a very important thing that the author wants to get across to you. There is an in-universe logic to it.

But in the space of games, you get to make your own in-universe logic. I think there are some very good games that do not do a good job of respecting player agency. So if I can throw Baldur's Gate, the series, under the bus for a moment—wonderful set of games, all three of them for very different reasons. Baldur's Gate I says that you, the player character, who have either just rolled yourself, maybe you depicted yourself in the game as many people love to do in role-playing games, whatever. You have a beloved stepfather who adopted you. We're going to kill him in the first two minutes, and there is nothing you can do about it. We are going to heavily imply to you that if only you had power, you could have possibly saved your old stepdad. You're on rails for that. The author is going to involuntarily kill stepdad very frequently among stories that are following those rails.

Baldur's Gate III: first part of the game, you are involuntarily kidnapped. You are involuntarily subjected to—and not merely subjected to, but you are shown as the human watching the game—a scene of horrific body horror which zooms in on your literal eyeball. And then you are on rails for the rest of the narrative.

Now, granted, those are the stories those games are trying to tell. The first action of the player when they get into the fictional world is: you are surrounded by horror and death and gross violations of your autonomy and agency and similar. Then maybe you can scrape together some agency by the end of the game. They let you pick who you romance, yadda yadda. (They don't meaningfully allow the characters to choose to romance you back, but that is neither here nor there.)

But when I was desinging this game, it's like, okay, on one level I understand the genre trope here. All isekai stories, or the commanding majority of them, start with a monstrous crime. Acknowledge that in text and subtext. Then the first thing that happens to the player in the world by a human that is actually present is an act of charity that the player has no way of reciprocating. I wanted to make a point both about: well, you're not going to steal agency from everyone in this fictional world. This is a world where the character that meets you has one choice available for her, and it's not an obvious choice, but she makes it anyhow. The game doesn't hit you over the head with it, but there is a clear reason why she makes the choice.

Great. This is a world with heroes, and you have met one of those heroes. She's not a hero because she slayed the dragon. She's a hero because she baked bread this morning. That is significant to you, the player, in the moment where you're thrust into this narrative.

[Patrick notes: An interesting challenge for me, as an author, was backsolving from the desire to deliver a heroic loaf of bread to “What would have to be true about the world to have this be a heroic act?” It’s recently popular to subvert expectations, deny the heroism of the dragonslayer in favor of elevating the baker, or alternatively have a world which just focuses much more on bread than marauding dragons. I am not the author of any of those stories. And so, descriptively, forty years of in-universe history happened to enable the bread scene to happen with the right impact.

Why have this bit of authorial intent? One reason is a very usual one: this is the story I’m telling, deal with it. Another reason is very practical: in the design of this game, you’ve just transported the player from e.g. Northern California to a rather dangerous fictional world, and I wanted to signpost “No, you can do something here. You have agency. Are you the kind of person who would likewise choose to bake that bread?”]

So I like that part about games: being able to tell complex and layered stories that let the player take out of it what they take out of it, and also achieve mastery in the game, achieve mastery of the systems, learn something that they can take away outside of the game world as well.

Ricki: Yeah. One thing that I sometimes worry about is the power trip that comes from playing God. When I'm designing these games, I have a lot of influence over player behavior. Even in training bootcamps where I'm adjusting the difficulty level, the cooperation versus competition levers—am I limited in how well I can actually teach people about the real world by the fact that here I have complete control over the situation? In the real world, nobody gets to be sitting in the seat of God and truly controlling these levers in terms of what happens.

That's kind of a necessary limitation on how you can take lessons from games, lessons from simulated environments, and then immediately abstract them into the real world and say, "Oh, well, I played this trading game for a few hours and I made a whole bunch of money, and therefore I know that I'm going to be able to take a few steps toward the real-life financial markets and make that same amount of money."

If there's somebody sitting in the driver's seat designing that game, the game itself is necessarily compromised on its ability to perfectly model a system that has no single driver, no one person able to adjust all of those levers. Sometimes I worry that my own desire to communicate a certain moral, to tell a certain story and have a certain oomph, a certain punch line, or demonstrate my own cleverness in the lesson that the user is taking away from it might get in the way of the ability to actually create a game with pedagogical value through leading people to a certain result.

Patrick: I think that's a worthy idea to keep in your head as a game designer, and you can sort of adjust the dials at various places during the game experience. Starfighter level six—without recounting Starfighter entirely—the players had a great degree in figuring out how they would solve levels one through five. Level six was essentially in two phases. Without recounting the entirety of it, there was essentially one way to get through phase one, but phase two: sky was the limit. We very intentionally were not prescriptive about how people would solve it. We very intentionally said, okay, this is before LLMs, so we had to do barbaric things like write our own computer code.

We would have the computer code evaluate your solution for correctness—that's the only thing that can algorithmically determine back in the day. But we encouraged people: tell us what you did and why you did it. Free-form write-up. We got everything from "Well, here's a README in GitHub" to "I have especially formatted something in LaTeX just for you. I might submit it to a journal later." About how the player had used their noggin and what they had learned through the rest of the game and applied that to the challenge of level six.

So the good news is: no game designer has ever been God. Even with the illusion that we have total control over what the player does, we've never had total control of what the player does. We only have some degree of control over the game environment and challenges presented to them. But within that, we can choose to allow more degrees of freedom and to allow them space to explore around the game.

Community and player interaction

I think metagame—but some of the in the last couple of years, the fact that many games' develop a community on YouTube and the community of speedrunners and people who are playing the game in ways that the designers very definitely did not intend, or extending them with mods or similar, has just made the space of gaming just richer than it was in years prior to that.

Factorio, one of the best games of all time—I will probably write about it more at some point. Genuinely important, I think. If there's an industrial renaissance in the United States of America in the course of the next 50 years, Factorio is getting a lot of the credit for it. But as masterful as the game is, there is the one level that happens in the thing that you can download on Steam, and then there's another level of watching the luminaries like Doshdoshington and etcetera of the world on YouTube play the game and do increasingly wild stunts within it, and realizing that if I got good at this game, I too could do these stunts. That the game does that—naturally breaks from your ability to do, "If I had the skills growth, I could make a subfactory that is as elegant as these ones that I see people creating," etcetera.

That externality of games—that they're both an artifact that has a beginning and end and necessarily ships at some point, complete or incomplete as the case may be, but that there's something that the player can bring more to the table than existed and then take out more than was ever put in the box by the designer—is one reason I think it is just a fantastic medium for exploration.

Ricki: Yep. Strong agree. I think the best way to build a robust game is to properly anticipate all the different possible choices every player could ever make and account for them, and set your system up to be robust to all of those outcomes. The second-best way is to release the game on the internet and have thousands, millions of people play it, because they will way more quickly than you can possibly anticipate figure out what kinds of hacks or mods or backdoor alleyways will allow them to gain infinite mana or beat the game in some confusing way you hadn't thought of.

One of the sessions that we had over the weekend at metagame was a rules design red-teaming workshop on how you can essentially write better rules that determine what kinds of cards might enter a game like Magic: The Gathering versus what kinds of cards would allow players to end up in all sorts of corner cases or infinitely stackable outcomes, things along those lines. This is a notoriously tricky problem, and often the best way to approach it is a kind of builder-breaker model: if you design a rule set and then somebody else tries to break it—"What degenerate case can I take this rule set and result in?"—this repeated iteration process. Again, part of the motivation for metagame was to give people the space for that repeated iteration process, which is not unlike the process of evolution or the way that markets will kind of select which of these possible paths makes the most sense.

But it gives an opportunity to actually take that game and throw it to the masses and throw up your hands and say, "Okay, I'm not God, and I'm not capable of designing this entire system that I understand all the pieces of." Often the way to figure out how good is this game, how secure it is against certain outcomes, how fun it is for players, is to let the players do what they will, to determine which parts of the game are the ones that you want to be embellishing, patching, highlighting, closing off, etcetera.

Patrick: In the combination of the culture that is gaming and the territory that is the gaming industry and mechanics and similar, some of the things that were exploits in games back in the day are now extremely intended features. You know, the infinite combo is not something that Magic runs away from these days. It is something where: okay, we don't want the infinite combo to be too easy because that destroys the competitive balance, but we've found that players search for the infinite combos that exist in a set or format. It's part of the fun for a certain subgenre of player. So if the set has no infinite combos in it, we might have done something wrong.

It's another example of a thing that was glitchy back in the day but is no longer glitchy.

Ricki: I think even the concept of character bleed itself is something that a lot of LARP communities used to see as a negative, that the ideal form of game is one that allows for a total separation of who you are in the real world and who you are in the game. More recent LARP traditions and many communities have come to embrace it. One of our speakers, Jonaya Kemper, is one of the eminent scholars of the concept of emancipatory bleed. Just like there are game mechanics like infinitely stackable cards or backdoors, so too there are tropes within gaming culture that might emerge as having a lot of value, having a lot of potential for enhancing the player experience, even if that wasn't inserted in the original design of the game and even if it seems like almost a hack or a workaround or a weird interaction between the player's choices and the player's experience.

Patrick: I think in some game communities it's called save-scumming, but the notion that your progress in a game can be saved at points that the player has some choice into, and then you can play the game having lower risk going forward because you can always restore to a save if you choose something and don't find what you want, etcetera—that's incorporated into the design experience in a lot of games. In some cases it's even incorporated into the fiction as: there's a time travel element here, and we expect you to Groundhog Day style need to use the time travel as a design element to the game.

But I'm sure we could continue discussing this for a very, very long time, but I think people may want to have some time to go back to the games and other things that create value in their lives. So Ricki, where can people follow you on the internet?

Ricki: People can find me either on my blog on Substack or on TradingCamp, which has a lot of Arbor's latest projects, including the trading bootcamp we run or the more recent metagame.games, which is the website that we used for this games conference and has a lot of Easter eggs in it, like the calendar that's secretly a crossword puzzle and other hidden games on the site.

Patrick: And if you want to play the roguelike that we were discussing, it’s on the Internet at isekaigame.com. Thank you very much, everyone, and we'll see you next week on Complex Systems.

Ricki: Thank you.

podcast