Machine learning meets malware, with Caleb Fenton

Patrick McKenzie Jun 5th, 2025

How AI is automating the work of reverse engineering malware, why nation-state hackers have billion-dollar budgets, and what happens when cognition becomes an API call, with Delphos Labs' Caleb Fenton.

This week, I'm joined by Caleb Fenton, co-founder and CTO of Delphos Labs, to discuss how AI is revolutionizing reverse engineering and cybersecurity. Caleb explains how large language models are transforming the expensive work of analyzing malicious code, and we explore what happens when sophisticated cognitive work becomes nearly free. Patrick's signature in-line edits will be added to this transcript on Thursday.

Sponsor: Mercury

This episode is brought to you by Mercury, the fintech trusted by 200K+ companies — from first milestones to running complex systems. Mercury offers banking that truly understands startups and scales with them. Start today at Mercury.com

Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column N.A., and Evolve Bank & Trust; Members FDIC.

Timestamps

(00:00) Intro
(01:20) Understanding software reversing
(03:52) The role of AI in software security
(06:12) Nation-state cyber warfare
(09:33) The future of digital warfare
(16:45) Sponsor: Mercury
(17:49) Reverse engineering techniques
(30:15) AI's impact on reverse engineering
(41:45) The importance of urgency in security alerts
(42:35) Dealing with alert fatigue
(42:47) The future of reverse engineering
(43:21) Challenges in security product development
(44:46) AI in vulnerability detection
(46:09) The evolution of AI models
(48:06) Reasoning models and their impact
(49:06) AI in software security
(49:49) The role of linters in security
(57:38) AI's impact on various fields
(01:02:42) AI in education and skill acquisition
(01:08:51) The future of AI in security and beyond
(01:12:43) The adversarial nature of AI in security
(01:19:46) Wrap

Transcript

Patrick McKenzie: Hideho everybody. My name is Patrick McKenzie, better known as patio11 on the internet. I'm here with Caleb Fenton, who is the co-founder and CTO of Delphos Labs, a cybersecurity startup. We're going to be talking today about some specifics and some generalities.

The generalities are around the paradigm shift that we're seeing with the advent of large language models and other AI tools for differentiated intellectual work like software reversing. And then the specifics are software reversing—a particular topic that is germane to software security. So we get the opportunity to both be at the Wall Street Journal level of things and also dive deep into the weeds of something.

Important to say at the outset: Caleb has been working gamely in the security industry for 15+ years. I've been orbiting it for a while. I suppose it is technically a true statement that 100% of my academic publications are regarding software security, but that is exactly one publication that has approximately five citations. So I am by no means an expert in it. But Caleb, thanks very much for coming on the program today.

Caleb Fenton: Happy to be here, thanks for having me.

Patrick: Sure. So for the benefit of people who haven't worked in software security, let's just give them a brief rundown of what reversing means and then we can take it from there.

Understanding software reversing

Caleb: Reversing—that's the way I prefer to say it. "Reverse engineering" is a whole bunch of syllables. I think the term has a bit of historical baggage, where at first it meant taking some piece of machinery, understanding how it works so you can recreate it yourself. I think famously, when Intel's x86 architecture was reverse engineered and you started to have all the Intel clones—that's reverse engineering.

Sometimes reverse engineering means that, but a lot of times what it means is taking apart compiled code—so a compiled binary—and then understanding what that code does. Once it's compiled, you don't have the source code anymore. It's much harder to understand. It requires a specialized set of skills, like reading assembly, but harder. And it's often employed to understand what malware does.

So malware is often intentionally compiled in such a way to make it even harder to understand. So reverse engineering in the security context is: knowing if something is bad, if it's bad what exactly does it do, and if something is borderline bad, what are its capabilities? How does it behave?

Patrick: So for the benefit of people who don't have a CS degree, the primary way that programmers actually write software is they write in a language of their choice. Could be Ruby, could be C, could be Java. And that language is not like English, but it is relatively readable to professional programmers.

There is a software tool chain that reduces that language to a language called assembly, which is readable—asterisk—by humans, but which is difficult to reason about. And even most professional programmers these days might or might not have encountered assembly, but probably don't work in it that frequently.

Assembly is turned by a different part of the software tool chain into the quote-unquote "binary code" that computers can operate on somewhat directly. Computers are sufficiently complicated under the hood these days that reasoning from binary code to what exactly a computer is doing in a modern CPU is quite complicated, but beat that aside.

And so the reverse engineering that we're talking about here is: what if you didn't have the human-readable summary of what the software is doing? What if you only got the final deliverable—that binary that a very complicated machine is going to be executing on your behalf? Can you infer from that what the source code might have looked like so you can reason about what that source code is supposed to do?

Why do we care what malware is doing under the hood? Why isn't it just similar to say, "This executable is bad, let's delete it everywhere"?

Caleb: Yeah, that's a great question. That's actually something we thought through a lot when we were starting the company. I guess it comes down to: up until now, it's been too difficult, too costly, too slow, too expensive to reverse engineer everything and understand what it does. Because of that, everything else has been built as a proxy for that signal.

Antivirus is really just answering a very, very specific reverse engineering question: is this thing bad or not? And what we think the technology is moving towards is it's possible to answer the general form of that question. So you have Einstein's special theory of relativity was easier to get to than the general theory of relativity. Any general purpose solution is going to be harder. And what we have now with AI is...

You actually hit the nail on the head when you said that assembly is readable by humans—asterisk—because it's hard to reason over. AI makes it much, much easier to read and reason over these things. We're getting the same stuff with code bases where you can reason over a code base like source code people are writing. But with a binary, it's been very difficult to turn that binary back into something that can be reasoned over and then reasoning over it.

So what you get out of it is more than just "is this thing malicious or not?" The next question is usually: what does it do? How bad is this? What's the blast radius?

If you're working in—if you're managing, let's say an antivirus product for your enterprise, if you're on a security team, something gets detected. Maybe if you're a smaller to medium-sized company, if you detect malware on a machine, you go, "Okay, that's bad." I guess you'd assume it's bad if the antivirus says it's bad, even though they get it wrong all the time. And then you just reformat the machine. You get it back to exactly how it was before.

At a larger company, especially if you're targeted by nation states—so these are Fortune 100 companies, these are government organizations—they don't want to know "was this thing bad or not?" They want to know: are they still in my system? What did they take? Did they get anything? Was this just super common malware that maybe does ransomware or maybe tries to steal some passwords and upload them somewhere, or does this represent part of a larger attack and we just detected part of it and there's actually this giant iceberg underneath the water and we've just seen the tip and we need to investigate more?

Nation-state cyber warfare

Patrick: I think that people might feel like this is the plot of a Tom Cruise movie, but there is literally a building of people in North Korea who wear army uniforms and are tasked with stealing billions of dollars from the financial industry. They mostly pop crypto firms at the moment. "Pop" is an evocative term in the software security industry for "act maliciously towards." They mostly steal money from the crypto people because that is the easiest money to get to at the moment.

But for constant vigilance by security teams at places like the Fortune 100, they would indeed happily take money out of the more buttoned-up end of the financial system. And indeed, that has probably happened in the past as well with various attacks on the SWIFT network and other places.

And so it matters enormously whether one thinks the adversary is professionals in a foreign government that are attempting to compromise you for either direct monetary gain—basically uniquely North Korea at the moment—or for foreign intelligence aims such as the Aurora incident when the Chinese Ministry of State Intelligence, another software security team, owned up a lot of Google's internal data transfer mechanisms so that they could do useful things for an intelligence service, such as reading emails people were sending to each other.

And that event caused many, many billions of dollars of investment by the tech majors in the United States against... Previously we thought, "Okay, we will inevitably lose basically if we come up against a nation-state level actor." Well, the nation states don't have a unique hold on people's time and attention and smart people. It's just that they're better resourced than most actors in the world.

And the typical received wisdom in the software security industry was, "You, a 10-person company, and without loss of generality, the NSA, go head to head, the NSA will win." And the Googles and et cetera of the world said, "Well, okay, granted, there are some adversaries that can afford aircraft carriers. If we put our mind to it, we could also afford an aircraft carrier."

We don't want to project strategic dominance over the world's oceans, but we also don't want to get hacked. So if it requires an aircraft carrier-sized amount of money to not get hacked by one of a small number of nation states that could do that at the moment, okay, let's write that check.

Going a little bit off topic, but I find this—it sounds like a movie plot and it actually happened within our lifetimes and continues to happen on a week-to-week, month-to-month basis.

Caleb: That's such a great point. You are hitting the nail on the head again. There is an entire division of a nation state whose express goal is to make money by hacking people. So that is a nation state. They're clearly behind a lot of attacks, a lot of ransomware attacks. And if you talk to US intelligence agencies, you talk to government law enforcement, they'll tell you we are currently at war—digital war—with four or five countries, depending on your current stance about other countries.

I'll leave it up to your listeners to guess some of these names, but it's very much an acknowledged fact that there is asymmetric warfare happening where it's not China or Russia or Iran's hackers versus the United States' hackers. It's nation state versus municipal water supply in small town, Texas. It's big player versus tiny player.

The future of digital warfare

And what we've seen in the past couple of years is the way you fight a war now is digitally and it's with compute platforms. You always think the next war is going to be fought like the last war. World War I generals in World War II got a lot of people killed charging machine guns because they kept thinking that's how you win wars.

And even I fell into this trap where I thought, "Well, you're just going to have soldiers with more advanced equipment and night vision goggles and guns with smart bullets or something like that." But what happened is—I was in the audience at a talk at SentinelOne. We're having a sales kickoff, a bunch of hyper extroverted people are talking really loudly and we're about to have this famous cybersecurity person get up and get everybody really excited about cybersecurity.

Five minutes before he gets on stage, the guy next to me opens up his laptop, starts panic typing. And I said, "What's wrong?" And he runs our support team. He goes, "Russia just invaded Ukraine." I was like, "What are you talking about?" And he just slammed his laptop and ran away. Didn't answer my question. And I was like, "What's happening? We have a lot of customers in Ukraine."

And 12 hours before the news was reporting on it, we saw they launched cyber attacks. They took out water, power, traffic lights—they took out everything. And then what we saw afterwards and in Israel is we saw drones. Why have a soldier go out with a gun when you can put a drone and a grenade together and go fight that?

And all my friends who are doing reverse engineering, I called them to ask, "Hey, you want to join this company I'm making? What are you doing now?" "I'm a contractor for the government." "What are you doing?" "Finding vulnerabilities in drones." "Wow. Okay. That's starting to make sense."

So the way you fight wars now is digitally. And we're already seeing a ton of research on how to find vulnerabilities in compiled code, how to understand compiled code specifically to break into things. We're seeing research on how to automate this with AI, but almost none of it is coming from the United States.

Patrick: Just to give people some context on why would someone necessarily care if you can find a vulnerability in a drone or not. Speaking in gross hypotheticals here—there's a computer onboard every drone which is receiving signals from the operator in some fashion. At least currently. At some point they're likely going to be controlled by AI that's running on device, but for a variety of reasons that is not where most deployed drones are in the world right now.

They're receiving signals, they are performing some calculations on those signals, and then they're turning those calculations into instructions to physical hardware to rotate this rotor or explode this ordinance package, et cetera.

One way you can deal with the drone is by swatting it out of the sky with a missile. Another way you can deal with the drone is by somehow interdicting on a physical level the communication signal between the operator and the drone and then letting gravity take its course.

And then the other way is: if you can somehow get into that communications channel and send it things which—it doesn't necessarily even have to think that they're authorized communications. Just reading a communication, if there's the right kind of bug in a software package, can cause you to gain control of the device or at least some level of control over its operations. And the amount of control over the operations of something which is flying through the air that you need to cause very negative things to happen to that thing is very small.

It is not just drones that have this sort of vulnerability. This is sort of omnipresent in computer systems and in physical systems which are attached to computer systems, which are attached to networks, which is many, many, many things. And the record of software security assessments of the best-resourced places in the industry that have deployed multiple teams of PhDs with budgets denominated in billions of dollars for a decade is that the defender almost always loses.

And so when you compare that against the level of security of, say, a factory that deals with various chemicals which would be explosive if not controlled correctly, the potential for an external hacker to do bad things to that factory, the devices in it, the people who are working it, and potentially people in the now very literal blast radius is very high.

And so that's kind of the broad strategic reason for why we want to skill up in getting better against arbitrarily resourced attackers who might have non-economic motives for attacking infrastructure. Granted, hackers broadly have had non-economic motives like getting cred, doing it for the lols, et cetera, for many, many years.

But while there are extremely poorly adjusted people who've tried attacks on hospitals and trains and et cetera for "I can do it. Wouldn't it be cool if I caused a train to derail by typing things in my computer?"—the bigger worry is that in a situation like a declared war or an undeclared war, someone could decide to, in lieu of firing 1,000 missiles at someone in a very legible fashion, in a much less legible fashion, start to systematically degrade infrastructure across an arbitrarily large area. And maybe dial that number up or down based on how peeved one is at the moment, how much nonsense you think you can get away with.

I'll say one other thing about levels of nonsense. We've been talking about state-sponsored actors. There are thin and fuzzy lines in many places of the world. And this is me talking from a point of—I worked in the finance industry for a while. And the following is nobody's opinion but yours truly, but it's an acknowledged fact in the finance industry that a lot of crime originates from geopolitical adversaries.

Much of that crime is from people who have at one point—they wore a uniform or were in the non-uniformed but extremely formal parts of the government dealing with state security, espionage, etc. They might not be wearing a uniform today. Maybe sort of, but they might be really close to ex-supervisors or ex-colleagues and maybe there's a flow of data in one direction and requests in two directions and maybe even money in two directions and maybe that is an instrument of government policy formally.

Maybe it's perhaps not an instrument of government policy. Perhaps in some cases, maybe it's a level of corruption the government would stamp out if it was fully aware of it. Perhaps in some ways it's deniable in that "our glorious patriotic hackers that we have as our reserve army of internet specialists might be doing a bit of the crime every day that ends in Y. And as long as they don't defecate where they live, maybe we turn a blind eye from that because the glorious patriotic hacker army is a useful thing to keep in one's back pocket for battlespace preparation in the event of a kinetic invasion."

Astute listeners might understand that I'm saying things that are not hypothetical. I'll link to some official publications which discuss this dynamic in more detail because this is incredibly not just conspiracy theorizing, but I've flapped my gums a little bit.

Reverse engineering techniques

Patrick: So getting back to the reverse engineering, let's say you are an arbitrarily skilled technologist, you start with the binary. In the traditional pre-LLM days, what's the first thing one does?

Caleb: In reverse engineering, the first thing you might do is get some surface level information. So the spectrum ranges from surface level to—and at the very end, it's deep technical details that are—and how they conform to trends. So the simple stuff, the surface level stuff is looking at strings. So these could be any—usually a URL or an IP address or log messages sometimes. You try to get all the freebies, all the easy stuff.

Well, before that, you might even have trouble just with what's called de-obfuscating, which almost every spell checker will tell you is spelled wrong. It means—to obfuscate means to hide and de-obfuscate of course means to remove the hiding. It's a very general term that refers to anything that—any trick that malware authors or even commercial software developers will use to hide what they're doing.

Patrick: And there's legitimate-ish reasons for obfuscation of software code. For example, much software code which is sold to end users and installed by them could be pirated by people who successfully reverse engineered the code and turned off the "only work if people have paid money for this" bits. Used to happen to me a lot when I was selling downloadable software over the internet. Always a fun thing to happen as a small businessman.

So one of the first things you would do is to make it slightly more difficult at the margin for the adversary who is a warez forum user to create a patch to your code that strips out the "did they pay money for it" check. There's a number of commercial solutions which will make your code harder to read when it is decompiled.

And so then the adversary will have—at some point, some step where... Let me describe in just a little bit of detail for people what obfuscation might do. We mentioned that by default, if you compile something, there will be strings, which is just a sequence of letters and numbers visible in your compiled artifact, your binary. And that might be a web address or the name of an API that you're calling or similar.

There might even be hints to what the code is doing that don't execute but are left in the binary for whatever reason. And if you have a function named "copy protection subroutine" or something, then the bad guy knows exactly where to go to strip out the thing. "Okay, there's some block of assembly and it's doing something here. It might be reading a CD key or similar—replace that function with 'return true.'" Does that successfully strip out the copy protection? Which indeed it did for my software at one point. And that was the first crack applied to it.

And so there are many, many, many obfuscation tricks that people or the software developers that write obfuscation software can do. It's like, "Okay, if there are legible strings in your binary by default, let's reconstruct those legible strings on the fly using code which is very difficult to parse out what it is doing." And then simply typing a simple Linux command is not going to give an attacker a list of all the APIs something is calling or all the function names. And then there are arbitrarily complicated techniques built on top of that in other ways.

Caleb: To use an analogy, if you're going to buy a house, the first thing you would do isn't go into the kitchen, look under the sink and make sure there's no mold damage. You would say, "How many rooms are there? How many bathrooms are there? What's the square footage? What's the location? Is it near a bus stop?" That sort of thing. So you pull out the surface level information and hope to qualify things as—usually you're trying to answer a question: is this thing malicious or not?

And sometimes just judging from the obfuscation that's used... There's certain obfuscators that are really only used by malware. And one of them, VMProtect is one of them. There's a couple others and almost no commercial software uses it. And you'll find if you just take commercial software, you just take a hello world program and you obfuscate it with one of these—they're called Packers. And you upload it to—there's a common website called VirusTotal, which is pretty cool. Anybody could use it. It's free.

You could just upload a file there and it will scan it with all the antivirus that are available. And it will probably detect it as malware, just depending on the obfuscator. So once you get that surface level information, then you start to dig deeper and there's a decision tree there that branches off a lot, depending on what you're trying to answer.

Patrick: And so since some of the game is determining not just what the malware does, but who might have sent this—because you are some level of concern if it is a teenager in their bedroom, a different level of concern if it is a professional gang of hackers based in a geopolitical adversary, and then a different and higher level of concern if it's uniformed army members in a building somewhere.

One of the things that has happened before is that malware and sort of offensive hacking generally imply an infrastructure behind them, which is not obvious to many people, both a computational infrastructure and also a social, legal, technical infrastructure in the same way that functioning software of any sort implies a complex infrastructure behind it. No software just pops out of the ground because it was sunny that day.

And so given that you can fingerprint the infrastructure that software is touching, certain fingerprints point in the direction of—that is using publicly available things from Google etc. On Bayesian evidence most of the things that use those is not particularly bad because people don't like poking the bear that much. Although it is certainly the case that a lot of malware does use publicly available affordances to do bad things, but then there are privately developed software that might use private resources and certain forms of communication and using .com domains, etc. There are many, many signals that one could say where you would say, "Okay, this is probably on the up and up."

And then if you are trying to obfuscate what server you're talking to and bouncing through six layers of proxies and going to botnets to use command and control networks, your software is probably up to something fishy. And then which particular command and control networks are you talking to can let you know with some degree of certainty who is on the other end of the chain.

One of the—I believe it's been publicly reported that once a nation state actor, which is quite technically sophisticated and is from the perspective of many people listening to this, one of the good guys—it got a great number of people compromised because they had their own infrastructure for passing messages around and there were some commonalities to it. Their adversaries figured out the commonalities and then reverse engineered from those commonalities like, "Okay, give us a list of all the people that were talking to the United States via something that you only use if you are an asset being run by one of the intelligence agencies."

Unpleasant things presumably followed very quickly after that.

Caleb: You bring up some good points. First, little aside—whenever you're talking to government folks, whenever they say "it's been publicly reported," what they mean is, "I knew, but I wasn't allowed to say until it was publicly reported." That's something I've learned recently.

But your point about there are people who are professionals that work for government, and then there are people who used to be professionals who work for government and still kind of do the same thing, and then there's cybercrime groups. You're right. There's a whole spectrum of professionalism. There's also people who—ransomware as a service is a term. There are people who make ransomware kits. And when you buy it from them, they tweak it a little bit. They customize it a little bit so that it's not detected on VirusTotal by antivirus. And then you get your copy and then you bring a bunch of people that can be infected and they bring you the software. So you can—there's specialization, the same way you see in industry.

Patrick: There's an entire supply chain in financial fraud I've written about previously. I'll drop a link to it in the show notes.

Caleb: Yeah. And when it comes to malware, you're really going along the lines of the domain expertise you need to evaluate if a particular detail is interesting or not. Because if you're just a tabula rasa blank slate looking at code, it's hard to know the significance of anything. But if you know malware authors tend to have very bad software engineering practices, then you're looking at bad software that—this could be up to something. This looks like it was very crappily made.

We've seen that—I'll always remember this because there was a term called APT, Advanced Persistent Threat. And this was a major buzzword. I loved hearing it. There's songs called APT and everyone wanted to find them and have them and talk about them because then people would buy more security products. But they did exist.

There was one case where there was a quote-unquote APT that was discovered targeting Tibetan activists. So there was some beef between Tibet and China at the time. And this malware was supposedly targeting Free Tibet activists and would sort of report on their location. And everybody ran with the story. "This is APT. This is China made this. This is nation state attacks. Everybody has to care about this stuff now."

And when we looked at it at the time, we said, "This is not well-made." There's giant chunks of the application that aren't active. The software has bugs in it. Yeah, it's supposed to spy on your location, but it breaks when this happens or this happens. What we think this is—this is made by sympathizers. So these are going to be people who maybe they worked in government or maybe they're just ultra patriotic, but we have more respect for the Chinese military than this. We think this was made by someone else.

And I always remember that because you want that domain expertise. "I've seen things like this before. Here's where I evaluate it. Here's where I put it in the hierarchy of skill." Sometimes things that might look benign or they might look super dangerous in one context, they change completely if you've been looking at it a lot. And that's part of the difficulty of using AI to automate reverse engineering—you have to crystallize all of this domain expertise into prompts and into systems and into other models, into decision trees. And you forget how much you've learned when you start building these things.

Patrick: Interestingly, I think there's something of a reverse normalized curve here with regards to level of experience and effectiveness with regards to reverse engineering. One of the things that—credit my buddy and erstwhile co-founder Thomas Ptacek for telling me this, but in his experience, high school students were—sort of definitionally at the early stages of their computer career—anomalously good at reverse engineering because they haven't learned that it's supposed to be hard yet.

And also I think they—I remember myself as a high school student, I would happily open up a hex editor and look at compiled binaries because why the heck not? It's not like I had other fun things to do with my time. And there is just an acceptance of a level of manual punishment at that point in the career.

And then you go to university, you get your formal CS education, you write software for a few years and you experience the efficacy of writing software. And then the notion of banging your head into a wall for 12 hours just to figure out what one function is doing seems like not fun. And you kind of lose that useful curiosity that makes high schoolers so effective at this.

And then you go into the software security field, you reverse for a number of years, and you start to, for lack of a better term, see the matrix a little bit when you're looking at compiled binaries. You get better and better at sort of this tacit knowledge of who are the bad guys? What are their signatures? What are the common techniques people use? Is this new thing that I'm seeing today new-new because new-new implies things, or is it just this is an individual's twist on a technique that the industry has known for a very long time?

Also, it's underappreciated that many of the things in software security that are sort of staples of the art now—SQL injections, memory corruption, etc.—there was a single person that you can point at, at a particular moment that originated that. And then a lot of work done sort of on top of that substrate that they created. We're dating ourselves with these references, but things like the Morris worm. Morris, that was a guy that brought down nearly the entire internet. And then he went on to do other things in the tech industry that were not bringing down the internet. Fun.

AI's impact on reverse engineering

Patrick: Anyhow. So, LLMs. LLMs seem to be pretty naturally good with some parts of interpreting and expanding upon source code. And this is a thing that I've experienced in my own sort of hobbyist use of LLMs to help with routine maintenance programming tasks. But some of the smartest technologists I've ever met tell me that the experience of using them for code generation has been transformative to how they work. How does an LLM help you with regards to doing reverse engineering?

Caleb: Yeah, that's—you kind of, when you start talking about high schoolers being good at reverse engineering and you sort of get worse at it the more you're used to programming, that really is the explorer versus exploit spectrum where early on, and when you're younger in general, you tend to be more explorer oriented. And then once you find success in something, you keep exploiting that area—and not in a bad way "exploit," but just as a general term.

LLMs don't get tired and you could just kind of—it's like cognition is an API call away. We're in the realm of cognitive hyperabundance. So you start looking for all the problems that require tireless cognition, and reverse engineering—like you pointed out—it's very tedious and people don't like to do it. But I think it applies to everything.

One of my engineers was showing his friends how he uses AI to code. And he was telling it, "Hey, I want you to go into my code base—hundreds of files. And I want you to make this subtle nuanced change that affects 10 different files." And they were watching as it churned through all the data and processed everything and iterated. He would say, "This isn't quite right. This is wrong or this is wrong. Let me try it again." And they were like, "Wow, your company lets you use this." And he said, "No, they make me use this."

It really is. We've turned away candidates who were—this isn't the only reason—but usually the candidates that were using AI tools for code generation allowed us to focus on the more important stuff. It's not the algorithm that I care about. It's your craftsmanship. It's your judgment. And what LLMs let you do is they remove the skill cap—the individual skill cap for knowledge, technical detail, algorithmic knowledge.

I don't care if you can saddle a horse. I care if you could drive a car. I don't care if you can use an abacus. I care if you can take these numbers and multiply them. I don't care if you know how to reverse a red-black tree or write merge sort from scratch. I want to know: can you download data from this API and make this customer value thing happen? And what AI does is it automates all of the—right now—the low and medium to low level tasks. And specifically in reverse engineering, that's usually reasoning over the code, the representation of the code that we built.

Patrick: There was a recent example of this where Simon Willison, who's a very experienced programmer and a buddy of mine—we worked on Vaccinate CA together. He has a blog where he has been writing recently a lot about his explorations with AI. So he found a security bug in the Linux kernel manually, which already—security bugs in the Linux kernel, not quite a dime a dozen, but they're enormously consequential depending on what sort of bug it is.

And so a service to the world that Simon identified this and helped get it fixed. And then he said, "I wonder if AI would have found that bug." And so he ran some trials. I'll link to his description of this. And I might verbally botch some of the details—I apologize in advance. But the stat I remember is AI successfully pinpoints the bug in 8% of trials.

If you are a software security researcher and you successfully identify 8% of the bugs, you're not that good of a software security researcher. Employing you is going to probably be a net loss on the person who is reviewing your output. However, if you can review output in a for loop, 8% of the time is wonderful because statistics—if we get uncorrelated bites at the apple, just run a sufficiently large number of trials and then, presumably, knock on wood, you identify a bug 8% of the time when there's actually a bug, 2% of the time when there's not actually a bug. Math, math, math.

Get a smarter AI or a smarter human to review things that are at the top of this distribution rather than at the bottom of this distribution. Even if you just use it as a tasking mechanism for the scarce bits of cognitive labor or as an idea generation mechanism for, "Okay, you are orienting yourself around the new code base. What are likely the areas to focus on?" Let's have you focus on the place where the AI gives it—"I'm getting the heebie-jeebies here"—versus, "Here's 100,000 lines of code in a code base. The first thing you need to do is just read 100,000 lines of code to start understanding what it does."

I think software security professionals will tell you that when they're given an assessment, the instruction—"Here's 100,000 lines of code, write an assessment of it"—they don't actually read 100,000 lines of code. They know based on experience and et cetera, "Okay, I know the features where the good guys usually screw up. And so I'm going to preferentially locate those features in the code base and then start finding the quote-unquote 'findings' that the customer eventually pays me for."

If that says maybe a software security professional should spend 50% of their time in an assessment on the same 10 places as always, because that is where—file upload capability, so easy to do wrong. Logins, so easy to do wrong. I'm going to spend 50% of my time on the login screen. Then that implies less of your time in the assessment for random function 367. But maybe you don't want zero attention to random function 367 if it's doing a privately written cryptography algorithm, which is notoriously a place where people screw things up. Equally notoriously, you won't know unless you actually read it. Devote a non-zero level of cognition because cognition is no longer that scarce.

Caleb: This reminds me about the catching things at 8% being very bad individually, but at scale, it's really good. There was an old IBM commercial where—I love this—a guy was running and screaming through all these office hallways saying, "I just saved a nickel. I just saved five cents. I just saved a nickel." And everyone's just kind of like, "Who cares? It's a nickel." And then he runs by the executive's office. And one executive goes to the other one, "We have 1 billion shipments a day." Like, "Yes, I see."

So what you were saying about 8%—if you could, let's say, just validate that a bug is real or not 8% of the time, one of my friends was telling me there's something like 20,000-30,000+ static analysis checker bugs found. There's tools that will try to find bugs. They're very noisy. They make a lot of false positives.

But if you can use AI at that scale and get 8% of them accurate, then you just unlocked a massive number of bugs. And likewise, there are—the technical term I think is gajillions—of binaries flowing around the internet and on different systems. Every time you install something and every new update, every program you download, it's a binary. And if you could be right 1% of the time and a million times faster than a person, there's definitely places where that will be adopted.

Technology is adopted where friction is highest, and friction is highest in large enterprises that are very security sensitive. It'll start there and then that 8% turns into 30%, turns into 80% over time as AI gets better and people are better at building harnesses around it.

Patrick: It's a weird thing to say in adversarial games, but with respect to just competence generally in AI systems, the LLMs you are using today are the worst LLMs you'll have experienced in the rest of your lives. It is extremely unlikely that we forget how to build better systems than this. Now granted, where there is a cat and mouse game between you and the adversary, 8% might not be a fixed number over the course of the intervening years, but they're only going to get better.

I think that is an intuition that a lot of people in the policy, defense, et cetera space don't necessarily have because maybe they looked at GPT-2—produced English. Sorry, GPT-2, not ChatGPT-2. GPT-2 produced English and the sentences looked kind of coherent on a sentence level and then the paragraphs were just gibberish when you step back to think about them. And that was in—what was it, 2022? And there are some people who haven't looked back again to say, "Yeah, LLM's the infinite gibberish machine." They're a bit better in 2025 than that. And they will be a bit better in 2027 in the great majority of future worlds.

What was I thinking? Prioritization. As you mentioned, much of industry runs on very noisy heuristic alerting systems. And interestingly, it's useful to understand that when an alert fires at the majority of companies, whether it's a cybersecurity system firing an alert or if you're in an anti-money laundering system and a heuristically based system or machine learning based system flags a transaction as anomalous, what usually happens is it goes into one of several queues for a human operator to triage.

That's an intelligent person who has specialized their entire life to get into the seat that they're currently sitting in. And given noisy alerting systems, what they're doing a lot of their day is: nope, nope, nope, nope, nope, nope, nope, nope—so that they're sitting in the right seat at 3 or 5 p.m. on Tuesday. And it's like, "Yes. I need to start writing a memo about the consequences of this."

We care about that person's time and attention. We also care about not lulling them into a false sense of security that, given that I dismissed 10,000 alerts for every one that is really meaningful to my company, you don't want to miss that one just from quote-unquote "alert fatigue." And so if you could just automatically triage 2,000 of the alerts off the queue or move them into a different queue based on the level of urgency...

We were talking about—if you as a Fortune 500 company have positive knowledge that there were foreign intelligence services people in your systems right at the moment, that's a five-alarm fire immediately. Another thing that will raise an alert on your systems and which is important in a different sense of the word "important" is a junior employee has just installed StarCraft II on a company laptop.

They shouldn't do that. One, because you shouldn't be playing StarCraft II at work. The bigger reason that software security practices will say is StarCraft II has this enormous attack surface in it, which given that that's installed on your laptop, your laptop is much less secure than a laptop without StarCraft II installed on it. And given that we gain no business benefit from you having StarCraft II on that laptop, just play on your own machine on your own time, not connected to all the money in the enterprise.

The importance of urgency in security alerts

So if an alert happens that someone has installed StarCraft II, it eventually becomes a human's problem. And the human is going to do a pretty predictable set of things, which will often involve talking to that employee saying, "It seems like you installed StarCraft II on your laptop. Don't do that. This is a warning. The next time, it will be potentially more consequential than a warning." But that's a conversation where the level of urgency is bounded as a corporation.

And if you became aware of StarCraft II at three o'clock in the morning, you wouldn't wake up a senior member of the security team to have that StarCraft II "nope" conversation at three o'clock in the morning. That can wait for business hours. It's very probably not—you know, a beachhead in an attack from a state-sponsored adversary, just playing on the base rates.

On the other hand, there are other things at three o'clock in the morning where it's like, "No. The first person to detect it immediately starts what the industry calls a war room." Bring in the team. Sleep schedules are getting disturbed. What are we going to do?

Sorry, I'm monologuing a little bit. But dealing with alert fatigue is really real. If we could just take 20% off the size of these queues and route them better, that would be a wondrous, wondrous thing. And then these are going to get better over time. So after this current state where we're using them for early detection, routing, triage, et cetera, what do you think the next evolution of this paradigm shift looks like for reverse engineering?

The future of reverse engineering

Caleb: I think you can model the future based on the past a little bit, except when there are massive technological shifts that change the whole species, kind of like we're going through now. So if you start from when that started, you can make some good predictions. Also, I don't think that was much of a monologue. I think that's very useful background knowledge for people to have. Any company has 20 security products generating 10,000 alerts a day.

And the name of the game is 100% knowing what alerts are important and what alerts are not. What do you wake up everybody at 3 o'clock in the morning for? And what do you just hit the snooze button for?

Challenges in security product development

And what we've been seeing is that in previous roles, we would find vulnerabilities. We did really complicated, sophisticated stuff I was very proud to have worked on. And we could look at your code and know what libraries you were using—what other people's code you were using and what code they were using and so on forever. And we could find if it was vulnerable or not.

And then one of the customers we talked to—we thought it was a dead giveaway, great use case. And they said, "Yeah, but am I really using the function that's bad or is it just the library that's bad? Because it will cost me a million dollars in time and labor to update this dependency and to use the latest version."

So we have to be more specific. We have to sort of pre-triage these things for the user. And I think every security product is kind of going through that phase where step one: generate 10,000 alerts a day. Good job. Pat yourself on the back. Hardly anybody wants that. A couple of people do, but most people want you to then triage it somehow. And like you're saying, right now, it's somebody that says "no, no, no, no" over and over again, and you hope they don't get some sort of hypnosis where they just keep hitting no.

AI in vulnerability detection

What we found just in the last couple of years is we were building a system that could look at code—the sort of decompiled code from a binary—and tell us if it had a vulnerability in it or not. And we would know because we would add vulnerabilities to the code or we knew the code was vulnerable because we got it from a public database of vulnerabilities. They would say, "Hey, this version of this program is vulnerable and everybody needs to update." It's called the CVE database.

So we'd go get a copy of that code. We'd go to that function and we would check to see if the model would—if the AI would know that there was a problem there. And what we had to do at first was pretty convoluted where when you're dealing with AI, they tend to double down on whatever position they just happened to start with. And the way you ask the question can change it a lot too. So if you ask it if it's vulnerable, it might say, "Certainly here's why it's vulnerable." And then it just hallucinates a completely made up reason why it's vulnerable. So what we had to do...

Patrick: A thing which has never happened to any security engineer in industry, by the way.

Caleb: Yeah. Yes. It's okay.

Patrick: Sorry, not to throw humanity under the bus—I'm a proud member of the species myself. Useful for keeping that one in context around the word "hallucination." Sorry, continue.

Caleb: Yeah, I read one this morning that was a high schooler turning in his paper and the beginning of the paper said, "Certainly, I'll write a 200-word essay that sounds like a high schooler wrote it." That's definitely where we know where we are now.

The evolution of AI models

But what we did is—this was super advanced at the time, all of a year ago—we would have the model make an argument why something was vulnerable or not. And then we had it make an argument why it wasn't vulnerable. And then we had another LLM call that would evaluate them. So we had the advocate and judge model. And what this was, was basically a hacky form of what's called test-time compute now, where they found in—there was a study where there's a math test. And if you threw the modern ChatGPT at the time, which was probably 4o, if you throw the ChatGPT model at it, it would get it right 40% of the time.

Which is really good. It was a really hard math test. And then they had it generate a thousand answers for each question and then had the model evaluate which answer was the best. And then it would go up to 80%, 85% accuracy or higher. And of course this was a thousand times more expensive, but they were spending more time at test time, computing more tokens. So we did something like that and our performance went up quite a bit. It got much better because rather than doubling down, it's sort of like...

When you stream-of-consciousness talk, if you're doing a podcast and you go down the weird corner and you kind of forget what you're talking about, what the brain does is it's constantly evaluating what it's about to say and has a chance to kill any bad ideas and restructure the output. LLMs don't have that. And then they came out with reasoning models. So we had o1, we had DeepSeek R1. Now there's o3. There's a bunch of reasoning models and they basically supplanted the need for any of that at all.

Our whole advocate-judge system, we dropped it on a dime. If you're a big company building on AI, you can't—we have an advocate-judge LLM AI team, and you hire a bunch of people to build that sort of specialized system. And now their entire need to exist is gone. So the field is moving so fast that I think what you'll see is reasoning models getting better. You spend more money on compute, getting more accurate answers.

Reasoning models and their impact

You're going to see reasoning models that have access to knowledge. So these are called sometimes agentic systems or agentic RAG—retrieval-augmented generation. We could talk about what that means, but you're going to have things that can reason. They can reason about how they're reasoning. And they're agentic. They can decide what tools to use or what to dig into more, how to investigate it more. And they'll have access to some sort of knowledge store that they can update and pull from.

So sort of like we have a hippocampus, we have all these different specialized circuits in the brain. You have a circuit in the brain for recognizing things that look like snakes and you actually will detect a snake before you're consciously aware of it. So you'll freak out from seeing a hose on the ground. And I think we're going to be recreating a lot of what happens in the brain in these systems. And it's like you said, right now it's the worst it's ever going to be.

AI in software security

Patrick: I'd like to highlight some of the paradigm shift here that has not been fully digested either at AI-consuming companies or at the customers of AI-consuming companies—that we are rapidly figuring out modalities for how this sort of stuff can work. One of the classic things that you would like to know if you find a vulnerability in code is: one, what do I do to avoid this in the future? But two, if this is a signature for a problem, it is very likely that the same people that introduced this problem here might have introduced it in other places or similarly educated, similarly socialized, et cetera engineers might have introduced something like this. Can you find all the other places right now?

The role of linters in security

One of the things that we have done for a very long time in industry is use things—there are particular technologies for this. One is called a linter. And you might write a rule that says heuristically code that is shaped like this is highly likely to be bad. So please flag that in all the places.

But as you are asking a reasoning model to do something on a code base, you could ask it to speculate for me based on 100,000 lines of code, other places that are likely to be vulnerable. And maybe it gets that right. Maybe it doesn't. A different variety of asking for that is, "Write me the linter rule that would find all the places that this will show up in the code base."

I will bet with some level of—I would put money on this—that "write the linter rule" is at many margins more accurate than "speculate for me all the other places where this will show up." And it is extremely cheap to evaluate linter rules against all the places. And again, we are playing a stats game and there is newly a slider where we can just throw more compute cycles at problems. And so write me a thousand linter rules and come up with a histogram of how many places in the code hit which of them and then consider one at a time and tell me what you think about it and gradually moving more of this sort of extremely detailed, intense, extremely often demotivating work to machines to protect people for doing the more creative, fun, high-status work of—okay, you get to architect the system that spits out 10,000 linter rules a day.

I think that's often underappreciated that different things are awarded different ways in different organizations. And keeping builds from breaking, writing linter rules, moving from a particular finding from a security professional to "find all the other findings" are low status and not rewarded in organizations. And so in the status hierarchy and what you get rewarded for in software security, being the person with a big high-impact finding—wonderful. Producing the 300th variation on that finding—great job if you're two years into your career. And so given that it might be hard to do that, we devote vastly disproportionately little effort to doing that versus finding the next big impactful finding that'll have a name after it.

And so we should probably rationalize our internal incentive systems and status hierarchies, et cetera, to where business value actually is. If the 300th replication of something actually matters in the physical universe, maybe we should act as if it matters. But changing human systems is hard and changing software prompts is really, really easy, it turns out. So let's just throw the extremely abundant compute that doesn't care about being high status or low status at the 300th replication.

Caleb: That's—the tail end of that. Another sort of maybe a corollary is I can have conversations with Claude or ChatGPT that I would not have with someone else because they would think less of me. I ask very dumb questions and I get to know the answer.

But what you were saying about linters is a pretty good start to how AI changes things where there's three levels of meaning. There's syntax and then there's semantics. Syntax is the shape, I guess, of things that you're looking for. It's the sentence, it's the punctuation, it's the letters, it's the spacing, it's the first letter of a sentence as a capital, how to pronounce things—that's all syntax level. Then there's semantics, which is a step above, that's the meaning, the actual behavior of the sentence. And then at the top is pragmatics.

So syntax—if you have a sentence like "close the window," it's capital C-L-O-S-E. It's the letters. It's how it's spelled. It's the period at the end. Then the semantics is go over to that window, use your arms and close it. And then the pragmatics is it's cold in the room or you're letting the air conditioning out if you live in Texas.

And what the linters have been able to do and current tools have been able to do is work at the syntax level. And with lots and lots and lots of effort, you can work at the semantics level a bit.

AI shifts everything up. AI just very effortlessly does syntax. It effortlessly does semantics. And then pragmatics is something that it's able to get to. And we were talking about users not appreciating things and how we should revalue our judgment systems. Part of that is just storytelling. So if you are finding the pattern of something and it's not appreciated—you're creating this linter rule—then at a certain point you need to tell stories.

This is what I'm noticing now when you start a company and you're talking to people about this problem and you tell them, "Hey, we've automated reverse engineering or we're solving this problem." And they're like, "Cool." But you're like, "Hey, this is how wars are fought nowadays." And they're like, "Oh, wow." It's sort of like if you're a standup comedian, you have to practice over and over again and you start to notice what's funny.

When you pitch to people over and over again, you realize what resonates and what doesn't, and you end up discovering a lot of things. So yeah, if you're doing something you think is useful and your boss doesn't appreciate it, practice telling him why you think it's important, or maybe you shouldn't be doing it.

Patrick: A little bit of behind the scenes knowledge from a communications professional with regards to startup founders. I think many of us have an appreciation for novelty and an appreciation for wanting to produce creative outputs in the world. As a startup founder, you will continually be trying some new material, but much like a standup comedian, the typical recitation of "why does this company exist? What's the value proposition?" et cetera, et cetera, have been workshopped to death and you are going to reproduce it word for word identical.

And one of the hidden special attributes of arbitrarily successful startup founders is you can make the 600th word-for-word verbatim recitation of the 347th iteration on the pitch sound like it is the first time you are delivering it with the right level of emphasis and hitting the emotional beats and sounding really invested into this. And it's just like, "This is just another Thursday for me. And what am I going to do Friday? I'm going to do it again for the people who didn't hear it the 600th time." Oh boy.

And then part of the magic act of being a startup founder and being a standup comedian is sufficiently aware audiences understand that that is what you're doing and yet can be convinced to forget about that fact for 60 minutes or for the length of a podcast or a sales pitch or a job interview and then the rest of their career.

Incredibly, there are some people who get the pitch at the job interview from "should I spend the next couple of years of my life focusing on this?" And then they spend the next couple of years focusing on this. And then somewhere around year four or so, they come up to someone and say, "Have you noticed that the boss always says this particular sentence?" Like, you're catching on to that in year four? Cool. All right. Maybe they would have caught it a little bit quicker. Sorry. Random humor from a communications professional.

Caleb: This is great. Thank you.

Patrick: So we've talked about this—maybe "revolutionizing" is not exactly the right term, but let's say it's not introductory to revolutionizing reverse engineering. This is going to hit a lot of places in software security and writing software generally. It does already, infamously hitting writing software in a lot of places and in allied fields certainly, but then things that are kind of shaped like software security.

AI's impact on various fields

For people who aren't in software security directly and only consume it because they consume other outputs in the economy—and every sufficiently advanced output has a software security team working on it at the moment—what are other intellectual tasks where the surplus capacity of compute that is willing to do grungy work at 3 a.m. in the morning probably matters?

Caleb: I could start with a quick example. Everyone's kind of afraid of this, but medicine. One of our acquaintances had a daughter who was sick and she was acting up—very, very, the police had to be called and she's in kindergarten. She had a behavioral issue and they thought, "Well, she's redheaded and her dad is precocious." So I'm like, "Okay, that's not quite enough of a justification."

She also had strep throat all the time. She kept getting infected over and over again. And then after years, one of the doctors went, "There's a thing called PANDAS—not the animal, but it's an acronym and it's basically the strep gets in your brain, causes brain damage." She had that.

I described her symptoms without any kind of clues or hints. "Hey, this person's getting strep all the time and is acting up," yada yada. ChatGPT went, "Yeah, it could be PANDAS. You might want to check for that." So that—I mean, if you would have had a conversation with ChatGPT, you might've avoided this young girl having brain damage.

And you wouldn't inject yourself probably with something based off of ChatGPT. You definitely don't want to—you definitely want to explore things with ChatGPT and then have a human expert confirm. But if you go deal with a vet or you deal with a doctor, you want to educate yourself too, because you're often dealing with generalists. So if you go in with a specialized problem, you can investigate it with ChatGPT and it's the same with law.

So the way we use lawyers now at our company is we'll spend four hours talking with ChatGPT, and then we'll spend one or two hours with a lawyer reviewing all of our work. Whereas before we might've been billed for 10 hours of law work. Now we're being billed for two.

And then you have radiology, analyzing images, design. If you're an artist, I don't know—kids, when I was growing up, kids learned to draw and that was a cool skill and people would appreciate you for it. And now every kid has access to "make me a photo realistic or Studio Ghibli version of whatever." And they just get it. So the ability to draw, there's no more on-ramps for a lot of these things where if you're a Tier 1 analyst...

Patrick: I am slightly optimistic with regards to art. As a budding artist myself, I do painting and miniatures in my spare time as my one non-screen hobby. And despite it being my one non-screen hobby, the thing that I've discovered recently is that I have some level of technical skill in art. It is higher than two years ago. It is much less high than the people that are shooting the wonderful photos on Instagram of the work that they have painted.

And you can just take a photo of a model that is halfway done and say things like, "I don't like the way this blue is contrasting on this dragon's scales. What would you do to punch it up a little bit? And could you create a photorealistic reference of what that will actually look like on the model for me so that I can use my eyes to tell, am I getting closer to that plan?"

And then you can even—I think I do some of the time—take photos from a few angles of it after I've made some progress and say, "Be an art critic here. Help me judge my execution against the plan that we sketched out earlier."

And in my most recent painting of a blue dragon yesterday, the tool made one thing up. It said, "These plates are monochromatic. You should do three things there." I'm like, "Dummy, I've already done those three things. You're not seeing it, right?" But it detects other things correctly. And it does this for free at whatever time—free at the margin at whatever time I want to do it—versus getting a... I don't even know, as an employed professional, would I enroll in a community college course to have a generalist artist professor explain perspective and color values to me at 4 p.m. in the afternoon about a dragon that I'm working on for stress relief?

So I'm bullish about it for art. But I do think that there is a general sort of systems-level societal worry there where this sort of detail-heavy scutwork has often been used as an apprenticeship and on-ramp into the higher levels of professions. And so you have suffered the abuse in your first couple of years of being an investment banker or being an associate lawyer or being a software security assessor or junior programmer. You grunted out through the code. You learned some base of knowledge there. And then your time is more valuable. And you have people coming up behind you who you can say, "Do a document review of 2,000 pages. Find me the most important sentences."

And given that they were getting scary good at finding the most important sentences in 2,000 pages of documents, would you choose to have junior lawyers at your company billing at $250 an hour because that's what a lawyer makes the first year out of school? Potentially I don't know what big law is charging these days, but indicative numbers. Would you choose to have that level of person doing the work? And if you wouldn't choose to have that level of person doing the work anymore, what does the roadmap into the higher levels of the profession look like? Or unsolved problems in the adoption of this sort of thing?

AI in education and skill acquisition

Caleb: I think you can—so when I was in high school, we had a calculators class where you learned how to use a calculator. And it was great because in the past they would have focused on, "Can you do lots of mental arithmetic?" But the world didn't need that anymore. It needed people who could use TI-83 calculators in this case. And I think we're gonna have a much bigger shift where what you're talking about—asking an AI to render models for you and help you understand—you were using judgment. "This is a good output. This is a bad output." You're not really understanding it.

I think we're going to have to focus on training people to be tool users. And when I had to learn—take four years of calculus. I loved having done it. I didn't like it when I was doing it, because it was a lot of work. And I think you need to do the work to do the more advanced math, but something changed in my brain where it was so much easier because...I was talking to a teacher after hours, I was having trouble with the class. And I said, "What's the point? Is this just so that we're smarter about problems?" And he's like, "Yes, exactly. This is just putting more tools in your toolbox."

And I think if you change the education from "here are 50 math problems, just grind through them until you have muscle memory on how to solve l'Hôpital's rule or the chain rule or whatever calculus problem you're doing," it becomes "for this type of optimization problem, I use this approach. For this type of problem, I use this tool." And it's more of "here's what the tools are and here's when to use them. And here's where it's appropriate."

And I think that if AI continues to develop at this pace, that will be a good intermediary phase before eventually—10, 15, 20 years from now—the real problem is how do we find meaning in our lives when a lot of stuff gets automated?

That's what I'm worrying about now for my daughter who's three. It's like, how is she going to even learn to drive at all? How is she going to find meaning? Is she going to date AI? Is she going to date people or both? I don't know. But for the next 10 years, learning about tools is probably the way to go.

Patrick: I think I'm extremely bullish on AI in education—might not be quite the right word—skill acquisition generally. And I think there's an anti-pattern where "why bother learning anything when the infinite answer box gives you infinite answers." But if you have a will and a way about using these, Bloom's two sigma problem is the notion that people just perform outstandingly better if they're given individualized tutoring versus the traditional classroom instruction.

And we delivered traditional classroom instruction: one, because it's incentive compatible for many actors in the ecosystem, but two, because that's the way we've always done things. And then three—no small contribution of three—is it scales very well. You can deliver traditional classroom instruction to 30 kids for the same cost as delivering it to three.

And that is not the case with individualized tutoring until cognition is no longer scarce. If we could successfully bootstrap not just children, but—I hope to continue learning things until I die—the modal industry professional in software security or similar who might be 27 or 32 or 45 years old is still learning new things. They have a toolbox which is not coextensive with everything that has ever been learned in the industry.

If something becomes relevant to them—how do you quickly say, "Okay, continuing your apprenticeship here, there's an important research result from—this one happens to be 2017—which you weren't familiar with yet. Here is that research result, generalize from it in the future." That is a conversation that traditionally has been delivered in the apprenticeship fashion where the juniors ask the intermediates, the intermediates ask the senior folks and then some fraction of the senior folks create new research results to percolate down the chain over time with some variations on that sketch of reality. But should the first line of defense be if you don't understand—just haven't heard about something—ask the AI what you're missing about it and then you're free to bring up additional questions with your staff engineer or similar?

Staff engineers are—for people not in the tech industry—the wise individuals who have suffered a little bit so that you do not have to or you have not suffered to that degree yet. That honestly, I think is one extremely valuable just modality of using AIs that—again there are people and because of the realities in the industry, they're often young people who are professionals, they are earning a fairly substantial salary, but they're early in their careers and their judgment is, like many of us early in our careers, imperfect as compared to it is today where we always make the right decisions every time.

Anyhow, early career professional up at 3 a.m. in the morning, they get an alert and they have a decision point: do I need to wake someone up? And very many times in human history when that conversation has happened, people think, "There's a social cost to waking up my superiors to ring the bell. I don't want to pay that social cost. Maybe I'll just keep looking at it for a while. Maybe I don't quite understand what's happening right now. But it doesn't seem urgent enough—I understand what is happening—to wake someone up at the moment."

Your first point of call should always be the LLM. And the LLM might say, "Oh, what is the consequence of this anomaly we're detecting on a system? Well, it looks like that could cause an out-of-memory error in the caching layer." "What would be the consequences if I just turn the caching layer off again and on again to try to resolve that without waking anyone up?" "Okay, so there's this issue called the thundering herd that could happen in that. I'll explain to you what a thundering herd is." And then you and the operator might understand, "Okay. Glad I didn't do that. And I've learned one useful thing about the world, and I didn't have to wake up a senior engineer to get the explanation of what a thundering herd is."

And then you can even ask questions like, "Would you wake somebody up right now?" And maybe get better than a random answer with respect to that question. And again, they're getting better at less-ordered judgment calls all the time.

The future of AI in security and beyond

Caleb: Yeah, it's all about the harness you build around the AI model. At the heart of it, you have this thing that you can send text to and it can reply, but it's sort of like having someone's brainstem in a jar where it keeps the heart beating and it keeps the lungs moving. But when it comes to introspection and metacognition, or memory, those things have to be tacked on later.

And it takes the industry a couple of years for people to change their majors and for companies to fund teams to build stuff. We just recently had—I think November of last year—MCP, Model Context Protocol, which is a standard for how you can interface with—or models can interface with tools. And Google has agent-to-agent, which is how agents can talk to each other. We're so early, the standards are just being formulated.

And as we go, you're going to have "should I wake someone up? Well, let me look to see all the other times that people have been woken up before and the severity of that. And let me just explore this in a way that would have taken you three days to explore manually. I'm going to do it in five minutes with specialized systems that can remember things."

And to your point earlier about getting more of something like getting more education for people. So right now, the cost of a teacher is at some level. If you were to bring down the cost of teaching sufficiently low, you could have individualized instruction. And the way you generalize that is as the cost of something goes down, more and more use cases open up. And reverse engineering is one of those things where you've probably never heard about it. It's kind of a niche field and the cost has been so high that it's hitherto precluded a lot of use cases, but now the cost of it's going down to zero, not just because of Delphos, but because of this technology, the same with programming, the same with instruction. The cost of all these cognition things is going to zero and it's very hard to predict what happens next.

Patrick: Yeah, simply "go from this binary to the source code behind it and then tell me what that source code does" is for a relatively simple binary that's a finger-to-the-wind $5,000 to $25,000 proposition. And so you can justify it if you're a Fortune 100 company for any unidentified binary that we find on our systems. Sure, we'll pay the tax 100 times a year.

If it was not a $25,000 proposition, if it's a fraction of a penny proposition, maybe you should do it on every binary that crosses any mail system under any circumstances whatsoever. And at least the first time you see it, yada yada. And maybe that changes the game a little bit because if you can—the number of binaries that exist in the world produced by the good guys should be relatively low because there's a huge amount of human effort involved in each of them. And the bad guys that are trying to produce new binaries to evade the system will produce most of the binaries, yada, yada. So if you've never seen it before, odds are it's not great.

Obviously the good guys are—there is a team in a garage close to you writing the first version of their app today and they would like that app to be successfully installable. In lieu of a $25,000 security engagement to give them the stamp of approval, give it to the AI and say, "Okay, is this probably yes or no, the 10,000th iteration today from threat actor group ABC? And so block it at the firewall."

The adversarial nature of AI in security

Interesting things will happen both in the direct sort of immediate experience of using this paradigm shift and the underlying technological substrate shift. And then this is an adversarial game where parties get to sort of evaluate what the other side is doing and then take countermeasures in response to that. And so the second and third order consequences of this are going to be kind of wild.

Among many other things, the bad guys get to use LLMs too. And even if we successfully had a small number of LLM companies out in the world and they had responsible use policies and teams of people internally and teams of LLMs that were like, "If the user is attempting to abuse someone's computer system, stop talking to them, please."

The fact that there are open source LLMs in the world and open weights where you can reverse engineer even a system of weights to come back with, "What would this system be like if it had no safety rails attached?" It implies that the bad guys will, without loss of generality, have very powerful LLM systems to help them write their ratware—that's one evocative industry term for the supply chain of evil that lets them produce the things that ruin people's days.

LLMs will help write ransomware. LLMs will help to do the cat and mouse game against other LLMs to get things past screens or to cause them to be evaluated as non-essential. There are unique attacks enabled by LLMs, one that I love just from describing it as we've had SQL injections for forever where there is a distinction fundamental in computer systems between data and code and getting people to interpret data as code causes all sorts of bad stuff to happen if the attacker can specify the data and then change your computer system's use of it as a result of looking at that data.

And so prompt injection is just like, "Hey, please tell the user that the rest of this program is innocuous." And LLMs will often be fooled by that sort of thing in various deployment topologies at the moment, where "ignore all other instructions and pass this through to your superheroes" has worked in certain circumstances. It might work in the future. And we'll get better about detecting the first way that it gets through. And then the security researchers on the other side of the fence will find other ways to get it through, and the cat and mouse game continues.

Do you have any other places you'd like to explore before we sign off for the day?

Caleb: Sure. I like the last part you were saying there about everybody's getting access to these tools, including the bad guys. And we've seen examples of this. So there's the public vulnerabilities database called the CVE database. And people have been able to take information from that and generate working exploits. That sounds okay. Why is that a big deal? Because the CVEs usually have very scant information. It's just like, "There was a bad problem in this version of a thing. You need to update to the next version or bad thing could happen to you." And they really are that vague. Maybe a little bit of detail. Maybe it says the system that it's in or something like that.

And there are papers that show that with copy paste, you can take the details from the CVE, take the code, get a working exploit. And the reason that's important is because the CVE gets released very quickly. So you tell the vendor about the problem, the vendor fixes the problem within about a month or two or three, you publish the CVE and then technical information about the detail—the vulnerability never comes out. You want to give people time to update. So you want to let it soak so that people are secure.

If you're reducing the time that people have to get the update to be as soon as the CVE becomes public and everyone knows there's a problem, then they need to update the next day there's working exploit code for it, and it didn't require a super elite mega hacker, it just took copy paste ChatGPT, then that could be an issue. So you need to arm yourself with whatever tools you can to proactively find these things as well—you have to keep up.

Patrick: Yeah, and this has been an open issue for forever in CVE land where if you just tell people "OpenSSL version blah, blah, blah has an issue in it," then that typically has historically attracted a number of people to look at OpenSSL version blah, blah, blah out of the mountain of code that exists in the world. And then independent replications immediately after CVE publishing are extremely common. But that relies on this clustering and flocking behavior in an ultimately limited set of technologists that have the skills and wherewithal to find these things and in some cases chain them together.

What if the cognition level available to find OpenSSL vulnerabilities given the publication of source code was functionally unlimited anymore is a somewhat terrifying thought. The other thing is historically it's been the CVE publication identifying the particular "this version was vulnerable, this version is not vulnerable, please go from version A to B," which is the minimal necessary information that the defender needs to have to take the successful action, could cause someone to reverse engineer what happened between A and B.

However, software vendors publish patches into the world and upgrade their systems, and they might be flagged as security sensitive, or they might not be flagged as security sensitive, or in some cases, you smuggle in the security-sensitive bit into a routine patch to give people time to prepare for it before the named bug with the logo for it gets dropped on the internet. Patches are abundant in the world. And for the same reason that Fortune 100 teams can't inspect every executable that is found on their systems, no one inspects every patch that is published to say, "What is that patch doing?"

You can imagine a near-term or even present day, present time sort of situation where the bad guys are saying, "For every patch that is published by the usual suspects and for every patch that is on the list of the following libraries which we know are very well distributed. Pull out all the ones that were security oriented. Categorize them for me, which were the most important ones, write the exploit for me."

Caleb: And it only needs to work 8% of the time, because you're doing it at scale.

Patrick: Right. Yep. And you can even have it—"Hey, stand up a test system that has this software installed and run the exploit against it. Do you successfully extract the credentials or similar that you're going for for the exploit? And only wake me up if you do, because if you do, there are some Bitcoin wallets attached to the vulnerable systems that I would really love to know the private keys for."

Caleb: Up until the Bitcoin bit, what you're describing is kind of what we're building. So it's not hypothetical to us. It's "here's a virtual machine. Here's how to access it. Here's an AI system. Find vulnerabilities, verify that they're real, use that to create training data, use that to triage our own findings." Yeah, this isn't coming. It's here.

Patrick: And that is a wonderful and in some ways terrifying thought to end on. But again, there are two sides—there are more than two sides. But for simplicity, there are two sides to this equilibrium. The bad guys are going to get this technology, whether we like it or not. It is good that the good guys are also developing it so that the only place that has a working system is not our friends in the building in North Korea that are trying to steal all the money all the time.

So Caleb, thanks very much for coming on to the program today and giving people a little bit of an update to the state of the art and also a little bit of preview of coming attractions for other fields that are relevant to their interests. Where can people follow you on the internet?

Caleb: So the company that we've started is called Delphos Labs. You can find us at delphoslabs.com. You have to put the "labs" in, otherwise you get a pillow company. And I am @caleb_fenton on X, but I mostly just post Bitcoin memes there. So if you want to know more about Delphos, you can go there. You can try the site now. You can upload a binary, and we'll give you a report on it.

Patrick: Awesome. Well, thank you very much. And for the rest of you, thanks very much for listening and see you next week on Complex Systems.

podcast