Cybersecurity

How does Recorded Future find and track cyber threats? Its data science chief explains.

February 19, 2025 | By Ben Fox Rubin
A red facade of a building with Recorded Future on it against a sky.
Matt Kodama, who heads data science at the cybersecurity company Recorded Future, likens his job to working in the back of the house of a restaurant. His team is the one that shops for groceries, designs recipes, cooks and plates the food. 

The components that this data science works with are pieces of data they curate from all over the web, which they then analyze to tease out useful information customers can use. That information is then fed into Recorded Future’s software platform, which is used by major financial institutions, governments and many others to track potential threats, both online and in the real world, to their organizations in real-time. 

The Mastercard Newsroom recently visited the headquarters of Recorded Future, which became a part of Mastercard in December, and got a chance to sit down with Kodama and hear about his work. 

The following Q&A was edited for length and clarity. 

What data do you use? 

It runs the full spectrum, starting from many kinds of data that are publicly available. At the other end of the spectrum, we have figured out some clever and sophisticated ways to access data from harder-to-reach places that are often used by cyber threat actors. 

Can you share a specific example? 

For threat actors, one of the biggest games in town right now is stealing information off of individual people's computers. Now threat actors have got your login strings, your passwords, your cookies, all the information about what your computer looks like. With this data, that threat actor can make a fake machine that looks like your computer, and can make the browser traffic from that fake machine look like it's coming from your town. 

If you're a threat actor working in the enterprise ransomware game, that is amazing information for getting onto the network that you are targeting. The number of files from infected computers like this that are being offered for sale is insane. It's a big business.

It’s a crazy, crazy world. It’s like a flea market. 

What many customers care about is if they can find out as fast as possible that a high-risk login, such as to their VPN, has been exposed. They can take very specific security action. If there's a log in session, kill it. Whatever password is currently on that login, reset it., Because it's basically a race: how quickly will this information get used by threat actors versus how quickly can the defenders find out and remediate it. 

Do you have another example? 

All the computers that are trying to send messages over the internet need to have these tables that say, “If you’re trying to talk to this domain online, send a message to this IP address.” It’s like the phone book for the internet. 

One really basic but actually pretty effective analytic is just taking all of this information and saying, “What’s here today that wasn’t here yesterday?” There are constantly new businesses that are being created, and then, of course, a lot of them fail. So it’s very, very normal for new domains to pop up and then not harm anybody and then go away. So I can’t just tell everybody, “Hey, this is a new domain, block it.” We can instead go through every one of those new domains and try to connect to it. And if the security certificate it sends me back is a new certificate and looks weird, those are risky ones. 

At the end of this whole story, what a customer is trying to do is say, “Could you please give me a very short list of domain names that I should put into my [Domain Name System] filter and make sure that none of my employees browse to that domain?” If they can block it, that's the gold standard. 

So in this way you’re able to block a phishing site before a phishing email even goes out to potential victims? 

Ideally, yes. The bad guys have got to set up their infrastructure before they can use it. The idea is to detect when infrastructure is being set up extremely quickly and then correctly figure out which new
infrastructure is operated by threat actors as opposed to all the normal and benign stuff.

How do customers use this information? 

The problem in the world is that there's so much bad guy activity. If I could magically give a company a feed of all of the very, very likely to be high-risk domain names that they should be blocking — they don't have security controls that scale to that number of domains. It's just too much bad stuff. Customers are very, very hungry for any insights we can give them — not just this domain name is risky, but who are they going after and what indications suggest they're going after people like me. Because then I would prioritize using that piece of information for my security versus, frankly, probably 90% of similar threats that look almost exactly like that, but they're going after somebody else. 

Customers have a very hard optimization problem, like the limited capacity of their security controls. What are they going to focus on? There's too much. And so they're hungry for us to give them insights that will help them with that optimization problem.  

Ben Fox Rubin, vice president, editorial content, Mastercard