Shawn Wang joins us to talk about his work in AI, why prompt engineering is not what you need to focus on, how the scope of AI is bigger than any one of us, how to deal with the consistency of AI, and how to make use of AI in your product or app.
Time Jump Links
- 00:25 I'm hearing AI voices
- 01:21 Introducing Swyx
- 03:08 Welcome to the AI episode
- 06:02 APIs and AIs
- 07:25 The art of AI prompting
- 12:56 CSIR prompt
- 13:58 Working with the inconsistency of AI
- 16:50 How can I train AI on personal data?
- 20:58 Ask.Lukew.com
- 25:16 We're all building on top of bigger models, frameworks, etc.
- 29:02 You need to have your hands on 10 and 2
- 32:26 I would use AI but my company won't let me
- 35:22 How much data is going back to OpenAI?
- 36:17 What modality should a web developer use?
- 40:17 Can I still be a web developer and AI engineer?
- 45:54 The darkside of AI
Episode Sponsors 🧡
MANTRA: Just Build Websites!
Dave Rupert: Bleep-blurp. Hey there, Shop-o-maniacs. I'm Davatron5000, and with me is Christopher Coyier.
Chris Coyier: Is that your bot voice?
Dave: Yeah. I'm outsourcing this to AI now, Chris.
Dave: I just figured this was a really... I don't know. It feels like the show to do it. You know?
Chris: Yeah. Yeah. Yeah. Why is there even a connection between voices and AI? There totally is, though, isn't it? I don't know. I don't know.
Especially in understanding voices and producing them, I guess. I can't get enough of those like Joe Biden and Trump playing video games together. If AI has anything to do with that, I'll take it.
Dave: It might be the peak. We'll see. We'll see. [Laughter]
Chris: This is as good as it gets. We've already hit it, people. But of course, we're here to talk about the biggest, one of the biggest things to happen in tech, of course, in the last - I don't know - many years. You can't read the news without hearing about it. That of course being super conducting at room temperature.
Chris: That's what we're going to talk about.
Dave: Everybody's favorite.
Chris: No. No! Of course, we have a guest on. We have Swyx, otherwise known as Shawn Wang.
Shawn Wang: Hey, hey.
Chris: What's up? Thanks for joining us, man. We love having you.
Shawn: Hey, Chris, Dave. Good to be back.
Chris: Friend of the show, I'd say. Yeah. Good to be back, indeed. We always have you on because you're a thinker. You know what I'm saying? You always have something to say. Sometimes it's spicy. Sometimes it's... More often than not, I just find it well-considered. You're a good writer. And these days--
Let's back up just a little bit further than that, though, because you have this interesting story that I think frames it, which is that you started in some totally unrelated to the tech field entirely. Finance or something, right? That's your early story.
Chris: Then peace'd out of that for tech for various reasons. Then had this whole thing in tech. You bounced around companies that we've all heard of: Amazon and Netlify. I don't even remember your whole stack of companies, but they're all pretty cool.
Shawn: Yeah, the last time we met was at Cascadia.js, and I actually gave a talk that kind of charted my history and journey through tech and why you should pay attention to each of those things. I'm not super proud that I haven't... I only stayed at Amazon for a year, but I've learned a lot in my time. I found it ultimately for personal growth rather than the beauty of my CV, I guess.
Chris: Ah... no. I think it's fine. Especially, I think it syncs up with your, like, "I like to think about the industry and what's happening."
Chris: "And I like to paint a picture of what's happening there in a way from these kind of 10,000-foot views that I think a lot of people don't often see." And so, if that requires you to bounce around a little bit to get that perspective, I think the world benefits from that. So, there you go.
But recently, another hop. I guess it's tech-adjacent this time, but somehow feels entirely different to me, at least, and that's to this AI stuff. We weren't joking around. We are going to do an AI show. Welcome to the AI show of ShopTalk Show.
Shawn: AI show.
Chris: Yeah. You feel a little all-in on it to me. Does it feel that way to you? Yeah.
Shawn: Yeah, so what I did here was the same thing as my career transition from finance to Web development, which was to do it on the side for six months, feel around if there's potential here, if there's fit between me and the problem domain, and once I have decided to do it, actually cut ties and go all in. And so, I'm about a year into the journey.
Chris: Yeah, fantastic. Well, that makes you the perfect person to talk to yet again for this type of thing. I'm hoping, in this show -- we'll just see where it goes -- to kind of connect AI because the people listening to this show (of course, being mostly Web designers and developers) are going to be interested in knowing, "Okay, well, why should I care then?"
I think that answer is clear in some real obvious ways, like, "You should use GitHub Copilot. That's AI-powered and it helps you code." But I don't know that anybody is confused about that at the moment. But there's just so much more to it than that, right? We're going to get into that.
Let's mention at the top that Shawn Swyx here has a podcast of your own. Latent Space, it's called.
Shawn: Yep. Thank you.
Chris: Yeah. Let's do that. Tell us about that real quick.
Shawn: Latent Space is actually the newsletter that I first started writing a year ago. Then around about February of this year... So, it's only five months old. February of this year, we started a podcast around it just interviewing people just because I feel like you can only be so original and just writing by yourself. Once you actually have people to interview, you actually mine new tokens, as they say, and get more original insights.
The podcast has been popping off. We count on our listeners, Mark Andreessen, Satya Nadella, and Andrej Karpathy. At one point, we were the top ten U.S. tech podcast, which is fun.
You know how these things go. It's only a ranking for a day, but--
Chris: Right. Right. Right. I think we hit the top 100 in the Netherlands once.
Shawn: Yeah. [Laughter] No, I think you guys are pretty high up.
Shawn: Because you've been doing this a hot minute. You know? But yeah, so Latent Space is just my, I guess, exploration and journal about how software engineers should think about AI.
I recently condensed this into, I think, the title of the AI engineer. I think that's where, even if you don't have a Ph.D., even if you don't have years of experience in machine learning, engineering, or anything like that, if you know how to wrangle an API, you should actually start taking a serious look because this is giving you new capabilities that you haven't considered before.
Chris: Yeah, I can see that. As Web developers, we can wrangle APIs. Dave, over there, you can wrangle a hot API, I'd say, you know.
Dave: Ooh... I'm posting. I'm getting.
Dave: Batching. Yeah.
Chris: Maybe even caching. I don't know.
Chris: Doing stuff with APIs. That, of course, must be hotly interesting to Web developers because they're like, "Wait. I can just send a paragraph or something to some URL endpoint and then get back better stuff?" Surely, that's how some apps have taken advantage of this and integrated it into their products, right?
At CodePen, if we were to just be like, "Make a Pen for me that's a rainbow that animates from left to right," or something, there's some chance that there's an API out there that could maybe help us get that feature done. Yeah?
Shawn: Yeah, I think that's where it starts. Then you quickly realize that there is an art to prompting and there's an art to code generation, and there's an art to chaining these things together to get the desired result. It starts easy and then has, I guess, a slope of complexity as you ramp up in your seriousness.
Dave: See, that's what gets me, the art of prompting. I would love to kind of get your TLDR. How do you do a good prompt?
I type stuff in all the time, and it's like, "Meh." Like, make me whatever. I did one for image generation for a D&D character. I was kind of like a cyborg lumberjack. It came back with some downright awful stuff. [Laughter] Just hideous pictures.
I eventually kind of got one that was serviceable but not great. But I'm just kind of curious.
Chris: Now you're blaming yourself because you're like, "Well, I didn't type in all the right stuff."
Dave: Well, yeah. It feels like I'm praying to a god, like, "Please, please, please, if I'm really good, give me back a cool one." You know? How do you do--? How do you get better at prompt engineering? I'll just add the constraint - on a dime. [Laughter] Of course, if you have infinity dollars.
Shawn: It actually is. I mean most of these resources are completely free. It's more about the investment in time.
I will say prompt engineering is very much last year's story. There are people walking around with the title of prompt engineer, but I think the scope of the AI engineer encapsulates a lot more writing of code than just prompting. I just want to set that out front in case people think that we're only going to talk about prompt engineering for an hour.
Dave: No. No. Let's keep it time-boxed here for the next five or so.
Shawn: Instead of talking to god, actually, you're more like talking to an intelligent 14-year-old, which is more relatable to parents. [Laughter]
Dave: Yeah, okay.
Shawn: It follows instructions. Sometimes it disobeys them. But you know. It actually has some kind of intelligence that you can use to do some task.
I'll say the two main ways I would recommend to get better at prompting: one is, learn by example. One of the early successes of this current wave of AI was lexica.art, which is a prompt search for images. A lot of the ways that people learned how to do text-to-image generation is literally just looking at other people's examples, copy their prompts until you get it.
Surprise, surprise, I guess, it's very similar to CSS [laughter] and how we copy over components and tokens and stuff.
Shawn: Once you have that sort of unstructured learning down, then you start to have a philosophy around what works in prompts. That is actually the domain of active research. Sometimes you can read papers around that.
There's one by, I think, Kojima et al., that actually did some automated prompt research around hundreds and thousands of patterns that they found. You find that certain magic keywords actually do improve the performance of prompts a lot.
The famous one is "Let's think step-by-step." We can talk about why that works. We back up the reason why, but it's not immediately obvious.
I think the best way to think about prompting is adopting some kind of framework about what you want to do, who you are, who it is for. Let's say you're generating text for an audience. Describe your audience. Describe your role (who you are as an author), and then maybe the style of text that you want to generate. Assuming that we're only talking about text.
There's also code. There's also audio and all sorts of stuff. But the bulk of the work is going to be text.
Having these basically Mad Libs of, like, okay, the task you want done, the role you are, the audience you're playing to, the style of the text you want generated, and then kind of just filling in there or having a dropdown menu, like a list.
There are many of these out there. I would say Typedream is one of the more interesting examples of a third-party implementation of prompt libraries and things you can grab and combine off the shelf. That's how you get better, just by studying the craft as other people have mapped out.
Shawn: But it's very unstructured. For the first time, there's no strictly typed API with a JSON schema definition where you can look up the docs. There are no docs. It's just kind of trained on the text corpus of 300 billion tokens. You just kind of have to vaguely pattern match to whatever you think is in there.
Dave: It's more like an instrument. You're just going to have to learn how it plays, sort of.
Shawn: Ooh... I love that. You're really good at the analogies. [Laughter]
Dave: I build my career....
Shawn: It's true.
Shawn: Yeah, so you guys play music, and sometimes patterns emerge that are not inherent in the instrument directly, right? I used to play the violin and harmonics is not something you can predict. You just find it in the instrument.
Things emerge in the LLMs themselves. So, what's really interesting and why there's a role for engineers (over researchers) is that even the researchers who created the models themselves don't fully know the full capabilities. I talked about this in my posts on why you're not told to pivot into AI.
There's a role for you here because the sheer scope of this general intelligence is actually broader than any individual one of us can comprehend.
Dave: Yeah. Interesting. On that prompt thing, I watched a video. Somebody gave an acronym CSIR. Context, like what you are asking for. Specific information like I have ten widgets - or whatever. Intent or goal, like kind of what you want out of it. Then response format as needed. I think that one was kind of like, could you list it in steps or could you list it as a paragraph or whatever?
Anyway, I thought that was helpful, but I guess my issue, and this, I think, kind of gets into the broader thing of building your own AIs and small AIs, you said before the show, which I'm kind of curious about. It feels like there's a consistency problem, kind of like that 14-year-old teenager thing. I tell you to do one thing and then it goes off and does it. Then I say, "Do the same thing again," and it goes and does a different thing.
It feels like to me like AI isn't even eventually consistent. It's just kind of inconsistent. And so, in a world where AI is just kind of flip of the coin, why would you build a business on it or a feature in your product? Isn't it kind of just rolling the dice with your users?
Shawn: I would say the consistency has pockets of reliability and pockets of unreliability. There's a little bit of an art to it, like finding pockets where you're like, "Okay, I can build a business on top of this." And people have. Jasper is a $1.5 billion company built purely on top of GPT-3.
I would say, yeah, part of prompting, part of chaining prompts and doing all that good stuff, is taking the vast expanse of what it can possibly do and actually wrangling it into proper software. This becomes a new kind of software that we're writing.
A lot of people are calling it software 3.0. Software 1.0 would be deterministic software that we write, you know, if statements, else statements, and everything is deterministic, handcrafted by the programmer. Software 2.0 would be traditional machine learning models where you have a custom data set and you train on the tasks, specifically like fraud detection, spambot detection. Then software 3.0 is where you have foundation models that have been just generally trained on a whole bunch of things and then taken off the shelf and prompted to do specific tasks.
I think what's new here and what we are uncomfortable with, the source of your discomfort is that, fundamentally here, we are dealing with a nondeterministic black box that is creative and generative instead of deterministic and limited. I sense that discomfort and I'm okay with it because it's just a new tool in our toolkit now that we didn't use to have. Learning to wield it is just like learning anything else, to be honest.
I think that just the fundamental new thing is that it is nondeterministic and we're just going to have to deal with that.
Dave: One thing I've heard in talking to people at Google and stuff like that is your AI is kind of only as good as the data you seed it with or build it or generate it with - I guess. What's the right term? Train it with, yes?
Shawn: Train it. Yeah, train it.
Dave: Okay. Train it. Chris Coyier, CodePen, is sitting on a mountain of Pen data.
Dave: I'm not saying you should just harvest it.
Chris: Don't scrape it.
Dave: Yeah, don't.
Chris: Buy it. It's for sale.
Shawn: It is? It is for sale? Where it is for sale?
Chris: No, I'm just kidding.
Shawn: I'm kidding.
Chris: But it's for sale. [Laughter]
Shawn: If you have this secret CodePen API announcement to have, this would be a perfect time to plug it.
Chris: No, I don't. No big news in that regard. But of course, we're trying to make it as best as it can be.
Dave: How does Chris of Dave Rupert train? Even if I just wanted to use my Pens, how would I train an AI on my Pens? I know it's "possible," quote-unquote, but it's like, "Yeah, going to the moon is also possible." Can you draw the line for me on how it kind of works?
Shawn: Yeah. I think I think it's just a question of where you are in history. We are in the very early stages where everything is kind of... We're inventing the playbook as we go along. Skip along a few decades and it'll be one click on some platform or other.
I would say that you don't train these large models on just your data. They typically get their insights and abilities from training on everybody's data. There's a massive incentive to centralize and for the largest players to keep getting bigger just because they already have the largest pools of data on which to train from.
Then if you want to customize it to your code style, to patterns that you specifically have coded before and want to repeat, then you can fine-tune it on the davatron5000 account on CodePen.
Dave: Yeah. What is that called? Is that applying context or providing context?
Shawn: Context means it's a very specific thing in AI or LLMs. Context just means sticking relevant examples of your prior work into the prompt before you add your real prompt. It's kind of like the hack into the real prompt.
You've heard about things like the context window where GPT-3 or 4 have 4,000 or 8,000 tokens in the context window. That's the amount of space you're allowed to input into the LLM to produce the code or the language that you want. But fine-tuning is just additional training on top of foundation.
Dave: Oh, okay. I can start with an LLM (large language model), and then fine-tune it through my own thing. I assume there are tools out there. I hear LangChain in the mix and whatever different. [Laughter]
Shawn: Yeah. LangChain doesn't do fine-tuning. LangChain is more of prompt chaining. It's basically--
It's very funny. The way that this industry is developing is like very early Web development. The best use of that is jQuery for AI engineers. It's a collection of tools.
Dave: This is so interesting. I start with, like, "Give me this," and then it says, "Cool. Yeah, but give me this." It's kind of automating those five prompt steps, basically, to get a good result.
Shawn: Part of the steps could be document noters, right? For example, if you have... We'll talk about this in a bit, but one of the disciplines in AI engineering is retrieval augmented generation, which is generating text but based on examples that you receive from your codebase.
It could be from your codebase. It could be from your corpus (like a textbook or your docs, whatever it is). Any time people do, like, chat with your docs or chat with your codebase, anything like that, that is retrieval augmented generation.
LangChain also does help with that. But also, there are plenty of people doing alternative frameworks and rolling their own. It's so early that there is no clear winner.
Chris: Oh, gosh. That fills in a misunderstanding gap I had, I think. Somehow, in my brain, if I'm coding and I'm using the LLM thing to help me code in some way that my expectation is that it's trained kind of like just on my stuff or on just other code from GitHub or something. It's not part of this massive LLM and then scoped down later, but it is? [Laughter] I don't know.
Shawn: Absolutely, it is.
Chris: It is?
Chris: Okay. It's scoped later. Here's... Okay. Hopefully, this isn't too much of a wildcard. But you know Luke Wroblewski, lukew.com? He did a bunch of startups. He's at Google, these days, I think. He's prolific in his writing and presentations and stuff.
lukew.com, but you can go to ask.lukew.com, and he's like, "I'm going to make my stuff use an AI of some kind." And if you go to ask.lukew.com, you can just ask it stuff. According to his little writings about this, he's just playing around with it.
It just returns stuff from his own presentations and his own videos, PDFs, blog posts, and stuff like that. We're saying that this wasn't actually just trained on Luke's stuff. It was trained on all kinds of stuff and then is somehow just scoped to Luke's stuff in the last mile. I don't know if I'm explaining that correctly.
Shawn: Yeah, that's probably what happened. That's just because most of us will never produce nearly enough words to train these kinds of models. The reason they're large language models is they take a huge corpus and the number of parameters in them are huge.
Just to give you an idea of order of magnitude, it's about 300 billion words to train GPT-3 and the latest Llama 2 model from Meta was 2 trillion words. That actually teaches it all sorts of things from world knowledge about, like, what entities are and not what objects are, to being able to complete sentences. [Laughter] Everything and anything in between.
Then you don't fine-tune it. You do retrieval augmented generation, the thing I just told you about. It's specifically trained to pay attention to the last few items, the context window, where you can insert in, let's say, your blog posts or whatever. It can start generating text only scoped to that.
But it will hallucinate things that maybe don't exist in the source text. It's just harder to find, and that's one of the difficult engineering challenges.
Chris: Oh, wow! Even though, on Luke's website here, I could ask it, "How do you make cotton candy?" or something, and it might just tell me, even though Luke has never written about that in his life.
Shawn: Yeah. But there are tricks around that. It's around sourcing of your information. Basically, you cite your work, right? [Laughter]
Actually, if you were an elementary or middle school teacher, you might see a lot of patterns to, "Don't make this up. Show your work." Where a lot of us are doing the same thing with AI APIs that we're writing.
I would also say, I guess, that you don't have to do this at the beginning. I do think there's maybe multiple stages of adoption to go here.
Chris: Ooh... Tell me about that.
Shawn: Oh, I don't have an actual framework or anything to pitch. Definitely, people are out there putting together long Twitter threads about this stuff.
Just in terms of prompting, the art of retrieval augmented generation is a fairly well-mapped out discipline by now. You can kind of go down a standard set of tutorials.
I'll talk about this later, but I am working on a tutorial for that kind of stuff to where it's synthesizing this thing.
Just to leave breadcrumbs for people looking to do more, look into vector databases like Chroma or LanceDB. Look into retrieval augmented generation tools like LangChain, but also you can handle it yourself. In fact, you should handle it yourself to understand it.
Then look into hypothetical document embeddings (HyDE), which help you match the queries that you embed to the documents that you actually want to retrieve. There are also some patterns beyond that, but those are the basic starting points.
Chris: Yeah. there are so many words to know. Yeah, that's part of what makes it intimidating to me. I know the big ones. I know what Llama is, I guess. And I know what... Everybody knows OpenAI and their models. That makes sense.
But then you're like... But the startup space is huge, and all kinds of things are trying to help you. But this is again like shattering some of my understanding of some of that.
You're saying in order to have this be useful, it's got to be a massive model. Again, we're talking about text mostly, but it probably applies to anything.
If you're a startup in this space, you're probably built on the back of bigger models. It's not like you're bringing some new model to the party. You're bringing some DX or something to the party.
Shawn: Well, I just think it's how a lot of startups function anyway. This is no different to me than I built on top of AWS.
Chris: Yeah, right.
Shawn: It's no different to me than I built on top of some framework I don't own. But someone else put in the hundreds of millions of dollars into building this framework or setting up this service, and I can use it to write the last mile, to make a product that's useful for my end customers who don't actually care what I implement under the hood. They just care what I can do for them.
There's just a set of actual creative generative new use cases that I can now create. I think that's what people care about.
Dave: Another analogy incoming. [Laughter] Warning signs...
I don't know how to write a global mapping system, but I can embed a Google Map on my webpage. I can set some markers and say, "Make a map that's kind of in these bounds," right?
Shawn: Oh, absolutely. I have this little menu bar app, and I use it for coding probably like 20 times a day.
Shawn: Any time I want to look... Yeah, I do. It's Copilot++. Copilot is really good for auto-complete, really good if you write a comment of functioning you want to implement. It can just go ahead and implement the function.
But for just general things, I have this project called Small Developer that is a coding agent that goes from prompt to full app. You can just literally say, "I want a snake game," "I want a tic-tac-toe app," "I want a VS Code extension," "I want a Chrome extension," - of that size and complexity. It can produce the whole thing, including, for example...
Maybe a hot take or admission here. I've never learned CSS animations. I've tried. I've pulled up the CSS-Tricks articles and I've really tried, but I can't memorize it. I just don't have the practice. I don't have the patience to do it.
But now I can just punch it into an LLM and go, like, "Give me a candy stripe loading indicator ten pixels and rounded," and it just does the thing. It's great.
Chris: Right. Okay, so that's an example where Copilot wasn't the right... There's actually a different way to interact with these models that somehow works better for you. Copilot isn't the last story in how to interact with these models.
You even brought up a point that I don't know that everyone 100% understands. I think the easiest... Once you've installed and activated Copilot, you can't miss it. As you type, it's suggesting crap. Then you hit tab or whatever.
But there's this other way to use it that's more prompt-like where you literally use the code comments above the thing. Then it sees what you're doing and it might suggest more code than just the extension of the line, like, "The following function will take a URL and look for the word "candy" in it and return true or false," or something. If you explain what you want, you're going to kind of get that. But that is still pretty scoped to, like, just one file, for example.
Shawn: Not even the file. It's typically a single function or a few lines of code.
Chris: If you're talking, "Code me a tic-tac-toe app," like you explained, that's a larger scope. You're saying there are tools already that can help with that?
Shawn: Yeah, I made one. [Laughter]
Chris: Yeah. [Laughter]
Shawn: I'm not plugging my own. I'm just saying it's possible. I did it in a weekend. It's not that hard.
The more you explore the capabilities of these tools, then you'll know when to pull them in. I think, at least for people who are skeptical, cynical, or maybe just apprehensive, just having a sampler is really helpful. Like, "Here are the capabilities that other people have explored," and you should know them whenever you want to pull them out because they are very helpful. [Laughter]
Just because a lot of this is low-level boilerplate, right? I think what the opportunity offers is that we get to elevate ourselves from writing every single line of code towards reviewing the code that is generated and prompting it to tell it what we want and just kind of reviewing it (because it's going to make errors). You should not check it in without review.
Chris: Yeah. Right.
Shawn: Even though people do. GitHub estimates about 40% of people, 40% of lines of code checked into GitHub are checked in unreviewed or unchanged from GitHub Copilot.
Chris: [Loud gasp]
Dave: I'm guilty. [Laughter]
Dave: Sorry. My entire test suite. [Laughter]
Chris: But how does it really know? My eyes maybe looked over it really quick and I'm just so familiar with the kind of code that it generated that how would it know that I actually reviewed it? I don't know. That seems funny.
Shawn: Yeah, exactly. I think there is some... pWeb does studies and say, like, it does generate unsecure code. It's kind of like how with the Tesla autopilots. They want you to keep your hands on 10 and 2 and never stray and not go to sleep because even though this is kind of self-driving, you still have to monitor it. We don't have the equivalent of that in code generation or AI yet where we can say, "No, we really require human review." So, there's a certain amount of self-discipline that needs to be done or at least good testing.
I think, ultimately, this helps you generate boilerplate much better but it doesn't replace the human processes of actual checking and designing with the end user in mind.
Chris: Yeah. You keep saying it's early days. It totally is, right? I don't think there's any doubt that we haven't seen where this is totally heading yet. But it does already seem like a big deal.
When I talk with general people, I talked with a guy who owned a restaurant. He was so pleased with himself to use... He's a nice guy. I'm not trying to be like, "You're dumb." He used it to iterate on menus for catering. I thought that was really clever, actually. Then was skilled enough with it to adjust the prompts and be like, "Yeah, you did a pretty good job, but I don't have any mangos, so adjust it," and stuff like that. Really basic stuff.
Then I was like, "Man, that's great. Good for you staying on top of the world. You should see what it's doing for us coders, though, man. We really benefit a lot. I benefit all day every day."
I don't know if it makes me twice as fast, but it's getting there. So, I'm thinking about that, about how I probably am better and faster. That's something. I just want to throw that out there for everybody listening to this. If you've never even tried it, which I know there are some of you out there, it really is just good. [Laughter]
Chris: It helps you. Then I was in my local Slack. We have a Bend.js Slack group, so just my little town. A conversation just started the other day. It was just generic. It was like, "Who is using AI code helper tools?" I bet it was 50/50.
There were some people that were just like, "Nah. Just haven't yet." You know? Which is a good reminder of how early days this is and how some people are just uninterested or something. They're not even Luddites. It's just early days. You know?
I probably misused that word, but then there are some people that more common, the most common response was, "I can't because work won't let me."
Chris: Have you seen that one yet?
Shawn: Yeah. Yeah, yeah. I'd say it's probably... For real use cases is probably a top complaint, and there are others.
There are professors who require usage. There are other companies who actually buy ChatGPT subscriptions for every single employee. There's a very wide range of opinions, and that's completely fine. You have to decide. As with any new technology, you have to decide how comfortable you are with it and what the safe boundaries are.
I will say there are companies that are emerging that will give you the privacy guarantees that you want. It's obviously very important for the Europeans and anyone working in healthcare, defense, anything like that.
Yeah. Don't count it out because there are options out there. If you use it as an excuse not to explore, that's where I'm a little bit against it because it's not like people aren't running into these. You're not alone in running into these issues. People have actually worked on solving them. They're out there, and they are looking for you.
Yeah, "Work won't let me," I think is a barrier. But also, if you really want it enough, there are people out there who will give you an LLM inside of a docker container that you can run inside of your own environment that is fully controlled by you. So, Sourcegraph is doing that. Codium is another podcast guest that I've had on is doing that. There are a bunch of people.
Chris: Hmm.... Enterprise-friendly AI. Hell, yeah. Of course, they are.
Shawn: Exactly. Oh, yeah.
Chris: Yeah, they want to charge you $200 a head or whatever they do.
Dave: Well, and that seems like the way to do it. Just sort of a clean room, like, "Hey! Don't go tell the world about this one."
Shawn: Well, yeah, so then the question is... Obviously, this gets into a whole privacy debate, which we have no expertise over. But how unique is your code? You know? [Laughter]
Dave: Right. Right. Well, just getting into customer data. I talked to a friend who works at Salesforce. Knowing customers isn't cool, exciting, but knowing customers who are about to peace out is maybe interesting and maybe somewhere where AI can do that. But I don't want to just send a list of my customers to openai.com. You know?
Shawn: For sure.
Dave: A clean room for that would be kind of cool.
Chris: Yeah. You've even got to adjust your understanding of what's even happening, right? You're probably not. What data is going back to them?
Dave: True. Yeah.
Chris: I don't even know. Some? [Laughter]
Shawn: Yeah. Everything in the prompts.
Chris: Yeah. But you don't see the prompts. You know?
Shawn: Exactly. Yeah. OpenAI has promised that they do not train on the data that you send them from that API. You just have to take them at their word. But also, I would just say I think they don't care. [Laughter]
Chris: Probably, but you almost wish that they would. Right? In my mind, not all the time, but as I'm coding, in my mind I'm like, "Ooh, this thing is getting smarter as I code because it's learning my codebase better." But it's not really. [Laughter]
Shawn: No, it's a static model, and they guard the training data sets very closely. Yeah, I mean there are many solutions out there. Keep an eye open. OpenAI is not the only story in town. If you have privacy concerns, that's one issue. But there are many other problems that people deal with in applying AI, and it's not just code generation, by the way.
There are many modalities, is the word that people use to describe this. A modality would be like text generation. Modality of the audio generation. Modality of the code generation. All of these modalities offer different and unique challenges and also opportunities.
Dave: That was kind of like where I wanted to go next. As a Web developer who wants to leverage some AI, what modalities or what should I be looking to do? What's possible, and what do you think people aren't doing? I have some scenarios if you're interested, but what's--?
Dave: Can I just send two pieces of JSON and say, "Hey, what's good?" You know? [Laughter]
Dave: Or, like, "What's bad about this?" Or "What happened?" And it can generate stuff. Is that something that it can do?
Shawn: Yeah. Comparing JSON, charting a codebase is very, very fun. Take any codebase, whatever, anything that's open-source and throw it in there. Ask it to draw a diagram. It's really good at that.
Chris: Wow! Really?!
Shawn: Yeah. [Laughter] I do it all the time. It's fantastic.
Chris: Because it understands imports? AIs know ESM? [Laughter]
Shawn: It just understands the structure of code. That's inherently easy to graph. You have to just kind of hint it.
Shawn: Color this box or give it some kind of legend, but it does all the things. But the way, Code Interpreter is a mode of ChatGPT that you can pay for, $20 a month, that gives you a sandbox to execute the code in and actually do charting as well. If you're doing anything like visual, anything data analysis, you can upload a CSV and ask it to analyze the CSV. Anything like that, use Code Interpreter, which is a special mode inside of ChatGPT that if you're not close to the story, you maybe haven't tried out. It is the best thing since sliced bread. I will vouch for it.
Chris: It's a way of flagging it and saying, "What I'm about to send you is code, so you should know that"?
Shawn: No. No, no, no. What I'm about to send you is data, and please write code to analyze this data. Sometimes that data is in the form of code.
But coming back to the modality question, I'll just map out the rough sequence. First, learn how to interact with the GPT-3 API and prompt it. Second, learn about prompt tooling, the retrieval augmented generation stuff we already talked about. Third, learn about code generation. Fourth, image generation. Fifth, speech to text, so that's kind of audio.
Then you can also go the other way, text to speech, and that's really fun. I've had a phone call with an AI, and you can go back and forth. There's this amazing company, vocode.dev. And the latencies are so good that you can have a conversation with these things. It's amazing.
They do speech-to-text and text-to-speech (and then, obviously, text-to-text in between) to generate all these conversations. It's amazing.
Then number six is fine-tuning and running open-source models (because everyone cares about that). Then lastly is dabbling in agents. That's the seven-day sequence I've kind of mapped out there.
Chris: Is that a time to plug your course? Don't you have something? Is that what that course takes the shape of?
Shawn: Yeah. Yeah. I just think this is how I would introduce it to my friends, like, "Hey. If you don't have that much time, here are the main things you need to know exist because these are all extremely good, well proven out, available to everybody, and you are just choosing to ignore them."
Yeah, that's the email course I'm working on at Latent Space. To me, it's a free course. I'm not going to make money off of it. I just think everyone asks me for, like, "Where do I start?" and this is my one answer. [Laughter]
Chris: And is that for somebody who is going to--? So, you're advocating for this term "AI Engineer." I think you're going to win that battle, probably. It just rolls off the tongue pretty nicely.
Chris: But also, does it mean I have to be that or can I be both? Can I just still be a Web developer and also just take all these and just be both or does it pay to specialize or whatever?
Shawn: No. No, absolutely not. Obviously, everything is a spectrum, and you can assume multiple hats in your life. I just think someone needs to lay out the syllabus or map out the fields as much as it's established. This is my high-confidence map of things that I know to be valuable and doable and you should know.
There are a lot more experimental things that I'm not sure about that I'm not going to put in there. But the stuff that I make mandatory here is stuff that you should know. I think everyone needs that approach in terms of learning anything, like, what is a core set of skills that everybody must know, even if you're not dabbling in it day-to-day professionally?
Shawn: Yeah. Yeah. And I'll mention this intersection... You're asking me about this intersection of the AI engineer and software engineer. I've actually started pulling in some of our front-end friends. I think you might be familiar with Amelia Wattenberger.
Dave: Mm-hmm. Yeah.
Shawn: She's been spending the past couple of years at GitHub on Copilot. She's been doing some really interesting demos on the side.
I actually organized the first AI UX meetup in SF, and I think there's a huge amount of potential for front-end engineers in particular to embrace AI because the capabilities are there. There's actually not that much of a heavy back-end component. It's really about making it accessible to the end user and creating new human interfaces to interface with the models.
Chris: Right! She and Maggie Appleton had posts that I loved because we talked all this time about prompts. I even sensed some pushback from you, like, "Ah, prompts is last year," kind of thing.
Chris: I also don't love them. It's not that I couldn't learn it, but it seems so early that I have to craft a prompt. Oh, man. I don't know. That's weird. I'm more interested in these tools that the prompt, even if the prompt is what the API does behind the scenes, I don't want to write it. That's what it seems like Copilot works for me. The prompt is just whatever is happening around where my cursor is.
Amelia had this post of, like, a writing tool that there was no prompting. There's just writing. Then wherever your cursor was, it was almost like analyzing the text where your cursor was, and it was saying, "Oh, this piece of writing, I'm going to click the happy face. Could you make it happier somehow?" or "Could you elaborate a little bit?"
I'm assuming, behind the scenes, it takes that chunk of text, the paragraph that's around where the cursor is, and then you click the happy face, and what's happening behind the scenes is it's adding additional prompt information that's like, "Take that piece of code and make it sound happier." Whatever kind of crap you have to write into the prompt. Then it comes back from the API something that you surface, like, "How about this?" Then you're like, "Sure. Sounds good. I accept."
That's awesome! That's a great user experience around editing a text document. Hers was, I think, hypothetical. I wouldn't doubt that she's built it or somebody has something like it.
Shawn: She's demo'd it.
Shawn: She's demo'd it. Yeah, so both Maggie and Amelia, both of them spoke at the meetup. Maggie was a core organizer. Yeah, it all came out of this AI UX meetup that we organized. The full video is on YouTube.
ChatGPT itself, OpenAI has been repeatedly calling it basically a UX innovation, like the models were there. Someone needed to come along and shape it into an app. Suddenly, that blows up because now people can actually use it.
I strongly encourage designers and UX people, UI people, to embrace this because there's so much opportunity for you to unlock the last mile of value and show the creative potential of the machine. Right now, we do a lot of forms. A lot of work is forms and data tables and displays. Now we can actually kind of break out of the paradigm and create new things. Including, by the way, ephemeral UI, UI that is generated on the fly based on the use case and context because it can write code. Then you can run it. So, why not generate tables on the fly?
Chris: That blows my mind a little bit. You're doing some rote task and somehow it understands what you're doing and is like, "Ooh, I'm going to actually write a quick little program to help you do that task." [Laughter] Then you can do it and then just throw it away because you're done doing that.
Shawn: Right. Yeah, so the ability to generate throwaway code is actually much better or the activation energy is much lower. That's fantastic if you know what you're doing. It's very easy to burn yourself if you don't know what you're doing and you're just kind of--
Chris: Is that why you're saying this isn't replacing anybody's jobs? You still need this high level of understanding of what's going on. Yeah.
Shawn: The phrase I often go with is that everyone just got a junior developer. It's up to you to supervise them and put them to work.
Shawn: If you don't know how to put them to work, you're not going to benefit.
Chris: The dark side being let's say it does make you twice as productive. We know about capitalism and how it works.
Chris: Your boss isn't going to be like, "Well, take Thursday and Friday off then. You're crushing it." They're going to say, "Well, we expect you to be that productive all the time then." I mean, to me that does have a ring of darkness that I don't love, but what are you going to do?
Shawn: Yeah. Yeah, well, then this maybe broadens out the general philosophy of software. There are always more tickets than we can ever handle. Right? [Laughter] It helps us handle more tickets.
By the way, there are many startups working on the GitHub issue to PR flow, like create an issue in linear GitHub and then it auto-generates a PR.
Chris: Oh, my God. It's like, "I've got an idea how to fix that particular bug."
Dave: I've also heard the opposite, like Stack Overflow... some maintainers are just burdened by all these AI garbage PRs.
Shawn: SEO content as well. People are going to be spammed by all sorts of garbage, so it creates new problems too.
Dave: Creating spam, yeah.
Dave: I, unfortunately, have a hard stop, so we have to put an end to this conversation. I super apologize because I'm enjoying it.
Chris: That's alright, Dave.
Dave: Shawn, thanks for coming on. There's a lot of hype around AI and it's really easy to be a hater, I guess. [Laughter] Is it crypto 2.0? But I think you've done a really great job demystifying some of the sort of stuff about it, so I think that's really cool.
Shawn: Thanks so much.
Dave: We'll have to chat some more.
Chris: Yeah. This went way too fast. Yeah.
Dave: I've got 100 more questions here, so we'll have to do it again. But for those who aren't following you and giving you money, how can they do that?
Shawn: I'm still on Twitter for now, or X, as they call it. I don't need your money. I guess if you are interested in learning more about AI engineering, the site is latent.space. I'm very into the custom domains these days. The conference I'm doing is AI.Engineer, which is in October. We're gathering all the AI engineers. Come join in.
Dave: Very cool. All right, well, thank you for coming on the show. Thank you, dear listener, for downloading this in your podcatcher of choice. Be sure to star, heart, favorite it up. That's how people find out about the show.
If you built something cool with AI, let us know. That would be cool to know.
Then I guess we're on X and Mastodon or whatever. But anyway, join us in the D-d-d-d-discord. Of course, that's where all the cool stuff happens at patreon.com/shoptalkshow. Chris, do you got anything else you'd like to say?
Chris: Whoa! ShopTalkShow.com.