Be Water, Ubiquity over Ownership
In this episode of Hallway Chat, Nabeel and Fraser unpack the emerging power of tool use, reasoning traces, and code execution in modern models. They reflect on how startups can stay ahead in a market that feels like it changes every 90 days, and what product choices separate the winners from the also-rans.
The conversation spans Conductor’s clever planning/execution design, Notion’s everywhere-at-once API strategy, and the subtle ways models persuade us—sometimes too effectively. Along the way, they explore how founders should handle VCs circling during competitive raises, why misinformation concerns may be shifting, and why Fraser still turns to Gemini for certain jobs despite its quirks.
It’s a candid look at the red-ocean dynamics of AI, the importance of owning your narrative, and how the future of work may mean managing a team of agents as naturally as we once managed people.
- (00:00) - Cold Open: Not Owning the UI
- (00:28) - Welcome Back + Era Check
- (01:26) - Tool Use Crosses the Rubicon
- (02:24) - Code + Browse + Reasoning = The New Triumverate
- (04:47) - Steerability & Plan Mode UX
- (05:46) - Why Code-Writing Hasn't Infiltrated Apps (Yet)
- (07:55) - Be Water: Notion's Everywhere Strategy
- (08:55) - Managing Many Agents Is... Tiring
- (15:41) - Model Divergence & The Need for a Picker
- (20:19) - Red-Ocean Reality & Controlling the Narrative
00:00 - Cold Open: Not Owning the UI
00:28 - Welcome Back + Era Check
01:26 - Tool Use Crosses the Rubicon
02:24 - Code + Browse + Reasoning = The New Triumverate
04:47 - Steerability & Plan Mode UX
05:46 - Why Code-Writing Hasn't Infiltrated Apps (Yet)
07:55 - Be Water: Notion's Everywhere Strategy
08:55 - Managing Many Agents Is... Tiring
15:41 - Model Divergence & The Need for a Picker
20:19 - Red-Ocean Reality & Controlling the Narrative
Cold Open: Not Owning the UI
[00:00:00]
Fraser Kelton: I don't know. My, my big issue was like some, somebody on Twitter, we don't mean name, like, was like, this is a you problem, not a model problem. And it, it's clearly a, a model, like it's clearly a product problem.
Is, yeah, it's a product problem. Like I, I I think that, yeah, you have a broad, you have the broadest, most horizontal product since Google, and it's like, imagine if Google came out with every new generation of search and they're like, Nope, here, here's how you have to adapt your search techniques to make it work.
Well, it just, it's not. It's not right
Welcome Back + Era Check
Nabeel Hyatt: Hello everybody. Welcome to Hallway Chat.
Fraser Kelton: I'm Nabeel. I'm Fraser. Welcome back.
Nabeel Hyatt: I was trying to reflect every three to six months has been almost a new era. We get these new capabilities from the models.
Those start to filter out into the world. and it almost feels like, different startups start doing better or worse or start releasing new form factors. People come up with something reasoning or deep research or coding gets good enough and that shifts the market again, [00:01:00] which I keep having conversations with CEOs about how incredibly exhausting that is to try and reorientate yourself to.
But my thought was if we think about GPT five and if we think about sonnet 3 point 7 , are there things that we have seen in the changing of the surface area of these models over the past three months that would indicate what is gonna work or not in the future? who are the winners and losers of this inning of the game
Tool Use Crosses the Rubicon
Nabeel Hyatt: The one that came to mind for me is that, you know, starting with cloud and, and ZPD five is also they're very good at tool use. we just got models that naturally, cross some Rubicon threshold at tool use.
And so you're seeing a lot of. Broad. I think Andrew, a Mason from the script called it Open World. Stuff like D Script and the new Airtable and where it looks like Notion is Rearchitecting right now,
the reason that you get Airtable suddenly being.
Open world [00:02:00] and interesting is because obviously Airtable has lots of different internal tools, this sort function and and that now finally AI is good enough at figuring out what part of the tool to use. When you type into the chat bot, I'll stop there.
Like if I'm, if I'm prompting you at this era and saying tool use has suddenly cross some threshold, does that bring to mind what.
Code + Browse + Reasoning = The New Triumverate
Fraser Kelton: The place where my, where my head went was that the, the tools which seem profound and new here in the past couple of quarters are the ability to write and execute code as well as the ability to browse the web mm-hmm. In reasoning traces. And so I think in the Airtable example, it's going and doing tasks within Airtable with some of the data within Airtable.
Within getting web information to fill out some of the air tables, et cetera, et cetera, et cetera, which feels like novel and new. And I think that. All of the [00:03:00] glimmers of delight and, fresh and newness over the past couple of months have come from the intersection of that where it's oh my goodness.
It, it clearly just wrote code to be able to figure this out. And some of the times it went and it got information from the web in order to include variables into the code that it wrote. So for, for the company that you mentioned,
yeah. Like it can, it can go and get the information that's required and it can synthesize it, and reason on top of the information that it's gotten, and then go out and, Grind again on that.
I've been researching and reading about these advances in a certain field, and I used a new alpha product to gather that research, and it was a, a table.
It was a table that probably would've taken me five hours. I'll show it to you now. Yeah. It would've been five hours of my time to have built. This is it. I didn't do, I didn't touch anything here. I gave it. I gave it a prompt, it came back and asked a couple of clarifying [00:04:00] questions and made some assumptions that I could correct.
Wow. And again, there's no way that LLMs a year ago would've done this.
And this is clearly a function of tool use.
And, you know, built in with reasoning, they are doing one experiment in terms of how to add the scaffolding along the way to make sure that the tool use that it goes at runs off and does, is directionally useful because I think that that's that's the challenge that we've seen in these over the past couple of quarters.
Right. They are doing longer horizon tasks and they are doing more sophisticated tasks, but they've historically been doing it with, with crass assumptions. They used to save you five hours of work, but it would make a bad assumption on step two and then it'd be if, if, if it's five hours and you start by deviating by a degree, you end up in a very different place.
Steerability & Plan Mode UX
Fraser Kelton: A lot of the experimentation right now is what's the right user experience to ensure that the user is able to steer it in the right direction? Steer it as in make sure that the assumptions that it's setting off to do are [00:05:00] correct.
Nabeel Hyatt: Yeah. you're saying one of the normative behaviors that's starting right now, is we got plan mode, right? This idea that maybe I should ask you for directions first, and maybe I should make a task list and then confirm the project plan before I go run away and do a whole bunch of work.
you've brought up something that. I hadn't really thought about, but it's very true. and maybe the answer is we have multiple things still bubbling up that are affecting startups it's tool use, it's certainly browsing the web.
It is code has certainly not worked its way through all this. The fact that it can write good code is affecting the code domain. Sure. But code is how all software is written,
Why Code-Writing Hasn't Infiltrated Apps (Yet)
Nabeel Hyatt: Anything with the chatbot, sometimes the answer is go write code. And yet you almost never see that right now for sure. Unless they're willing to pop up, cursor. And so you hear these, oh, I keep my journal in cursor kind of [00:06:00] stories.
That's because. It's better written as code. Right. But the AI journaling company that started up tomorrow doesn't think about architecting their system that way anymore because they're still stuck in a paradigm of a couple of years ago. Right.
Fraser Kelton: This comes back to what you said is the world is moving so fast still, right?
Is that I, I think we're seeing a lot of companies who are delivering products off of the wave That arrived 18 months ago. Right. Which is like three cycles. Yeah, three cycles late. In terms of what's coming, tell you, I had an amazingly profound and great experience within Claude along these lines, so I asked it a fairly complex.
For me, financial question that, not worth getting into. And it, it wrote code to calculate it. And then the, the surprising thing for me, which I hadn't seen before was that it then checked its work.
Mm-hmm. And
it wrote code, it calculated it, it checked its work and it said that there was an error.
And so then the next step that it automatically did [00:07:00] within the, the Claude experience was it said correcting for the mistake. And it then wrote new code to fix it. I thought, oh wow, that's this is, this is now another step from where we were even two months ago.
Nabeel Hyatt: Yeah, you're not seeing the next, education, AI consumer app to teach your kids stuff, deciding to articulate itself in code writing. They're still very much in chatbot land. to pick a random, yeah, a random category.
the company I think has got a good handle on this chapter, or at least feels it is, is, is notion in that can iterate on the UI layer and the UX layer, but also you have to assume that for other people, building their own interfaces to this thing is trivially easy 'cause you can write code. And so it's very interesting to me that notion is both fighting for you to want to use their website, which makes sense.
Be Water: Notion's Everywhere Strategy
Nabeel Hyatt: Right. You wanna come there and use their website. Then now you'll notice if you open up Claude and you [00:08:00] open up ChatGPT pt, there's a little notion button there as well, right? So unlike Slack, which is decided to take this walled garden, I'm, I'm so afraid approach, and I want this to be the only UX you ever interact with.
Notion is taking the approach of listen. Right. Ultimately, we're just gonna be fluid. We're just gonna be water. Like we need you to be everywhere. And so we're gonna try and be everywhere you're gonna be because we don't know what the future's gonna be. And so N-C-P-A-P-I, they actually just updated their API for the first time to have, you know, new functionality.
let's be everywhere. That feels like the better plan now. I don't think you can presuppose that you are going to invent the perfect. UX for all users, especially if you have something that's truly agentic and emergent, so let them make their own stuff
Fraser Kelton: I think so. I, I still feel like we, we, we are under appreciating the degree to which jobs are going to change. this is so crazy. It's so amazing. I love it.
Managing Many Agents Is... Tiring
Nabeel Hyatt: I had a funny thought the other day, I had, for the first time I was trying to get conductor [00:09:00] really humming. I'm gonna really focus, right? And so I had six. Claude's running, in one project I'm working on, I've got six repositories built and they're all humming they're asking for help, call it every 60 seconds.
I am context switching. I was so tired. I needed a nap after five minutes of doing this. There was five deeply disparate projects. oh, now I'm doing this UI change.
Ooh, now I'm doing this refactor. Ooh, now I'm doing this database migrate. I was like, oh, for all of this. Conversation about we are ruining our children as the TikTok generation. Maybe actually what we're doing is preparing their brains for being able to parallel
Fraser Kelton: a whole bunch of agents at the same time, actually.
So that's like the really optimistic thing. We, maybe, we have ruined any sense of, of being able to focus and we'll just adapt all of our tools to that. Yeah. To meet us where we already are. Yeah. One thing I wouldn't mind talking about today, I, I actually have a VC question of the week for you.
Mm-hmm. And we can get to that. But the other thing is [00:10:00] GPT five launches and neither here nor there to discuss, but I think the one thing that was worth discussing was somebody, on Twitter. Somebody was like, oh, there's a lot of. If you actually now look at the people who know how to be model whisperers, there's like really awesome ways to interact with this model.
You just, it's a skill issue and you're just like, no, that's a product issue.
Mm-hmm.
And I, I, I thought that's, that's absolutely correct. And if you don't think of it that way, it's clearly wrong. It's, I don't even think it's a point of opinion.
Nabeel Hyatt: Look, I think when it was GPT. One, two. Even the beginning of ChatGPT, when there's a new malformed, interesting thing and we don't know what it is, then I get.
That you should try and find its edges and it's wonderful, when Midjourney came out, I thought understanding the nuances of that model was an interesting thing to do.
It's G GT five. And, and that's five with an arbitrary naming that's worth throwing out, 4.5, 4.0 [00:11:00] I whatever. there's a lot. We, we are deep into product management and org chart land of producing GPD five, which was very clearly focused on the market for.
Ease of use, right? They took away the thinking button. they're, this is a productized version of the model for consumers. That's right. And so for you to come out with it, and then for GPT, for OpenAI themselves, a few weeks later to release a guide that says things like, I'll just read the quote here for a sec.
While powerful prompting with G PT five can differ from other models. Here are tips to get the most out of it via the API or your coding tools. And then it goes on to list a bunch of things which, half of these things. Are design patterns of use of GPT, GPT or any AI model over the last two years.
Avoid overly firm language. at this point you should just understand how your user uses your product in mobile, in the beginning of mobile. at some point there was [00:12:00] suddenly a hamburger menu.
there are three lines that meant if you click the three lines, a little drag down comes down and there's menus there. Now you can decide as a company to not use the hamburger menu if you don't want, but understand people are gonna be confused if you don't.
That's it. And so, yeah, it's a bad product choiceunless there was some specific reason that they felt like they needed to change everyone's behavior again, what, what is the reason they, would you wanna change everyone's behavior again?
Fraser Kelton: There, there's none. I mean, the, the reason that you would do it is because you think you can accelerate growth, you can tap into a larger market, but none of those, none of those are problems for them at this point.
Nabeel Hyatt: I guess the only the cynical version of it would be you would accept that everyone will likely maybe change their behavior to train to your new design patterns, which would make them suck at using the other models. That's maybe giving too much credit. No,
Fraser Kelton: that would just be a bad idea as well.
My, my big issue was some, somebody [00:13:00] on Twitter, we don't need a name. was like, this is a you problem, not a model problem. And I think that, yeah, you have a broad, you have the broadest, most horizontal product since Google and it's like, imagine if Google came out with every new generation of search and they're like, Nope, here, here's how you have to adapt your search techniques to make it work.
Well. It just, it's not, it's not right.
Nabeel Hyatt: The thing that
this topic of giving time to adapt Reminds me of is I, I, I always felt like.
The worst generation to have to go through the Facebook, Twitter stuff was millennials because it hit them. Yeah. At, a very, very vulnerable time in their lives. And then the other one was you know, older folks who were hopping on at that time period who just didn't understand how these things were working and the way that they were manipulating you and so, and so forth.
And, the most affected by early social media. but when I look at young folks today. it's different, like they grew up with it. like as a whole [00:14:00] generation, they're just inoculated, . Their antibodies have been built up to what this media ecosystem is doing. I think very similar to the way my father used to talk about the first generation of people who had a television in their home versus the second or the third. And I think there's a little bit of that going on here.
Like we are the first contact with this new crazy alien, and so we're gonna bear the brunt of having to navigate. But I feel good long-term that humans habituate quickly and adapt quickly.
I guess the positive or my hope here is that. Most people ideally, are using the product regularly enough that they'll start to pick up those patterns. But of course, if you're talking about billions of people on the planet, some of them are gonna be more casual users. They're just not gonna understand why it did the thing when they dropped in their medical records one time and it's the first time they use ChatGPT that year.
This is where I really liked reasoning and I really wish they didn't bury reasoning. [00:15:00] most of the time reasoning is a fun parlor trick that came outta deep seek, where it's Ooh, it looks like it's thinking, and there's a whole bunch of words, and it makes it feel more busy and smarter somehow, even if it isn't, But every once in a while when I'm unsure about the answer in reasoning, you could look up, you could pop open and see what were the words that got it to believe what it just said to you, and then find the logical fallacies or why it got stuck or so and so forth.
I do this all the time with, with Gemini because it's so verbose in its reasoning, and you can catch it in its logical fallacies all the time, and that's super helpful.
Fraser Kelton: I have a couple of thoughts there, but the most interesting one is, is you tuck in that you're using Gemini now. That, that feels like big news. Why, why Gemini?
Model Divergence & The Need for a Picker
Nabeel Hyatt: Oh, look, I bounce around all these models all the time. I see the Gemini model. Is is very good still at long form summarization. So if I have a, a big, huge, long thing and I want it to very quickly and very cheaply summarize that stuff and not lose. [00:16:00] Bits. It's just really good at that.
It's very good at very execution oriented data tasks where it's writing really fast and you're trying to be in a fast loop to do work like analysis of A PDF or a spreadsheet. it's a low IQ workhorse, low, low IQ only compared to frontier models,
the common conception is that these models are gonna consolidate over time. And I really think these models are actually gonna diverge over time. They're gonna feel, um, they'll be half good at everything.
Yeah. But you can already feel it like Claude is gonna continue to be very, very, very good at, at coding. For good reasons. Everyone will try to compete with that, but it's proven to be a hard thing to hit exactly the same way and, The tree has been planted for OpenAI on ChatGPT is hard not to feed the tree, right?
Like that. That thing of a consumer, chatty searchy interface is their root and so I imagine that they will feed that tree and we'll try to be in front on that. I think [00:17:00] there's, multiple 10 billion, a hundred billion dollar companies inside of each of these models, but I happen to think that they'll be more divergent, which will mean for most consumers, for enterprises.
That's great. You pick the model for the job, for consumers or casual users. I think the real problem is no one's gonna pay 20 bucks a month for six models for doing the six things. interesting,
we should, we gotta find a way to solve that.
Fraser Kelton: Yeah. Big time experiments in flight and, and hopefully lots of learning from the experiments. Yeah. Listen, I, I love that conductor thing that you started with is opus for planning, sonnet for execution. And there's the, there's just gonna be more and more and more of that.
Yep. You said you had a VC question. I do, and I think there's two different ways to take it, but it's the same situation. You, you're an investor. I'm an investor in a company, we're an investor in a company. That company gets contacted by other investors outside of fundraising. Mm-hmm. And. My spidey sense, for reasons that I won't [00:18:00] get into here, suggests that they're actually contacting them because their competitors fundraising and they want to check the diligence box before they write the check.
Of course, yes. And so I have two questions here. One is. What does an entrepreneur do in that situation? They get an unsolicited inbound. This is a moment in time where all the headlines are also saying that people are getting, preempted with unsolicited term sheets. And so it, it's plausibly something that could, could materialize, but perhaps they're gonna leak all their IP to a group who's about to go sit on the board of a competitor.
And then my other question related to this is, I remember as a founder how terrible it felt when that happened to you and you're like, oh, listen, I think that you're just meeting me because you want to go invest in the competitor. And I do my absolute best to be a great citizen of the startup community.
But when you're doing diligence into a market, you have to meet with [00:19:00] competitors and figure out who it is that you want to invest into, and how do you navigate that When you, when you are meeting with competitors, you may end up investing in the group that you met second after spending a considerable amount of time with one group.
What's the respectable way to navigate that?
Nabeel Hyatt: I think there. are Two stamps of founders with two very different playbooks and I. Tend to think that that should it should be who the startup is. think of it as blue ocean or red ocean. If you are doing something really net new and you have loose adjacent startups that are bouncing around, but it's not war.
you're, you're doing a really interesting new thing. Maybe your competitors have been. around, for 40 years. But there's nobody else really venture sized that's competing with you. Then I generally think you stay relatively heads down and you don't take inbound. You just do some coffee meetings to keep relationships because you probably have a technology or a belief that you need to eventually soft sell to the market to have them understand.
And [00:20:00] so you do a little bit of soft outbound stuff to do long tail, almost like biz dev oriented selling to a handful of VCs. figure out who you like and who you want, but it's pretty low. I, I know this is like the, the, the kind of zeitgeisty thing to talk about right now is either the people that are raising all of the time or the kind of like,
Red-Ocean Reality & Controlling the Narrative
Nabeel Hyatt: I'm so happy to never answer another VC email again and look at my inbox, all these people inbound and screw all of them, and I'm just gonna build and so on and so forth. It depends, how many competitors are telling your story for you. . my analogy is remembering back to the like, Uber, Lyft, DoorDash, Postmates phase of the world, and. In a world where you have two or three competitors that are look like you and smell like you, and let's just be honest like you are in a red ocean war, there's blood everywhere in the water.
It is to your detriment to let other people tell your story all week long, so if you don't pick up that phone and you don't talk to that [00:21:00] vc, by the way, they're just part of the ecosystem.
It's not just you're not talking to that vc, that VC talks to potential VPs of engineering, potential VPs of sales, they're at dinner parties with everybody in the industry. Potential customers, all of these people are part of the same broad community. And so you're just seeding narrative to the enemy.
And if you believe you can seed narrative to the enemy for a year and just wake up one day and walk out and be yep, we grew too. Give me money. that's a naive view of the world. So we got at Postmates, we got counter programmed like crazy by DoorDash. they would run around every three months and tell every VC that Postmates was about to run outta money.
they would do it all the time. And I'm not saying be dishonest, but they, they would foment doubt and fear and so and so forth. They explain why they're awesome and why they're gonna win in the end. And by the way. Postmates did some, did some version of the same thing. Because if you're not out there having a conversation about why you should [00:22:00] exist, it, it's not an introvert's game.
if you're gonna be in a red ocean competitive market, you can't just be in a hole and do the work. you're just gonna lose control of your narrative. And I know that's maybe not what founders want to hear, but I think it's the truth. So my, my answer to that founder is of course, it's a competitor.
Who's probably some VC doing a competitive due diligence. I remember specifically getting a call from a friend of mine who was a VC who was about to invest in a competitive, directly competitive company with a Spark company and wanted to head check like, Hey, I'm about to write this, $50 million check into this company, and I kind of like it.
It's a hundred million dollar round. if your guy's raising, so he literally did the thing. if your guy's raising, I wanna know if we could take a look. I haven't seen the data room. It sounds like you guys aren't raising right now, you know, this kind of thing over the course of 30 minutes, 45 minutes.
like I managed to introduced doubt. Like, great, you should put that check together. But like, have you thought about, this or that, we dug in on it. . If I decide [00:23:00] that I can't talk to that person because they're leaning into the enemy, I don't get that opportunity to make our case.
Fraser Kelton: That makes sense. And and what about advice? For me, I go and I spend a good amount of time with a founder. Great relationship. I then have calls with his competitor and I, I, yeah. Hypothetically I invest in that competitor. Yeah. What, what's the, what's the way to handle the, the relationship with the first one?
Nabeel Hyatt: The nature of things might be that the, personality match, fun, size round, just like so many different dynamics that lead you to not invest in one company and maybe invest in another company later. very, very rarely have had any founder over the years have a problem with that, as long as it's.
Look, there's the, there's the vengeful human who just is angry at everybody no matter what they do. Yeah. like that personality type was gonna be angry at you anyway, right? Yeah. But I actually think most people are relatively reasonable about this, which is to say that nine times outta 10, if a founder is calling out an investor [00:24:00] for feeling like they were fishing for information and then invested directly at the competitor,
the founder was probably right and that that VC was probably acting unethically, which happens. So which I know you're not doing, you try to go forward with authenticity. We try to work with these founders 'cause we care about them and, and yeah, of course there's lots of humans we interact with every week.
I got on the phone with a founder two or three weeks ago, and within two, the first two sentences, I actually did feel like I had to tell him
just so you know, I'm interested in this category. Just so you know, I spoke to one of your competitors a week ago I want them to know that, I want them to know that I had that conversation for sure, that they may end up going and doing that investment, and I don't want them to get the wrong idea.
Right.
Fraser Kelton: yep. So, so much of life, you realize it's just treating people with respect and being open and honest, right? I can, I can absolutely imagine that conversation. Hey, listen, you a lot, did the diligence, but went with this decision for these reasons. Everything will [00:25:00] remain confidential that I learned, et cetera, et cetera, et cetera.
Yep. Well,
Nabeel Hyatt: maybe let's wrap up.
Fraser Kelton: I
Nabeel Hyatt: think it's a good place to stop. Good chat. I kind of needed that. Thank you. Take care, man.