From AI to Action: Building Intelligent Agents | Part 2
In Between the Tech – Part 2
AI Powered Engineering
Guest: Nirman Dave, CEO of Zams
Host:
How will engineers utilize AI within their designs? In this episode exclusively for subscribers, Nirman Dave, CEO at Zams, provides his take on the evolving relationship between humans and machines.
What should the content writers and the data scientists of today be doing to position themselves to prepare? And then, what about the next generation of people?
Nirman Dave:
I think really the fundamental idea is how do you define what the end goal or what the success is? Because now we have these tools that are very, very fast. So, you don't have to be so much deep down into kind of being very hands-on with a lot of things, but you had to be much more mature into thinking about how do we think about the future? How do I think about what my end goal is? So typically, when you think about it, let's take a marketing role where someone really sets the end goal and says, we want X kind of visitors on the website, or we want X conversions on our website. And you have a bunch of content writers that would go out and write content. Then you have a bunch of other people that would go out and build other source of content, and then you really measure those numbers up.
But fundamentally as that function changes, your question is to ask, if I'm a writer today, what is it that I should be writing? What's kind of my end goal? And then you prompt the agent to do that. Now you can have the agent suggest and recommend to you what to write, but ultimately you knowing your audience and why that audience cares is fundamentally more important. So, it's very interesting because what these agents enable you to do is that they improve the speed of execution, and that naturally forces you to focus on the fundamentals. Why am I doing this? Why is this important for our audience? And so on, so forth. And so especially in any sort of role, even if you think about the data science role where you can fundamentally prompt these agents to execute certain tasks, build certain types of models to give you certain types of outcomes, your question is, should I do it this way? And would the audience care? For example, let's say you have a data of 50 different countries and in each country, you have data about which type of user is converting to a paid plan. Right, now, we wanted to build a prediction about which new user is going to convert to a paid plan. As a data scientist, the choices you have to make are, should I build one model that encompasses all of the data? Should I build one model for each separate country? If I do that, should I build one model for each separate demographic in that country? That's the fundamental kind of process you are defining as a data scientist. And then you can have the agent go ahead and execute that. Maybe the agent can give you a recommendation on what to do as well. But the idea is you go to the fundamental and say, what does this business really care about? What do I really care about in terms of the outcome and where do I want to direct these agents? So, there's a big transformation from becoming executors to becoming directors. So, every role in my opinion, will be a director role in the future because you're going to be sitting and thinking, what is fundamentally important to my audience as opposed to sitting there and saying, how do I just do my thing and then get out? So, I think that would be a big fundamental shift in the future.
Host:
So then, how will people in their new roles know if the tools, or the agents, are taking them in the right direction?
Nirman Dave:
That's where the humans come in because you are directing, you're saying this is the right direction and that the agent doesn't. You can say, no, no, no, you're not doing in the right direction. Go here. And so I think that's where the fundamental nature of these jobs are going to come in, where the agents will go ahead and execute. You want to make sure you're giving it direction; you have the velocity vector. You want to just really give it the directional vector where it needs to go.
Host:
How can engineers determine if they’re using a quality platform?
Nirman Dave:
So, every agent has a different way to identify success, and the key challenge as of today, April, 2025, is we don't know what's the best way to evaluate an agent. There's a lot of different ways. There are metrics like faithfulness that evaluate the accuracy of the agent. Is it making accurate decisions? Is it thinking about the right kind of assumptions, right? Then there are metrics like contextual relevancy, which is it giving the right kind of responses. For example, if it looks at an invoice and it goes to the vendor and writes an essay about rainbows and butterflies, it's contextually irrelevant because you want to go and negotiate. So, is it contextually relevant? Is it getting the right kind responses, right? Is it factually accurate? So, there are different metrics to do this, and there is no one metric fits all for AI agents. And so, for platforms like us, what we do at Zams is … Zams is a platform to build these AI agents for enterprise scale and really nuance operations.
And so because of that, we have really fundamentally thought through every single type of AI agents that you're building and to really define some templates to say, Hey, look, if you have this sort of an AI agent, this is the best way to evaluate it, and these are the metrics you want to include in it. And that's what we call Zams precision index. That's kind of what we do. And as more and more evolutions in the industry come along, there are lots of research scientists in the world that are really thinking about this problem more fundamentally. And as they come up with different metrics and things like that, we keep on incorporating that. So you, as an audience, really have that perspective. But when it comes to evaluating these models, that's really what the core is, is you again, as a director, need to define: what is success for my agent? Is it do I want my agent to be factually very correct? Then maybe faithfulness score is the way to go. Do I want my agent to be contextually super relevant? Then contextual relevance is the way to go. Do I want my agent to have less incorrect items? Maybe I should focus on contextual precision and recall as my metrics.
So there are all of these different metrics, but you define what is it that you want your agent success to be? And we've seen that very different in different industries. We work with a manufacturing company that has built, an agent that knows all the information about all of its equipment and gives their factory workers the ability to quickly answer and troubleshoot when an equipment fails. So, they built this AI agent - for them accuracy is significantly important. So, they really optimize the faithfulness score. On the other end, and when we talk about the invoice processing agents, for them contextual relevancy is very important. They don't need to know to the dot exactly how much cent they want to save, but they really need to know what is the motivation for the supplier to give a bigger discount. So really kind of take that angle. And so, these metrics really are defined based on the type of direction you really want to take.
Host:
With the rise of automated programs, what are the advantages for engineers, and what are the security risks?
Nirman Dave:
So, I think these tools are really, ultimately any sort of language model is a reinforcement learning tool. Which reinforcement learning is a type of algorithm where you don't tell the algorithm what to do, you tell it what the outcome is and it'll go figure it out. And so, if you take a reinforcement learning model and you say, hey, the outcome is to always give me the best color combination to wear today in the office and tell me what color T-shirt should I wear? What color pants should I wear? That's it. That's the outcome. And every day you do that, it becomes so good at learning your preferences, learning what looks good on you, and it always gives you the best color combinations possible. Why? Because the algorithm is designed to become really good at that one task. And what we're seeing today in the world is people are taking these language models and putting them to use on specific tasks, and it becomes really good at it.
For example, there's a company called Cursor. What they did was they took this language model and they put it on a specific task, which is to write code for developers, and it's really good at that. Similarly, there's another company that takes language model, puts it to use for legal documents, and it only evaluates and redlines the legal documents. And it's really good at that task. And one company takes that and puts it only for recruitment and hiring and it’s really good at that. So, there's this ability to take these language models and put it to a specific task and make it really good. So, when you talk about these automated developer tools, one of the tools that does this really well is - it takes all of this language model, puts it to use and says, I want to build the best developer code. So anytime a developer is working on something, they get stuck that say, hey, how do I write a code that will automatically go through my library and do X, Y, Z tasks and this system automatically generates this code for them.
So, it's what we are seeing with a lot of no-code style tools or automated tools where these tools are taking up these language models and really becoming very much refined on that. I think that has really exponential advantages. What we've seen alone at Zams is our engineers built out features and products that takes typically months in less than a week because they don't have to sit down and say, oh, I ran into this error. How do I go and fix it? And previously what used to happen is you run into an error, you would go to a stack overflow or some sort of a website, you'd say, Hey, this is an error that I have. You would see how other people in the world have kind of solved that error and you'd go ahead and attempt each one of these solutions. Now because you have a language model that does that really well, the model will, itself, find the best combination, find the best outcome and give it to you so you don't have to waste your time trying to figure a lot of that kind of nitty gritty stuff out.
So, I think the speed is a strong advantage, and I strongly believe that in this world, speed is a competitive advantage. A lot of the reasons that a startup, that group of startups today can exponentially grow their revenue. What we've seen with Zams is because speed is our competitive advantage as opposed to some of the larger companies in a space that take a while to get there because of this. So I think, fundamentally, what these automated programs and tools do is it gives you the speed advantage. And now we're seeing it with some of the larger companies as well, where the companies that we work with are like, let's say Fortune 200 companies that we work with. They use Zams in a way that it now speed becomes their competitor band. They can build things out significantly faster in terms of automations, in terms of writing code, things like that.
Security risks, not so much. I mean, if you'd asked me three months ago, are there security risks with language models and things like that, I would've said, yeah, sure there there's some security risks because your data is flowing to let's say open air servers and things like that. But now the world has changed a lot. We have a lot of open-source models like Facebook's LAMA model, which is completely open source. It can stay in your servers, your data doesn't have to go anywhere, but really this model does the same kind of job and gives the same kind of output that other language models would. So security risk isn't that significant. Now, the biggest limitation is creativity and how you really want to mold this language model into something that's relevant for you.
Host:
How do you think engineers can use AI based tools effectively when they're enhancing their designs?
Nirman Dave:
Historically as an engineer, you would always have to tell the system what comes first and what comes next. So that's kind of a logical thinking way. First, go through my inbox, find the invoices, then evaluate the invoices. If there's a negotiation to be done with a vendor, draft up an email, send it to the vendor and do this. It's always an iterative process where you're saying, do this, do this, do this, do this, do this. Now there's a fundamental way, fundamental shift in the way you execute as a programmer. So let's take for example, an AI agent. You're no longer saying, do this, do this, do this. You're saying, look, hey AI agent, you have five actions in your toolbox that you can do. Action number one, send emails. Action number two, read invoices. Action number three, think of negotiations. So you have these three or four actions that you've given to the agent, and now the agent's job is to say, decide when to run which action.
So it would say, hey, my outcome is to get at least $20,000 of savings on all my suppliers and their costs. As an agent, you go figure it out, what you have to do. Here's access to my Gmail, go figure it out. And the agent knows that it has three actions. One is to read invoices, second is to send emails, and third one is to draft an action. So that's kind of what it has in its toolbox. So always the agent's job now is to figure out which action to use when and when to sequentiate them. So, they would say, oh, you know what? It seems like I have access to this Gmail. Let me just read the invoices. Let me just draft this thing. Let me just send it and let me just calculate the total amount of savings I've done. So as a programmer, the way you interact with agents is very different.
Instead of saying, do this, do this, do this, now we're saying, hey, this is everything that you have in your toolbox. Now as a programmer, you can say, hey, I'm actually curious what happens if I give the agent a marketing campaign in its toolbox and I say, hey, agent, you now have the ability to run marketing campaigns as well. And you let the agent figure it out, and maybe the agent comes up with a situation where it's like, you know what? Instead of getting a negotiation from a vendor, I can provide them a marketing campaign that we can co-run together and that can give us a little bit more savings or things like that. So, it will naturally include that as a part of its actions and send it out. So, this is one of the things that you can do with AI agents, and that's how you typically start to think is not to say, do this, do this, do this, but to say, here's everything that you have, that you need, go find the best outcome.
And so, I think the job of a developer fundamentally changes in how they execute. And specifically, then it also changes in how they write code. Instead of me solving each problem myself, I can just have agent write some boring code that I don't have repeatedly write again, one of the examples is as a developer integrating with old CRM tools, for example, like Salesforce is a little bit of an intense task because there's so many complications. You have to go through so many different things. You have figure out how to integrate my app with this, but with an AI agent or with any of the automatic code writers, it's just a matter of throwing in a prompt, copying the outcomes that come through and you can move on. And so I think it fundamentally changes how the developers think and execute about what to do. So that's a very interesting shift because for years and years in college, every developer, everyone that's studied engineering is taught, here's how you talk to machines, do this, do this, do this, do this. But now the way you talk to machines is very different. So that fundamental change is very, very interesting.
Host:
Instead of being sequential, it is much more of a matrix. Is that accurate?
Nirman Dave:
Correct. It's much more broad. And again, if you think about it, what is fundamentally important at that time, now is no longer to do sequential directions, but it's knowing what is the outcome you want and really orchestrating that. So, I think every job almost is going from an execution role to a director role where it's like instead of just doing an IC style work, you're being more director saying, this is the outcome and you let the agent figure it out.
Host:
Do you think we'll get better designs that way? Will we get better iPhones?
Nirman Dave:
We could. We need directors who really can think what better means and if they can very well define that, I think we can get there. But I think things will eventually become super optimal and better in a lot of different ways. Hey, we need better phones or we need better this, we need better that. But the question is, how do we define better? And is the definition of better different from company to company, product to product, user to user? And I think that's the kind of world that would be very interesting to live in where maybe your definition of better phones is very different than my definition, better phones. But then we both get the experience that we define as better for ourselves because the AI agents really know how to take these tools and really take the right kind of actions. So I think there could be a world like that, and that would be very exciting.
Host:
Your startups have had very similar themes of empowerment, and technology accessibility and ease for users. Do you have a philosophy on the relationship between humans and machines or humans and tech?
Nirman Dave:
You know what? The funny thing is, I never saw that relationship until an article on TechCrunch came out and it had a couple of sentences that says, one of my previous startups had a very similar philosophy of taking something complex and giving it to kids and making it super easy, similar to what we're doing at Zams, where we're taking something so complex like AI and giving it to enterprise and making it super easy. And I didn't really draw those parallels until I read that article and I was like, that's interesting, because it wasn't something very intentional that I went about doing and said, let's do this for every product. But as I think about it a little bit more, I've come to realize that there are two really strong differentiators for any company. And in my personal opinion, the two strong differentiators are speed of execution and user interface and design.
And I think those are fundamentally so important to building really great products. And when you think about user interface and design, naturally great design makes things accessible. It's not like you are trying to go and make things accessible, but if you really understand the craft and the nuances of how these agents work, what are they built for? You will naturally craft designs that make it simpler to understand for the user and eventually make it more accessible. And so my philosophy has been making speed are a competitive advantage. So we move really fast as a company. We make our decisions very, very fast. We move very fast. We are not wasting our time doing a lot of meetings, so we are executing day in and day out, and we're also fundamentally saying, does this make sense. There are traditional ways of doing a lot of different things.
Does it really make sense? We need to do that. And we're questioning every single toggle, every single button on our platform and saying, do we really need this? Does the user really care about this? Can this just be an automated thing or should we give the user a lot more control over here? So that's how we can really think about it. And that's really an example of this is when it comes to the matrix that we talked about. When it comes to evaluating AI agents, there are different measures of accuracy that can use to evaluate the AI agent, right? One is faithfulness score for the accuracy of the factual. One is the contextual recall and precision, so on and so forth that we just talked about. Now, the question we asked ourselves is, does our user really care about knowing all of these details or is it a side kind of thing for them?
So today, if you go to Zams’ platform, you can click a little button on the side that can show you those metrics if you really want to know. But it's never front and center. So we said, what is supposed to be the front and center? What does the user really care about? And ultimately our users really care about, is this performing well? I just want to know if it's performing well, and if I want to know the details and how well, and what are the nuances of it, I can click a button and see it. And that really led to the invention of the Zams Precision Index, which is our method of how we define whether a model is performing well or not. And that encompasses all these other metrics that are there in the world to create something that's more global and universal for everyone to quickly take a look at and go, oh, my agent is doing well, I don't need to worry about it. So we really think that great design can become a strong competitive advantage, and that's how we really think about the products that we build as well. And that leads to accessibility.
Host:
If you're interested in more content on AI-based engineering tools, be sure to visit Mouser's Empowering Innovation Together page to explore a variety of articles, videos, and more. Stay tuned as we delve into the newest technologies including our upcoming feature on Additive Manufacturing. Visit, mouser.com/empowering-innovation.