Skip to main content

Give AI Tools, Get Better Results

Image Source: Andrey Popov/Stock.adobe.com

By Becks Simpson for Mouser Electronics

Published August 18, 2023

Large language models (LLMs) like ChatGPT and GPT-4 have become indispensable tools for boosting productivity and gaining a better understanding of various topics. From education to software development to content writing, LLMs have emerged in numerous domains with an almost magical ability to distill and generate information for human consumption. However, despite their impressive capabilities, LLMs sometimes struggle to provide accurate answers or perform specific tasks that require precise knowledge. For example, complex mathematics problems or inquiries about obscure subjects often generate incorrect or inadequate responses. These limitations arise primarily because LLMs are trained, often on outdated data, to predict the next most statistically likely word in a sentence instead of reasoning to find the correct answer. To overcome these challenges and enhance the accuracy of LLMs, researchers and developers are creating tools and updating how models can interact with them to build artificial intelligence (AI)-powered agents that can engage with the world and access richer information and expertise.

Using AI without Help

LLMs seem almost miraculous in their ability to answer questions on a wide range of subjects. In fact, they’re so effective that people are increasingly integrating them into their daily lives to boost productivity and improve understanding of subjects; LLMs treat these interactions like a condensed, more succinct version of a traditional search engine. LLMs are venturing into education, where students use them to explain concepts better, and into software development, where programmers use them to write and understand snippets of code. Many content and writing professionals also are using them for such tasks as summarizing, document-writing, and brainstorming. However, the underlying AI struggles with several of these applications, typically because very precise knowledge is required to answer a question or complete a particular task. Some common examples include LLMs that fail to get the right answers for mathematics problems or that produce wildly incorrect results for requests about very niche subjects like the history of one’s tiny hometown or some lesser-known celebrity.

As mentioned, these errors occur because the LLMs are trained on vast swaths of the internet with the objective of producing the next most statistically likely word in a chain (Figure 1). Basically, they have memorized this information but in an extremely rough fashion—the AI’s knowledge is more of a rough approximation of many topics. The less well-represented a piece of information is among all data on which it was trained, the less likely the model will be to reproduce that information correctly. For example, these LLMs learn a vague representation of addition or subtraction after having seen many textual data points like 1+1 = 2, and usually they can produce the correct answer with their statistically probable responses. However, given a more complex input like 649  152, which they likely have not seen before, the answer is often wrong either because the training data used are older than the answer required or the information needed comes from a source that is hidden from public view (e.g., a website’s database). Examples of poor responses might be hallucinating the prices of hotels or flights or giving answers that were correct a year ago but not now. To circumvent these limitations and supercharge the AI to give more accurate responses across a wide range of applications, tools are being crafted that allow LLMs to interact with the world around them for access to richer information and more bespoke expertise.

Figure 1: LLMs by themselves are trained to predict the next most statistically likely word—a task in which they succeed due to vast amounts of training data. (Source: Author)

Identifying Tools

In the world of LLMs, tools are external applications that a model interacts with in a specific way to derive or validate particular answers. Initial examples of tools included calculators, code runners, and search engines, but the list is growing daily with the addition of new tools like plugins to application programming interfaces (APIs), connections to databases and vector stores, and other machine learning programs like image-to-text extractors. The failed use cases mentioned previously (such as models being unable to calculate equations properly or giving wrong answers to trivia questions) can be fixed by using a tool like a calculator or a search engine, respectively. To generate good responses, the model leverages accessible tools instead of relying entirely on what it has learned. These tools are used whenever the model needs to calculate something, run some code to ensure accuracy, answer questions from the user’s data, or respond to a request that a user would typically make through a website.

Although these tools seem like simple things that a human easily understands and knows how to interact with, regardless of the interface, the AI model needs to wrap existing tools (e.g., calculators or website APIs) in code and make them interact. In particular, a tool needs to be software-bound such that its inputs and outputs can be done through code (Figure 2). For example, a website’s API could be used instead of its user interface, which removes the need to click buttons or toggle checkboxes and ensures that purely text input is sent to the tool. The other important piece of a tool’s anatomy is the description of when to use the tool, which is important so the model understands what to use the tool for and specific instructions on when not to use the tool. The latter may help if the tool is being overused.

Figure 2: Tools for AI look like software functions with code-based inputs, outputs, and usage descriptions. (Source: Author)

Teaching AI to Use Tools

Rewriting or wrapping existing human-facing tools to be suited to AI usage is insufficient, however. The other piece of the puzzle is updating the prompt flow that LLMs use to decide when to interact with tools. This process turns the model into an agent—one that understands how to produce a correct answer by combining its input, prompt instructions, and accessible tools. A typical interaction with LLMs involves writing a specific instruction for the model to execute, often with examples of what the response should look like given certain types of inputs. The process of developing these instructions is called prompt engineering. In the case of teaching AI to use tools, prompt engineering is extended to include the chain of decisions the model should make to complete its task. Using a software library like LangChain—available in Python and JavaScript, among other languages—users can combine an LLM (e.g., GPT-4), a set of tools (e.g., calculators, code compilers), and the agent structure to build AI applications that can use tools. More importantly, it allows the added functions of memory and response chaining, which enable much more powerful capabilities.

For example, this kind of chained logic-enabled, multitool-using agent is required to answer questions such as “What was our revenue last year divided by the number of sales made?” First, a calculator is required for the division and a lookup program is required to interface with the company’s database. Second, because this is a multistep problem, the prompting of the agent must involve observations of what steps it needs to take, like looking up the revenue from last year and how many sales were made. From each of these observations, the agent’s internal prompting will determine an action—for example, “use the lookup tool to search for the answer.” This process is then repeated as many times as required to obtain the final answer. In this case, the information about last year’s revenue and number of sales will trigger the observation that the agent should find the answer by dividing the two. This in turn triggers the action of inputting both into the calculator tool. Since no more steps occur, the agent will recognize that it has the final answer and provide it to the user (Figure 3). Chaining this series of observations, actions, and responses together means that the LLM-powered agents can complete dramatically more complex tasks than the LLM alone.

Figure 3: Agents with a specific prompt flow allow LLMs to interact with tools. (Source: Author)

Conclusion

While LLMs like ChatGPT and GPT-4 have revolutionized the processes of retrieving and generating information, they have certain limitations for tasks that require precise knowledge. However, researchers and developers have found a solution by incorporating tools into the LLM ecosystem. These tools, ranging from calculators and search engines to APIs and database connections, allow LLMs to interact with external applications and access richer information. By leveraging these tools, LLMs can overcome their limitations and provide more accurate responses across a wide range of applications.

Teaching LLMs to use tools effectively involves updating their prompt flow and turning them into intelligent agents. This prompts the models to make informed decisions about when to interact with tools, combining their input, prompt instructions, and access to tools to produce correct answers. In the future, as LLMs continue to evolve and their tool capabilities expand, we expect even more impressive applications and advancements. By harnessing the potential of LLM-powered agents with access to a diverse array of tools, we can unlock new possibilities for productivity, problem-solving, and knowledge discovery. The synergy between LLMs and tools is paving the way for a more intelligent and efficient interaction between humans and AI.

About the Author

Becks is a Machine Learning Lead at AlleyCorp Nord where developers, product designers and ML specialists work alongside clients to bring their AI product dreams to life. She has worked across the spectrum in deep learning and machine learning from investigating novel deep learning methods and applying research directly for solving real world problems to architecting pipelines and platforms to train and deploy AI models in the wild and advising startups on their AI and data strategies.

Profile Photo of Becks Simpson