Artificial Intelligence (AI), and particularly Large Language Models (LLMs), have significantly transformed the search engine as we’ve known it. This presents businesses with an opportunity to enhance their search functionalities for both internal and external users. With Generative AI and LLMs, new avenues for improving operational efficiency and user satisfaction are emerging every day. Let’s take a look at some of the many ways businesses can benefit from these new models to streamline operations and deliver faster and more accurate results. We’ll begin by looking at some real-world examples and then we’ll dive into more details of how these improved search capabilities can enhance your business.

Real-World Examples

Here are a few examples of search improvements where we’ve leveraged LLMs for improved search capabilities:

Streamlining Internal Documentation Access

Our team addressed the all-too-common challenge of scattered internal documentation by creating a tool that allows for fuzzy search across documents, with the ability to pose natural language questions, receiving intuitive natural language answers that might not be directly present in the documentation. Utilizing AI, the tool interprets the queries, scans through a broad range of documents spread across different repositories, and delivers relevant answers. This enables internal users to swiftly find answers to queries such as “What is our internal IP address?” or “How can I modify my direct deposit account?” This has significantly reduced the time team members spend searching for information, enabling them to stay focused on their core tasks.

Enhancing Front-End Product Search

We also developed a front-end search tool that revolutionized the way users search for products. While traditional search systems are bound by the constraints of keywords, fields, and specific taxonomies, this AI-powered tool embraces the concept of fuzzy searching. Rather than matching exact terms and requiring specific field values as inputs, it interprets and understands user queries, pinpointing the desired types or categories. Even if certain categories aren’t predefined in the database, users still receive relevant product suggestions. For instance, if a user searches for “low carb options,” even though “low carb” might not be a category saved in the database, the LLM can still match and suggest “keto” products, understanding the user’s intent. This advanced approach greatly enhances the user experience, making product discovery more intuitive.

Content Recommendations in Streaming Services

While many streaming platforms suggest content based on viewing history, LLMs have the power to significantly elevate this experience. If a user searches for “movies similar to Inception with a twist ending,” the LLM can delve into its vast knowledge of movie themes, plots, and reviews, and generate a curated list of recommendations that might not be explicitly tagged in the platform’s database. This nuanced approach can cater to the most specific of viewer moods and preferences, ensuring a seamless entertainment experience.

Recipe Suggestions for Culinary Sites

For culinary platforms or recipe websites, user searches can be incredibly varied, given the diverse range of ingredients and dietary preferences. With an LLM-driven search tool, when a user inputs specific ingredients they have on hand or mentions a particular dietary restriction, the platform can craft unique recipes that aren’t just pulled from a database, but are generated in real-time. An example of this would be: “carrots, chicken, and bok-choy.” Because those ingredients didn’t match an existing saved recipe, the LLM generated:

Quick Carrot, Bok-Choy, and Chicken Sauté


  • 1 medium-sized carrot, julienned
  • 1 bok-choy, roughly chopped
  • 1 chicken breast, diced
  • 2 tablespoons soy sauce
  • 1 tablespoon vegetable oil
  • 1 clove of garlic, minced
  • Salt and pepper to taste


  1. In a skillet, heat the vegetable oil over medium heat. Add the diced chicken, seasoning with salt and pepper. Cook until browned.
  2. Add the minced garlic to the skillet, sautéing briefly until fragrant.
  3. Introduce the julienned carrot and chopped bok-choy. Stir-fry for 3-4 minutes, or until the bok-choy is slightly wilted.
  4. Pour in the soy sauce, stirring well to coat the chicken and veggies.
  5. Serve hot with your choice of side, or enjoy as is for a light meal.


So let’s summarize what we’ve already found to be different about searching with LLMs:

What’s Different About Search with LLMs

Search methods powered by LLMs stand out as transformative tools, offering businesses an edge in their information retrieval and customer engagement processes. One of the most compelling features of LLM-driven search is its ability to perform “fuzzy” searches as opposed to the rigid keyword match approach of traditional systems. In layman’s terms, while conventional search mechanisms demand exact phrases or keywords to return relevant results, LLMs can understand and interpret the intent and context behind a query. This means users can pose questions or enter queries in a more natural, conversational manner without being limited to specific keywords. The results are not only more aligned with user intent but also often more comprehensive.

Moreover, LLMs come equipped with an extensive knowledge base derived from the vast amounts of data they’ve been trained on. This expansive, and ever-increasing knowledge base allows them to provide insights, answers, and context that may not even exist in a business’s specific dataset or repository. For businesses, this means tapping into a broader informational spectrum without the need for manual data entry or updates. When integrated into business search tools, LLMs can drastically reduce the gap between user queries and the most relevant, context-rich results. In essence, an LLM-powered search doesn’t just fetch data—it understands, interprets, and often augments it, providing businesses with a dynamic tool that continually adds value.

As we dive deeper into the capabilities of LLMs in search, it’s essential to have a clear understanding of the kind of data these models deal with. Broadly speaking, data can be categorized as either structured or unstructured.

Structured Data refers to information organized in a defined manner, making it easier to search and analyze. This data is typically arranged in rows and columns, akin to what you’d find in databases, spreadsheets, or CSV files. Such a format is convenient for traditional search methods, where specific fields can be queried directly.

Unstructured Data, as the name suggests, lacks a clear structure. This category encompasses a vast array of content, from emails and text documents to social media posts. Searching through this data isn’t as straightforward due to the absence of predefined fields or categories, which is exactly where many conventional search systems fall short.

Given their training on diverse datasets, LLMs excel in parsing and understanding unstructured data, providing contextually relevant results. Where traditional search methods might stumble, LLMs can traverse this complex landscape, delivering insights from unstructured data sources with the same, if not higher, efficiency as they do with structured ones. This proficiency ensures that businesses can unlock the full potential of all their data, irrespective of its format.

The other really interesting aspect of search with LLMs is the range of possibilities it enables on the generative side. Beyond merely retrieving relevant information or documents based on a query, LLMs can generate entirely new content or responses tailored to a user’s specific request. These enhanced capabilities transform the search experience from a passive retrieval process into an active, dynamic interaction. For instance, if a user seeks advice on a particular topic, instead of just presenting pre-existing articles or references, an LLM can synthesize its vast knowledge base to produce a coherent, contextually apt, and unique response on the spot. Such generative prowess can be especially invaluable for businesses aiming to offer real-time solutions, personalized advice, or innovative content suggestions, ensuring they remain a step ahead in delivering unparalleled user experiences.

LLM Search Techniques

The following are some terms you are likely to hear as you dive into search solutions using LLMs:

RAG (Retrieval-Augmented Generation): RAG queries proprietary data and provides those as part of the prompt to fuse relevant documents with the LLMs knowledge base. In other words, first, potential answer-containing documents are fetched. Subsequently, a generative model crafts a detailed answer using the retrieved data. This is table stakes for the kinds of solutions we are describing above.

Initial Query Refinement: Initial query is the starting point in any search process where a user provides an initial query to the system. The quality and specificity of this query can significantly impact the success of the search. Following the initial query, there is often a need to refine the query based on user feedback or additional contextual information to narrow down or better direct the search towards the desired information.

Multiple Passes: This involves making several passes over the data to iteratively refine the search results. Each pass may use the information gleaned from the previous ones to improve the accuracy and relevance of the results.

FLARE (Forward-Looking Active Retrieval augmented generation): FLARE uses the prediction of upcoming sentences to anticipate future content. This anticipated content then serves as a query, guiding the retrieval of pertinent documents. It’s a proactive approach, ensuring that the search mechanism stays a step ahead.

Our expertise lies in navigating through these various techniques in order to determine the best solution across a variety of situations.

Ensuring Accuracy: How to Test Results

Upon deploying AI-driven search tools, validating their accuracy is paramount. Here’s a suggested approach:

Ground Truth Creation: Design a test dataset with established answers or recognized documents to serve as a reference point.

Precision and Recall Metrics: Define a process to assess the precision (the relevance of retrieved documents) and the recall (how many of the relevant documents were fetched). Strive for a balanced outcome.

User Feedback: Enable users to grade the relevance of search outcomes, offering invaluable real-world insights.

A/B Testing: Run the AI-enhanced search solution concurrently with a conventional one, juxtaposing their real-time performances.

Ongoing Monitoring: Continuously gauge the system’s efficacy, making ongoing, iterative adjustments based on evolving data sets or user preferences.

How to Proceed

Our experience has shown us that these initiatives begin with a company (a) reviewing existing search solutions to see what users like and don’t like about them, or (b) looking at proprietary data that they would like to be able to search, or (c) finding places where users would like answers that could be fueled by a combination of an LLM and proprietary data. These all become potential targets that you can evaluate for feasibility/level of effort as well as short- or long-term value. The next step is determining your businesses’ priorities. We often work with clients both prior to, and during these discussions.

With the priorities determined (i.e., high value/low effort), the next step is a Proof of Concept (PoC). This is an early version of the solution that can be expanded over time. Implementations will tend to grow more expensive as you improve them by iterating on prompts, using more advanced techniques and adding testing. However, you often get a 70% solution pretty quickly through a Proof of Concept – and it’s often already better than the existing search. A PoC is a pragmatic step to evaluate the feasibility and results. It gives stakeholders an opportunity to assess the benefits and challenges before making additional investment on iterative improvements.

Assuming you do decide to improve the Proof of Concept, then typically you iterate using an agile process. We recommend that you utilize the steps outlined in the “Ensuring Accuracy” section to validate as you iterate so you can closely monitor the progress being made. Through this iterative process, you’ll be able to address quality issues and optimize results.

Wrapping Up

The era of simplistic keyword searches is rapidly changing over to the advanced capabilities of AI and LLM-driven search techniques – often starting with a parallel implementation of the two – see Google and Bing as prime examples. With the capability to do fuzzy searches across structured and unstructured data and generate different kinds of responses, the results speak for themselves. Even better, initial implementations are often affordable and relatively fast.

As with everything involving LLMs and Generative AI, this is an area that is advancing rapidly. RAG is well established, but methodologies like FLARE and query refinement offer exciting opportunities to make this increasingly powerful. Mechanisms for ensuring accuracy are also improving quickly.

Our team is excited to be working at the forefront of this revolution. We love to talk with people who are considering adopting these approaches. Feel free to reach out and discuss with us.