Get GenAI guide

Access HaxiTAG GenAI research content, trends and predictions.

Sunday, May 12, 2024

The Path to Enterprise Application Reform: New Value and Challenges Brought by LLM and GenAI

As an enterprise application consultant,HaxiTAG LLM studio developer, we are initiating a series of discussions on the reformation of enterprise application software systems based on LLM (Large Language Models) and GenAI (General Artificial Intelligence). We will explore which application software and systems should undergo reformation with LLM and GenAI, and the new value that LLM and GenAI-driven reformation will bring to enterprises. We will also discuss how legacy IT systems can embrace new technological iterations and upgrades to better serve production experience, value creation, and return on investment, thus enhancing the delivery of innovative value.

This is one piece of the series, focusing on the entry points and use cases of enhancing efficiency in IT development with LLM and GenAI.


Exploring Enterprise Application Software Systems Reformation with LLM and GenAI: Unveiling Entry Points and Use Cases for Efficiency Enhancement

First and foremost, we need to recognize that search is one of the hardest technical problems in computer science. Only a handful of products, such as Google, Amazon, and Instagram, do it well. In the past, most products did not need great search - it wasn't core to the user experience. But with the explosion of LLMs and retrieval systems to support them, every LLM company suddenly needs to have world-class search embedded within their product just to make it work.

Meanwhile, retrieval is a critical component of LLM systems, and it's not going away. Retrieval-augmented generation (RAG) systems deliver relevant information to an LLM to help it respond to a query. This grounds LLM generation in real and relevant information. Imagine an LLM is answering a question on a history test. Without RAG, the LLM would have to recall the information from things it's learned in the past. With RAG, it's like an open-book test: along with the question, the model is provided with the paragraph from the textbook that contains the answer. It's clear why the latter is much easier.

However, finding the right paragraph in the textbook may not be easy. Now imagine trying to find one code snippet in a massive codebase, or the relevant item in a stack of thousands of invoices. Retrieval systems are designed to tackle this challenge.

New LLMs have longer context windows, which allow them to process larger inputs at once. So why take the effort to find a paragraph out of the textbook if you could just load in the entire book? However, for most applications, we think that retrieval won't go away even with >1M token context windows:

- Businesses often have multiple versions of similar documents, and delivering them all to an LLM could present conflicting information.

- Most interesting use cases will require some type of role and context-based access control for security reasons.

- Even if computation becomes more efficient, there's no need to incur the cost and latency to process thousands or millions of tokens that you don't need.

As LLM prototyping exploded, people quickly turned to semantic similarity search. This approach has been used for decades. First, separate data into chunks (e.g. each paragraph in a word document). Next, run each chunk through a text embedding model that outputs a vector which encodes semantic meaning of the data. During retrieval, embed the query and retrieve the chunks with the nearest vector representations. These chunks contain the data that (in theory) have the most similar meaning to the query.

Semantic similarity is simple to build, but results in pretty mediocre search. Some key limitations of this approach are:

- It can miss useful content that is semantically different from the query. Users don't always clearly articulate what they mean, or a query might not include context that would be relevant to their request. (e.g., a customer describing a product vaguely, or not mentioning a recent purchase).

- The approach is sensitive to the embedding model. A general text embedding model might not perform well in your domain.

- They're very sensitive to how you process input data. The system will function differently depending on how you parse, transform, and chunk your input data. Dealing with different types of data (e.g. tables) is complex.

- Even with optimizations, text embeddings are expensive to compute. This hinders the ability to iterate on ingestion and embedding pipelines, and to serve applications that need near real-time data.

To the first point, this approach only searches based on the semantic meaning of the query. If you look at any of the companies that do search well, semantic similarity is only one piece of the puzzle. The goal of search is to return the best results, not the most similar ones. YouTube combines the meaning of your search query with vectorized predictions of what videos you're most likely to watch, based on global popularity and your viewing history. Amazon makes sure to prioritize previous purchases in search results, which it knows you were probably looking to re-order.

The future of retrieval systems will be complex and will likely end up looking like today's production search or recommender systems. The problems are not that different: from a large set of candidate items, select a small subset that is most likely to achieve some goal. Today, most retrieval systems look like a combination of semantic similarity search and rule-based approaches. But in the future, they may incorporate a variety of signals, such as user behavior, global popularity, and more.

In conclusion, the reformation of enterprise application software systems based on LLM and GenAI will be a complex process that requires significant investment and exploration from enterprises. However, this reformation also has the potential to bring great value to enterprises and even change the way they operate and compete. Therefore, we encourage enterprises to be bold in exploring and investing in this area, and to work closely with industry experts and consultants to ensure the success of their reformation journey.

Key Point Q&A:

  • What is the significance of search in the context of LLMs and enterprise application software systems?
  • Search plays a critical role in LLMs and enterprise application software systems, as it is considered one of the hardest technical problems in computer science. With the emergence of LLMs and retrieval systems, every LLM company now requires world-class search embedded within their product for it to function effectively.

  • How do retrieval-augmented generation (RAG) systems contribute to LLM functionality?
    RAG systems deliver relevant information to an LLM to assist in responding to a query, grounding LLM generation in real and relevant information. This approach makes it easier for LLMs to answer questions by providing context similar to an open-book test scenario.

  • What are the limitations of semantic similarity search in LLM systems?
    Semantic similarity search, while simple to build, results in mediocre search performance due to several limitations:It may miss useful content that is semantically different from the query.
    It is sensitive to the embedding model and how input data is processed.Text embeddings are expensive to compute, hindering the ability to serve applications needing near real-time data.