Retrieval Is All You Need


FullStackRetrieval.com

My new home for retrieval related content

Welcome to the 135 people who have joined us since last week! If you aren’t subscribed, join 3,189 smart AI folks. View this post online.

The bottleneck is not GPT-4

GPT-4 is actually pretty smart, ask it to make a decision, and you’ll likely get the right answer.


Give it a few examples of previous successful decisions and you’ll bump your accuracy even further.

This led me to a realization, the limits of our ability to do more with Large Language Model (LLMs) isn’t necessarily the model itself, but rather the information we give it.

Give GPT-4 the right information and you’ll provide value to users.

So it’s not a reasoning challenge, it’s a retrieval challenge.

Retrieval is the process of gathering, storing, and serving your applications the data it needs. The concept of serving applications data is as old as computing itself, but in the unstructured world of LLM application building, new a retrieval mindset help us make magic.

All my optimization strategies came back to retrieval.

I constantly think about:

  • What’s the best way to collect my data?
  • How should you split it into manageable pieces?
  • Where should you store it?
  • How should you index
  • What’s the best way to actually retrieve data?

Retrieval is the umbrella that covers most of your application building. It’s the iceberg under your front end.

It’s also extremely hyped (2nd to only agents and maybe evals). But yet it’s still massively underrated.

As an educator in space I have a keen eye on how information is presented - with retrieval, I haven’t seen the take on it that I want to see. So I'm taking a stab at fixing this.

My retrieval content is finally getting a home: FullStackRetrieval.com

video preview

FullStrackRetrieval.com Trailer

(As a subscriber you're getting the beta access, you may seem some rough edges, I'll launch on Twitter later)

First thing new members will get is 1 advanced retrieval method a day for 5 days sent to them.

This will include:

  • Philosophies - How should think about chunking, top k, and retrieval methods?
  • Interviews - Let's hear how production retrieval is happening
  • Tutorials - Advanced RAG, building retrieval into your GPTs/OpenGPTs
  • Videos & code samples - Walkthroughs

When I try to understand topics I need to start with the big picture, then understand how its sub-components relate to each other.

Instead of thinking about retrieval in the abstract, here is my “least-wrong” tactical view.


Let’s break down into parts:

  1. Query - The initial piece of data that will guide the retrieval process. This can be a user question, chat history, image, audio, prompt, table, or other various data.
  2. Query Transformation - The process of modifying or reformatting the original query to make it more suitable for retrieval. It is not required to transform your query.
  3. Raw Data Source - The home and original collection of information. Unprocessed and unstructured from which you'll extract. This could be websites, images, pictures, other applications, you name it. There may be multiple data sources.
  4. Document Loaders - Tools or functions that extract data from a raw source
  5. Documents - Individual units of data or information that have been extracted and are ready for indexing. This might be individual pieces of texts, single customer records, etc.
  6. Index - A data structure that organizes data or information from your documents that makes retrieval faster, more efficient, or better performing.
  7. Knowledge Base - A structured repository that contains indexed documents from which the retrieval process extracts documents from. This is often the combination of a vector store and document store.
  8. Retrieval Method - The technique or algorithm used to search for an extract the most relevant documents from the knowledge base in response to a query
  9. Relevant Docs - The subset of documents that the retrieval method determines to be most useful in addressing the query
  10. Document Transform - The process of further refining or reformatting the relevant documents to make them more suitable for the language model. This could include summarization, compression (removing information) or other transformations.
  11. Context - The combined content derived from the transformed documents that provide the necessary background or information for the language model to generate its response.
  12. Large Language Model (LLM) - The model that will generate a response based on the context and prompt it's given
  13. Prompting Method - The technique or method used to present the context to the language model. This also includes chaining different prompts together.
  14. Response - The final answer or output generated by the language model based on the context and prompting method.

The point isn’t to cover 100% of retrieval permutations, but rather have a starting point for a discussion.

I tried bullet-proofing this view on twitter, but I’d love your feedback, what do you think?

All of the content is free for now, I may do some paywall in the future but as friends, email me and you’ll get the family discount.

I’d love to hear a few things from you:

  • What’re you working on? How can I help?
  • What’re your thoughts on new OpenAI GPTs, has one provided value to you? If so, which?

In case you missed it

  • This analysis on GPT-4-Preview's context length got more traction than I was expecting
  • Awesome post from Charles Frye that extends the "LLM OS" metaphor made famous by Andre Karpathy


Greg Kamradt

Twitter / LinkedIn / Youtube / Work With Me

Unsubscribe

Greg's Updates & News

AI, Business, and Personal Milestones

Read more from Greg's Updates & News

Sully Omar Interview 2 years of building with LLMs in 35 minutes Welcome to the 100 people who have joined us since last week! If you aren’t subscribed, join 9,675 AI folks. View this post online. Subscribe Now Sully, CEO Of Otto (Agents in your spreadsheets) came on my new series AI Show & Tell I reached out to him because you can tell he feels the AI. His experience is not only practical, it's battle tested. Sully's literally built a product of autonomous async agents that do research for...

Joining ARC Prize How the cofounder of Zapier recruited me to run a $1M AI competition Welcome to the 2,450 people who have joined us since last post! If you aren’t subscribed, join 9,619 AI folks. View this post online. Subscribe Now "We gotta blow this up." That's what Mike Knoop (co-founder of Zapier) says to me in early 2024. "ARC-AGI, we gotta make it huge. It's too important." "Wait, ARC? What are you talking about?" I quickly reply. "It's the most important benchmark and unsolved...

Building a business around a commodity OpenAI's models are a commodity, now what? Welcome to the 296 people who have joined us since last week! If you aren’t subscribed, join 3,939 AI folks. View this post online. Subscribe Now Large Language Models are becoming a commodity. We all know it. So if you’re a foundational model company, what do you do? You build a defensible business around your model. You build your moat. Google famously said they have no moat, “and neither does OpenAI.” But...