Jeroen Herczeg
A Retrieval-Augmented Generation (RAG) system combines information retrieval with Large Language Models (LLMs) to improve the quality and relevance of generated text. This allows LLMs to access up-to-date or private information and provide factual answers with verifiable sources.
When you first learn about RAG, it might come across as a simple system meant to improve the accuracy of a Large Language Model. But once you start implementing it, you realize that it's quite complicated and requires a good grasp of retrieval and generation techniques.
In this book, we will start by examining how large language models work, as well as their limitations and challenges. Next, we'll take a close look at the RAG architecture and how it can improve the performance of a language model. Finally, we'll discuss the most frequent difficulties encountered when developing a RAG application.
Throughout the book, you will learn through live interactive examples that will help solidify your understanding. By the end, you will have the confidence to deploy powerful RAG applications that solve real-world problems.
Get the first chapter →
“Hands-On Retrieval-Augmented Generation” is comprised of 240 tightly edited pages designed to teach you everything you need to know about Retrieval-Augmented Generation with no unnecessary filler.
What is Retrieval-Augmented Generation?
Hallucinations and Knowledge Gaps
Knowledge Cutoff
Limited Contextual Understanding
Observability
Question-Answering System
Conversational Agent
Real-time Event Commentary
Content Generation
Create an OpenAI Account
Create an Assistant
Adding the Data
Potential Drawbacks
Model Architecture
Weights and Biases
Tokenization
Training
Inference
Settings
Sliding Window
Attention Mechanism
Memory
Anatomy of a Prompt
Zero-shot Prompting
Few-shot Prompting
Chain-of-Thought Prompting
Transfer Learning
Domain-Specific Fine-Tuning
Document Formats
SERP and REST APIs
Web Scraping
Databases
PDFs, Images, and Multimedia
Text Splitting
Converting Unstructured to Structured Data
Dealing with Noisy Data
Keyword Search vs Semantic Search
Definition of a Vector
Norms, Distances, and Similarities
Vector Operations
What are Embeddings?
Types of Embeddings
Training Embeddings
Pre-trained Embeddings
Cosine Similarity
Euclidean Distance
Dot product
Introduction to ANN Search
ANN Algorithms
Trade-offs in ANN Search
Introduction to Vector Databases
Vector Search Engines
Vector Indexing and Querying
Introduction
Implementation
Introduction
Implementation
Introduction
Implementation
Introduction
Re-ranking Strategies
Introduction
Hybrid Search Strategies
Introduction
Multi-Modal Search Strategies
Contextual Embeddings
Dynamic Embeddings
Hosted Services
On-premises Deployment
Model Versioning
CI/CD Pipelines
Evaluation
Monitoring
Horizontal and Vertical Scaling
Performance Tuning
Cost Analysis
Monitoring
Data Privacy
Model Security
Compliance
Explore interactive examples deployed on HuggingFace Spaces to accelerate your understanding as you progress through the book.
Enter your email address and I’ll send you the first chapter from the book for free.