/ Home

ReRanking Algorithms

Advanced Re-Ranking Techniques in Modern Retrieval Systems

Re-ranking is the second-stage optimization layer in multi-stage retrieval pipelines (e.g., BM25 → ANN → Re-ranker → LLM).
Its purpose is to improve precision@k, nDCG, MRR, and reduce hallucination risk in RAG systems.


1. MMR (Maximal Marginal Relevance)

Concept

Maximal Marginal Relevance (MMR) balances:

It prevents near-duplicate documents from dominating top-k results.

Formula

[ MMR = \arg\max_{D_i \in R \setminus S} \left[ \lambda \cdot Sim(D_i, Q) - (1-\lambda) \cdot \max_{D_j \in S} Sim(D_i, D_j) \right] ]

Where:

Characteristics

Limitations


2. Cross-Encoder Re-Ranking (Neural Rerankers)

Concept

Unlike bi-encoders (separate embeddings), cross-encoders process: [CLS] Query [SEP] Document in a single transformer forward pass.

Architecture

Strengths

Weaknesses

Common Models

Typical Pipeline

Stage 1: Retrieve top-100 via BM25/ANN
Stage 2: Re-rank top-100 via cross-encoder
Stage 3: Select top-5 for LLM context


3. Neural Diversity Rerankers (Beyond MMR)

Concept

Learned models that optimize both:

Instead of heuristic diversity (MMR), these use neural learning.

Methods

Determinantal Point Processes (DPP)

Encourages diverse subset selection via determinant maximization.

xQuAD

Explicitly models query subtopics.

Neural Subtopic Modeling

Transformer-based diversity scoring.

Strengths

Limitations


4. Transformer-Based Generative Retrieval

(GenRE / SEAL / SPLADE / ColBERTv2)

These blur the boundary between retrieval and generation.


4.1 GenRE (Generative Retrieval)

Concept

Model directly generates document IDs given a query. Query → Transformer → DocID tokens

Pros

Cons


4.2 SEAL

SEAL encodes documents as token sequences and retrieves via generation.


4.3 SPLADE

Sparse lexical expansion via transformer.

Advantages:


4.4 ColBERTv2

Late-interaction architecture:

Advantages:


5. Learned Global Reranking Frameworks

Concept

Instead of scoring independently, model:

Techniques

Listwise Learning-to-Rank

Transformer Listwise Models

Benefits

Challenges


6. LLM-Centric Reranking

(In-Context / Prompt-Based Rerankers)

Concept

Use a Large Language Model to rank candidate documents via prompting.

Example:

Given query Q and documents D1…D5, rank them by relevance.

Methods

Score-based Prompting

LLM outputs relevance score.

Pairwise Comparison

LLM compares D1 vs D2 iteratively.

Chain-of-Thought Ranking

LLM explains ranking before output.

Strengths

Weaknesses


Comparative Summary

Method Learning-Based Scalable Diversity-Aware Precision Cost
MMR Medium Low
Cross-Encoder High High
Neural Diversity ⚠️ High Medium
GenRE / SEAL High High
SPLADE High Medium
ColBERTv2 High Medium
Global Listwise ⚠️ Very High High
LLM Reranking ❌ (usually) ⚠️ Very High Very High

When to Use What (RAG Engineering View)


Final Engineering Principle

Retrieval quality impacts hallucination rate more than model size.

Re-ranking is not an optimization detail —
it is a core reliability mechanism in modern LLM systems.