Strategic Chunk Retrieval

Strategic Chunk Retrieval — Chunk-Level Memory with Selective Curation

DynamicCheatsheet_StrategicChunkRetrieval replaces the monolithic cheatsheet with a structured, chunk-level memory store and retrieves items based on their full strategy content rather than only their source question.

How It Works

  1. Chunk-Level Memory Store — Each entry is a single <memory_item>: a self-contained strategy, code snippet, or insight. Every item carries metadata including the embedding of the strategy text and a usage counter initialized to 1.

  2. Content-Based Retrieval with Usage Bonus — Given a new input, every memory item is scored using a blended formula combining cosine similarity (α=0.85) with a logarithmic usage bonus (1−α). The logarithmic normalization prevents high-count items from dominating while still surfacing battle-tested strategies. We support top-k selection and a probability threshold variant.

  3. Selective Curation — The curator receives only the retrieved chunks, not the full store, and produces an updated set of <memory_item> blocks. Retrieved items are updated and re-embedded; non-retrieved items remain untouched. This localization prevents information loss in the broader store and keeps the curator prompt short and focused.

Advantages Over DC-Cumulative

  • Bounded context: The generator and curator see only relevant chunks, not an ever-growing flat document.
  • Targeted updates: Only retrieved chunks are rewritten; the rest of the store remains stable (proven by the Non-Retrieved Item Preservation proposition in our report).
  • Usage-aware prioritization: Frequently applied strategies accumulate higher counts, surfacing the most reliable patterns.
  • Reduced curation loss: Per-strategy information loss scales with retrieval count rather than total store size.

Key Results

  • 28.2% on AIME 2020–2024 (vs 24.8% DC-RS)
  • 53.0% on IneqMath (vs 48.0% Default, vs 47.0% DC-Cu/DC-RS)
  • 100% on MathEquationBalancer

Limitation

On DataSIR (75.0%), DC-SCR drops below the Default (87.0%) because strategy-only retrieval cannot capture structural problem similarity — this limitation motivated the dual-embedding retrieval in Dynamic Ledger.

Example Command

python3 run_benchmark.py \
  --task IneqMath_all \
  --approach_name DynamicCheatsheet_StrategicChunkRetrieval \
  --model_name openai/gpt-4o \
  --cheatsheet_prompt_path prompts/curator_prompt_for_strategic_chunk_retrieval.txt \
  --retrieve_top_k 3 \
  --max_n_samples 600