Memory-augmented QA

Neural QA system focused on compositional questions using attention over key–value facts: attend, retrieve, answer.

GitHub · Back to Projects


  • Problem: Answer compositional questions by retrieving the right fact(s) from a structured knowledge store.
  • Approach: Key–Value Memory Network with batched dot-product attention (attend → retrieve → answer).
  • Outcome: Single-hop reasoning baseline with controlled dataset construction and analysis of common failure modes.

This project focuses on developing a memory-augmented question answering (QA) system that leverages a structured memory of facts represented as key–value pairs to enhance its ability to answer questions accurately. The system is designed to retrieve and utilize relevant information from a structured set of facts, improving its performance on various QA tasks.

The key components and techniques used in this project include:

  • Model: implemented a Key–Value Memory Network (KVMemNet) storing facts as (key, value) pairs and retrieving values via attention over keys.
  • Attention: batched dot-product attention with value aggregation; supports single-hop retrieval-based reasoning.
  • Data design: built a controlled dataset using templated questions, vocabulary reduction, and distractor sampling to tune difficulty and scale.
  • Training: optimized with CrossEntropy loss; monitored accuracy/loss and applied overfitting mitigations (e.g., early stopping / regularization if used).
  • Retrieval quality: improved key formatting (natural-language keys) and balanced distractors to force fine-grained discrimination.

System Sketch


            Question → Encode → Attention over Keys → Aggregate Values → Classify Answer
            

Engineering Notes

  • Why key formatting matters: small changes to how facts are represented as keys can significantly change attention sharpness and retrieval accuracy.
  • Distractors control difficulty: sampling “near-miss” distractors is more effective than random negatives for training robust retrieval.
  • Failure mode observed: attention can spread across multiple plausible keys, producing confident but wrong answers—especially with ambiguous templates.
  • What I.d add next: multi-hop reasoning (2–3 hops) and an evaluation split that isolates compositional generalization vs memorization.