Knowledge Graphs and GraphRAG for Agents

Guide · updated 2026-06-21 · Markdown variant

Graph-structured retrieval: when and how to use knowledge graphs over vector RAG for multi-hop, relational, and global corpus queries.

The problem vector RAG does not solve

Standard vector RAG retrieves chunks by embedding similarity. That works well for single-hop, fact-lookup questions ("What does X say about Y?") but fails on three distinct problem classes:

Multi-hop / relational — "How is entity A connected to entity C through B?" requires traversing a path that pure cosine similarity cannot reconstruct.
Global summarization — "What are the main themes across this whole corpus?" requires aggregating over the entire document set, not retrieving a few chunks.
Cross-document entity linking — the same person, organization, or concept appears under different surface forms across documents; a graph merges these into one node.

Knowledge graphs model a corpus as entities (nodes) and relationships (typed edges), enabling structured traversal at query time.

Cross-link: /resources/rag-retrieval-for-agents for the vector-RAG baseline.

Core concepts

Graph construction — an LLM reads each document chunk and extracts named entities plus the relationships between them (e.g., Person –WORKS_AT→ Company). This is expensive: each chunk costs one or more LLM calls. The output is a property graph stored in a graph database.

Community detection — clustering algorithms (e.g., Leiden) group densely connected entities into communities. The Microsoft GraphRAG system (arxiv:2404.16130) then pre-generates a hierarchical summary for each community using an LLM. These summaries are the backbone of global search.

Local search — uses entities as the query entry point. The query is embedded to find nearest-neighbor entities, then graph traversal expands outward through relationships and community context to build the prompt. Best for targeted, entity-specific questions.

Global search — instead of traversing, it broadcasts the query across all pre-computed community summaries, collects partial answers (MAP step), and aggregates them into a final response (REDUCE step). Best for whole-corpus sensemaking and thematic questions.

Approaches and variants

Microsoft GraphRAG (microsoft/graphrag) — the reference implementation of the arxiv:2404.16130 paper ("From Local to Global: A Graph RAG Approach to Query-Focused Summarization"). Produces a full Leiden-community hierarchy with pre-generated summaries. Also supports a DRIFT search mode that combines global and local search. Documented at https://microsoft.github.io/graphrag/ and https://github.com/microsoft/graphrag. Note: not an officially supported Microsoft product; graph construction is intentionally expensive.

LightRAG (HKUDS/LightRAG, arXiv:2410.05779, EMNLP 2025) — a lighter alternative that builds a dual-layer knowledge graph (entities + higher-level concepts) alongside vector embeddings. Supports five query modes: local, global, hybrid, naive (pure vector), and mix (default). Incremental graph updates avoid full re-indexing. GitHub: https://github.com/HKUDS/LightRAG

HippoRAG (OSU-NLP-Group/HippoRAG, arXiv:2405.14831, NeurIPS 2024) — inspired by hippocampal indexing theory. Combines knowledge graphs with Personalized PageRank to model associative memory. Demonstrated up to 20% improvement on multi-hop QA vs. standard RAG with lower latency than iterative retrieval approaches. HippoRAG 2 (arXiv:2502.14802, ICML 2025) extends to continual learning. GitHub: https://github.com/OSU-NLP-Group/HippoRAG

Hybrid vector + graph — combine a vector store for chunk retrieval with a graph store for relational traversal. LlamaIndex PropertyGraphIndex and Neo4j's neo4j-graphrag-python package both support this pattern natively.

Storage

Property graphs — nodes and edges carry key-value properties. Neo4j (Cypher query language) is the dominant choice; others include Amazon Neptune (supports both property graph and RDF) and Memgraph. Most GraphRAG tooling targets property graphs.

RDF / triple stores — represent facts as subject–predicate–object triples, queried via SPARQL. Stronger semantic interoperability (W3C standards, ontology reasoning) but heavier join overhead at scale. Less common in LLM-era agent pipelines.

Tooling:

LlamaIndex PropertyGraphIndex — constructs a property graph from documents via LLM extraction, stores in a pluggable graph backend (Neo4j, in-memory, etc.), and exposes multiple retriever types including keyword-entity lookup and vector-based graph node retrieval. Docs: https://developers.llamaindex.ai/python/framework/module_guides/indexing/lpg_index_guide/
LangChain langchain-neo4j (GraphCypherQAChain) — generates Cypher queries from natural language against a Neo4j graph. The LLM is given the graph schema; it produces a Cypher query, executes it, and reasons over the result. Docs: https://python.langchain.com/docs/integrations/graphs/neo4j_cypher/
neo4j-graphrag-python — Neo4j's own Python package for building RAG pipelines over Neo4j, including a Knowledge Graph Builder pipeline that extracts entities from unstructured text. Docs: https://neo4j.com/docs/neo4j-graphrag-python/current/

Tradeoffs

Dimension	Vector RAG	GraphRAG
Build cost	Low (embed chunks)	High (many LLM calls per chunk)
Update / freshness	Re-embed changed chunks	Re-extract affected subgraph
Multi-hop queries	Poor	Strong
Global summarization	Poor	Strong
Operational complexity	Low	High
Best for	Fact lookup, semantic search	Relational, entity-centric, whole-corpus

Use GraphRAG when: your queries span multiple entities and require traversing relationships; you need "what is the overall picture" summaries; or entities appear under many surface forms across documents.

Stick with plain RAG when: questions are single-hop or semantic; the corpus is small or fast-changing; or build cost/latency constraints are tight.

Agentic angle

An agent can treat a knowledge graph as a tool: issue graph queries (Cypher, SPARQL, or a higher-level API) as discrete tool calls, inspect the subgraph returned, and decide whether to traverse further. This fits naturally into MCP or function-calling patterns — each traversal step is a tool call with a verifiable intermediate result.

The most robust production pattern combines vector retrieval (fast chunk lookup) with graph traversal (relational context): the vector index answers "what chunks are relevant?" and the graph answers "how are these entities connected?".

Cross-links: /resources/reliable-tool-calling · /resources/embeddings-vector-search · /resources/agent-memory-context

Verified sources

Microsoft GraphRAG repo: https://github.com/microsoft/graphrag
Microsoft GraphRAG docs: https://microsoft.github.io/graphrag/
GraphRAG paper (arXiv:2404.16130): https://arxiv.org/abs/2404.16130
LightRAG repo (HKUDS, arXiv:2410.05779): https://github.com/HKUDS/LightRAG
HippoRAG repo (OSU-NLP-Group, arXiv:2405.14831): https://github.com/OSU-NLP-Group/HippoRAG
HippoRAG 2 (arXiv:2502.14802): https://arxiv.org/abs/2502.14802
LlamaIndex PropertyGraphIndex guide: https://developers.llamaindex.ai/python/framework/module_guides/indexing/lpg_index_guide/
LangChain Neo4j integration: https://python.langchain.com/docs/integrations/graphs/neo4j_cypher/
Neo4j GraphRAG Python package: https://neo4j.com/docs/neo4j-graphrag-python/current/

#rag #knowledge-graph #graphrag #retrieval #neo4j #agents

Category: Guide