Knowledge Graph Basics
Knowledge Graph Basics: Build & Understand KGs in SEO & AI

A comprehensive deep dive into DSPyβs philosophy, architecture, core abstractions (signatures, modules, predictors), optimization system, and how to use it to build reliable, data-driven LLM applications.
Traditional Approach: 1. Write a prompt by hand 2. Test it on a few examples 3. Tweak wording when it fails 4. Add more instructions 5. Prompt becomes unwieldy 6. Model changes β prompts break 7. Repeat forever Problems: βββ Prompts are brittle (small changes break them) βββ No systematic optimization βββ Prompt-model coupling (prompts don't transfer) βββ Hard to maintain at scale βββ Human intuition is the only guide
DSPy Approach: 1. Define WHAT you want (signatures) 2. Define HOW to compose (modules) 3. Provide training examples 4. Let DSPy optimize prompts/weights 5. Get compiled, optimized program Benefits: βββ Prompts are generated, not hand-written βββ Systematic optimization with metrics βββ Portable across models βββ Modular and maintainable βββ Data-driven improvement
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β THE DSPY INSIGHT β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β Traditional: You write prompts, hope they work β β β β DSPy: You write programs, DSPy writes prompts β β β β The prompt becomes a COMPILED ARTIFACT, not source code β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β DSPy Architecture β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β APPLICATION LAYER β β β β Your DSPy Program (Modules composed together) β β β ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ β β β β β ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββ β β β ABSTRACTION LAYER β β β β βββββββββββββ βββββββββββββ βββββββββββββββββ β β β β βSignatures β β Modules β β Predictors β β β β β β(schemas) β β(logic) β β (LM calls) β β β β β βββββββββββββ βββββββββββββ βββββββββββββββββ β β β ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ β β β β β ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββ β β β OPTIMIZATION LAYER β β β β βββββββββββββ βββββββββββββ βββββββββββββββββ β β β β βTelepromp- β β Metrics β β Assertions β β β β β βters β β β β β β β β β βββββββββββββ βββββββββββββ βββββββββββββββββ β β β ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ β β β β β ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββ β β β INTEGRATION LAYER β β β β βββββββββββββ βββββββββββββ βββββββββββββββββ β β β β β LM β β Retrieval β β Tools β β β β β β Adapters β β Models β β β β β β β βββββββββββββ βββββββββββββ βββββββββββββββββ β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Layer | Purpose | Components |
Application | Your business logic | Custom modules, pipelines |
Abstraction | Define structure and behavior | Signatures, Modules, Predictors |
Optimization | Improve program quality | Teleprompters, Metrics, Assertions |
Integration | Connect to external systems | LM adapters, Retrievers, Tools |
class QuestionAnswering(dspy.Signature): """Answer questions based on provided context.""" context = dspy.InputField(desc="Background information") question = dspy.InputField(desc="The question to answer") answer = dspy.OutputField(desc="A concise answer")
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β SIGNATURE STRUCTURE β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β class TaskName(dspy.Signature): β β """Task description (becomes part of prompt)""" β β β β # Input fields - data provided to the LM β β input1 = dspy.InputField(desc="description") β β input2 = dspy.InputField() # desc is optional β β β β # Output fields - data extracted from LM response β β output1 = dspy.OutputField(desc="description") β β output2 = dspy.OutputField() β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Inline signature format: "input1, input2 -> output1, output2"# Simple QA qa = dspy.Predict("question -> answer") # With context rag = dspy.Predict("context, question -> answer") # Multiple outputs analysis = dspy.Predict("text -> sentiment, confidence, keywords")
Signatures serve as: 1. CONTRACTS βββ Define what the module expects and produces 2. DOCUMENTATION βββ Self-documenting code via docstrings and field descriptions 3. PROMPT TEMPLATES βββ DSPy generates prompts from signature structure 4. TYPE HINTS βββ Enable validation and IDE support 5. OPTIMIZATION TARGETS βββ Teleprompters know what to optimize based on signatures
nn.Module, they encapsulate logic and can be composed hierarchically.class MyModule(dspy.Module): def __init__(self): super().__init__() # Initialize sub-modules and predictors self.predictor = dspy.Predict(MySignature) def forward(self, **kwargs): # Define the logic result = self.predictor(**kwargs) return result
class RAGPipeline(dspy.Module): def __init__(self, num_passages=3): super().__init__() self.retrieve = dspy.Retrieve(k=num_passages) self.generate = dspy.ChainOfThought("context, question -> answer") def forward(self, question): # Step 1: Retrieve relevant passages passages = self.retrieve(question).passages context = "\n".join(passages) # Step 2: Generate answer with context answer = self.generate(context=context, question=question) return answer
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β MODULE COMPOSITION β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β ComplexPipeline (dspy.Module) β β β β β βββ QueryExpander (dspy.Module) β β β βββ dspy.ChainOfThought("query -> expanded_queries") β β β β β βββ MultiRetriever (dspy.Module) β β β βββ dspy.Retrieve(k=5) # Primary retriever β β β βββ dspy.Retrieve(k=3) # Fallback retriever β β β β β βββ Ranker (dspy.Module) β β β βββ dspy.Predict("passages, query -> ranked_passages") β β β β β βββ Generator (dspy.Module) β β βββ dspy.ChainOfThought("context, query -> answer") β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
dspy.PredictΒ is the basic predictor that directly calls the LM:class BasicQA(dspy.Module): def __init__(self): super().__init__() # Create a predictor from a signature self.qa = dspy.Predict("context, question -> answer") def forward(self, context, question): # Call the predictor result = self.qa(context=context, question=question) return result.answer
When you call a predictor: 1. SIGNATURE β PROMPT βββ Convert signature + inputs into a prompt 2. PROMPT β LM βββ Send prompt to configured language model 3. LM RESPONSE β PARSING βββ Extract output fields from response 4. PARSED β dspy.Prediction βββ Return structured prediction object
# Simple prediction predict = dspy.Predict("question -> answer") result = predict(question="What is 2+2?") print(result.answer)# "4"
# With chain-of-thought reasoning cot = dspy.ChainOfThought("question -> answer") result = cot(question="What is 2+2?") print(result.reasoning)# "I need to add 2 and 2..."print(result.answer)# "4"
Standard Predict: Input: question Output: answer ChainOfThought: Input: question Output: reasoning, answer β reasoning is automatically added The prompt instructs the LM to "think step by step" before answering.
cot_hint = dspy.ChainOfThoughtWithHint("question -> answer") result = cot_hint( question="What is the capital of France?", hint="Think about European geography" )
pot = dspy.ProgramOfThought("question -> answer") result = pot(question="What is 15% of 80?") # LM generates: result = 80 * 0.15# DSPy executes the code# Returns: 12
# Define available toolsdef search(query: str) -> str: """Search the web for information.""" return web_search(query) def calculate(expression: str) -> str: """Evaluate a mathematical expression.""" return str(eval(expression)) # Create ReAct agent react = dspy.ReAct( "question -> answer", tools=[search, calculate] ) result = react(question="What is the population of France times 2?") # LM reasons: "I need to find France's population, then multiply"# LM acts: search("population of France")# LM reasons: "Got 67 million, now multiply by 2"# LM acts: calculate("67000000 * 2")# LM answers: "134,000,000"
retrieve = dspy.Retrieve(k=5)# Get top 5 passages results = retrieve("What causes rainbows?") for passage in results.passages: print(passage)
mcc = dspy.MultiChainComparison( "question -> answer", num_chains=3 ) result = mcc(question="Complex reasoning problem...") # Generates 3 independent reasoning chains# Compares them to select best answer
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β BUILT-IN MODULES β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β Module β Description β β βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ β β Predict β Direct inputβoutput mapping β β ChainOfThought β Adds reasoning before answer β β ChainOfThoughtHint β CoT with optional hints β β ProgramOfThought β Generates & executes code β β ReAct β Reasoning + tool use β β Retrieve β Retrieves relevant passages β β MultiChainComparisonβ Multiple chains, picks best β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
import dspy # OpenAI lm = dspy.LM("openai/gpt-4o", api_key="...") # Anthropic lm = dspy.LM("anthropic/claude-3-opus-20240229", api_key="...") # Local models (via Ollama) lm = dspy.LM("ollama/llama3.1") # Azure OpenAI lm = dspy.LM("azure/gpt-4", api_key="...", api_base="...") # Configure globally dspy.configure(lm=lm)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β LM ADAPTER SYSTEM β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β Your DSPy Program β β β β β βΌ β β βββββββββββββββββββ β β β dspy.LM β Unified interface β β ββββββββββ¬βββββββββ β β β β β βΌ β β βββββββββββββββββββββββββββββββββββββββββββ β β β Provider Adapters β β β β βββββββ βββββββ βββββββ βββββββ β β β β βOpenAIβAnthroβ Azure β Local β ... β β β β ββββ¬βββ ββββ¬βββ ββββ¬βββ ββββ¬βββ β β β βββββββΌββββββββΌββββββββΌββββββββΌβββββββββββ β β β β β β β β βΌ βΌ βΌ βΌ β β [APIs] [APIs] [APIs] [Local] β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
lm = dspy.LM( "openai/gpt-4o", api_key="...", temperature=0.7,# Sampling temperature max_tokens=1000,# Max response length top_p=0.9,# Nucleus sampling cache=True,# Cache responses num_retries=3,# Retry on failure )
# Configure default LM dspy.configure(lm=dspy.LM("openai/gpt-4o-mini")) # Use a different LM for specific callswith dspy.context(lm=dspy.LM("openai/gpt-4o")): # This uses gpt-4o result = expensive_module(input) # Back to default (gpt-4o-mini) result = cheap_module(input)
import dspy # ColBERT v2 rm = dspy.ColBERTv2(url="http://your-colbert-server:8080") # Qdrantfrom dspy.retrieve.qdrant_rm import QdrantRM rm = QdrantRM("collection_name", qdrant_client) # ChromaDBfrom dspy.retrieve.chromadb_rm import ChromadbRM rm = ChromadbRM("collection_name", persist_directory) # Configure globally dspy.configure(rm=rm)
class RAGModule(dspy.Module): def __init__(self, k=3): super().__init__() self.retrieve = dspy.Retrieve(k=k) self.generate = dspy.ChainOfThought("context, question -> answer") def forward(self, question): # Retrieve relevant passages retrieved = self.retrieve(question) context = "\n\n".join(retrieved.passages) # Generate answerreturn self.generate(context=context, question=question)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β RETRIEVAL FLOW β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β 1. Query comes in β β "What causes climate change?" β β β β β βΌ β β 2. dspy.Retrieve(k=3) β β βββ Encodes query β β βββ Searches vector store β β βββ Returns top-k passages β β β β β βΌ β β 3. Passages returned β β βββ "Greenhouse gases trap heat..." β β βββ "CO2 levels have risen 50%..." β β βββ "Human activities since 1850..." β β β β β βΌ β β 4. Context provided to LM β β dspy.ChainOfThought(context=passages, question=query) β β β β β βΌ β β 5. Grounded answer generated β β "Climate change is caused by greenhouse gases..." β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β OPTIMIZATION TRIANGLE β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β TRAINING DATA β β β² β β /β\ β β / β \ β β / β \ β β / β \ β β / β \ β β / β \ β β / β \ β β βΌ β βΌ β β METRIC ββββββββ΄βββββββΊ PROGRAM β β β β All three are required for optimization: β β - Program: The DSPy modules to optimize β β - Metric: How to measure success β β - Training data: Examples to learn from β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Simple exact match metricdef exact_match(example, prediction, trace=None): return example.answer.lower() == prediction.answer.lower() # F1 score metricdef f1_metric(example, prediction, trace=None): pred_tokens = set(prediction.answer.lower().split()) gold_tokens = set(example.answer.lower().split()) if not pred_tokens or not gold_tokens: return 0.0 precision = len(pred_tokens & gold_tokens) / len(pred_tokens) recall = len(pred_tokens & gold_tokens) / len(gold_tokens) if precision + recall == 0: return 0.0 return 2 * precision * recall / (precision + recall) # Semantic similarity metricdef semantic_similarity(example, prediction, trace=None): # Use embeddings to compute similarityreturn compute_cosine_similarity(example.answer, prediction.answer) # Composite metricdef composite_metric(example, prediction, trace=None): em = exact_match(example, prediction, trace) f1 = f1_metric(example, prediction, trace) return 0.3 * em + 0.7 * f1
# Create training examples trainset = [ dspy.Example( question="What is the capital of France?", answer="Paris" ).with_inputs("question"), dspy.Example( question="Who wrote Romeo and Juliet?", answer="William Shakespeare" ).with_inputs("question"), # ... more examples ] # with_inputs() specifies which fields are inputs (rest are outputs)
flowchart TD A[Unoptimized Program] --> B[Optimizer/Teleprompter] C[Training Data] --> B D[Metric Function] --> B B --> E[Compilation Process] E --> F{Optimizer Type} F -->|BootstrapFewShot| G[Select few-shot examples] F -->|MIPROv2| H[Optimize instructions + examples] F -->|BootstrapFinetune| I[Generate fine-tuning data] F -->|COPRO| J[Coordinate instruction optimization] G --> K[Compiled Program] H --> K I --> K J --> K
from dspy.teleprompt import BootstrapFewShot optimizer = BootstrapFewShot( metric=my_metric, max_bootstrapped_demos=4,# Max examples to include max_labeled_demos=16,# Max labeled examples to try max_rounds=1,# Optimization rounds ) compiled = optimizer.compile( student=my_program, trainset=training_data, )
BootstrapFewShot Algorithm: 1. Run program on training examples 2. Collect successful traces (where metric passes) 3. Select diverse, high-quality traces as demonstrations 4. Insert demonstrations into prompt 5. Return compiled program with few-shot examples
from dspy.teleprompt import BootstrapFewShotWithRandomSearch optimizer = BootstrapFewShotWithRandomSearch( metric=my_metric, max_bootstrapped_demos=4, num_candidate_programs=10,# Number of combinations to try num_threads=4,# Parallel evaluation )
from dspy.teleprompt import MIPROv2 optimizer = MIPROv2( metric=my_metric, num_candidates=10,# Instruction candidates init_temperature=1.0,# Exploration temperature verbose=True, ) compiled = optimizer.compile( student=my_program, trainset=training_data, valset=validation_data,# For evaluation num_trials=30,# Optimization budget )
MIPROv2 Algorithm: 1. INSTRUCTION PROPOSAL βββ LLM generates candidate instructions based on task 2. EXAMPLE SELECTION βββ Bootstrap effective few-shot examples 3. JOINT OPTIMIZATION βββ Bayesian optimization over instruction-example space 4. EVALUATION βββ Validate on held-out data 5. SELECTION βββ Return best configuration found
from dspy.teleprompt import COPRO optimizer = COPRO( metric=my_metric, depth=3,# Optimization depth breadth=5,# Candidates per iteration )
from dspy.teleprompt import BootstrapFinetune optimizer = BootstrapFinetune( metric=my_metric, multitask=True,# Train on multiple tasks ) compiled = optimizer.compile( student=my_program, trainset=training_data, target_model="meta-llama/Llama-3-8B",# Model to fine-tune )
from dspy.teleprompt import GEPA optimizer = GEPA( metric=my_metric, num_generations=20, population_size=10, )
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β OPTIMIZER COMPARISON β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β Optimizer β What it Optimizes β Sample Cost β β ββββββββββββββββββββββΌβββββββββββββββββββββββΌββββββββββββ β β BootstrapFewShot β Few-shot examples β Low β β BootstrapFewShot+RS β Example combinations β Medium β β COPRO β Instructions β Medium β β MIPROv2 β Instructions+Examplesβ High β β BootstrapFinetune β Model weights β Very High β β GEPA β All + Pareto diverse β High β β β β Recommended starting point: BootstrapFewShot β β For production: MIPROv2 or GEPA β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
class ConstrainedQA(dspy.Module): def __init__(self): super().__init__() self.qa = dspy.ChainOfThought("question -> answer") def forward(self, question): result = self.qa(question=question) # Hard constraint: answer must be less than 50 words dspy.Assert( len(result.answer.split()) < 50, "Answer must be concise (under 50 words)" ) return result
class GuidedQA(dspy.Module): def __init__(self): super().__init__() self.qa = dspy.ChainOfThought("question -> answer") def forward(self, question): result = self.qa(question=question) # Soft constraint: prefer answers with citations dspy.Suggest( "[" in result.answer and "]" in result.answer, "Consider including citations in brackets" ) return result
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β ASSERTION BEHAVIOR β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β dspy.Assert (Hard Constraint): β β βββ If constraint fails: β β β βββ Add feedback to prompt β β β βββ Retry with constraint context β β β βββ If still fails after retries: raise exception β β βββ If constraint passes: continue normally β β β β dspy.Suggest (Soft Constraint): β β βββ If constraint fails: β β β βββ Log suggestion β β β βββ May retry once with hint β β β βββ Continue even if still fails β β βββ If constraint passes: continue normally β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Configure assertion behavior dspy.configure( assert_max_retries=3,# Max retries for Assert suggest_max_retries=1,# Max retries for Suggest backoff_time=0.5,# Delay between retries )
BEFORE COMPILATION: βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β class QAModule(dspy.Module): β β def __init__(self): β β self.qa = dspy.ChainOfThought("question -> answer")β β β β def forward(self, question): β β return self.qa(question=question) β β β β # Prompt is minimal, no examples, generic instruction β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ AFTER COMPILATION: βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β # Same code, but internal state is different: β β β β compiled_qa.qa.demos = [ β β # Carefully selected few-shot examples β β Example(question="...", reasoning="...", answer="..."),β β Example(question="...", reasoning="...", answer="..."),β β Example(question="...", reasoning="...", answer="..."),β β ] β β β β compiled_qa.qa.instructions = """ β β You are an expert question answering system. Given a β β question, think step by step and provide a clear, β β accurate answer. Focus on factual accuracy... β β """ β β β β # Prompt is now rich with examples and refined instruction β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Compilation affects: 1. INSTRUCTIONS βββ Task descriptions in signatures βββ Generated from optimization 2. DEMONSTRATIONS (Few-shot examples) βββ Selected from successful traces βββ Optimized for diversity and quality 3. FIELD PREFIXES βββ How input/output fields are labeled βββ Can be optimized for clarity 4. (Optionally) MODEL WEIGHTS βββ If using BootstrapFinetune
# Save compiled program compiled.save("my_compiled_program.json") # Load compiled program loaded = MyModule() loaded.load("my_compiled_program.json") # Or load state into existing program my_program.load_state(compiled.dump_state())
flowchart TD A[User calls module.forward] --> B[Module logic executes] B --> C[Predictor called] C --> D[Build prompt from signature] D --> E{Demos available?} E -->|Yes| F[Add few-shot examples] E -->|No| G[Skip demos] F --> G G --> H[Add current input] H --> I[Call language model] I --> J[Parse LM response] J --> K{Assertions?} K -->|Assert fails| L[Retry with feedback] K -->|Assert passes| M[Return prediction] L --> I M --> N[Continue module logic] N --> O[Return final result]
# Enable tracing dspy.configure(trace=[]) # Run program result = my_program(question="What is AI?") # Inspect tracefor step in dspy.settings.trace: print(f"Module: {step['module']}") print(f"Input: {step['input']}") print(f"Output: {step['output']}") print("---")
# See what prompt was actually sent lm = dspy.LM("openai/gpt-4o", cache=False) dspy.configure(lm=lm) # Enable inspection lm.inspect_history(n=1)# Show last 1 call result = my_program(question="What is AI?") # This prints the full prompt and response
class MultiStagePipeline(dspy.Module): def __init__(self): super().__init__() self.decompose = dspy.ChainOfThought("question -> sub_questions") self.answer_sub = dspy.ChainOfThought("sub_question -> sub_answer") self.synthesize = dspy.ChainOfThought( "question, sub_answers -> final_answer" ) def forward(self, question): # Stage 1: Decompose question decomposition = self.decompose(question=question) sub_questions = decomposition.sub_questions.split("\n") # Stage 2: Answer each sub-question sub_answers = [] for sq in sub_questions: answer = self.answer_sub(sub_question=sq) sub_answers.append(answer.sub_answer) # Stage 3: Synthesize final answer final = self.synthesize( question=question, sub_answers="\n".join(sub_answers) ) return final
class BranchingModule(dspy.Module): def __init__(self): super().__init__() self.classifier = dspy.Predict("question -> category") self.factual_qa = dspy.ChainOfThought("question -> answer") self.creative_qa = dspy.ChainOfThought("question -> answer") self.analytical_qa = dspy.ChainOfThought("question -> answer") def forward(self, question): # Classify the question type classification = self.classifier(question=question) # Route to appropriate handlerif classification.category == "factual": return self.factual_qa(question=question) elif classification.category == "creative": return self.creative_qa(question=question) else: return self.analytical_qa(question=question)
class EnsembleModule(dspy.Module): def __init__(self, n_models=3): super().__init__() self.predictors = [ dspy.ChainOfThought("question -> answer") for _ in range(n_models) ] self.aggregator = dspy.Predict("answers -> best_answer") def forward(self, question): # Get answers from all models answers = [] for predictor in self.predictors: result = predictor(question=question) answers.append(result.answer) # Aggregate combined = "\n".join(f"- {a}" for a in answers) final = self.aggregator(answers=combined) return final
class SelfRefiningModule(dspy.Module): def __init__(self, max_iterations=3): super().__init__() self.max_iterations = max_iterations self.generate = dspy.ChainOfThought("question -> answer") self.critique = dspy.Predict("question, answer -> critique, needs_improvement") self.refine = dspy.ChainOfThought("question, answer, critique -> improved_answer") def forward(self, question): # Initial answer result = self.generate(question=question) answer = result.answer # Iterative refinementfor _ in range(self.max_iterations): # Critique current answer critique = self.critique(question=question, answer=answer) if critique.needs_improvement.lower() != "yes": break # Refine based on critique refined = self.refine( question=question, answer=answer, critique=critique.critique ) answer = refined.improved_answer return dspy.Prediction(answer=answer)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β DSPy vs MANUAL PROMPTING β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β Aspect β Manual Prompting β DSPy β β βββββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββ β β Prompt creation β Hand-written β Auto-generated β β Optimization β Trial and error β Systematic algorithms β β Maintainability β Difficult β Modular, structured β β Portability β Model-specific β Model-agnostic β β Reproducibility β Low β High (data-driven) β β Debugging β Print statements β Traces, assertions β β Testing β Ad-hoc β Metric-based β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β DSPy vs LANGCHAIN β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β Aspect β LangChain β DSPy β β βββββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββ β β Philosophy β Chaining tools β Programming LMs β β Prompts β Templates β Compiled from data β β Optimization β Manual β Automatic β β Abstraction β Chains, agents β Signatures, modules β β Focus β Orchestration β Optimization β β Learning curve β Moderate β Steeper β β β β They're complementary - LangChain for orchestration, β β DSPy for optimization. Can be used together. β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
USE DSPy WHEN: βββ You have training data (even small amounts) βββ You need reproducible, optimized prompts βββ You're building production LM applications βββ You want modular, testable code βββ Prompt quality matters significantly βββ You're willing to invest in the learning curve USE SIMPLER APPROACHES WHEN: βββ One-off scripts or experiments βββ No training data available βββ Simple, single-prompt tasks βββ Rapid prototyping is priority βββ Team is unfamiliar with DSPy
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β DSPY SUMMARY β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β CORE PHILOSOPHY: β β "Program, don't prompt" β β - Prompts are compiled artifacts, not source code β β - Optimization is data-driven, not intuition-driven β β β β KEY ABSTRACTIONS: β β βββ Signatures: Define WHAT (input/output contracts) β β βββ Modules: Define HOW (composable logic units) β β βββ Predictors: Interface to LMs β β βββ Optimizers: Improve programs automatically β β β β OPTIMIZATION REQUIRES: β β βββ Program (modules to optimize) β β βββ Metric (how to measure success) β β βββ Data (examples to learn from) β β β β BUILT-IN MODULES: β β βββ Predict: Direct mapping β β βββ ChainOfThought: Reasoning + answer β β βββ ReAct: Reasoning + actions β β βββ ProgramOfThought: Code generation + execution β β βββ Retrieve: RAG integration β β β β OPTIMIZERS: β β βββ BootstrapFewShot: Select few-shot examples β β βββ MIPROv2: Joint instruction + example optimization β β βββ COPRO: Instruction coordination β β βββ BootstrapFinetune: Create fine-tuning data β β βββ GEPA: Evolutionary + Pareto optimization β β β β BENEFITS: β β βββ Systematic optimization β β βββ Modular, maintainable code β β βββ Model-agnostic programs β β βββ Reproducible results β β βββ Production-ready patterns β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Traditional LLM Development: You β (write prompt) β Prompt β LLM β Output DSPy Development: You β (write program) β Program Data + Metric β Optimizer β (compiles) β Optimized Prompt Optimized Prompt β LLM β Output The key difference: You focus on the PROGRAM and METRICS, DSPy handles the PROMPT.
1. [ ] Install DSPy: pip install dspy 2. [ ] Configure LM: dspy.configure(lm=dspy.LM("...")) 3. [ ] Define signatures for your tasks 4. [ ] Build modules that compose signatures 5. [ ] Collect training examples 6. [ ] Define a metric function 7. [ ] Choose an optimizer (start with BootstrapFewShot) 8. [ ] Compile and evaluate 9. [ ] Iterate on data, metrics, and program structure 10.[ ] Deploy compiled program