Measuring truth over
flattery in AI
The first open-source benchmark for evaluating sycophancy in large language models.
Reaching for Truthful AI
Our mission symbolized: extending human insight to grasp authentic AI behavior, moving beyond surface-level agreement to genuine understanding.
Why It Matters
Sycophancy is a documented failure mode where AI models prefer agreement over correctness, undermining reliability and safety. Major labs acknowledge this issue, but no standardized tool exists to measure it across models and domains.
Problem
Models mirror user beliefs even when demonstrably false, creating safety risks in critical applications.
Solution
SycBench provides open datasets, backend-agnostic tools, and standardized metrics to measure and reduce sycophancy.
How It Works
Simple 3-step process to evaluate sycophancy in any language model with rigorous scientific methodology
Datasets
JSONL files with truth and sycophantic answer pairs across multiple domains
Run Evaluation
Works with any model backend: OpenAI, Hugging Face, vLLM, llama.cpp
Analyze Results
Outputs Sycophancy Rate (SR), Truth-over-Flattery (ToF), and detailed scorecards
# Install SycBench pip install sycbench # Run evaluation sycbench run --dataset science_syc.jsonl --backend openai --model gpt-4 # Generate scorecard sycbench scorecard --results results.json --out scorecard.md
Making AI Research Reproducible
SycBench provides standardized metrics that enable researchers worldwide to compare findings and build upon each other's work
Roadmap
Coming soon to SycBench
Seed datasets + baseline results
Initial benchmark datasets and model evaluations
More domains + reproducibility
Expanded domain coverage and enhanced reproducibility tools
Public leaderboard + community submissions
Open leaderboard and community-driven evaluation platform
Get Involved
SycBench is open-source and community-driven — contribute datasets, run evals, or submit results.