About
PaperQA2, developed by FutureHouse (a 501(c)(3) nonprofit), is the first AI agent to demonstrate superhuman performance across a range of scientific literature search and synthesis tasks. Evaluated using LitQA2 — part of the LAB-Bench benchmark — PaperQA2 outperforms PhD and postdoc-level biology researchers in retrieving accurate information from scientific papers. The agent is equipped with tools to search and retrieve papers, extract relevant content, traverse citation networks, and formulate well-grounded answers. Building on PaperQA2, FutureHouse created WikiCrow, which generates Wikipedia-style scientific summaries more accurate than human-curated Wikipedia articles as judged by expert reviewers. It has already been used to produce articles covering all 20,000 genes in the human genome, synthesizing over 1 million distinct papers. Another application, ContraCrow, identifies contradictions between published scientific papers, uncovering an average of 2.34 contradicted statements per paper — a powerful tool for hypothesis generation and research prioritization. PaperQA2 is open-source and available on GitHub, making it accessible to researchers, developers, and institutions looking to perform literature analysis at a scale that is currently infeasible for human researchers. It is ideal for biomedical research, systematic reviews, knowledge base construction, and automated scientific writing.
Key Features
- Superhuman Literature Retrieval: Achieves higher accuracy than PhD and postdoc-level biology researchers on the LitQA2 benchmark for retrieving information from scientific papers.
- Citation Graph Exploration: Traverses citation networks to find relevant connected papers, enabling deep, contextual synthesis of scientific knowledge.
- Wikipedia-Style Summary Generation (WikiCrow): Generates accurate scientific summaries at scale — proven more accurate than human-curated Wikipedia articles when evaluated by expert biologists.
- Contradiction Detection (ContraCrow): Evaluates every claim in a scientific paper to identify contradicting evidence elsewhere in the literature, averaging 2.34 contradictions per paper.
- Massive-Scale Literature Analysis: Capable of synthesizing information from millions of papers simultaneously — demonstrated by generating gene summaries across all 20,000 human genome genes.
Use Cases
- Biomedical researchers using PaperQA2 to rapidly retrieve accurate answers from thousands of scientific papers without manual literature review.
- Academic institutions automating the generation of comprehensive, Wikipedia-style gene or disease summaries using WikiCrow for large-scale knowledge bases.
- Research teams using ContraCrow to identify contradictions between studies, surfacing gaps and opportunities for novel hypothesis generation.
- Developers and data scientists building custom scientific Q&A or summarization pipelines on top of the open-source PaperQA2 framework.
- Systematic review authors leveraging PaperQA2 to accelerate evidence synthesis across hundreds of studies in a fraction of the time.
Pros
- Superhuman Accuracy: Benchmarked to outperform PhD/postdoc-level researchers on scientific retrieval tasks, providing a highly reliable research assistant.
- Open Source & Free: Fully open-source and backed by a nonprofit, making it freely accessible to researchers, developers, and academic institutions.
- Scales to Millions of Papers: Can synthesize information from over a million papers simultaneously, enabling analyses impossible for human researchers.
- Multi-Purpose Research Applications: Supports diverse use cases including Q&A, summary generation, contradiction detection, and hypothesis generation.
Cons
- Primarily Focused on Biology: Current benchmarks and demonstrations are centered on biology; performance on other scientific domains may vary.
- Requires Technical Setup: As an open-source tool available via GitHub, non-technical users may find deployment and configuration challenging without developer support.
- Dependent on Available Literature: Output quality is constrained by the accessibility and coverage of the scientific papers it can retrieve and index.
Frequently Asked Questions
PaperQA2 is an open-source AI agent developed by FutureHouse that performs superhuman scientific literature search, retrieval, and summarization, outperforming PhD-level researchers on standardized benchmarks.
It uses a suite of tools to find papers, extract information, explore citation graphs, and formulate answers — all evaluated against LitQA2, a rigorous scientific retrieval benchmark where it outperforms expert human researchers.
Yes. PaperQA2 is fully open-source and available on GitHub. FutureHouse is a registered 501(c)(3) nonprofit, and the tool is freely available to the research community.
WikiCrow is an agent built on top of PaperQA2 that generates Wikipedia-style summaries of scientific topics. Its summaries have been shown to be more accurate on average than existing Wikipedia articles, as judged by blinded PhD-level reviewers.
ContraCrow is another PaperQA2-based agent that detects contradictions between published scientific papers. It evaluates claims in a paper against the broader literature, finding an average of 2.34 contradicted statements per paper, which can guide new hypothesis generation.
