THE FUTURE OF EVIDENCE

Build an agentic AI system that automates the entire meta-analysis workflow from research question to publication-ready results.

> THE_CHALLENGE

Meta-analyses represent the gold standard for evidence synthesis in medicine, yet conducting them requires months of manual work by teams of experts. We challenge participants to build an agentic AI system that automates this entire workflow.

Your system should accept a research question as input, such as "What is the effect of metformin on lifespan in animal models?", and autonomously produce a complete, publication-ready meta-analysis that matches the quality of expert human work.

The challenge will have a specific focus on biomedical interventions related to longevity and age-related diseases. This includes pharmaceuticals like metformin and rapamycin, as well as lifestyle interventions such as caloric restriction or the Mediterranean diet.

> META_ANALYSIS_PIPELINE

AUTOMATED WORKFLOW

SEARCH

  • PubMed, Embase, Cochrane
  • bioRxiv, medRxiv
  • Google Scholar

SELECT

  • Inclusion/exclusion criteria
  • Relevance screening
  • Quality assessment

EXTRACT

  • Statistical data from PDFs
  • Tables and figures
  • Effect sizes, confidence intervals

ANALYZE

  • Statistical synthesis
  • Heterogeneity assessment
  • Forest plots, funnel plots

> EVALUATION_FRAMEWORK

Study Selection Accuracy

25%

Precision and recall against gold-standard dataset curated by expert reviewers.

Data Extraction Accuracy

25%

Agreement rates with manually extracted effect sizes, confidence intervals, and sample sizes.

Statistical Validity

25%

Reproduce published meta-analyses within acceptable margins, correct heterogeneity handling.

Time Efficiency

25%

Total computation time vs documented person-hours for manual meta-analysis.

ADVANCED CAPABILITIES (BONUS)

Real-time updating as new studies are published
Interactive visualization dashboards for subgroup analyses
Integration with regulatory submission formats (FDA/EMA)

> TECHNICAL_REQUIREMENTS

STUDY DESIGN HANDLING

Handle multiple study designs including randomized controlled trials, cohort studies, and case-control studies, as each requires different statistical approaches and quality assessment criteria.

Randomized Controlled TrialsCohort StudiesCase-Control Studies

AUTOMATED CLASSIFICATION

Core task: automated classification of all selected articles across multiple domains. For each article, the system should identify:

ARTICLE TYPE

Original research, systematic review, meta-analysis, case report, etc.

DATA TYPE

Blood biochemistry, RNA sequencing, DNA methylation, neurocognitive tests

BIOLOGICAL SPECIES

Homo sapiens, Mus musculus, etc.

STRUCTURED OUTPUT

Brief Intervention Description

3-4 sentences: active agent, molecular targets, delivery method

Agentic Study Selection

Automated study selection process

Intervention Effects

Mortality & disease risk (aggregated data using coding agents)

Evidence Basis

Methods used to generate data (clinical trials, animal studies)

Data Quality Assessment

Confidence score based on study type and journal quality

> TEST_CASES_&_VALIDATION

Participants will receive three well-characterized research questions from different medical domains, each with existing high-quality published meta-analyses that serve as ground truth:

TEST CASE 1

Pharmaceutical intervention with clear, standardized outcome measures (longevity intervention like metformin)

TEST CASE 2

Behavioral intervention with more heterogeneous outcome reporting

TEST CASE 3

Diagnostic test accuracy question requiring bivariate meta-analysis methods

PROVIDED FOR EACH TEST CASE

  • • Institutional database subscriptions
  • • Full text of all potentially relevant papers
  • • Published meta-analysis protocol (search strategy & inclusion criteria)
  • • Manually verified data extractions for validation

> EXPECTED_IMPACT

TRANSFORMING EVIDENCE-BASED MEDICINE

Success in this challenge would fundamentally transform evidence-based medicine. Automated meta-analysis could reduce the typical timeline from months to hours, enabling real-time evidence synthesis as new studies emerge.

IMMEDIATE BENEFITS

  • • Critical acceleration during health emergencies
  • • Faster gerontology & preventive medicine research
  • • Democratized access for smaller research groups

BROADER IMPACT

  • • Revolutionize regulatory agency evaluations
  • • Transform clinical guidelines development
  • • Enable living systematic reviews

> RESOURCES_&_GUIDELINES

ESSENTIAL READING

TECHNICAL TOOLS

ADDITIONAL RESOURCES

PDF PARSING

VALIDATION & QUALITY

SUPPORT AVAILABLE

  • Mentorship from meta-analysis experts
  • Technical support for database API integration
  • Compute credits for model training and inference
  • Weekly office hours for team collaboration

Ready to revolutionize evidence-based medicine? Build the future of systematic reviews.