How to Select a Keeper Study

RESEARCH METHODOLOGY · SYSTEMATIC REVIEW · STUDY SELECTION

How to Select a Keeper Study: A Step-by-Step Guide for Systematic and Literature Reviews

A practical walkthrough of the keeper study selection process — how to define eligibility criteria using PICO, apply a two-phase screening protocol, conduct quality appraisal, and document every decision in a way that survives peer scrutiny and journal review.

17 min read Research Methodology Graduate & Doctoral Level ~4,000 words

Custom University Papers — Research Methodology & Academic Writing Team

Specialist guidance on systematic review methodology, study selection protocols, PRISMA documentation, and evidence synthesis across health sciences, social sciences, education, and business research.

A keeper study is a source that passes your eligibility criteria and is retained for full inclusion in your systematic review, scoping review, or structured literature review. The selection process — deciding which studies are keepers and which are excluded — is the methodological step most likely to compromise your review if handled inconsistently. Instructors and journal reviewers can accept a narrow topic or a small final sample. What they cannot accept is a selection process that cannot be reconstructed, that applied different standards to different studies, or that confuses topic relevance with methodological eligibility. This guide explains how to structure the keeper study selection process so that your decisions are defensible at every stage.

This guide covers selection methodology — the process, the criteria, the tools, and the documentation. It does not conduct the selection for you, and it does not determine which specific studies belong in your review. Those decisions depend on your research question, your field’s quality standards, and the database results your search produces. What this guide gives you is the framework for making those decisions correctly.

What This Guide Covers

What a Keeper Study Is (and Is Not) Before You Screen: Building Your Criteria Using PICO/PICOS to Define Eligibility Writing Inclusion and Exclusion Criteria The Two-Phase Screening Protocol Phase 1: Title and Abstract Screening Phase 2: Full-Text Review Quality Appraisal: Does Quality Affect Keeper Status? Documenting Every Decision — PRISMA and Flow Diagrams Inter-Rater Reliability and Second Screeners Where Keeper Study Selection Goes Wrong Tools That Support the Screening Process Frequently Asked Questions

What a Keeper Study Is (and Is Not)

A keeper study is any study that meets all of your pre-defined inclusion criteria and none of your exclusion criteria. The term is used in educational research, social science, and some health science contexts to describe the final retained set of sources after a systematic screening process. In clinical research, the same concept appears under labels like “eligible study,” “included study,” or “retained record.” The terminology varies by field, but the concept is identical: a study you keep is one that answers your research question, meets your methodological standards, and falls within your scope boundaries.

What a keeper study is not is a study you like, a study from an author whose work you recognize, or a study that confirms your hypothesis. Selection bias — the systematic tendency to include studies that support a particular direction — is one of the most serious methodological threats to any review. A correctly constructed keeper selection process is specifically designed to prevent this. Your criteria are set before screening begins, applied uniformly across every record, and documented publicly enough that another researcher could reproduce your decisions.

2 Phases in a standard screening protocol — title/abstract first, full-text second

PICO The framework used to build eligibility criteria before any screening begins

PRISMA The reporting standard for documenting how many records were screened, excluded, and retained

κ Cohen’s kappa — the statistic used to measure agreement between two independent screeners

Keeper Studies Across Different Review Types

The keeper study concept applies to systematic reviews, scoping reviews, integrative reviews, and structured narrative reviews. It does not apply to traditional non-systematic literature reviews, where source selection is not required to follow a documented, reproducible protocol. If your assignment or thesis requires a systematic or structured review — even at the student level — the keeper selection process described in this guide applies. If your assignment is a general literature review without a documented search and screening methodology, your institution may use different standards. Confirm with your supervisor which type of review you are conducting before designing your selection criteria.

Before You Screen: Building Your Criteria

The most important rule in keeper study selection is one that most students violate: you must define your eligibility criteria before you run your search and before you see any results. Criteria built after you have seen which studies exist are post-hoc criteria — they are shaped, consciously or not, by what you found rather than by what your research question requires. Post-hoc criteria produce biased selections and cannot be defended in a methods section.

Criteria development is a planning step. It happens at the same time you develop your research question and design your search strategy — not after. For most student reviews, this means your criteria should be committed to paper (or a pre-registration document) before you log into any database. Everything that follows — the screening, the full-text review, the quality appraisal — applies those criteria mechanically to whatever the search returns.

Start with Your Research Question

Your eligibility criteria are a direct translation of your research question into screening rules. Every element of your research question — population, intervention, comparison, outcome — becomes a category of criteria. If you cannot connect a criterion directly to your research question, question whether it belongs.

Review Field-Specific Standards

Different fields apply different methodological standards to what counts as acceptable evidence. Health sciences reviews may require randomized controlled trial designs. Social science reviews may accept mixed-methods or qualitative designs. Know your field’s evidence hierarchy before writing your inclusion criteria for study design.

Set Scope Boundaries

Date range, language, publication type, geographic scope, and setting are all scope decisions that belong in your criteria before screening. These are not arbitrary — each one should be justifiable in your methods section. “English-language only” is acceptable if you explain why; unexplained language restrictions are a methodological weakness.

Using PICO/PICOS to Define Eligibility

PICO is the most widely used framework for converting a research question into eligibility criteria in health sciences and increasingly in social sciences and education. It stands for Population, Intervention, Comparison, and Outcome. An extended version — PICOS — adds Study design as a fifth element. Each letter of the acronym becomes a category of your inclusion and exclusion criteria.

Population

Who are the participants or subjects? Age range, diagnosis, demographic group, setting, or professional role. Your P criterion determines which studies have the right sample — everything else about the study is irrelevant if the population does not match.

Intervention

What exposure, program, treatment, or phenomenon is being examined? Define this precisely — a study of “mindfulness-based stress reduction” is different from one of “mindfulness practices” broadly. Vague intervention criteria produce a heterogeneous sample of keeper studies that cannot be meaningfully synthesized.

Comparison

What is the intervention compared against — a control group, an alternative treatment, a pre-intervention baseline, or nothing? Not all reviews require a comparison; qualitative and descriptive reviews often have no C element. If your research question does not involve comparison, this criterion may not apply.

Outcome

What results or effects are you interested in? Define your outcomes specifically — “academic achievement” is too broad if you need “standardized reading scores at third-grade level.” Studies that measure related but different outcomes may not be keepers unless your criteria explicitly include them.

Adding S: Study Design

PICOS adds a study design component to the framework. This is the criterion that determines which research designs are eligible — RCTs only, RCTs plus quasi-experimental, any quantitative design, mixed methods included, qualitative included. In student reviews, study design is often the most debated criterion because it determines how much evidence you will find. Narrowing to RCTs only is methodologically rigorous but may leave you with very few keeper studies in fields where RCTs are rare. Your study design criterion should be calibrated to your research question and justified in your methods section.

PICO Is a Framework, Not a Checklist

PICO gives you categories for thinking about eligibility — it does not write your criteria for you. Within each PICO element, you still need to make specific decisions: exactly which populations, which intervention variants, which outcomes, which designs. A PICO table with only one-word answers in each cell (“adults / CBT / usual care / anxiety”) is not a usable criterion set. Each cell needs enough specificity that a second researcher applying your criteria to the same study would reach the same inclusion or exclusion decision independently.

Writing Inclusion and Exclusion Criteria

Inclusion criteria define what a study must have to be a keeper. Exclusion criteria define what disqualifies a study that would otherwise pass. Both are required — they are not two ways of saying the same thing. A study can technically meet all inclusion criteria and still be excluded for a reason that is easier to state as an exclusion rule.

Criterion Category	Inclusion Example	Exclusion Example
Population	Studies involving adult participants (18+) diagnosed with Type 2 diabetes	Studies conducted exclusively with children or adolescents under 18
Intervention	Studies examining structured dietary interventions lasting at least 8 weeks	Studies examining pharmacological interventions without a dietary component
Outcome	Studies reporting HbA1c levels as a primary or secondary outcome measure	Studies reporting only patient-reported outcomes without clinical measures
Study Design	Randomized controlled trials and quasi-experimental designs with control groups	Case reports, case series, editorials, opinion pieces, and letters to the editor
Publication	Peer-reviewed journal articles published between 2014 and 2024	Conference abstracts, dissertations, grey literature, and non-peer-reviewed reports
Language	Articles published in English	Articles published in languages other than English where no translation is available
Setting	Studies conducted in primary care or community health settings	Studies conducted exclusively in inpatient or hospital settings

Write your criteria in complete sentences, not just keywords. “Adults with Type 2 diabetes” is not a criterion — it is a category label. “Studies must include participants aged 18 or older with a confirmed diagnosis of Type 2 diabetes (ICD-10 code E11 or equivalent)” is a criterion. The specificity matters because when you encounter a borderline case during screening — a study of adults with pre-diabetes who are classified as Type 2 by one measure but not another — your written criterion is what you apply to reach a decision.

INCLUSION CRITERIA — complete written example for an education review

1. Population: Studies must include K–12 students in any grade level in public or private school settings. Studies focused exclusively on home-schooled students are excluded.

2. Intervention: Studies must examine a structured, school-based social-emotional learning (SEL) program delivered by classroom teachers or school counselors. Programs must have a defined curriculum with at least four sequential sessions.

3. Outcome: Studies must report at least one of the following outcomes: student academic achievement (grades or standardized test scores), student behavioral outcomes (disciplinary referrals, attendance), or teacher-rated social competence using a validated instrument.

4. Study Design: Quantitative studies with experimental or quasi-experimental designs, including randomized controlled trials, cluster randomized trials, controlled before-and-after studies, and interrupted time series designs.

5. Publication: Peer-reviewed journal articles published between January 2010 and December 2024, in English.

Note: Each criterion is a complete statement that can be applied as a yes/no decision. A second researcher reading only these criteria — without access to the research question — could apply them to any study and reach the same keeper/exclude decision.

The Two-Phase Screening Protocol

Keeper study selection happens in two sequential phases. The first phase — title and abstract screening — applies your criteria to brief summaries to eliminate clearly irrelevant records quickly. The second phase — full-text review — applies your criteria in full to the complete paper for every record that survived Phase 1. Both phases use the same eligibility criteria. The difference is how much information you have available and how much time each decision takes.

This structure exists because systematic searches return large numbers of records — often hundreds to thousands — and full-text review is time-intensive. Phase 1 functions as an efficient filter. Phase 2 is where the detailed, defensible decisions are made. Studies excluded in Phase 1 are excluded because the title and abstract provide enough information to confirm they do not meet at least one inclusion criterion. Studies that survive both phases are your keeper studies.

“When in doubt at the abstract stage, do not exclude. Include the record in Phase 2 and make the exclusion decision with the full text in hand.”

Phase 1: Title and Abstract Screening

Title and abstract screening is a rapid filtering step. You are reading enough of each record to determine whether it could plausibly meet all of your inclusion criteria. The operative word is “plausibly” — at this phase, you are not required to be certain that a study is a keeper. You are only required to be certain that a study is not a keeper.

Export your search results to a screening tool
Import all database results into a reference manager (Zotero, Mendeley, EndNote) or a dedicated screening platform (Rayyan, Covidence, Abstrackr). Deduplicate the records — the same study may appear in multiple databases. Deduplication happens before Phase 1 begins. The number of records after deduplication is your starting total for the PRISMA flow diagram.
Apply your criteria to the title first
The title alone excludes a significant proportion of records in most searches. If the title clearly indicates the study is about a different population, a different intervention, or a different discipline entirely, it is excluded at the title stage. Document this as “excluded at title screening” in your tracking tool. Do not spend time reading the abstract of a study whose title already confirms exclusion.
Read the abstract for borderline or unclear records
For records where the title is potentially relevant but not conclusive, read the abstract. You are looking for enough information to confirm or rule out each inclusion criterion. If the abstract confirms that the study meets all criteria, mark it for Phase 2. If the abstract confirms that the study violates at least one criterion, exclude it and record the reason. If the abstract is ambiguous — insufficient information to decide — include it in Phase 2. Do not exclude on ambiguity.
Record your decisions systematically
Every record receives one of three outcomes: included for Phase 2, excluded with reason, or flagged for discussion (in a dual-screener setup). The reason for exclusion must reference a specific criterion — not “not relevant” but “wrong population: participants were children, inclusion criterion requires adults 18+.” Vague exclusion reasons cannot be defended in a methods section.

Do Not Over-Exclude at Phase 1

The most common Phase 1 error is excluding records that should have gone to full-text review. If the abstract does not clearly describe the population, the intervention, the design, or the outcomes — information that many abstracts omit or report incompletely — you cannot exclude based on missing information. Missing information in an abstract is not evidence that the criterion is not met. Only confirmed criterion violations justify exclusion. When information is absent, err toward inclusion and retrieve the full text. False exclusions at Phase 1 are irrecoverable — you will not know you missed a keeper study unless you go back.

Phase 2: Full-Text Review

Every record that survived Phase 1 requires full-text retrieval and review. This phase is where your final keeper study determinations are made. You are reading the complete paper — methods section especially — and applying every inclusion and exclusion criterion with full information available.

Retrieve Full Texts

Locate the complete article for every Phase 1 survivor. Use your institutional library access, Google Scholar, PubMed Central, or direct author contact for articles behind paywalls. If a full text cannot be retrieved after reasonable effort, document it as “full text not available” — this is a legitimate reason for exclusion that reviewers accept as long as it appears in your PRISMA diagram.

Focus on the Methods Section

For most criteria — population characteristics, study design, intervention details, outcome measures — the methods section is the definitive source. Many abstracts misrepresent or incompletely describe study design. A paper that appears to be an RCT based on the abstract may turn out to be a quasi-experimental design without randomization. Read the methods before making any keeper determination.

Apply All Criteria Explicitly

Work through each criterion in sequence for each study. Record whether the study meets or fails each criterion. A study excluded at full-text review must be excluded for a specific, stated criterion — not general irrelevance. If a study meets all criteria, it is a keeper. If it fails one or more, it is excluded and the specific failing criteria are recorded.

Build an Exclusion Log

Maintain a table of all studies excluded at full-text stage with the title, authors, year, and the specific criterion that caused exclusion. This table is required in many systematic review publications and is a hallmark of transparent methodology. For student reviews, the exclusion log demonstrates to your committee that you applied criteria consistently and that your keeper set is defensible.

Handle Ambiguous Cases with Criteria, Not Intuition

Every borderline case must be resolved by returning to the written criteria, not by asking whether the study “feels” relevant. If the written criteria do not resolve the case, that is a gap in your criteria — document the decision and the reasoning. If you are conducting the review with a second screener, ambiguous cases go to consensus discussion or to a third reviewer for arbitration.

Quality Appraisal: Does Quality Affect Keeper Status?

Quality appraisal is the process of evaluating the methodological rigor of each study that passes your eligibility screening. This is a separate step from eligibility screening and happens after your keeper set is determined. The question of whether quality appraisal can exclude a study from the final keeper set — or only weight it differently in synthesis — is one of the most contested decisions in systematic review methodology.

Position 1: Quality as Exclusion Criterion

Some review protocols treat a minimum quality threshold as an eligibility criterion — studies below a certain score on the appraisal tool are excluded from the keeper set. This is most common in Cochrane-style clinical reviews where evidence quality is tied directly to clinical decision-making. If your protocol takes this position, the quality threshold must be specified in your criteria before screening begins — not after you see the scores.

Position 2: Quality as Synthesis Weight

The more common approach in social science and education reviews is to include all eligible studies regardless of quality score, then account for quality in the synthesis and discussion. Low-quality studies are retained as keepers but their findings are interpreted with appropriate caution. This approach avoids the circularity of using quality to exclude studies when quality assessment itself involves judgment calls.

For student-level reviews, your supervisor or the assignment rubric will usually specify which approach applies. The key methodological requirement is consistency: you cannot exclude some low-quality studies and retain others of similar quality without an explicit, pre-specified rule. If you use quality as an exclusion filter, document the tool, the threshold, and the score each study received.

Common Quality Appraisal Tools by Study Design

Study Design	Appraisal Tool	What It Assesses
Randomized Controlled Trials	Cochrane Risk of Bias Tool (RoB 2)	Randomization process, allocation concealment, blinding, outcome reporting
Observational Studies	Newcastle-Ottawa Scale (NOS)	Selection, comparability, and outcome/exposure assessment across cohort and case-control designs
Quasi-Experimental	ROBINS-I	Risk of bias in non-randomized studies of interventions — seven bias domains
Qualitative Studies	CASP Qualitative Checklist	Research design justification, recruitment, data collection rigor, reflexivity, ethical issues
Mixed Methods	MMAT (Mixed Methods Appraisal Tool)	Addresses quantitative, qualitative, and mixed methods components within a single tool
Any Design	JBI Critical Appraisal Tools	Design-specific checklists for 12 study types maintained by the Joanna Briggs Institute

Documenting Every Decision — PRISMA and Flow Diagrams

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework is the international standard for documenting the keeper study selection process. The PRISMA flow diagram is a visual record of how many records entered the selection process at each stage and how many exited at each stage, with reasons. For any review that presents itself as systematic, PRISMA documentation is expected — by journal reviewers, thesis committees, and in many student assignment rubrics.

What the PRISMA Flow Diagram Must Show

Records identified: The total number of records returned by each database searched, listed by database name. If you also searched grey literature, reference lists, or other sources, those are listed separately.
Records after deduplication: The number of unique records remaining after duplicate records are removed.
Records screened (Phase 1): The number of records reviewed at title and abstract stage — this equals the post-deduplication total.
Records excluded at Phase 1: The number excluded after title/abstract screening, with the primary reason categories listed (e.g., wrong population: n=143; wrong intervention: n=87; not a primary study: n=52).
Full texts retrieved (Phase 2): The number of records for which full text was sought.
Full texts not retrieved: The number where full text could not be obtained and why.
Full texts excluded (Phase 2): The number excluded after full-text review, with specific exclusion reasons and counts for each reason.
Studies included: Your final keeper study count — the number that met all eligibility criteria and are included in the review.

The PRISMA statement website (prisma-statement.org) provides the official 2020 PRISMA checklist, the flow diagram template, and the elaboration and explanation document that describes what belongs in each item. If you are conducting a systematic review at any level — student, thesis, or publication — this is the primary methodological reference for your selection documentation requirements.

PRISMA Extensions for Specific Review Types

The 2020 PRISMA statement covers standard systematic reviews and meta-analyses. Specific review types have their own extensions: PRISMA-ScR for scoping reviews, PRISMA-IPD for individual participant data meta-analyses, PRISMA-Harms for reporting of harms, and PRISMA-Equity for equity-focused reviews. If your review has a specific methodological character — particularly a scoping review — use the appropriate extension rather than the standard PRISMA template. Check with your supervisor or the target journal for which version applies.

Inter-Rater Reliability and Second Screeners

A systematic review conducted by a single screener introduces reviewer bias — the risk that your individual judgments, blind spots, or knowledge gaps influence which studies become keepers. The methodological standard for high-quality reviews is dual independent screening: two reviewers apply the criteria to each record independently, then compare decisions. Where they agree, the decision stands. Where they disagree, the discrepancy is resolved through discussion or by a third reviewer.

Agreement between screeners is measured using Cohen’s kappa (κ), a statistic that accounts for agreement occurring by chance. A kappa of 0.61–0.80 is generally considered substantial agreement; above 0.80 is near-perfect. Most published systematic reviews report kappa values for both Phase 1 and Phase 2 screening to demonstrate that the keeper set is not the idiosyncratic product of a single reviewer’s decisions.

For Published Reviews and Thesis-Level Work

Dual independent screening is expected. This requires finding a second reviewer — often a colleague, fellow student, supervisor, or research assistant — who applies your criteria to the same records without seeing your decisions first. After both reviewers complete screening independently, decisions are compared and discrepancies resolved. Report kappa values in your methods section.

For Course-Level Student Reviews

Many undergraduate and some master’s-level assignments do not require a second screener due to the practical constraints of a student project. If your assignment does not mandate dual screening, note this as a limitation in your discussion section. Some courses allow a simplified version — applying criteria to a subset of records with a second reviewer to demonstrate the process — rather than dual screening of the full record set.

Even if your assignment does not require dual screening, piloting your criteria before full screening begins is a practice that improves decision quality. Apply your criteria to ten records before screening the full set. If you find yourself uncertain about how a criterion applies to borderline cases, resolve those ambiguities now — not mid-screen, when changing your interpretation is a consistency violation.

Where Keeper Study Selection Goes Wrong

Criteria Defined After Seeing Search Results

Building your eligibility criteria after you have browsed the database results means your criteria are shaped by what exists rather than what your question requires. This is the most fundamental bias in study selection. Reviewers and committee members who ask “when were your criteria finalized?” are checking for exactly this problem.

Instead

Write your complete criteria set before running your first database search. If you pre-register your review (PROSPERO for health sciences, OSF for others), your criteria are timestamped before data collection. Even without formal pre-registration, document your criteria in a dated file before searching.

Excluding on Ambiguity at Phase 1

Reading an abstract that does not mention the outcome measure and excluding the study because “it probably doesn’t report the right outcomes.” Ambiguity is not a criterion violation. If you cannot confirm exclusion from the abstract, the study goes to Phase 2.

Instead

Train yourself to ask only one question at Phase 1: “Can I confirm, from this abstract alone, that this study definitely violates at least one criterion?” If no — it goes to Phase 2. This conservative approach costs time in Phase 2 but eliminates false exclusions, which are irrecoverable.

Vague Exclusion Reasons in the Log

Recording “excluded — not relevant” or “excluded — poor quality” for Phase 2 rejections without specifying which criterion was violated. A committee or journal reviewer can challenge “not relevant” — they cannot challenge “excluded: study population was adults with Type 1 diabetes; inclusion criterion specifies Type 2 diabetes only.”

Instead

Every exclusion at Phase 2 maps to a specific, named criterion from your inclusion/exclusion list. Your exclusion log column heading should read “Criterion Violated” — not “Reason for Exclusion.” The specific criterion reference is what makes the decision reproducible and defensible.

Applying Criteria Inconsistently Across Studies

Including a study conducted in 2008 (outside your 2010–2024 date range) because it was a landmark study in the field, while excluding other studies from the same period. Exceptions that are not justified by a pre-specified criterion are a consistency violation. A reviewer who notices one 2008 study in your keeper list will check whether there are others from the same period that were excluded.

Instead

Apply your criteria mechanically. If a landmark study falls outside your date range, it does not become a keeper — it can be referenced in your introduction or discussion as context, but it should not appear in your PRISMA keeper count or your data extraction table. If you realize mid-review that your date range was too restrictive, adjust the criterion for all records and document the change, not just for the one study you want.

Conflating Eligibility with Quality at Phase 1

Excluding studies at the abstract stage because they appear to be methodologically weak — small samples, no control group, cross-sectional design — when your eligibility criteria did not specify those as exclusion factors. Quality appraisal is a separate, later step.

Instead

Ask only eligibility questions during Phase 1 and Phase 2. “Is this study design listed in my study design inclusion criterion?” is a Phase 2 eligibility question. “Is this study well-designed?” is a quality appraisal question. Mixing them at the screening stage produces a keeper set shaped by quality preferences rather than defined criteria, which is a form of selection bias.

Missing the Full PRISMA Count

Reporting only the final keeper count without documenting the funnel — how many records were identified, how many were deduplicated, how many excluded at each phase. A statement that “15 studies were included after a systematic search” with no PRISMA diagram is not a reproducible or auditable selection process.

Instead

Track your numbers from the first database search. Record how many records each database returned before deduplication, how many duplicates were removed, how many went through Phase 1, how many were excluded at Phase 1 (with the primary reason categories), how many went to Phase 2, how many were excluded at Phase 2 (with specific reason counts), and your final keeper count. These numbers build your PRISMA flow diagram.

Tools That Support the Screening Process

Keeper study selection is a systematic process — and it is substantially more manageable when conducted in a tool designed for it rather than in a spreadsheet. Several platforms exist specifically for systematic review screening, each with different features, costs, and institutional availability.

Rayyan

Free web-based platform designed for systematic review screening. Supports dual-blind screening with built-in conflict detection. Imports records from most reference managers and databases. The blind mode hides one screener’s decisions from the other until both have reviewed each record. Widely used in student and academic reviews. Available at rayyan.ai.

Covidence

Full-featured systematic review platform including screening, full-text review, data extraction, and quality appraisal. Subscription-based but many universities provide institutional access — check your library. Generates PRISMA flow diagram data automatically as you screen. The most commonly used platform for Cochrane reviews and health sciences systematic reviews.

Excel / Sheets with a Protocol

Acceptable for smaller student reviews when dedicated platforms are unavailable. Create columns for: Record ID, Title, Authors, Year, Phase 1 Decision (Include/Exclude/Unsure), Phase 1 Reason if Excluded, Phase 2 Decision, Phase 2 Criterion Violated. Track your numbers as you screen to build the PRISMA flow at the end. Less efficient than dedicated tools but fully functional.

Zotero / EndNote

Reference managers that handle deduplication and organization well, but are not built for screening decisions. Best used in combination with a screening platform — import from the database, deduplicate in Zotero or EndNote, export the deduplicated set to Rayyan or Covidence for actual screening.

ASReview / EPPI-Reviewer

Machine learning-assisted screening tools that use active learning to prioritize the records most likely to be keepers for human review. Useful for very large record sets (1,000+). EPPI-Reviewer is commonly used in education and social policy reviews in the UK. Both require some learning investment before use.

PROSPERO

Not a screening tool — but the international registry where systematic review protocols are pre-registered before data collection begins. Registering on PROSPERO timestamps your criteria and signals to readers that your selection process was planned prospectively. Free to register. Required by many journals and encouraged by most systematic review methodologists.

Frequently Asked Questions

How many keeper studies do I need for a systematic review?

There is no required minimum number of keeper studies. A systematic review with three keeper studies is methodologically valid if the search was exhaustive and the selection process was rigorous. The keeper count is a result of your search and criteria, not a target. That said, a very small keeper set (fewer than five studies) raises questions about whether the search strategy was comprehensive or the criteria were too restrictive — questions you will need to address in your discussion section. If your search returns very few eligible studies, that finding itself is meaningful: it suggests a gap in the literature that your review documents.

Can I add studies to my keeper set after I have finished screening?

Yes, through forward and backward citation searching — also called citation chasing. After your database search is complete and your initial keeper set is identified, you scan the reference lists of your keeper studies (backward citation searching) and use Google Scholar or a citation database to find studies that have cited your keeper studies (forward citation searching). Studies found through citation searching go through the same two-phase screening process and are documented separately in your PRISMA diagram. This is a standard supplement to database searching, not a post-hoc addition. It should be planned in your protocol from the start.

A study meets all my criteria but it has obvious methodological flaws. Can I exclude it?

Only if your protocol includes a minimum quality threshold as an explicit exclusion criterion — and only if that threshold was specified before screening began. You cannot exclude a methodologically weak study mid-review simply because you judge it to be poor quality, unless quality was a pre-specified eligibility criterion. The correct approach for a study that meets eligibility but scores poorly on quality appraisal is to retain it as a keeper and account for its quality in your synthesis — interpreting its findings with appropriate caution, conducting sensitivity analyses that exclude low-quality studies to test whether they change the conclusions, and noting its limitations transparently in your discussion.

What if a study reports results for both an eligible and an ineligible population?

If the study reports eligible and ineligible subgroups separately and you can extract data only from the eligible subgroup, the study can be a keeper — but document this in your data extraction notes and acknowledge it in your methods. If the study does not separate results by subgroup and you cannot isolate the eligible population’s data, the study is generally excluded. Some reviews contact study authors to request disaggregated data — this is a legitimate strategy but time-intensive and not always successful. If you attempt author contact, document it in your PRISMA diagram under “Studies awaiting classification” until you receive a response.

Do I need to screen grey literature — theses, conference papers, government reports?

This depends on your review protocol and whether your eligibility criteria include or exclude grey literature. Many student reviews restrict to peer-reviewed journal articles, which excludes grey literature by definition. If your criteria do not exclude grey literature, you should search for it and screen it using the same process. Excluding grey literature is a legitimate methodological choice — it trades comprehensiveness for quality control, since grey literature is not peer-reviewed. The choice must be stated and justified in your methods. Publication bias — the tendency for positive results to reach peer-reviewed journals while null or negative results remain unpublished — is partially corrected by including grey literature, which is why some fields expect it.

How do I report the reason for exclusion when a study fails multiple criteria?

Report the primary reason — the first criterion that was violated when reviewing the study, or the most decisive criterion if multiple were violated simultaneously. Some review protocols instruct screeners to flag all violated criteria; others use a hierarchy (wrong population takes precedence over wrong design). What matters for your PRISMA diagram and exclusion log is that each excluded study has at least one specific criterion cited. If your exclusion log shows many studies excluded for multiple reasons, that typically indicates your criteria are working correctly and your exclusion log is detailed — which is a strength, not a problem.

What if my search returns thousands of records? Is there a more efficient way to screen?

For large record sets, three strategies improve efficiency without compromising rigor. First, use a tool like Rayyan that displays records in a streamlined interface with keyboard shortcuts for fast decisions. Second, apply your most restrictive criterion first at Phase 1 — if wrong study design eliminates 80% of records immediately, screen for that before applying other criteria. Third, consider AI-assisted screening tools like ASReview that prioritize likely-eligible records for early human review; these do not replace human judgment but reduce the time spent on clearly irrelevant records. Whatever strategy you use, document it in your methods — a large record set is not an excuse for shortcuts in the screening process itself.

Connecting the Steps: How Keeper Selection Fits Into Your Full Review

Keeper study selection does not begin when you open your database results and does not end when you color your final study green in Rayyan. It begins when you write your research question and ends when you finalize your PRISMA flow diagram and exclusion log. Every step in between — criteria development, search strategy design, Phase 1 screening, Phase 2 full-text review, quality appraisal, and data extraction — is connected to the integrity of your keeper set.

The most common reason students struggle with keeper selection is treating it as a clerical task rather than a methodological one. Deciding which studies count as evidence for your research question is an intellectual act with consequences for everything you conclude. A keeper set built on vague criteria, inconsistently applied, produces a review whose conclusions cannot be trusted — regardless of how well-written the discussion section is.

Before moving from keeper selection to data extraction and synthesis, audit your process against three questions: Can you reproduce every inclusion and exclusion decision from your written criteria alone? Does your PRISMA diagram account for every record that entered the search? Does your exclusion log for Phase 2 cite a specific criterion for every excluded study? If yes to all three, your keeper set is methodologically sound. If not, the gap is in documentation rather than in your judgment — and documentation gaps are fixable before submission.

For direct support with study selection protocols, PICO criteria development, PRISMA documentation, or any stage of a systematic or structured review, our research paper writing team works specifically with evidence synthesis methodology at the student, thesis, and publication level.

Also Useful

Continue with: literature review writing services · dissertation and thesis writing · data analysis help · statistical analysis help · proofread my research paper.

How to Select a Keeper Study

What This Guide Covers

What a Keeper Study Is (and Is Not)

Before You Screen: Building Your Criteria

Start with Your Research Question

Review Field-Specific Standards

Set Scope Boundaries

Using PICO/PICOS to Define Eligibility

Population

Intervention

Comparison

Outcome

Adding S: Study Design

Writing Inclusion and Exclusion Criteria

The Two-Phase Screening Protocol

Phase 1: Title and Abstract Screening

Phase 2: Full-Text Review

Quality Appraisal: Does Quality Affect Keeper Status?

Position 1: Quality as Exclusion Criterion

Position 2: Quality as Synthesis Weight

Common Quality Appraisal Tools by Study Design

Documenting Every Decision — PRISMA and Flow Diagrams

What the PRISMA Flow Diagram Must Show

Inter-Rater Reliability and Second Screeners

For Published Reviews and Thesis-Level Work

For Course-Level Student Reviews

Where Keeper Study Selection Goes Wrong

Criteria Defined After Seeing Search Results

Instead

Excluding on Ambiguity at Phase 1

Instead

Vague Exclusion Reasons in the Log

Instead

Applying Criteria Inconsistently Across Studies

Instead

Conflating Eligibility with Quality at Phase 1

Instead

Missing the Full PRISMA Count

Instead

Tools That Support the Screening Process

Rayyan

Covidence

Excel / Sheets with a Protocol

Zotero / EndNote

ASReview / EPPI-Reviewer

PROSPERO

Frequently Asked Questions

Need Help With Your Systematic Review or Literature Review?

Related Resources on Custom University Papers

Connecting the Steps: How Keeper Selection Fits Into Your Full Review

Leave a Comment Cancel

Article Reviewed by

Simon