a systematic method for search term selection in systematic reviews

11
A systematic method for search term selection in systematic reviews Jenna Thompson* Jacqueline Davis and Lorraine Mazerolle The wide variety of readily available electronic media grants anyone the freedom to retrieve published references from almost any area of research around the world. Despite this privilege, keeping up with primary research evidence is almost impossible because of the increase in professional publishing across disciplines. Systematic reviews are a solution to this problem as they aim to synthesize all current information on a particular topic and present a balanced and unbiased summary of the ndings. They are fast becoming an important method of research across a number of elds, yet only a small number of guidelines exist on how to dene and select terms for a systematic search. This article presents a replicable method for selecting terms in a systematic search using the semantic concept recognition software called LEXIMANCER (Leximancer, University of Queensland, Brisbane, Australia). We use this software to construct a set of terms from a corpus of literature pertaining to transborder interventions for drug control and discuss the applicability of this method to systematic reviews in general. This method aims to contribute a more systematicapproach for selecting terms in a manner that is entirely replicable for any user. Copyright © 2013 John Wiley & Sons, Ltd. Keywords: term; systematic review; semantics; systematic search 1. Introduction One key difference between a systematic review and a traditional narrative review lies in the thoroughly detailed, transparent and replicable search protocol required for a systematic review (Rother, 2007). Reviewers document every stage of the search process in detail, to ensure replicability of the review and an extensive search process. However, there remains at least one part of the systematic search process that is perhaps less transparent: the selection of search terms. A transparent method for search term or free text term selection is desirable for many reasons. The systematic review may address outcomes that are latent constructs and be referred to by different terms within a single body of literature, for example, psychometric constructions of intelligence may be referred to generally as intelligenceor specically as a Wechsler score, IQ and mental age (Smith and Humphreys, 2006). The outcome may be a broadly dened set of outcomes, within which specic outcomes have their own terms, for example, the outcome of brain injuriesmay include the more specic shaken baby syndrome(Lefebvre et al., 2011). If the review addresses a question that spans multiple disciplines, then the intervention and outcome terms may vary according to discipline, for example, the phrase violent crimein the criminological literature is often referred to as interpersonal violencein the public health literature. Finally, unpublished literature, such as dissertations and research reports, may not use standard terminology. Even in published literature, authors may describe their research in ways that are unfamiliar to the reader. In addition to these specic concerns, a transparent, replicable method of search term selection is desirable to improve the overall methodical nature of the systematic review and make the ndings more defensible. Few guidelines exist on how to dene and select terms for a systematic search. The standard guideline for clinical question formulation (participants or population, intervention, comparison and outcomes (PICO)) is a useful framework (Huang et al., 2006), but systematic methods for selecting search terms within each of these ARC Centre of Excellence in Policing and Security, Institute for Social Science Research, The University of Queensland, Brisbane, Queensland 4072, Australia *Correspondence to: Jenna Thompson , ARC Centre of Excellence in Policing and Security, Institute for Social Science Research, The University of Queensland, Brisbane, Queensland 4072, Australia. E-mail: [email protected] Not all elements of the PICO framework are required in a search. In fact, fewer elements are sometimes better in terms of sensitivity. Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013 Original Article Received 30 January 2012, Revised 5 April 2013, Accepted 30 July 2013 Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/jrsm.1096

Upload: lorraine

Post on 13-Dec-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A systematic method for search term selection in systematic reviews

Original Article

Received 30 January 2012, Revised 5 April 2013, Accepted 30 July 2013 Published online in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/jrsm.1096

A systematic method for search termselection in systematic reviews

Jenna Thompson*†Jacqueline Davis and Lorraine Mazerolle

The wide variety of readily available electronic media grants anyone the freedom to retrieve publishedreferences from almost any area of research around the world. Despite this privilege, keeping upwith primaryresearch evidence is almost impossible because of the increase in professional publishing across disciplines.Systematic reviews are a solution to this problem as they aim to synthesize all current information on aparticular topic and present a balanced and unbiased summary of the findings. They are fast becoming animportant method of research across a number of fields, yet only a small number of guidelines exist onhow to define and select terms for a systematic search. This article presents a replicable method for selectingterms in a systematic search using the semantic concept recognition software called LEXIMANCER (Leximancer,University of Queensland, Brisbane, Australia). We use this software to construct a set of terms from a corpusof literature pertaining to transborder interventions for drug control and discuss the applicability of thismethod to systematic reviews in general. This method aims to contribute a more ‘systematic’ approach forselecting terms in a manner that is entirely replicable for any user. Copyright © 2013 John Wiley & Sons, Ltd.

Keywords: term; systematic review; semantics; systematic search

1. Introduction

One key difference between a systematic review and a traditional narrative review lies in the thoroughly detailed,transparent and replicable search protocol required for a systematic review (Rother, 2007). Reviewers document everystage of the search process in detail, to ensure replicability of the review and an extensive search process. However,there remains at least one part of the systematic search process that is perhaps less transparent: the selection ofsearch terms.

A transparent method for search term or free text term selection is desirable for many reasons. The systematicreview may address outcomes that are latent constructs and be referred to by different terms within a single bodyof literature, for example, psychometric constructions of intelligence may be referred to generally as ‘intelligence’or specifically as a Wechsler score, IQ and mental age (Smith and Humphreys, 2006). The outcome may be a broadlydefined set of outcomes, within which specific outcomes have their own terms, for example, the outcome of ‘braininjuries’ may include the more specific ‘shaken baby syndrome’ (Lefebvre et al., 2011). If the review addresses aquestion that spans multiple disciplines, then the intervention and outcome terms may vary according to discipline,for example, the phrase ‘violent crime’ in the criminological literature is often referred to as ‘interpersonal violence’ inthe public health literature. Finally, unpublished literature, such as dissertations and research reports, may not usestandard terminology. Even in published literature, authors may describe their research in ways that are unfamiliarto the reader. In addition to these specific concerns, a transparent, replicable method of search term selection isdesirable to improve the overall methodical nature of the systematic review and make the findings more defensible.

Few guidelines exist on how to define and select terms for a systematic search. The standard guideline forclinical question formulation (participants or population, intervention, comparison and outcomes (PICO)) is auseful framework‡ (Huang et al., 2006), but systematic methods for selecting search terms within each of these

ARC Centre of Excellence in Policing and Security, Institute for Social Science Research, The University of Queensland, Brisbane, Queensland4072, Australia*Correspondence to: Jenna Thompson

,ARC Centre of Excellence in Policing and Security, Institute for Social Science Research, The University of

Queensland, Brisbane, Queensland 4072, Australia.†E-mail: [email protected]

‡Not all elements of the PICO framework are required in a search. In fact, fewer elements are sometimes better in terms of sensitivity.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 2: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

elements are rare. For systematic searches of controlled trials in medical research, a standard, highly sensitivesearch strategy has been developed to retrieve the maximum number of relevant references from medicaldatabases (Robinson and Dickersin, 2002). A technique commonly used to select effective search terms ininterdisciplinary research is ‘pearl growing’, in which an iterative process is used until no further material is found.The process involves the searcher locating a document relative to the topic at hand, reviewing the characteristicsof that document and adding their terms to the search in order to retrieve additional ones (Schlosser et al., 2006).‘Snowball sampling’, in which the references of relevant articles obtained in a first search are examined for furtherrelevant material, is also a commonly used information-seeking strategy (Greenhalgh and Peacock, 2005; Sayers,2007). These are valuable strategies for discovering further relevant material after a first literature search.

This paper proposes a method for obtaining a more complete set of documents in the first search. Weacknowledge the role of iterative methods—such as pearl growing and snowball sampling—in obtaining all materialrelevant to the review and propose that our approach is one alternative way of developing a replicable, transparentmethod for generating suitable and appropriate free-text terms as a solid foundation for systematic search andretrieval projects. In this paper, we demonstrate a method for selecting terms for a systematic search that istransparent, replicable and generalizable across disciplines. We use semantic concept recognition software(LEXIMANCER) to construct a set of terms from a body of literature on transborder interventions for drug control. TheLEXIMANCER software examines a body of text to produce a ranked list of important terms (most commonly occurringterms) on the basis of word frequency and co-occurrence usage. These terms then form a thesaurus builder that‘learns a set of classifiers from the text by iteratively extending the seed word definition’ (Smith and Humphreys,2006). A weight term classifier that results from this process is referred to as a concept. Using this software is beneficialas it reduces the amount of user input and generates a more automated system of term selection. In the followingsection, we present a general method for using LEXIMANCER to generate terms for a systematic search and thendemonstrate an application of this method to a systematic review of transborder interventions for drug control.Finally, we discuss the applicability of this method to systematic reviews in general and some directions for futureresearch.

2. Method of selection

In this section, we provide a detailed explanation of our proposed systematic method for search term selection,showing how it can be easily replicated through the use of the LEXIMANCER software. We demonstrate this methodby utilizing a systematic review based on transborder drug control. The review title is The effectiveness of trans-border drug interventions in controlling the supply of illegal drugs§ (Mazerolle et al., 2012). Although the review itselfis currently in progress, the process of term selection has been applied to the search methodology section and wassuccessful in generating a preliminary list of search terms.

For the purpose of clarification, we present the following table where we summarize each stage of oursystematic term search method.

Table 1 provides a general overview of our systematic term selection and highlights which stages are machinegenerated and which require input from the user. This systematic search method was applied to our term searchthat formed the basis of our transborder drug control systematic review that we describe below.

2.1. Stage 1: uploading/generating references

The first stage of the term search method involves uploading relevant and randomly selected references into theLEXIMANCER software. Because of the data mining and semantic recognition capabilities of LEXIMANCER, we are ableto analyse a large collection of textual documents and visually display the extracted information. The softwaredoes not only analyse the text within the document but also the title, abstract, authors and references. It is thisextracted information that acts as the primary result for our method, and hence, it is important to note thatobtaining a sufficient text corpus is a crucial step in the process.

Obtaining this initial corpus of text to generate the terms (let us call it text corpus to avoid confusion) requires abasic search for literature. The phrase basic search refers here to a search of a database that will act as thefoundation of literature that is to be analysed by the software. The corpus from this basic search will later berefined with respect to the user’s interests. In order to generate this corpus, a basic search is performed througha search engine, which helps to obtain a range of relevant references on the topic. The search engine used in ourmethod was the SciVerse Hub online database, a readily available database. This particular database was chosen toact as the primary search engine as it covers a vast array of journals, websites and patent offices, which are listedin Table 2. Because of the extensive scope of this database, it is advised that only the first 20 pages (about 2000references) are included in the corpus, ranked by relevance.

§The final review that was accepted by the Campbell Collaboration focused on crop targeting interventions, rather than transborder.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 3: A systematic method for search term selection in systematic reviews

Table 2. Journals, websites and patent offices searched by SciVerse Hub database.

Journal/sources Websites Patent offices

ScienceDirect MD Consult United States Patent OfficeScopus NDLTD World Intellectual Property OrganizationMEDLINE/PubMed Digital Archives Europe Patent OfficeNature Publishing Group University of Toronto T-Space United Kingdom Patent OfficePubmed Central DiVA Japan Patent OfficeWiley-Blackwell IISc —

BioMed Central The University of Hong Kong —

SAGE CogPrints —

IOP Wageningen Yield —

Hindawi Publishing Corp. MIT OpenCourseWare —

Royal Society Publishing Humboldt —

Scitation SciTopics —

Crystallography Journals Online CURATOR —

Maney Publishing PsyDok —

American Physical Society HKUST —

Project Euclid arXiv.org e-Print —

SIAM NASA —

Table 1. Summary of the six-stage method for term selection.

Stagenumber Stage Description

User defined ormachine generated

1 Uploading/generatingreferences

Search SciVerse Hub online database toobtain a body of literature relevant tothe user’s interest.

User defined and machinegenerated

2 Article omission criteria Define article omission criteria—set ofrules that narrow the initial corpus ofliteratureto a more refined set.

User defined

3 Edit concept seeds Determine list of commonly occurringconcepts and remove any thatcontributenothing to the systematic search.

User defined and machinegenerated

4 Results Obtain results from LEXIMANCER in theform of lists of concepts

Machine generated

5 Preliminary searchterms

Form a list of terms from these resultsand split into relevant categories,for example, ‘location terms’ and‘outcome terms’.

User generated

6 Concept cluster cloud Visually and comprehensively confirmthe most frequent and relevantconcepts within the body of literature.Include cluster cloud in the review.

Machine generated

J. THOMPSON ET AL.

To begin, the user generates a basic search query in order to obtain the text corpus. In relation to thetransborder review, the search was performed using the query string ‘drug AND control’. Note that the querycan be generated using terms that do not consist of Boolean operators (e.g. AND, OR and NOT); it is entirely upto the user. A total of 6 902 214 results were obtained (search date: 23/08/2011) from the database using thisquery. Because the database returns results ranked by relevance, only the first 2000 documents were consideredbecause the initial result was far too large. It is important to note that because the purpose of this initial search isto obtain a wide-ranging list of references for the initial text corpus, only a generalized search query is required.

2.2. Stage 2: article omission criteria

The next stage in the process involves generating a list of article omission criteria in order to reduce the initialsearch results to a more confined and relevant corpus of text. Note that the omission criteria are only used in

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 4: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

the term search method and should not be confused with the eligibility criteria, which are used and defined in thesystematic review itself. The eligibility criteria identify what references are accepted for inclusion in the systematicreview itself, while the omission criteria are developed primarily to contribute to the preliminary term searchmethod using the LEXIMANCER software.

From the initial results retrieved from the basic search, this body of text is reduced significantly according tothe list of omission criteria. This is done manually by the user, by looking through titles of the results andcomparing them with the list of omission criteria. Because of the simple nature of the initial search query, it isexpected that even the refined list of references will be quite large. LEXIMANCER is able to process large files of data;however, the processing speed becomes problematic when folders containing over 1000 documents areuploaded. In order to maintain a sufficient processing speed, a random fifty per cent of the refined list ofreferences is used as the text corpus. The process of selecting the random fifty per cent depends entirely uponthe user; however, we suggest selecting every second document until the desired quantity is achieved.

For the search drug AND control, the preliminary omission criteria were as follows:

1. We will exclude references on the basis of language. We aim to generate a preliminary list of search termsthat will cover the different aspects of transborder drug control using elements of the English language.Because there are limited available resources that allow us to search in languages other than English, anystudy in a non-English language will be excluded for now (because of the lack of resources for translation).

2. We will exclude medical studies relating to drug issues that do not include treatment strategies for drugdependent users. This would include, for example, articles discussing the status of diabetes control andantidiabetic drug therapy. Furthermore, medical studies involving articles based on drug trials, drug testingand drug development will be excluded from the initial search.

From the 2000 results retrieved, 486 were omitted according to the aforementioned criteria. Thus, a total of1514 results remained. After taking a random fifty per cent, 800 references remained as our text corpus.

Once this corpus of text is retrieved, it is uploaded into the LEXIMANCER software and run through all stages ofprocessing to produce a default interactive concept map and a list of commonly occurring terms in the corpus.These terms will act as our search terms. The concept map is a visual representation of concepts and themes thatoccur in the body of retrieved literature. It is presented in graph format, with nodes (points) representing theconcepts and lines connecting paths between co-occurring concepts. Note that there is a difference between aconcept and a theme, and the two should not be confused. A concept is not necessarily a word that occurs inthe text (although it of course could be) but rather a related word defining elements of the corpus of literature.Themes generated by the LEXIMANCER software encompass groups of concepts that are related in the text andcan range from extremely broad to very specific. For example, consider the words ‘zebra’, ‘wings’, ‘feed’, ‘monkey’and ‘wash’. The words zebra, wings and monkey could be defined by the concept ‘animal’, while feed and washcould relate to the concept of ‘job’. Furthermore, these two concepts animal and job could then be encompassedby the theme of ‘zoo’.

2.3. Stage 3: edit concept seeds

The concept seeds generate the concept map and serve as an important foundation for visual representation. Theprocess of editing the concept seeds is significant in determining the list of search terms as it enables the user tomanually eliminate any unnecessary or irrelevant concepts that may interfere with the analysis. This is achievedthrough a two-step process. Firstly, a manual search of the concepts is conducted, excluding all concepts thatare acronyms (e.g. ARC) or countries. This ensures that a general list of search terms is produced that does notcontain concepts subject to any particular country, department or organization, that is, a general overview.Although certain countries and organizations may be relevant to the review, they will still feature text wordsand therefore do not need to be represented as concepts in this list. Secondly, all abbreviations of common termssuch as e.g., etc. and ml, which contribute nothing to the systematic search, are omitted in the preliminary list ofsearch terms. Removing a concept from this list does not eliminate the text in which the concept itself iscontained. Once the identified concepts are excluded from the list, the ‘merging’ function of LEXIMANCER is utilizedin order to group concepts of similar meaning together into one single category. The merging and removingfunctions of the LEXIMANCER software enable the search to be refined to a list of concepts that will ultimatelyproduce an accurate interactive concept map of the given text corpus. It is important that the user keeps accuraterecords of all removed and merged concepts so that the process remains replicable.

A primary objective of this method is to provide a list of terms that accurately encompass the topic of asystematic review. In order to thoroughly represent the scope of the topic, the next step in this process is toadd any user-generated concepts that do not appear in the outputted list of concepts (and hence terms) byLEXIMANCER. A user-generated concept is any concept that is thought (by the user) to be of relevance to the searchthat does not appear in the generated list. These concepts can be manually entered into the LEXIMANCER software;we suggest that a note is made of which user-defined concepts are added in order to enhance transparency andreplicability. After completing the aforementioned stages, LEXIMANCER is run through to the final process stage andthe results obtained.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 5: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

Table 3 displays the concepts that were manually removed from the initial list obtained in the transborderreview.

We then merged some concepts together because they had similar meanings (merged concepts shown inTable 4). Within the LEXIMANCER software, concepts starting with a capital letter are different to those that donot, as they are used in the text corpus at the start of a sentence or as a name. Therefore, we merged the twosimilar words together.

The LEXIMANCER software is then run through to all stages, and the following top 10 themes and concepts wereproduced, listed in Table 5.

In addition, the following user-generated concepts were added to the list: transborder and cross-border. Theseconcepts are of great importance to the review and were chosen after reading a report involving drug control(United Nations Office on Drugs and Crime, 2011). This document was chosen because it was created by a large,well-resourced organization whose main focus is drug control. The LEXIMANCER maps in Figures 1 and 2 displaythese themes and concepts.

Table 3. Removed concepts.

Removed concepts

etalDEAHIVUNODCkgde

Table 4. Merged concepts.

Merged concepts

Area/areasCountry/countriesDrug/drugDrugs/drugsDrug/drugsGroup/groupsIncluded/includingMarket/marketsProblem/problemsProgram/programsStudies/studySubstances/substanceUsed/use/using

Table 5. Top 10 themes and concepts for ‘drug AND control’ search.

Top 10 themes Top 10 concepts

Drug DrugEnforcement TreatmentTreatment ProblemAbuse ControlCrime CountriesGroups EnforcementCountries LawArea AbuseCocaine IllicitTrafficking Market

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 6: A systematic method for search term selection in systematic reviews

Figure 1. LEXIMANCER map displaying the themes of search ‘drug AND control’.

Figure 2. Leximancer map displaying the concepts of search ‘drug AND control’.

J. THOMPSON ET AL.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 7: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

2.4. Stage 4: results

The software produces initial results as a list of top 10 themes and concepts from the text corpus andmaps displayingthese themes and concepts. Once these top 10 themes and concepts are identified, the Generate Outputs stage ofLEXIMANCER is utilized to produce what is known as an ‘Insight Dashboard’. LEXIMANCER identifies words that co-occurin the lists of themes and concepts (herein named co-occurring themes) as the tags and the remaining identifiedconcepts, including the user-generated concepts, as the ‘attributes’. The Insight Dashboard is a detailed report thatenables the user to identify the closest concept related to each co-occurring theme through means of a qualitativeanalysis. Thus, the user can subjectively choose which co-occurring themes they would like to include. Furthermore,a list of compounded concepts is provided that serves as suggestions for possible search phrases.

The first section of the report is a high-level visual chart, displayed in ‘magic-quadrant’ format. This format isrepresented by a graph split horizontally and vertically into four squares (quadrants). The four quadrants provide agraphical representation of the text corpus and depict how the strength of a concept measures against the relativefrequency. Quadrant four (top right) is of particular interest, as it displays concepts that are strong, prominent andmore likely to coexist with the co-occurring theme. Results obtained from the dashboard display the overviews ofranked concepts for co-occurring themes and the quadrant report. These results are in ranked bar chart format(Leximancer, 2007) of themost prominent concepts within the particular co-occurring themes. This ranking is definedvia a measure, which is proprietary information from LEXIMANCER (Leximancer, 2007), that combines the strength andfrequency characteristics of the concepts. This section of the report is valuable to the systematic search as it allows theuser to recognize certain combinations of words that can be used to form term search queries.

The Generate Outputs stage of LEXIMANCER was used in the transborder review to produce the InsightDashboard, by identifying the co-occurring themes as the tags and the remaining concepts as the attributes.The quadrant report that was obtained from the Insight Dashboard is shown in Figure 3.

Figure 3. Insight Dashboard: quadrant report for search ‘drug AND control’.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 8: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

Figure 4 demonstrates one of five overviews of ranked concepts within themes that are contained within theInsight Dashboard. Starting from the left, the first column represents the identified concept within the themes.The second column demonstrates the relative frequency of the concept within the theme, followed by thestrength of that concept in the third column. Finally, the last column represents the prominence of the identifiedconcept, with this result in ranked bar chart format.

2.5. Stage 5: preliminary search terms

The results from the LEXIMANCER output and the Insight Dashboard are used to generate a preliminary list of searchterms, by combining the list of top 10 concepts (user-generated included), along with the two most highly rankedconcepts within each co-occurring theme. The two highest ranked concepts are used in order to maintain aconfined and discrete list. One could simply take the single highest ranked concept (it is completely up to theuser); however, many of the concepts are repeated across themes. For example, the concept ‘drug’ may beincluded in the theme ‘narcotics’ and also within the theme of ‘abuse’. Thus, we use the top two to account forrepetition of concepts amongst themes. Choosing more or fewer ranked concepts to include in the list will onlyalter the size of the list. Therefore, a larger list of terms requires more ranked concepts to be chosen. Once again,it is important for the user to document this stage of the method to ensure transparency and that replicability ismaintained.

Note that we did not use the top 10 most highly ranked themes in our list of search terms because the themesidentified by LEXIMANCER will change according to the position of the concepts on the map and hence will notalways be entirely replicable. The list of concepts output by LEXIMANCER will always be the same, providing thatthe text corpus remains unchanged. However, because the themes represent groupings of concepts on themap, the themes may change slightly if concepts appear in different positions on different maps. In view of that,we did not include the list of top 10 themes in our list of terms.

The list of terms is then manually split into relevant categories by the user, for example, ‘location terms’ and‘outcome terms’. These categories are chosen by the user according to the objective of the systematic reviewand must follow the standard guideline for clinical question formulation (PICO). Therefore, the categories are ofthe form ‘participants or population terms’, ‘intervention terms’, ‘comparison terms’ and ‘outcomes terms’. It is

Figure 4. Insight Dashboard: ranked concepts for themes cocaine and countries.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 9: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

crucial to remember that LEXIMANCER will not distribute the terms in the respective categories automatically.Therefore, once the categories are determined by the user, the words are manually distributed into a table,producing the preliminary list of search terms.

This stage of the process was implemented in order to produce the final list of search terms for the transborderreview. The terms were manually split into three relevant categories based on the PICO framework, namely,intervention (control) terms, outcome terms and population (substance) terms. To these, we added a furthercategory, ‘location’, because the geographic variance of interventions was of particular interest to this researchquestion. These categories were chosen according to PICO and the objective of the systematic review. Thefollowing preliminary list of search terms in Table 6 was generated as a result of this process.

Additionally, we obtained several lists of ranked compound concepts for each category from the InsightDashboard. These lists were used as suggestions for possible search queries in the systematic search. For example,for the theme drug, the top 10 most highly ranked concept pairs were

1. control and international2. abuse and substance3. social and services4. trafficking and intelligence5. crime and criminal6. abuse and prevention7. trafficking and operations8. international and national9. enforcement and intelligence

10. abuse and alcohol

Therefore, we can combine search terms with elements of this list to generate a query. For example, combiningthe term drug and lines 1, 6 and 7 of the list form search queries ‘drug AND control AND international’, ‘drug ANDtrafficking AND operations’ and ‘drug AND abuse AND prevention’, which will be used in the systematic search.

The following list (Table 7) shows some possible search phrases that can be used when searching for referencesfor the transborder drug control systematic review. We generated this list through the combination of relevantcolumns in Table 6. These combinations were manually selected by the user for the purpose of demonstratingpossible search strings and were not used in the actual search.

Table 6. Preliminary list of search terms.

Intervention terms Outcome terms Location terms Substance terms

Control Abuse Countries DrugEnforcement Production Local IllicitTreatment Criminal Transborder SubstanceLaw Problem Cross-border HeroinPrevention Market — CocaOperations Money — —

Care — — —

Services — — —

Cultivation — — —

Table 7. List of possible search queries using generated terms that could be used for the transborder review.

Final outcome: terms and queries

Trans OR cross AND border AND drug AND control AND (outcome term)Trans OR cross AND border AND market AND substanceTrans OR cross AND border AND drug AND law AND enforcementTrans OR cross AND border AND (substance term) AND operationsTrans OR cross AND border AND money AND drug AND operationsTrans OR cross AND border AND (substance term) AND cultivation(Substance term) AND countries AND problem AND control(Substance term) AND criminal AND treatment AND countries(Substance term) AND production AND controlBorder OR countries AND prevention OR operations AND drugBorder AND control AND (substance term)

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 10: A systematic method for search term selection in systematic reviews

Figure 5. Concept cloud for the search ‘drug AND control’.

J. THOMPSON ET AL.

Note that Table 7 was not created by taking a row from Table 6 and joining the words from this row togetherby Boolean operators. Instead, the search strings were generated by the user on the basis of what fits welltogether.¶

2.6. Stage 6: concept cluster cloud

The final stage in the systematic search for terms is a useful visual output, provided by the LEXIMANCER software.This output enables the user/reader to easily view the most frequent and relevant concepts by using what isknown as a ‘concept cloud’. Concept clouds, like the concept map, visually display the entire list of concepts ina heat-mapped format. That is, hot colours (e.g. lighter shades of grey) denote the most relevant concepts, whilstcold colours (e.g. darker shades of grey, black) denote concepts of least relevance. Furthermore, the frequency ofthe concept is depicted through its size, with the larger concepts corresponding to higher frequencies.

The concept cloud obtained from the transborder term search is displayed in Figure 5 in the succeeding texts.The benefit of including this cloud into the systematic selection of terms, as well as the review itself, is that

it provides a visual aid for determining the frequency and relevance of each concept. In order to present thereader with a thorough understanding of the relevance of concepts in the search, we suggest providing thecluster cloud in the review. Multiple cluster clouds may even be generated using other relevant sets of datain order to compare results.

3. Discussion and further research

In this article, we present an alternative method for search term selection that researchers may be interested inusing in preparation for their systematic reviews (and other research). The term selection method that we presentis a transparent and replicable method, which utilizes the text analytic capabilities of the LEXIMANCER software.

¶The list of references from the search is not provided because the search was refined by the Campbell Collaboration to be crop targeting.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013

Page 11: A systematic method for search term selection in systematic reviews

J. THOMPSON ET AL.

Although certain stages of the method are based on user input, a strict documentation system is strongly advisedso that results can be replicated.

We demonstrate this method by using a systematic review based on transborder drug control and create a listof possible search queries using generated terms (Table 7). Although the transborder review was chosen for thispaper, the final review that was accepted by the Campbell Collaboration focused on crop targeting interventionsonly. This limitation means that the overall number of hits and included citations resulting from theaforementioned systematic search on transborder interventions cannot be provided.

The LEXIMANCER software provides outputs in the form of highly dense, complex graphs. A highly dense graph isa graph consisting of a large number of edges and vertices (nodes). Analysing the structure of these graphs notonly provides details on certain structural properties of the literature but also sheds some light on the data thatare represented by the graph itself. A possibility for future research in this area would be to identify commonsubgraphs between two of the complex graph outputs from LEXIMANCER. By using certain graph theoretic toolssuch as structure analysis and algorithm design, this element of research is easily possible and would provide auseful tool in analysing the results of systematic reviews. Another possible area of future research is includingexpert-elicited information in the term selection process. A text-driven method such as ours does not replaceexpert elicitation for search term selection. Indeed, in a systematic method such as this one, the stages atwhich expert users generate input are clearly defined. However, a more sophisticated iterative method couldbe used to validate the results of the LEXIMANCER term selection process using expert-elicited data.

This paper has demonstrated a method for selecting systematic search terms that is transparent, replicable andgeneralizable across disciplines. The specific method detailed in this paper is intended as an illustration of the typeof process that could be used to systematically generate search terms; it is not intended as a general proof offeasibility, and we have not yet validated this method over many research questions. Future work may test theconcurrent validity of this search term selection method by comparing the terms generated by this method tothose used in a previously published systematic review. By running key papers through the systematic termselection process, we could generate a list of terms that could be directly compared with those generated bythe authors of the review. This validation method could be extended if the search was actually performed onthe terms generated by this systematic method, taking care to replicate all other parameters of the search usedby the original authors. The located papers would then be compared with those located by the original authors.To further validate this search term selection method, the validation process should be replicated across multiplepapers from different academic disciplines. We believe that there is a general need to explicate and documentthe process of search term selection in systematic reviews, and we have presented here an example method fordoing so.

References

Greenhalgh T, Peacock R. 2005. Effectiveness and efficiency of search methods in systematic reviews of complexevidence: audit of primary sources. British Medical Journal 331: 1064–1065.

Huang X, Lin J, Demner-Fushman D. 2006. Evaluation of PICO as a knowledge representation for clinical questions,in Proceeding of the 2006 Annual Symposium of the American Medical Informatics Association (AMIA), pp 359–363.Washington DC, November 11–15.

Lefebvre C, Manheimer E, Glanville J. On behalf of the Cochrane Information Retrieval Methods Group. 2011.Chapter 6: Searching for studies. In Higgins J, Green S (eds.), The Cochrane Handbook for Systematic Reviewsof Interventions Version 5.1.0 (online edition). Available from: www.cochrane-handbook.org. Accessed August15, 2012.

Leximancer. Manual version 2.23. 2007 Retrieved from: https://www.leximancer.com/wiki/images/archive/7/77/20080826071142!Leximancer_V2_Manual.pdf. Accessed July 18, 2012.

Mazerolle L, Higginson A, Thompson J, Somerville A. 2012. The effectiveness of crop targeting as a drug controlstrategy (protocol). The Campbell Collaboration Library of Systematic Reviews.Available from http://www.campbellcollaboration.org/library.php.

Robinson KA, Dickersin K. 2002. Development of a highly sensitive search strategy for the retrieval of reports ofcontrolled trials using PubMed. International Journal of Epidemiology 31: 150–153.

Rother, ET. 2007. Systematic literature review x narrative review. Acta paulista enfermeria [online] 20: 5–6. ISSN1982–0194.

Sayers A. 2007. Tips and tricks in performing a systematic review. British Journal of General Practice 57: 759.Schlosser RW, Wendt O, Bhavnani S, Nail-Chiwetalu B. 2006. Use of information-seeking strategies for developing

systematic reviews and engaging in evidence-based practice: the application of traditional and comprehensivepearl growing. A review. International Journal of Language & Communication Disorders 41: 567–582.

Smith AE, Humphreys, MS. 2006. Evaluation of unsupervised semantic mapping of natural language withLeximancer concept mapping. Behavior Research Methods 38: 262–279.

United Nations Office on Drugs and Crime. 2011. World Drug Report. United Nations Publication: New York.

Copyright © 2013 John Wiley & Sons, Ltd. Res. Syn. Meth. 2013