2009 Abstracts


Authorship and Contributorship

Influence of Authorship Order and Corresponding Author on Perceptions of Authors' Contributions

Mohit Bhandari,1,2 Jason Busse,2 Abhaya Kulkarni,2 P. J. Devereaux,2 Pamela Leece,2 Sohail Bajammal,1 and Gordon Guyatt2

Objective The majority of medical journals rely on the order of authors and who is listed as corresponding author to convey authors' contributions. How readers interpret authors' roles based on authorship order and corresponding author remains uncertain. We explored how, on the basis of authorship order and designation of corresponding author, academic leaders perceive authors' contributions to research.

Design We conducted a cross-sectional survey of chairpersons in the departments of surgery and medicine across North America (259 United States and 32 Canada). We developed hypothetical study and authorship bylines with 5 authors varying the corresponding author as first or last author. Respondents reported their perceptions about the authors' roles in study conception and design, analysis and interpretation of data, and statistical analysis, and their view of the most prestigious authorship position. We used multinomial regression to explore the effect of corresponding author, surgical versus medical chair, and country of respondent, with respondents' beliefs about author roles.

Results One hundred sixty-five chairpersons (144 United States and 21 Canada; overall response rate: 57%) completed our survey. When the last author was designated as corresponding, perceptions of the first author's role in study concept and design (odds ratio [OR], 0.25; 95% confidence interval [CI], 0.15-0.41, P < .001) and analysis and interpretation of results (OR, 0.22; 95% CI, 0.13-0.38, P < .001) decreased significantly. Overall prestige of the last author position increased significantly when designated as corresponding author (OR, 4.0; 95% CI, 2.4-6.4, P < .001). Respondents varied widely in their inferences about the contributions of the remaining authors irrespective of who was corresponding, with fewer than 40% attributing any particular role to authors 2 to 4. Our findings did not differ significantly by specialty or country of the respondent.

Conclusions Academic department chairs were influenced substantially by corresponding author designation. We further confirm that without authors' explicit contributions in research papers, many readers will remain uncertain or draw false conclusions about author credit and accountability.

1Division of Orthopedic Surgery, McMaster University/Hamilton General Hospital, 237 Barton St E, 7 North, Suite 727, Hamilton, Ontario L8L 2X2, Canada, e-mail:; 2Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario, Canada; and 3Division of Neurosurgery, University of Toronto, Toronto, Ontario, Canada

Ghost Writers and Honorary Authorship: A Survey From the Chinese Medical Journal

Xiu-yuan HAO, Shou-chu QIAN, Su-ning YOU, and Mou-yue WANG

Objective To estimate the prevalence of ghost authors and honorary authors among Chinese authors according to the authorship criteria defined by the International Committee of Medical Journal Editors (ICMJE).

Design The first authors who submitted manuscripts to the Chinese Medical Journal during September 1 to December 31, 2008, were surveyed by questionnaire. The questionnaire was designed with 3 questions: (1) Who made the final decision on the authorship of your manuscript submitted to the journal? (2) Is someone without contribution to the research listed as an author?

(3) Has an English-language native speaker done something for you in preparation of the manuscript? If yes, is the speaker listed in the byline or acknowledgment section? The questionnaire was sent to the authors via e-mail. Authors working abroad or from overseas were excluded from the survey.

Results Among 268 authors who received the questionnaire, 231 (86%) authors responded and 220 (82%) authors had effective questionnaires. The byline of manuscripts was decided by the corresponding author for 40.9% of authors, by the first author for 33.6%, and by all authors for 25.5%. English-language speakers were involved in the preparation of manuscripts of 21.4% of authors, of whom 7.3% were listed as authors, 3.6% were acknowledged, 10.4% were neither listed in the byline nor acknowledged (ghost writers). Among respondents, 71.4% reported that all authors in the byline satisfied the authorship criteria, and 28.6% were unqualified but listed as authors (honorary authors), of whom 20.0% were heads of departments or institutions, 3.2% were friends of major authors, and 5.5% were others. The prevalence of honorary authorship was almost the same as the settlement of the byline by an individual author (28%) or by a group of authors (28.6%, Χ2 = 0.006, P < .05). The appearance of ghost writers was more common when the byline was decided by an individual author (52.8%) than by a group of authors (41.6%), but the difference was not statistically significant (Χ2 = 0.11, P < .05).

Conclusions The rates of ghost writers and honorary authors reported by authors who publish in the Chinese Medical Journal are similar to rates previously reported among US general medical journals for ghost authors and honorary authors. Many Chinese authors may be unaware of the authorship criteria defined by ICMJE.

Chinese Medical Journal, Chinese Medical Association, 42 Dongsi Xidajie, Beijing 100710, China, e-mail:

Prevalence of Honorary and Ghost Authorship in 6 General Medical Journals, 2008

[Updated September 10, 2009]

Joseph Wislar, Annette Flanagin, Phil B. Fontanarosa, and Catherine D. DeAngelis

Objective Given the increased awareness of authorship responsibility and inappropriate authorship and following new policies of some medical journals to publish individual author contributions, this study was conducted to assess the prevalence of honorary and ghost authors in 6 leading general medical journals in 2008 and compare this to the prevalence reported by author of articles published in 1996.

Design Online survey of corresponding authors of 896 research articles, review articles, and editorial/opinion articles published in 6 general medical journals in 2008 (Annals of Internal Medicine, JAMA, Lancet, Nature Medicine, New England Journal of Medicine, and PLoS Medicine (selected according to Institute for Scientific Information Journal Citation Report ranking in general medicine category). Based on previously reported prevalence rates of 19% of articles with honorary authorship, 9% with ghost authorship, and 2% with both honorary and ghost authorship, the sample size was determined with alpha at 0.05 and 80% power to detect a 10% difference between prevalence rates in 1998 and 2008. We also analyzed rates of honorary and ghost authors according to article types and compared rates for the 4 journals that publish authorship contributions vs the 2 journals that do not.

Results A total of 630 (70%) corresponding authors responded to the survey. Included in the final data set were 230 (37%) research articles, 136 (22%) reviews, and 264 (42%) editorials. Based on 545 usable responses on honorary authorship, 112 (21%) articles had honorary authors (range by journal: 14%-32%). This is a non-significant change from 1996 (19%; P = .387). Based on 630 responses, 49 (8%) articles had ghost authors (range by journal: 2%-11%); this is a significant decline from 1996 (12%; P = .024). The prevalence of honorary authors in 2008 was highest in Nature Medicine (32%) and lowest in New England Journal of Medicine (14%). The prevalence of ghost authors was highest in New England Journal of Medicine (11%) and lowest in Nature Medicine (2%). Honorary authors were reported for 26% of original research reports, 17% of reviews, and 17% of editorials. Ghost authors were more prevalent in research articles (12%) than reviews (6%) and editorials (5%). There were no significant differences in rates of honorary and ghost authorship between journals that require author contribution disclosures and those that do not. Further analyses will explore article and author characteristics.

Conclusion The prevalence of honorary authors has not changed significantly since 1996, but ghost authorship has declined significantly. The prevalence of honorary and ghost authors is still a concern.

JAMA, 515 N State St, Chicago, IL 60654, USA; e-mail:

Ghost Writing: How Some Journals Aided and Abetted the Marketing of a Drug

Lisa Bero and Jenny White

Objectives To describe the processes by which a pharmaceutical company planned journal publications. To examine variation in policies regarding ghostwriting among the targeted scientific journals.

Design We conducted a retrospective review of over 1000 internal industry documents concerning the gabapentin (Neurontin) case from the University of California, San Francisco, Drug Industry Document Archive, dated 1996-1999. We searched PubMed and Google Scholar (for gray literature) for articles planned for publication. We selected a set of articles resulting from grants that Parke-Davis gave to Medical Education Systems (MES) in 19961997 to publish 24 scientific articles and letters to the editor on gabapentin in peer-reviewed journals with prestigious "guest authors." We searched Web sites of and contacted all journals to which these articles were submitted for their former and current policies regarding conflict of interest, ghost authorship, adherence to standards of the International Committee of Medical Journal Editors (ICMJE), independent disclosure verification procedures, and reasons for rejecting the manuscript, if applicable. We also contacted ICMJE to determine if these journals followed the Uniform Requirements for Manuscripts Submitted to Biomedical Journals (as updated in 1997).

Results Eleven proposed articles were eventually published in 7 of the 24 journals originally targeted. None of these articles disclosed participation of Parke-Davis or MES in authorship. Only 2 articles disclosed funding by Parke-Davis or MES. Nine articles were apparently published in alternative journals, 4 of them in supplement issues. Currently all of these journals have policies requiring disclosure of conflicts of interest, but in a majority of cases ghost authorship is not addressed. Information on disclosure verification is sparse.

Conclusions Scientific journals have varied in their effectiveness in requiring and verifying full disclosure of conflicts of interest and authorship. Uniform standards must be implemented across all journals, so that companies wishing to conceal this information do not have an outlet to publish in journals with weaker policies. These standards should extend to journal supplements and letters to the editor.

Department of Clinical Pharmacy, Institute for Health Policy Studies, University of California, San Francisco, Suite 420, Box 0613, 3333 California St, San Francisco, CA 94143, USA, e-mail: berol@pharmacy.

Peer Review

The Natural History of Peer Reviewers: The Decay of Quality

Michael Callaham

Objective Commonly used methods of training and screening journal peer reviewers have been shown not to improve the quality of reviews. Does individual reviewer performance change over time, and in what ways?

Design All reviews for Annals of Emergency Medicine from 1994 through 2008 were rated and assessed. We used linear mixed effect models to analyze the rating changes over time, calculating actual and predicted intercept and slope for each reviewer, using the xtmixed routine in Stata (ver 10; StataCorp, College Station, TX) and controlling for editor and manuscript. The hypothesis was that individual reviewer performance changes over time.

Results All completed reviews (14,808) by 1498 reviewers were rated on a validated 5-point scale by 84 editors. Reviewers (academic clinicians and clinical researchers) had served the journal a mean of 71.8 months (minimum of 2 months [159 reviewers, 11% of the total], maximum of 175 months) completing a mean of 13.4 reviews each (range, 1-100) with an average quality score of 3.6. The average score of the pool did not change during the study period, indicating changes in its application did not occur. Reviewers with persistent unsatisfactory scores were removed from regular reviewing. Those with only 1 review (429 reviewers, 29% of the total) could not demonstrate a trend, so were excluded from the analysis. Individual reviewers deteriorated over time at a mean rate of –0.0402 review points per year (95% confidence interval [CI], –0.04733 to –0.0330), but this was masked by the increased quality of new recruits (0.022 points/year). A very small proportion of reviewers improved at 0.05 points/year or better (<1% of the total), and a much larger group grew worse at the same rate (32% of the total). Even 47 senior reviewers selected for consistent quality and volume grew worse at the rate of –0.023 points per year.

Conclusions Reviewers who improved over time or deteriorated were identified, which could lead to identifying characteristics of their demographics or experience that could predict future performance. However, the majority of reviewers deteriorated over time, although at a very gradual rate.

Department of Emergency Medicine, Box 0208, University of California, San Francisco, CA 94143-0208, USA, e-mail:

Does a Mentoring Program for New Peer Reviewers Improve Their Review Quality? A Randomized Controlled Trial

Debra Houry,1 Michael Callaham,2 and Steven Green3

Objective Traditional methods of peer reviewer training and selection have been shown to be ineffective. We instituted a mentoring program, pairing senior reviewers (selected for high quality) and new reviewers on the same manuscripts, to see if this intervention would improve quality as measured by editors' validated quality scores of all reviews.

Design Over a 2-year period, all new reviewers were randomly assigned to a control group or an intervention group by blinded technique. The intervention group was invited by e-mail to join our mentoring program and asked to communicate with their assigned senior reviewer mentor (by e-mail or phone) each time they were assigned a manuscript. Mentors were volunteers chosen for consistent timeliness and quality over years. Mentors and mentees (who were paired by topic during the study) were also notified each time either was assigned a manuscript to review; both groups (and editors) were blinded as to the study intervention. The content and amount of communication were left to the mentor and mentee. After 3 reviews, mentees were surveyed regarding their experience. We calculated reviewer-specific and average trend lines for both groups, accommodating for lack of independence of ratings using SAS Proc MIXED models (ver 9.1, SAS Institute, Cary, NC) with a random intercepts and slopes correlation structures. We tested for longitudinal differences in the average trends differences between the 2 groups during the study period.

Results A total of 17 mentees, 15 controls, and 16 mentors completed 194 reviews. Both mentees and control group reviewers received the same number of invitations, but mentees accepted and completed more reviews than control group reviewers (109 vs 84), and mentee mean scores were higher than control group scores when controlling for within reviewer trends and variations in volume and group trend effects (3.81 vs 3.24; difference: –0.56 [95% confidence interval, –1.048 to –0.078], P =.027). Satisfaction was not assessed since it has previously been shown not to predict performance, but participants were surveyed for their suggestions.

Conclusion A simple system of pairing newly recruited peer reviewers with volunteer reviewer mentors chosen for consistent quality over time resulted in slightly more reviews accepted and higher review scores.

1Department of Emergency Medicine, Emory University School of Medicine, Atlanta, GA, USA; 2Department of Emergency Medicine, Box 0208, University of California, San Francisco, CA 94143-0208, USA, e-mail:; 3Loma Linda University Medical Center, Santa Barbara, CA, USA

Surveys of Current Status in Biomedical Science Grant Review: Funding Organizations and Grant Reviewers' Perspectives

Sara Schroter,1 Trish Groves,1 and Liselotte Højgaard2

Objective To describe current status of biomedical grant review and to seek views on developing uniform requirements for the format and peer review of grant proposals.

Design Online survey to convenience sample of 57 international public and private grant-giving organizations. Nine of these sent an online survey via e-mail to a random sample of grant reviewers.

Results Of the grant organizations, 28 (49%) from 19 countries responded. Organizations reported as "frequent/very frequent" (and with deterioration over the past 5 years) these problems: declined review requests (54%), late reports (36%), administrative burden (21%), difficulty finding new reviewers (14%), and reviewers not following guidelines (14%). Half reported providing decisions to back to reviewers; 29% gave feedback on usefulness of reviews, and 46% rationed requests to reviewers. Of responding organizations, 57% supported the idea of uniform requirements for grant review, and 61% supported a uniform format of proposals. Of the 371 grant reviewers, 229 (62%) responded from 22 countries. Of these, 48% had reviewed for at least 10 years; 47% said their institutions encouraged grant review, yet only 7% were given protected time, and 75% received no academic recognition for grant review. Reviewers rated as "extremely/very important" the following in deciding to review: 51% supporting external fairness, 48% relevance of topic, 46% professional duty, 43% keeping up to date, and 41% avoiding suppression of innovation. Sixteen percent reported that guidance was very clear; 85% reported not having been trained in grant review, and 63% reported that they would like training. For more than half, lack of recognition and pay were never barriers to reviewing.

Conclusions Funders reported a growing workload that is getting more difficult to peer review. About two-thirds of grant organizations supported development of uniform requirements for the format and peer review of proposals. Just under half of grant reviewers take part for the good of science and professional development, but many lack academic and practical support and clear guidance. Given these findings, we propose that work starts on developing uniform requirements for grant review.

1BMJ Editorial, BMA House, Tavistock Square, London WC1H 9JR, UK, e-mail:; 2Rigshospitalet, University of Copenhagen, Denmark

Data Sharing and Conflicts of Interest

Reproducible Research: Biomedical Researchers' Willingness to Share Information to Enable Others to Reproduce Their Results

Christine Laine,1 Michael Berkwits,1 Cynthia Mulrow,1 Mary Beth Schaeffer,1 Michael Griswold,2 and Steven Goodman2

Objective "Reproducible research" is a model for communicating research that promotes transparency of methods used to collect, analyze, and present data. It allows independent scientists to reproduce results using the original investigators' same procedures and data and requires a level of transparency seldom sought or achieved in biomedical research. While the full reproducible research model involves more than data sharing, some form of sharing is a basic requirement. This report describes the willingness of biomedical researchers to share their study materials with others.

Design Authors of research articles published in Annals of Internal Medicine in 2008 were asked whether and under what conditions they would make available to others their protocols, statistical code, and data. We will ask researchers who stated that they would make materials available to report the number of requests received.

Results Of 72 articles, authors of 13% stated that protocol was available without conditions, 58% with conditions, and 17% not available. Statistical code was available without conditions for 3%, with conditions for 60%, and unavailable for 24%. Data were available without condition for 4%, with conditions for 57%, and unavailable for 38%. Most authors who said materials were available required interested parties to contact them first and many stated specific conditions for sharing these materials. Authors provided no statement about protocol, statistical code, and data availability for 13%, 14%, and 1%, respectively. Table 1 shows reported availability of study materials by study characteristics.

Table 1. reported availability of study protocol, statistical Code, and data by study Characteristics (N = 72)

Table 1. reported availability of study protocol, statistical Code, and data by study Characteristics (N = 72)

Conclusions While the majority of authors stated that they would make study materials available to others, most would do so only if others contacted them and attached requirements to the sharing of this information. Researchers were most willing to fully share their protocols and least willing to share data. Information on a larger sample and frequency of requests for materials will be available for presentation in September 2009.

1Annals of Internal Medicine, 190 Independence Mall W, Philadelphia, PA 19106, USA, e-mail; 2Johns Hopkins University, Baltimore, MD, USA

Investigator Experiences With Financial Conflicts of Interest in Clinical Trials

Paula A. Rochon,1 Melanie Sekeres,2 John Hoey,3 Joel Lexchin,4 Lorraine E. Ferris,2 David Moher,5 Wei Wu,1 Sunila R. Kalkar,1 Marleen Van Laethem,6 Andrea Gruneir,1 Jennifer Gold,7

M. James Maskalyk,2 David L. Streiner,2 and An-Wen Chan8

Objective To determine the extent to which investigators follow best practices to mitigate financial conflicts of interests in conducting clinical trials. We hypothesized that nonindustry-funded trials may engage in best practices more often than those with commercial funding.

Design E-mail survey of 1109 investigators from Canadian trial sites listed in 2 international trial registries in November 2006. We asked investigators about their experiences with trials conducted from 2001 to 2006. The main outcome was the frequency of 11 best practices, as defined by expert consensus and the literature, to mitigate financial conflicts of interest in trial preparation, conduct, and dissemination stratified by funding source (industry vs non-industry), and type of regulation (externally vs self-regulated).

Results A total of 844 investigators responded (76%), and 732 provided information for analysis. Fifty-five percent had been investigators on both industry-and nonindustry-funded trials. Overall, 41 (6%) investigators experienced best practices in all of their trials. The 3 externally regulated best practices (trial registration, institutional review of signed contracts and budgets) were more or equally likely to be followed in the industry relative to the nonindustry funding environment. Self-regulated practices (contracts had no restrictive confidentiality clauses; sponsor did not own the study data; investigator had access to data from all sites; investigator controlled final decisions regarding study design, analysis, interpretation, and manuscript content) were more frequently followed in nonindustry- than industry-funded trials (P < .001). Overall, 269 (37%) investigators reported having personally experienced (n = 85) or witnessed (n = 236) a financial conflict of interest; more than 70% of these situations related to industry-funded trials.

Conclusions Few investigators report always following best practices to mitigate financial conflict of interest in their clinical trials experience. Compliance was higher when best practices were externally regulated and trials were not industry funded.

1Women's College Research Institute at Women's College Hospital, 790 Bay St, Toronto, Ontario M5G 1N8, Canada, e-mail: paula.rochon@; 2University of Toronto, Toronto, Ontario, Canada; 3Queen's University, Kingston, Ontario, Canada; 4York University, Toronto, Ontario, Canada; 5Ottawa Health Research Institute, Ottawa, Ontario, Canada; 6Toronto Rehabilitation Institute, Toronto, Ontario, Canada; 7Baycrest, Kunin-Lunenfeld Applied Research Unit, Toronto, Ontario, Canada; 8Mayo Clinic, Rochester, MN, USA

Acknowledgment of Company Support in Research Publications From Investigator-Sponsored Studies

John M. Ellison, Rosarito P. Jahn, and Kirsten C. Kempe

Objective Many pharmaceutical and medical device companies provide support for independent research through investigator-sponsored study (ISS) programs (also known as investigator-initiated studies). Typically, a company's role in ISS projects is limited to providing funding and/or products, with minimal technical input. The investigator initiates and conducts the study and is responsible for complying with all regulations applicable to both investigators and sponsors. The International Committee of Medical Journal Editors (ICMJE) guidelines state that financial and material support should be acknowledged in published research. Our objective was to examine the frequency of acknowledgment in recent publications arising from a medical device company's ISS program.

Design Publications (2004 through 2008) traceable to ISS agreements were identified by searching company files, institutional Web sites, and the PubMed database. Each publication was examined by 2 or 3 reviewers for eligibility. Publications not addressing the principal goals of the ISS were excluded. Eligible articles were examined for acknowledgment of company support.

Results Our preliminary search identified 55 publications (from 20 peer-reviewed medical journals) arising from 18 ISSs. Several ISSs generated multiple publications. Disclosure of company support was found in 31 of the 55 (56%) publications. Acknowledgment was found in 10 of the 14 publications (71%) from ISSs that received company monetary support versus 21 of 41 (51%) of publications from ISSs that received product (a diagnostic device) only. Notably, all of the publications without disclosure came from 6 ISSs; 12 of the 18 (66%) ISSs did acknowledge company support in their publications.

Conclusions Publications from one-third of the ISSs we evaluated did not acknowledge company support. Authors may fail to acknowledge support when the company provides product (a diagnostic device) because it is rarely the intervention under study. Acknowledgments may also be missing due to journal practices or author preferences. All organizations providing support should be acknowledged regardless of the level or type of support provided. Opportunities exist for authors, editors, and industry to improve the transparency of ISS publications.

LifeScan, Inc, a Johnson & Johnson Company, 1000 Gibraltar Dr, Mailstop 3i, Milpitas, CA 95035, USA, e-mail:

Commercial Relationships, Funding, and Full Publication of Randomized Controlled Trials Initially Reported in Conference Abstracts

Isabel Rodriguez-Barraquer, Bonnielin Swenor, Roberta Scherer, and Kay Dickersin

Background In 1998, the Association for Research in Vision and Ophthalmology (ARVO) started requiring disclosure of author commercial relationships in meeting abstracts.

Objective To explore a possible association between commercial conflicts of interest and publication of randomized controlled trials (RCTs) initially presented as ARVO conference abstracts.

Design We identified RCTs presented as ARVO abstracts in years 2001-2003 and extracted data from each abstract, including "author commercial interest" (as defined by ARVO), study funding, and direction of results of primary outcome. Commercial relationships and funding sources were not mutually exclusive. Using PubMed (latest search March 2009) and direct author contact, we identified full reports associated with included abstracts. We will present 2001-2003 data and will also explore the effect of additional author and study characteristics.

Results We identified 151 abstract reports, of which 130 reported results for the primary outcome. Sixty-nine abstract reports (53%) had been published in full. Commercial relationships were reported as commercial institutional research support, 23 (18%); personal financial interest, 32 (25%); and no commercial relationship, 86 (66%). Full publication of abstracts with authors having commercial institutional support was 65% (15/23, 95% confidence interval [CI], 46%-85%), for those disclosing personal financial interests was 63% (20/32, 95% CI, 46%-79%), and for those reporting no commercial relationships was 49% (42/86, 95% CI, 38%-59%). Table 2 shows publication rates according to commercial relationship categories and results for the primary outcome. Abstracts noting commercial relationships had a higher full publication rate when the primary outcome result favored the experimental group. Among those studies with results favoring the control group (n = 11), no study disclosed commercial relationships.

Table 2. publication rates according to Commercial relationship Categories

Table 2. publication rates according to Commercial relationship Categories

a Direction of the difference between the experimental and comparison group for the primary outcome or main study results

b Author commercial relationship categories not mutually exclusive

c Number of abstracts disclosing relationship

Conclusions Our preliminary data suggest that studies disclosing commercial relationships show higher publication rates, with results favoring the experimental group, compared with studies reporting no commercial relationship. Further research is needed to explore this association.

Johns Hopkins Bloomberg School of Public Health, Deptartment of Epidemiology, Mailroom W5010, 615 N Wolfe St, Baltimore, MD 21205, USA, e-mail:

Perceptions and Integration of Conflict of Interest Disclosures Among Peer Reviewers

Suzanne Lippert,1 Michael Callaham,2 and Bernard Lo2

Objective We investigate how peer reviewers' understanding of key terms in conflict of interest (COI) statements varies in relation to their demographics and reviewer performance and how peer reviewers believe their understanding influences their assessment of articles.

Design We conducted an online survey with questions in the following categories: COI knowledge, perception of COI disclosures, personal information, and integration of COI statements into article assessment. A random sample of 146 core reviewers and 264 specialist reviewers for Annals of Emergency Medicine were invited to participate. Survey responses were linked to performance data in the journal files and coded for confidentiality. We provide descriptive statistics of survey responses and examine the relationship of reviewers' demographics and performance scores with their knowledge of the activities of serving as a consultant or on a speakers bureau and their reported means of integrating that knowledge in article assessments.

Results Of the 410 invited reviewers, 50% completed the survey. The following percentage of respondents believe it to be likely or very likely that: a company expects the presentation of medical content given by a member of a speakers bureau to be consistent with the company's marketing message, < 80%; that a typical consultant to a company would be reluctant to jeopardize their relationship with the company, < 70%; that sponsorship of research influences an author's judgment, < 90%. Seventy-five percent disagree or strongly disagree that authors have unlimited access to the company data without restrictions on publication. Seventy-four percent report that they would read an article more or much more carefully if the author serves on speakers bureau, 80% if the author is a consultant, and 85% if the author owns stock. The majority of respondents, however, report that their recommendation for publication is unchanged if the lead author falls into any of these categories.

Conclusion The majority of peer-reviewer respondents believe that financial ties to industry influence authors; nevertheless, most also believe that their recommendation of articles authored by physicians in these conflicted roles remains unchanged.

1ACMC, Emergency Medicine, 5416 Bryant Ave, Oakland, CA 94618, USA, e-mail:; 2Department of Emergency Medicine, University of California, San Francisco, CA, USA

Editorial Training, Decisions, Policies, and Ethics

Background, Training, and Familiarity With Ethical Standards of Editors of Major Medical Journals

Victoria Wong1 and Michael Callaham2

Objective To characterize the current demographics, training, experience, and familiarity with scientific publication ethics of editors of medical journals, expanding on a similar survey of journals in the 1994 Science Citation Index.

Design The 2007 Journal Citation Reports database was used to determine the journals containing the top 50% of total citations within each of 44 medical specialty categories. Category keywords were used to select only journals that a physician might use in making clinical decisions. An electronic survey was sent to the editors in chief of 190 journals (representing 5 million total citations), requesting information about demographics, training, and editorial duties. We included hypothetical scenarios about plagiarism, authorship, conflicts of interest, and peer review to test editorial knowledge.

Results Surveys have been received from 93 editors for a 49% response rate. Respondents to date include editors in 38 of the 44 medical topic categories, whose journals were cited 2.86 million times in the past year. 92% (81) of respondents were male, 81% (58) participated in some degree of patient care, and 79% (71) spent less than 50% of their work time doing editorial duties. Although 86% (78) of respondents were "confident" or "very confident" in their knowledge of scientific publication ethics when they began the survey, this number dropped to 71% (65) by the end. Performance on the editorial scenarios was poor; correct answers were given by 18% (14) to the question on plagiarism, 30% (27) to authorship, 15% (14) to conflicts of interest, and 16% (15) to peer review. Forty-nine percent (44) believed that additional training in scientific publication ethics would significantly enhance their skills as an editor.

Conclusion Despite high confidence levels among editors in chief of medical journals participating in this survey, there is still a need as well as a demand for further education in scientific publication ethics.

1University of California Davis Medical Center, Neurology Department, 4860 Y St, Suite 3700, Sacramento, CA 95817, USA, e-mail: vwongmd@; 2Department of Emergency Medicine, University of California, San Francisco, CA, USA

A Qualitative Study of Editorial Decisions About Publication of Randomized Trials at Major Biomedical Journals

Kirby Lee,1 Elizabeth Boyd,2 and Lisa Bero1

Objective To understand the decision-making process of journal editors when deciding whether to publish randomized controlled trials (RCTs).

Design Qualitative study of editorial meeting discussions of randomized trials submitted for publication at 3 major biomedical journals (BMJ, Lancet, and Annals of Internal Medicine) during 2003-2004. A total of 50 RCTs were submitted to the journals of which 13 were accepted and 37 rejected. Editorial meetings consisted of editors, statisticians, and, often, invited experts or consultants. Attendees varied among the journals and among meetings. We audio-recorded all editorial meetings in which these manuscripts were discussed. Audiotapes were transcribed and transcripts were analyzed using grounded theory. Thirteen RCTs accepted for publication were compared to a random sample of 13 RCTs rejected after peer review, stratified by journal. We identified commonly expressed themes and whether they were favorable toward publication or critical of the manuscript.

Results Ten recurrent themes emerged including methods, writing (the need for more information, clarification, or explanation to interpret methods and results or issues with presentation and style), results, editor discussions for improving the manuscript (eg, revising analyses, reporting data, implications of the findings), discussion of peer review comments, novelty, whether the manuscript would change clinical practice, conflicts of interest, readership, and author characteristics. For accepted manuscripts, approximately 54% of the comments were critical and 46% were favorable. For rejected trials, approximately 75% of the comments were critical and 25% were favorable. The frequency of themes discussed differed between accepted and rejected manuscripts (Table 3).

Conclusions Discussion of the trial reports' methods, results, and writing dominated the editorial meetings. For trials that were accepted, editors also devoted a large proportion of the discussion to suggestions for improving the manuscript and communicating its key messages. The peer review process resulted in constructive suggestions for changing the reporting of accepted trials.

Table 3. proportion of Themes discussed during editorial meetings for accepted and rejected manuscripts

Table 3. proportion of Themes discussed during editorial meetings for accepted and rejected manuscripts

1 University of California, San Francisco, Clinical Pharmacy, 3333 California St, Suite 420, San Francisco, CA 94118, USA, e-mail:; 2University of Arizona, Tuscon, AZ, USA

Editorial Policies of Pediatric Journals: A Survey of Instructions for Authors

Joerg J. Meerpohl,1,2 Robert F. Wolff,1,3 Charlotte M. Niemeyer,2 Gerd Antes,1 and Erik von Elm4

Objective The continued discussion about ethics and quality of biomedical publishing has led to recommendations for submitting authors. However, it is unclear to what extent these recommendations have been implemented in specialty journals. We studied whether aspects of good publication practice were implemented in the author instructions of pediatric journals.

Design In the Institute for Scientific Information Journal Citation Report (JCR) 2007, we identified all journals (n = 78) in the subject category "pediatrics" and included those publishing original research articles (n = 69). We accessed the online instructions for authors and extracted information regarding endorsement of the International Committee of Medical Journal Editors (ICMJE) Uniform Requirements for Manuscripts Submitted to Biomedical Journals and of 5 major reporting guidelines including CONSORT and STROBE, disclosure of conflicts of interest (COIs), and trial registration. Two investigators collected data independently.

Results The ICMJE Uniform Requirements were mentioned in author instructions of 38 of the 69 journals (55%). Endorsement of reporting guidelines was low with CONSORT being referred to most frequently (14 journals, 20%). Each of the other 4 reporting guidelines was mentioned in less than 10% of the author instructions. Fifty-four (78%) journals explicitly required authors to disclose COIs, and 16 (23%) either recommended or required trial registration. The odds of endorsing the ICMJE Uniform Requirements increased by 2.25 (95% confidence interval [CI], 1.17-4.34) per additional impact factor point. Similarly, the odds increased by 2.32 (95% CI, 0.95-5.70) for requiring disclosure of COIs and by 3.66 (95% CI, 1.74-7.71) for requiring trial registration.

Conclusions According to the author instructions of journals serving the pediatric research community, several recommendations for publication practice are not yet fully implemented. The more widespread endorsement of ICMJE Uniform Requirements and major reporting guidelines could improve the transparency and completeness of pediatric research publications. Many pediatric journals do not have editorial policies regarding trial registration. Disclosure of COIs at the time of manuscript submission should be mandatory.

1German Cochrane Center, University Medical Center, Freiburg, StefanMeier-Strasse 26, Freiburg, 79104 Germany, e-mail: meerpohl@cochrane. de; 2Division of Pediatric Hematology & Oncology, Department of Pediatrics, University Medical Center Freiburg, Freiburg, Germany; 3Kleijnen Systematic Reviews Ltd, York, UK, 4Swiss Paraplegic Research, PO Box, CH-6207 Nottwil, Switzerland, e-mail:

Was JAMA's Requirement for Independent Statistical Analysis Associated With a Change in the Number of Industry-Funded Studies It Published?

Elizabeth Wager,1,2 Rahul Mhaskar,3 Stephanie Warburton,3 and Benjamin Djulbegovic3

Objective To determine whether the number of industry-funded trials published by JAMA changed after the July 2005 requirement for independent statistical analysis.

Design Retrospective before-and-after study. Two investigators independently coded all RCTs published in JAMA from July 1, 2002, to June 30, 2008, (ie, 3 years before and after the policy). They were not blinded to publication date. RCTs were classified as "Industry" if they had any commercial funding or support. Discrepancies were resolved by discussion or further analysis. RCTs published in Lancet and New England Journal of Medicine (NEJM) during the same period provided the control.

Results The total number of RCTs and the proportion with commercial funding decreased significantly in JAMA after July 2005. In contrast, NEJM published more RCTs, but funding did not change significantly, while Lancet published the same number of RCTs, but the proportion of industry RCTs rose nonsignificantly in the same periods. Alternative categorization of funding sources distinguishing total industry funding (IF) from support (IS) (ie, supplying materials only) or joint industry/noncommercial funding (J) produced a less clear pattern but IF+J studies decreased significantly in JAMA while IS studies (and IF+IS studies) increased significantly in NEJM, and IF studies increased significantly in Lancet (Table 4).

Table 4. Number of Industry-Funded Trials published by JAMA before and after the requirement for Independent statistical analysis

Table 4. Number of Industry-Funded Trials published by JAMA before and after the requirement for Independent statistical analysis

a Studies with commercial funding

1 Committee on Publication Ethics (COPE); 2 Sideview, 19 Station Rd, Princes Risborough, HP27 9DE UK, e-mail:; 3 University of South Florida, Tampa, FL, USA

Conclusions JAMA's requirement for independent statistical analysis for industry-funded studies was associated with a change in the pattern of RCTs published. We cannot tell whether the policy affected the number of RCTs submitted, the acceptance rate, or both. The decrease in RCTs and commercial studies was not seen in the control journals.

What Ethical Issues Do Journal Editors Bring to COPE?

Sabine Kleinert1 and Elizabeth Wager2,3

Objective The Committee on Publication Ethics (COPE) has provided a forum for journal editors to discuss troubling cases since 1997. The keywords assigned to cases were checked and recategorized in 2008 to ensure consistency of use. We analyzed the cases discussed at COPE to identify possible topic trends. In particular, we wondered whether the introduction of the COPE flowcharts in 2006 might be associated with a reduction in the number of straightforward cases.

Design Analysis of cases on the COPE Web site from 1997 to 2008 according to their keywords (which are assigned by COPE).

Results The database comprises 354 cases. The number of cases presented to COPE each year ranges from 15 to 42 but with no clear time trend (eg, there were fewer cases in 2008 than 2007). COPE membership increased steadily from 1997 to 2006 and then more dramatically during 2007 and 2008. The only categories of cases that appeared to increase were plagiarism (perhaps due to increased awareness) and unethical editorial decisions (which has increased since COPE published its code of conduct, perhaps because editors are questioning their practices more). We found no evidence that the flowcharts had reduced the number of straightforward cases submitted. See Table 5.

Table 5. COPE Cases by Category, 1997-2008

Table 5. COPE Cases by Category, 1997-2008

Conclusion While the numbers in each category are small, and therefore do not warrant statistical analysis and should be interpreted with caution, apart from plagiarism and unethical editorial decisions, which appear to have increased, we observed no clear patterns in the types of problems presenting to COPE since 1997.

1Lancet, London, UK; 2Committee on Publication Ethics (COPE); 3Sideview, 19 Station Rd, Princes Risborough, HP27 9DE UK, e-mail:


Publication Pathways

Predictors of Time to Publication of Manuscripts Rejected by Major Biomedical Journals

Kirby Lee,1 Nicholas Lehman,2 I'Alla Brodie,3 and Lisa Bero1

Objective To evaluate publication rates and predictors of time to publication for manuscripts rejected by major biomedical journals.

Design Cohort study of 1008 manuscripts reporting original research submitted for publication and subsequently rejected at 3 major biomedical journals (BMJ, Lancet, and Annals of Internal Medicine) during 2003-2004. Our main predictor of publication was statistical significance of results (P < .05) reported for the primary outcome. We also abstracted manuscript characteristics including study design, sample size, funding source, and whether the manuscript was outright rejected vs rejected after external peer review. The primary outcome of our study was subsequent publication of the rejected manuscripts. We determined publication status and time from rejection to publication in the medical literature by searching PubMed, Cochrane Library, and the Cumulative Index for Nursing and Allied Health Literature through June 30, 2008 (minimum follow-up time of 4.3 years). Predictors of time to publication were analyzed using multivariable Cox proportional hazards regressions. All analyses were planned a priori.

Results Seventy-six percent (767) of manuscripts were published (median, 1.25 years, range, 0.01-5.32 years). The majority of manuscripts were published in specialty journals (85%, 654/767) with lower impact factors than the original rejecting journal. Manuscripts were more likely to be published if they had larger sample sizes and less likely if they did not disclose a funding source or did not report results using statistical tests for comparisons, particularly after 1.25 years from initial rejection (Table 6).

Conclusions A quarter of manuscripts initially rejected by major biomedical journals remained unpublished. Although some methodological characteristics and disclosing the funding source were associated with publication, articles with statistically significant results were not more likely to be published.

1University of California, San Francisco, Clinical Pharmacy, 3333 California St, Suite 420, San Francisco, CA 94118, USA, e-mail:; 2Colby College, Waterville, ME, USA; 3San Diego State University, San Diego, CA, USA

Table 6. multivariable predictors of time to publication (N = 902)

Table 6. multivariable predictors of time to publication (N = 902)

a 106 manuscripts did not report a sample size. Statistical significance of results estimated for early and late time periods due to nonproportionality.

b Studies reporting no formal statistical tests, descriptive statistics only, or qualitative research.

Publication of Research Reports After Rejection by the New England Journal of Medicine in 2 Time Periods

Michael Bretthauer, Pam Miller, Edward W. Campion, and Jeff Drazen

Objective In the past 2 decades, the number of peer-reviewed medical journals that publish original research reports has increased. We compared the subsequent publication of original research manuscripts after rejection at a major general medical journal, the New England Journal of Medicine, with an acceptance rate of less than 10%. Data were obtained for both 1995 and 2003.

Design All original research manuscripts rejected after external peer review at the index journal during the calendar years 1995 and 2003 were identified from our electronic databases. All manuscripts were tracked for subsequent publication in a medical or scientific journal by searching for similar titles and authors in PubMed. Searches were performed between November 25, 2008, and January 30, 2009, for the 1995 manuscripts. For the 2003 manuscripts, searches were completed in the first week of February 2009. Descriptive statistics were computed by the use of SPSS 15.0.

Results Of 1423 manuscripts rejected in 1995, we identified 1273 that were subsequently published (89.5%), as compared to 1040 of 1205 (86.3%) for those in 2003. The manuscripts were published in 384 different journals in 1995, as compared to 319 in 2003. The median time from rejection by the index journal to eventual publication in another journal was 437 days (interquartile range [IQR]: 313-652 days) in 1995 vs 398 days (IQR: 286-564 days) in 2003. Table 7 shows the 5 types of journals that published the papers after rejection at the index journal for the 2 time periods.

Table 7. eventual publication by Other Journals After Initial Rejection

Table 7. eventual publication by Other Journals After Initial Rejection

Conclusions The vast majority of research manuscripts that are rejected by a general medical journal after peer review are published elsewhere. However, there is substantial delay in this process. The time from rejection at the index journal to publication elsewhere declined only slightly from 1995 to 2003.

New England Journal of Medicine, 10 Shattuck St, Boston, MA 02115-6094, USA; e-mail:

An Evaluation of Time to Publication for Randomized Trials Submitted to the BMJ

Sara Schroter,1 Douglas G. Altman,2 and John P. A. Ioannidis3

Objective To evaluate the publication fate of randomized controlled trials (RCTs) submitted to the BMJ and to understand how many and which trials remain unpublished a long time after submission.

Design We evaluated all 660 reports of RCTs submitted to the BMJ between January 1, 1998, and December 31, 2001. We identified the published articles or, if we could not locate any published reports, we contacted the authors.

Results We found that 602 of 660 RCTs (91%) were published: 150 (23%) in the BMJ and 452 (68%) in other journals. All except 6 of the trials published elsewhere were in journals with lower impact factor than BMJ (median, 2.13; interquartile range [IQR], 1.53-3.03). The median time from submission to publication was 1.36 years (IQR, 0.89-2.15; and 1.30, IQR, 0.87-1.91, excluding the 40 unpublished trials). About 25% of the RCTs submitted to the BMJ remained unpublished for 2 or more years after submission. Excluding the BMJ-published papers, a higher impact factor was associated with more rapid publication (HR, 1.08 per point, P = .002). Unpublished trials were significantly less likely to go to external review, have external reviews returned, and reach the BMJ editorial "hanging" committee than published trials but were not significantly different in their country of origin. Of the 18 unpublished trials for which authors replied to our survey, 3 authors were still trying to get their paper published, at least half were said to have significant results, 3 said that negative results were perceived as a main reason for nonpublication, and 8 were discouraged by rejections.

Conclusions Results of all RCTs should be publicly available since they are level I evidence. The large majority of the trial reports submitted to the BMJ get published in time, but about 25% remained unpublished after 2 years. Among those remaining unpublished, negative results are not perceived as a prime reason for failure to publish.

1BMJ Editorial, BMA House, Tavistock Square, London WC1H 9JR, UK, e-mail:; 2Centre for Statistics in Medicine, Wolfson College Annexe, Oxford, UK; 3Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece

Complexity of Peer Review Evaluations: Twelve-Year Prospective Study on 424 Submitted Papers to a Specialized Journal

Francine Kauffmann,1,2 Klaus Rabe,3 Béatrice Faraldo,1,2 Hélène Tharrault,1,2 Jean Maccario,1,2 and Alan R. Leff4

Objective To classify reviewer scores and assess whether scores predict the career of a manuscript, beyond the initial decision of acceptance or rejection.

Design Included manuscripts were edited by 1 associate editor between 1994 and 1999. Subsequent publication through December 2008 was searched for rejected manuscripts. For all published papers, citations were assessed from their publication date through December 2008. Outcomes studied were publication (initial journal [n = 173] or another journal [n = 187]), impact factor for the year of publication, citations (Web of Science). Reviewer scores (1-5) regarding "originality," "scientific importance," "adequacy of methods," "brevity/clarity," and "adequacy of interpretation" were analyzed as predictors. Analyses on citation rates (geometric mean = 26, range, 1-441) conducted on 329 papers were adjusted for duration of follow-up after publication (9.9±1.6 years; P < .001) and impact factor (66 journals, 3.6±1.7; P < .0001).

Results Principal component analysis on reviewer scores separated "science" (methods, interpretation) and "journalism" (originality, importance). Two hundred fifty-one papers were rejected (43 without a reject recommendation of at least 1 reviewer) and 173 accepted (13 with a reject recommendation of 1 reviewer). For acceptance or rejection, tree discriminant analysis was performed as all scores were highly predictive of outcome: "interpretation" was the first criteria, followed by "methods" and "scientific importance." Discriminant analyses showed that every publication after rejection (75%) was driven by "originality" and early (within 2 years after rejection, n = 140 [75%]) vs late publication by "methods." Reviewer "scientific importance" score was the major determinant for all citations (P = .01), in the first 2 years (P < .01), and after (P = .02). In the first 2 years, poor "interpretation" increased citations (P = .03). Results were similar when restricting the analysis to the 173 accepted papers.

Conclusions Peer review evaluation can capture various aspects regarding submitted papers, which differentially predicts acceptance, publication of rejected papers, and citations rates. This is an issue relevant for authors, reviewers, editors, and readers.

1INSERM U780, 16 avenue PV Couturier, Villejuif, 94807 France, e-mail:; 2Université Paris-Sud, IFR69, Villejuif, France; 3Department of Pneumology, Leiden University Medical Center, Leiden, the Netherlands; 4Department of Medicine, University of Chicago, Chicago, IL, USA

Publication Bias

Testing for the Presence of Positive-Outcome Bias in Peer Review: A Randomized Controlled Trial

Gwendolyn B. Emerson,1 Richard A. Brand,2 James D. Heckman,3 Winston J. Warme,1 Fredric M. Wolf,4 and Seth S. Leopold1

Objective To the extent positive-outcome bias exists, it risks undermining evidence-based medicine. We designed a stratified randomized block design trial to test the hypothesis that a significantly greater percentage of peer reviewers for 2 orthopedic journals would recommend publication of a "positive" outcome report compared with a "no-difference" outcome report of an otherwise identical fabricated randomized controlled trial (RCT).

Design We fabricated 2 versions ("positive" and "no difference") of a well-designed, CONSORT-conforming, amply-powered, multi-institutional, blinded RCT including 3308 patients evaluating dosage and timing of perioperative antibiotics. The versions were identical except for the direction of the finding on the principal study endpoint (fewer surgical site infections in the "positive" version; no difference in surgical site infections in the "no-difference" version). Both versions were sent to peer reviewers by Journal of Bone and Joint Surgery (JBJS) and Clinical Orthopaedics and Related Research (CORR). The 209 reviewers were randomly allocated to either "positive" or "no-difference" versions, with randomization stratified by journal. Reviewers were informed that a study was ongoing but were not informed of the research question nor that they were reviewing the "test" manuscript.

Results At JBJS, the "positive" manuscript was significantly more likely to be recommended for publication than the "no-difference" manuscript (98% vs 71%; odds ratio [OR], 20.4; 95% confidence interval [CI], 2.6-161.7). At CORR, the difference was not statistically significant (97% of "positive" manuscripts recommended for publication vs 90% of "no difference" manuscripts; OR, 3.4; 95% CI, 0.6-18.2).

Conclusions Results suggest positive-outcome bias is variably present during peer review of orthopedic manuscripts submitted for publication, as it was present for 1 journal but not the other. Many of the 209 reviewers have regularly reviewed for both journals; we speculate the difference between journals may relate to historical perceptions about the 2 journals. Journal editors may consider providing reviewers with more explicit guidelines for review of "no-difference" manuscripts.

1Department of Orthopaedics and Sports Medicine, University of Washington, Box 356500, 1959 NE Pacific St, Seattle, WA 98195, USA, e-mail:; 2Clinical Orthopaedics and Related Research, Philadelphia, PA, USA; 3Journal of Bone and Joint Surgery, Needham, MA, USA; 4Department of Medical Education and Biomedical Informatics, University of Washington, Seattle, WA, USA

Composite Outcomes Can Be Misleading

Gloria Cordoba,1 Lisa Schwartz,2 Steven Woloshin,2 Harold Bae,2 and Peter Gøtzsche1

Objective A composite outcome combines several individual outcomes into 1 main outcome. It increases statistical power but can be misleading since the result may be driven by less important components. We compared the composite outcome to the most clinically important outcome.

Design Systematic review of parallel-group randomized clinical trials published in 2008 reporting a binary composite primary outcome. Two independent coders abstracted the data and a third observer, blinded to the results, selected the most important component.

Results Of 43 eligible trials, 36 were included (3 were excluded because no component was clearly most important, 4 because of insufficient or inconsistent data). Twenty-seven trials (75%) were about cardiovascular topics, and 25 trials (69%) were either entirely or partly industry funded. Composite outcomes had a median of 3 components (range, 2-9). Death (or cardiovascular death) was the most important component in 30 trials. There were 15,531 events (14% of patients) for the composite outcome and 4,513 (4%) for the most important component. The point estimate for the risk ratio for the composite outcome was equally often lower as larger than that for the important component, and it differed by more than 20% from it in 14 trials (39%). Statistically significant results were reported in 11 trials for only the composite outcome; in 2 trials for only the most important outcome (death or cardiovascular death); and in 1 trial for both but in opposite directions, as the effect was beneficial for the composite of death or nonfatal myocardial infarction and harmful for death.

Conclusions Composite outcomes may exaggerate perceptions of how well interventions work and, conversely, can hide effects on mortality. The pivotal assumption that the effect of the intervention should be similar for each of the components is often not met.

1Nordic Cochrane Centre, Rigshospitalet, Blegdamsvej 9, Dept 7112, DK-2100 Copenhagen, Denmark, e-mail:; 2Dartmouth Institute for Health Policy, Dartmouth Medical School, Hanover, NH, USA

Reporting Biases and Publication Strategy for Off-Label Indications: A Potent Mix

S. Swaroop Vedula,1 Ilyas Rona,2 Palko Goldman,2 Thomas Greene,2 and Kay Dickersin1

Objective To describe reporting practices in clinical trials sponsored by Parke-Davis/Pfizer in the context of their use of a "publication strategy" to market gabapentin for off-label indications.

Design One of us (K.D.) was provided internal company documents by plaintiffs' lawyers as part of legal action against Pfizer. We compared protocols with internal research reports and publications to identify reporting biases. We also examined internal company marketing assessments to identify whether a "publication strategy" or an "indication strategy" (ie, obtaining Food and Drug Administration approval) was recommended. One author (S.V.) extracted data relevant to reporting biases and another (K.D.) verified. K.D. signed an agreement in August 2008 agreeing to be bound by a protective order entered in pending litigation against Pfizer, which limits disclosure of confidential discovered information unless such information is ordered unsealed by the court, or the claim of confidentiality is waived by the claiming party. Through communications with counsel involved in the litigation occurring between August and October 2008, Pfizer agreed to waive any confidentiality claims concerning documents reviewed as part of K.D.'s expert report. As a result, all of the documents reviewed for this study have had their confidentiality claims waived.

Results We examined 20 clinical trials of gabapentin for 4 off-label indications included as part of the legal action: migraine prophylaxis (n = 3), bipolar disorders (n = 3), neuropathic pain (n = 8), and nociceptive pain (n = 6). Marketing assessments for 3 indications recommended a "publication strategy," and the fourth recommended multiple strategies. Each trial was associated with 1 or more reporting biases, including failure to publish efficacy results in full (11/20). In the 12 published trials, we also observed selective publication of primary (7/12) and secondary (11/12) outcomes, analyses in selected populations (5/12), possible ghost authorship (3/12), citation bias (5/12), possible time-lag bias (5/12), and a positive "spin" of the findings (8/12) (ie, a discrepancy between results and conclusions).

Conclusions Combining "publication strategy," used as a marketing tool, with biased reporting of results represents a potent mix that can create a false perception of a drug's efficacy. We propose that existence of a publication strategy should be revealed by the trial sponsor to trial participants, investigators, editors, and peer reviewers at appropriate time points, and publicly at trial registration.

1Johns Hopkins Bloomberg School of Public Health, Mailroom W5010, 615 N Wolfe St, Baltimore, MD 21205, USA, e-mail:; 2Greene LLP, Boston, MA, USA


"Spin" in Reports of Randomized Controlled Trials With Nonstatistically Significant Primary Outcomes

Isabelle Boutron,1,3 Susan Dutton,1 Philippe Ravaud,2 and Douglas G. Altman1

Objective The objectives of this study were to identify the nature and frequency of "spin" (ie, manipulation of the content and rhetoric of reporting to convince the reader of the likely truth of a result) in published reports of randomized controlled trials (RCTs) with nonstatistically significant primary outcome(s).

Design The Cochrane Highly Sensitive Search Strategy was used to identity reports of RCTs with a primary publication indexed in PubMed in December 2006. Articles were included if the study was a 2-arm RCT with a clearly identified primary outcome that was not statistically significant (ie, P ≥. 05). To systematically evaluate "spin" in reports, 2 readers appraised each selected article using a pretested standardized data abstraction form developed in pilot testing performed on another sample. We used the following preliminary classification scheme to investigate the main text and abstract to assess whether authors (1) interpreted nonstatistical outcomes as if the trial were an equivalence trial; (2) emphasized nonstatistically significant outcomes showing the benefit of treatment (linguistic "spin" or other method); (3) focused on other statistically significant results such as within-group comparisons, secondary outcomes, subgroup analyses, and modified population analyses; and (4) overinterpreted safety. Any other strategies used to influence readers were also collected. All discrepancies were discussed to achieve consensus and a third reader resolved any disagreements.

Results Of the 1735 PubMed citations retrieved, 616 reports of RCTs were selected based review of the title and abstract. After screening the full text, 74 reports of 2-arm parallel trials with a primary outcome clearly identified and nonstatistically significant were selected. The full results will be presented during the Congress.

Conclusion These results will provide information on the prevalence and nature of "spin" in a representative sample of published reports of RCTs with nonstatistically significant outcomes.

1Centre for Statistics in Medicine, University of Oxford, Oxford UK; 2Hûpital Bichat, Paris, France; 3Groupe Hospitalier, Bichat-Claude Bernard, Departement d'Epidemiologie, 46 rue Henri Huchard, Paris, Cedex 18 75877 France, e-mail:

Rhetoric Used in Reporting Research Results

Lisa Bero and Yolanda Cheng

Objective To identify rhetoric used to frame research results reported in drug studies and to determine if statistically significant numerical data support the claims about a drug.

Design In this observational study, we evaluated 35 published randomized controlled trials for language about a drug's effect(s). The 35 articles from 24 journals were publications of trials that had been submitted in New Drug Applications (NDAs) where there were discrepancies in conclusion between the published literature and the NDA trial report that favored the drug. For each publication, we assessed the results and conclusion sections for rhetoric that suggested that the drug was more effective or safer than the comparator. An initial group of keywords (eg, significance, statistical, clinical, more, less, effective, safe/well-tolerated) guided the extraction of rhetoric. Additionally, the statistically significant result (eg, P value, 95% confidence interval) was recorded if it was presented in text, tables, or figures. The rhetoric statements were grouped according to 6 categories of effect: increase/ positive, decrease/negative, equivalence/consistency, safety, clinical importance, and qualifying/ambiguity. For each statement, we noted the accompanying use of the term "significance" and a statistical test.

Results The articles were published between 1993 and 2004, with 30 of 35 published after 2000. Seven of the articles had no sponsorship statement. Three of the 24 journals had an impact factor greater than 10 (range, 0-17.6). For the 35 papers assessed, 695 rhetoric statements were extracted. Forty-nine percent (338/695) of the statements of effect made were not accompanied by any mention of a statistically significant result. Fifty-one percent (357/695) of the rhetoric statements included the term "significance," where 72% (258/357) were supported by a statistical test result. The majority of the rhetoric statements were classified as increase/positive effect statements, with 84% (142/169) of them having an associated statistical test. Rhetoric regarding the safety of a drug was rarely supported by a statistical result (Table 8).

Table 8. Categorization of Rhetoric Statements

Table 8. Categorization of Rhetoric Statements

Conclusions Rhetoric used to frame research results in drug studies overstates the effectiveness of a drug. A limitation of our study is that we did not obtain original data and conduct our own statistical analysis. The text of result and conclusion sections should align more closely with the numerical results.

Department of Clinical Pharmacy, University of California, San Francisco, 3333 California St, Suite 420, Box 0613, San Francisco, CA 94118, USA, e-mail:

A Propaganda Index for Screening Manuscripts and Articles

Eileen Gambrill

Objective The propaganda index is designed to serve as a complement to methodological filters such as CONSORT in reviewing manuscripts and published literature. Propaganda regarding problems addressed promotes the medicalization of behaviors and feelings and hinders empirical investigation of well-argued alternative views. Predictions were as follows: (1) reading a definition of propaganda will not facilitate propaganda spotting skills, (2) use of a propaganda index will facilitate this task, and (3) methodological quality is not correlated with a measure of propaganda.

Method A propaganda index consisting of 32 items was created based on related literature. Items concerned medicalization, use of vague terms, lack of documentation for claims made, and hiding controversy. Twenty articles describing randomized controlled trials (RCTs) regarding social anxiety were selected via an Internet search. Twenty PhD-level consumers of the literature were asked to read 5 articles with authors' names and journal titles removed and to identify propaganda using a definition of propaganda as "encouraging beliefs and actions with the least thought possible." They next applied the index to each article and returned this information. One week later they were to again rate the same 5 articles using the index.

Results Review of the 5 articles by the author revealed a high rate of propaganda: 78 out of 110 opportunities. This review served as the criterion. Preliminary results for 8 participants showed that they detected between 0 and 17 indicators over all 5 articles with an average of 5 before using the index. Percentage agreement of participants with the criterion ratings on the propaganda index for all 5 articles ranged from 57% to 88% with an average of 74%. Further data regarding predictions will be presented at the conference.

Conclusion Preliminary data suggest that, without prompting, many forms of propaganda remain undetected in reports of RCTs and that detection can be increased by use of a propaganda index.

University of California at Berkeley, School of Social Welfare, 120 Haviland Hall #7400, Berkeley, CA 94720-7400, USA, e-mail: gambrill@

Trial Registration

Frequency and Nature of Changes in Primary Outcome Measures

Deborah A. Zarin, Tony Tse, and Rebecca J. Williams

Objective Prespecification of outcome measures forms the basis of most statistical analyses of clinical trials. Trial registration allows for the tracking of changes to outcome measures from study initiation to eventual publication. Changes may occur any time after study initiation, though there is no standard for distinguishing important vs unimportant changes. The objective of this study was to determine the frequency and type of changes in primary outcome measures (POMs) between entries in and associated publications and between initial and current registry entries.

Design We identified 75 sequential MEDLINE citations with entries, generating "registry-publication pairs." Study 1 (50 pairs) compared the registry POM to outcomes in the publication. Study 2 (25 pairs) compared the publication POM to the registered outcomes. Study 3 examined changes over time for all 75 registry POMs. Each outcome measure was coded for "principality" (primary, secondary, or unspecified), domain (eg, depression), specific measure (eg, HAM-D), and time frame. Pairs were considered matches if the domain was the same. If either specific measure or time frame were not consistent, pairs were "substantively different."

Results Sixty-two of the 75 registry-publication pairs were consistent, though differences in level of specificity within pairs were observed (Table 9). One publication POM had been registered as a secondary outcome measure. Eight of 50 Study 1 pairs and 3/25 Study 2 pairs had substantive differences. Six of 75 Study 3 POMs changed substantively after initial registration. Forty-eight of 75 of the entries were initially registered more than 3 months after the start date, with some delayed by years.

Table 9. Results of 2 Studies Characterizing primary Outcome measures (pOms)

Table 9. Results of 2 Studies Characterizing primary Outcome measures (pOms)

a Two POMs were considered a match if they had the same domain (eg, pain).

Conclusions The taxonomy enabled us to categorize POM pairs. Most POM pairs were consistent based on our criteria. Our ability to detect inconsistencies was limited at times by vague registry entries or substantially delayed initial registrations. This taxonomy could be used to develop consensus criteria for tracking and communicating outcome measure changes.

National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bldg 38A, Room 75705, Bethesda, MD 20892, USA, e-mail:

Trial Registration Can Be a Useful Source of Information for Quality Assessment: A Study of Randomized Trial Records Retrieved From the World Health Organization Search Portal

Ludovic Reveiz,1 An-Wen Chan,2 Karmela Krleža-Jerić,3 Carlos Granados,4 Mariona Pinart,5 Itziar Etxeandia,6 Diego Rada,7 Monserrat Martinez,8 and Andres Felipe Cardona9

Objective We evaluated empirically whether trial registries provide useful information to evaluate the quality of randomized controlled trials (RCTs).

Methods We compared methodological characteristics of a random sample of ongoing RCTs registered in 6 World Health Organization (WHO) primary registries and in 2008. As 90% of trials were provided from, we ensured adequate representation across registries by including a representative sample of each registry. We assessed the reporting of relevant domains from the Cochrane Collaboration's "Risk of Bias" tool and other key methodological aspects. Two reviewers independently assessed each record.

Results A random sample of records of actively recruiting RCTs was retrieved from 7 registries using the WHO International Clinical Trials Registry Platform (ICTRP) search portal. Weighted overall proportions in the ICTRP search portal for adequate reporting of sequence generation, allocation concealment, and blinding (patient-reported outcomes and objective outcomes) were 5.23% (95% confidence interval [CI], 2.55-7.91), 1.37% (95% CI, 0%-2.76%), 8.58% (95% CI, 5.8-11.36), and 8.69% (95% CI, 5.29-12.09), respectively. Most items had insufficient or no information to permit judgment (Table 10). Significant differences in the proportion of adequately reported RCTs were found between registries that had specific methodological fields for describing methods of randomization and allocation concealment compared to registries that did not (random sequence generation, 74% vs 2%, P < .001; allocation concealment, 53% vs 0%, P < .001). Concerning other key methodological aspects, weighted overall proportions of RCTs with adequately reported items were as follows: eligibility criteria (81%), primary outcomes (66%), secondary outcomes (46%), follow-up duration (62%), description of the interventions (53%), and sample size calculation (1%). Final results will be presented at the Congress.

Table 10. Adequate Reporting of Sequence Generation and Allocation Concealment From the WHO ICTRP Search Portal From January 1 to December 12, 2008, by Source Registry

Table 10. Adequate Reporting of Sequence Generation and Allocation Concealment From the WHO ICTRP Search Portal From January 1 to December 12, 2008, by Source Registry

Conclusions Registries with specific methodological fields obtained more relevant quality information than those with general or coded fields. Critical appraisal of RCTs should include a search for information on trial registries as a complement to journal publications. However, the usefulness of the information will vary across registries due to variable content.

1Research Institute, Sanitas University Foundation, Cochrane Collaboration Branch, Av Calle 127 #21–60 cons 221, Bogota, Colombia, e-mail:; 2Mayo Clinic, Rochester, MN, USA; 3Knowledge Synthesis and Exchange Branch, Canadian Institutes of Health Research, Ottawa, Ontario, Canada; 4Research Institute, National University of Colombia; 5Department of Dermatology Research Unit for Evidence-based Dermatology, Hospital Plató, Barcelona, Spain; 6Clinical Epidemiology Unit, Cruces Hospital, & Osteba-Basque Office for HTA, Department of Health-Basque Country, Spain; 7Department of Physiology, University of the Basque Country (UPV/EHU), Spain; 8Institut de Recerca Biomèdica de Lleida (IRBLLEIDA)-Universitat de Lleida, Catalonia, Spain; 9Grupo Oncología Médica, Instituto Catalán de Oncología, Hospital Universitario Germans Trias i Pujol, Badalona, Spain

Registration Completeness and Changes of Registered Data From for Clinical Trials Published in ICMJE Journals After September 2005 Deadline for Mandatory Trial Registration

Mirjana Huic,1,2 Matko Marušić,2,3 and Ana Marušić2,3

Objective After September 2005, journals represented by the International Committee of Medical Journal Editors (ICMJE) were to publish clinical trials that are timely and adequately registered in approved registries. We assessed how well ICMJE member journals followed their own registration requirement policy.

Design We identified all reports of clinical trials with ClinicalTrials. gov registration number published by ICMJE journals from September 13, 2005, to April 24, 2008 (n = 438). For a random subset of 102 reports (Annals of Internal Medicine = 12/36, BMJ = 12/29, Croatian Medical Journal = 5/5, CMAJ = 1/1, JAMA = 16/100, Lancet = 14/88, New England Journal of Medicine = 42/179), we used the Archive section of ClinicalTrials. gov to analyze the completeness of minimal registration dataset and changes to the registration data elements relevant for the quality of trial reporting .

Results Of 102 trial reports, 73 were registered before and 29 after the September 2005 deadline, and 24 were registered before trial start date. At the time of registration, a number of trials had missing or misplaced data in the 20-item minimal dataset, such as specified key secondary outcomes, primary outcome, or target sample size; this number decreased at the last registration change before publication. The completeness of the registration data improved over time for most journals. Individual studies underwent changes in the registry from initial registration to publication (range, 1-38, median, 4; 95% confidence interval, 3-5 for 87 studies with 1 or more changes). A considerable number of published reports (23-78 of 102) had data items different (information either disclosed or changed) from those declared in the registration dataset (Table 11).

Conclusions ICMJE journals have not always fully adhered to their own registration policy. Registration quality assurance is needed for full policy implementation and transparency of clinical trials reporting in journals.

Table 11. International Committee of medical Journal editors Journals With missing Data or Information Different From Registry Data

Table 11. International Committee of medical Journal editors Journals With missing Data or Information Different From Registry Data

a Numbers in parentheses indicate the registrations where specific information from the missing field was provided in another registry field ("Detailed Description").

b 17 trials had a single registration entry before publication.

c Difference from the last registration data or first registration data if there was a single registration. The difference was either disclosure of the data item missing in the registry or change in the specific content of the register record. For example, only the drug code in the "Intervention(s)" registration field was found for 7 RCTs at initial registration and for 6 RCTs at last change before publication; generic names of the drugs were disclosed in published reports.

1Agency for Quality and Accreditation in Health Care, Zagreb, Croatia; 2Croatian Medical Journal, Croatia; 3University of Split School of Medicine, Soltanska, Split, 21000 Croatia, e-mail:

Is Protocol Information Recorded in Useful for Systematic Reviewers Relying on Conference Abstracts?

Roberta Scherer,1 Ann Ervin,1 JaKeisha Taylor,2 and Kay Dickersin1

Objective Trial registration records could provide key protocol information on trials published only as abstracts for systematic reviewers. The Association for Research in Vision and Ophthalmology (ARVO) required trial registration for abstracts describing controlled clinical trials (RCTs) starting in 2007. The objective of this study was to evaluate the proportion of 2007, 2008, and 2009 ARVO abstracts of RCTs that had been registered and compare information provided in "registered" abstracts with that in

Design We identified abstracts either describing an RCT or providing registration information. For 2007 RCT abstracts with a valid registration number, we abstracted information on study participants, intervention, sample size, and primary outcome, and compared it to similar information abstracted from We plan to include similar information from 2008 and 2009 abstracts.

Results About two-thirds of identified RCTs claimed registration (108/174 [62%] in 2007; 103/153 [67%] in 2008; and 124/170 [73%] in 2009). Fifty-two percent (56/108) of 2007 abstracts claiming registration had a valid, unique registration number; 3 not registered as an RCT in were excluded from further analyses. There was good agreement between the abstract and record in the description of participant characteristics. Reported sample size was concordant for 6 trials, lower in 20 abstracts, and higher in 11 (37 RCTs had sample size recorded in both sources). Agreement on intervention was 77% (41/53). The primary outcome reported in the abstract matched 1 or more primary outcomes reported in the "primary outcome" field for 26 of 34 trials with an entry in that field.

Conclusions When authors fail to identify, register, and fully publish RCTs, then no protocol information beyond the abstract is available for systematic reviewers. Even if registered, information in may disagree with that in the abstract. Protocols and amendments should be available at study inception to assist systematic reviewers.

1Johns Hopkins Bloomberg School of Public Health, Mailroom W5010, 615 N Wolfe St, Baltimore, MD 21205, USA, e-mail:; 2Delta State University School of Nursing, Cleveland, MS, USA


Quality of Reporting I

Quality of Survey Reporting in High-Impact-Factor Journals

Carol Bennett, Sara Khangura, Jamie Brehaut, Ian Graham, David Moher, Beth Potter, and Jeremy Grimshaw

Objective Reports of survey research often lack information necessary for transparency and study reproducibility. Our objective was to identify a representative sample of published reports of survey research and evaluate them with respect to a broad range of reporting characteristics.

Design We purposively sampled the top 15 journals (by impact factor) from each of 4 broad health science topic areas (N = 60) where survey research is common: health science, public health, medicine, and medical informatics. We conducted an Ovid MEDLINE search using the terms "survey," "questionnaire," "health surveys," and "data collection" to identify English-language studies published between January 2008 and February 2009. All citations were screened by 2 researchers to identify survey research that used a self-administered questionnaire as the primary data collection tool. Duplicate data abstraction employed a 32-item data collection tool designed to assess elements critical for transparency and reproducibility. These elements were identified through a comprehensive review of the literature identifying peer-reviewed survey reporting recommendations and from experts in survey research.

Results The search returned 1,719 citations resulting in 117 eligible studies. Preliminary results show that 13/117 (11%) described how representative the sample was of the population of interest and 47/117 (40%) discussed the generalizability of the results. With regard to reproducibility, 96/117 (82%) identified the mode of survey administration and 41/117 (35%) made the questionnaire used in the study available. Further analyses will outline the proportion of surveys that adequately report additional elements critical for transparency and reproducibility, such as a description of the survey's development, data analysis, reporting of response rates and methods for calculation.

Conclusions Pilot data indicate that the quality of survey research reporting is suboptimal. The current work will help identify areas where the development of an evidence-based reporting guideline would be expected to have the most impact on improving survey reporting.

Ottawa Health Research Institute, Clinical Epidemiology Program, ASB Box 693, 1053 Carling Ave, Ottawa, Ontario K1Y 4E9, Canada, e-mail:

Citation of Prior Research in Reports of Clinical Trials

Karen A. Robinson1 and Steven N. Goodman2

Objective Clinical trials should not be started or interpreted without consideration of prior trials that addressed the same or similar questions. Our objective was to systematically assess to what extent published reports of clinical trials cited relevant prior trials.

Design We searched Web of Science for 2004 combining terms for "meta-analysis" and "random" in title, abstract, and keywords. We used the meta-analyses to identify cohorts of RCTs addressing the same question. We then assessed, within each cohort, the extent to which trial reports cited the trials that preceded them. We calculated the proportion of prior trials that were cited (Prior Research Citation Index [PRCI]) and the proportion of the total available participant population cited (Sample Size Citation Index [SSCI]).

Results We identified 227 meta-analyses comprising 1523 trials in 19 disciplines. The median PRCI was 0.34 (lower decile 0.08, upper decile 0.76), meaning that only a third of relevant papers were cited. The median SSCI (0.44, lower decile 0.09, upper decile 0.8) was slightly larger than the PRCI, meaning that trials cited were a bit larger than trials not cited, but on average 56% of prior information was not referenced. Thirteen disciplines had PRCI of less than 40%. Of the 1101 RCTs that had 5 or more prior trials to cite, 511 (46%) cited either 0 or 1 prior trial.

Conclusions In reports of RCTs, for which the identification of prior research should be easier than with any other design, less than 40% of prior RCTs are cited, comprising less than 50% of the participants enrolled in all relevant prior trials. Further research is needed to explore the implications of this finding and the potential explanatory factors. The potential implications include ethically unjustifiable trials, wasted time and resources, incorrect conclusions, and unnecessary risks for trial participants.

1Department of Medicine, Johns Hopkins University, 1830 East Monment St, Room 8069, Baltimore, MD 21287, USA, e-mail:; 2Departments of Oncology, Epidemiology, and Biostatistics, Johns Hopkins Schools of Medicine and Public Health, Baltimore, MD, USA.

Utility of Editorials and Commentaries That Accompany Publication of Research Studies

Diane Civic

Objective Editorials and commentaries that accompany the publication of research articles can enhance readers' understanding of new studies. This analysis evaluates the extent to which these editorials and commentaries provide information on potential biases and implications for practice beyond the material included in the study's discussion section.

Design Assessment of editorials and commentaries published in the same issue as research reports in BMJ, JAMA, Annals of Internal Medicine, Lancet, and the New England Journal of Medicine during 5 randomly selected months in 2008. Up to 3 editorials/commentaries per journal per month were included. Eligibility included linkage to a single randomized controlled trial (RCT), cohort study, or meta-analysis. A piloted structured data sheet was used to abstract information from commentaries and studies. For RCTs, risk of bias was evaluated using the Cochrane Collaboration tool.

Results Fifty-five editorial/commentary research study pairs were included. Description of the studies' main findings were concordant in 46 (84%) pairs. Authors of 32 studies (58%) and 23 commentaries (42%) reported competing interests. Twenty-two studies (40%) received industry funding; only 1 editorial/commentary discussed study sponsorship. Five editorials/commentaries (9%) mentioned strengths, and 12 commentaries (22%) mentioned limitations omitted by study authors. Thirty-three of the 41 RCTs in the sample (80%) met 1 or more Cochrane criteria for potential risk of bias. Thirteen of the 33 studies indicated these biases in their discussion sections, but only 5 editorials/commentaries mentioned them. Seventeen (31%) of the commentary/research study pairs had discordant recommendations regarding whether study findings warranted action (adopting or not adopting a practice).

Conclusions Editorials and commentaries did not routinely address strengths or weaknesses of empirical studies beyond those reported by study authors and rarely mentioned studies' links to industry. In about a third of the sample, editorials and commentaries provided a different perspective than study authors on whether findings were sufficient to recommend action.

Blue Cross/Blue Shield Association Technology Evaluation Center (BCBSA TEC), 225 North Michigan Ave, Chicago, IL 60601, USA, e-mail:

Quality of Reporting II

Acknowledging Limitations in Biomedical Studies: The ALIBI Study

Milo A. Puhan,1,2 Nadine Heller,2 Irena Joleska,2 Lara Siebeling,3 Patrick Muggensturm,2 Martin Umbehr,2 Steve Goodman,1 and Gerben ter Riet3

Objective To determine the proportion of clinical research papers acknowledging limitations, to categorize limitations, and to assess the degree of tempering of conclusions due to uncertainty arising from limitations.

Design Survey of medical research papers (November 2008 to February 2009). We included the first 10 papers describing randomized trials and observational or diagnostic studies published in 2007 in 30 journals (10 general medical and 20 specialty journals, of which half were first- and second-tier journals). Two reviewers independently evaluated the proportion and type of acknowledged limitations and whether the wording of conclusions in the abstract and discussion section was tempered in light of limitations as perceived by the independent reviewers.

Results Seventy-three percent of the 300 papers acknowledged a median of 3 (range, 0-8) limitations in the discussion section, whereas 5.3% acknowledged a limitation in the abstract; 62.1% and 37.9% of acknowledged limitations referred to aspects of internal and external validity, respectively. Measurement errors (149) and selected study populations (115) were mentioned most frequently as limitations of internal and external validity, respectively. In 88.1% of the papers, tempering the conclusions because of limitations was not recognizable. Papers in general medical journals were more likely to acknowledge limitations at all (odds ratio, 2.27; 95% confidence interval [CI], 1.27-4.10) and in the abstract (OR, 3.57, 95% CI, 1.27-10.0), whereas the conclusions were not tempered more frequently (OR, 0.98, 95% CI, 0.43-2.33). First-and second-tier journals did not differ significantly (Table 12).

Table 12. acknowledgment of Limitations in medical Journals

Table 12. acknowledgment of Limitations in medical Journals

Conclusions A limitation of our study is that acknowledged limitations could not be distinguished from true limitations. Limitations are acknowledged frequently in medical papers, but they are rarely reflected in abstracts or conclusions. As a consequence, readers may not fully realize the limitations of the findings reported. Our findings raise the suspicion that often limitations are acknowledged only pro forma and cannot play the crucial role in the scientific discourse they deserve.

1Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Epidemiology Dept, Baltimore, MD 21205, USA, e-mail:; 2Horten Centre for Patient-oriented Research, University of Zurich, Switzerland; 3Department of General Practice, University of Amsterdam, Amsterdam, the Netherlands

Reporting of Eligibility Criteria of Randomized Trials: Comparison Between Trial Protocols and Journal Articles

Anette Blümle,1 Joerg J. Meerpohl,1,2 Gerd Antes,1 and Erik von Elm3

Objective The reporting of randomized trials in journal articles strikes a balance between limited article length and completeness of study information. Comparing study protocols to subsequent publications, we aimed to study the frequency and nature of changes in eligibility criteria (EC) of randomized trials.

Design We established a cohort of protocols submitted in 2000 to the Research Ethics Committee of University Freiburg/ Germany and subsequent full publications identified by electronic literature searches and survey of applicants. We identified 52 trial protocols with 78 publications. From protocols and publications we extracted information on EC differing between protocol and publication and classified them into 7 content categories. For each EC we examined whether it was added to the publication or missing, stated as inclusion or exclusion criterion, and whether the difference represented a minor or major change and would be suggestive of a smaller or larger study population.

Results The 78 publications were published in 50 journals, of which 23 (46%) endorsed the CONSORT statement in their author instructions (as of May 2009). For 1 trial, all the EC stated in the protocol matched with those reported in the publication. For 51 trials (98%) with 77 subsequent articles we found differences in EC reporting of different types. Of 1230 EC stated in protocols, 522 (42.4%) were matching between protocol and publication, and 708 (57.6%) were modified or missing in the publication (Table 13). A total of 572 EC (46.5%) were formulated as inclusion criterion, and 630 EC (51.2%) as exclusion criterion (28 labeled as "patient selection criteria"). The most frequent content categories of prespecified EC was comorbidity, medical treatment, and type/severity of illness. Seventy EC were new in the publications, for a total of 778 discordant EC. Most differences in EC between protocols and publications were deemed major; most of the published EC definitions were suggestive of larger study populations when the EC was prespecified but of smaller study populations when it was new.

Table 13. Characteristics of eligibility Criteria in trial protocols and publications

Table 13. Characteristics of eligibility Criteria in trial protocols and publications

a 21 of 522 (4.0%) labeled as "patient selection criteria"

b 7 of 708 (1.0%) labeled as "patient selection criteria"

c 28 of 1230 (2.3%) labeled as "patient selection criteria"

d 21 of 708 (3.0%) unclear

e 2 of 70 (2.9%) unclear

Conclusions Most articles do not mirror the exact definition of the trial's study population as prespecified in the protocol. Because many users of trial information rely on data published in journal articles, the generalizability of trial results may be misinterpreted.

1German Cochrane Centre, Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Germany; 2Division of Pediatric Hematology& Oncology, Department of Pediatrics, University Medical Center Freiburg, Germany; 3Swiss Paraplegic Research, PO Box, CH-6207 Nottwil, Switzerland, e-mail:

Reporting of Continuous Outcome Measures in Randomized Clinical Trials: Is the Whole Story Being Told?

David L. Schriger,1 Dan F. Savage,1 and Douglas G. Altman2

Objective The CONSORT statement defines elements to be included in reports of randomized controlled trials (RCTs) but says little about the depiction of outcome data. In some reports of RCTs a minimal summary of the available data is presented; a 2-arm, 1000-patient trial might report only 2 means and 2 standard deviations. Such austere reduction may lead to misinterpretation of a trial. We designed our study to investigate the extent of such data reduction.

Methods We are evaluating 10 randomly selected RCTs with 1 or more continuous primary outcomes from 2007 to 2009 issues of 20 leading medical journals. Using methods developed for the quantification of density of data in tables and graphs we measure the degree of data reduction in 2 ways. First, we note the format (text, table, or figure) that conveyed the most detailed information about the outcome and the way that information was conveyed (eg, mean alone; mean with standard deviation [SD], standard error of the mean [SEM], or confidence interval [CI]; histogram; or scatterplot). Second, we calculate the "percentage of available data presented" by dividing the number of data points and descriptive statistics presented for the outcome (the numerator) by the number of data points that could have been presented (the denominator) using a series of denominators of increasing stringency. We also calculate the percentage of data presented for the outcome that was best presented in each article.

Results In general, only a small fraction of available data are presented (mean for best outcome, 22%, median, 6%, range, 0.2%-100%). There was considerable heterogeneity by journal: mean range (2%-72%), median range (1%, 100%). For over half the journals the median percentage of data presented for the best outcome was under 10% and for 13 of 14 journals it was below 25%. The percentage of data presented for the best outcome was higher when presented in a figure (n = 49, mean 43%, median 22%), than a (n = 85, mean 10%, median 5%), or as text (n = 6, mean 7%, median 7%).

Discussion Reports of randomized trials present a small fraction of the available data. While the extent to which this leads to misinterpretation of trial results is unknown, scientific discourse would be enhanced by the presentation of the all of the data either in the paper or in online supplements.

1University of California Los Angeles School of Medicine, Emergency Medicine Center, 924 Westwood Blvd, Suite 300, Los Angeles, CA 90024-2924, USA, e-mail:; 2Centre for Statistics in Medicine, University of Oxford, Oxford, UK

Quality of Reporting III

CONSORT for Improving the Quality of Reports of Randomized Trials: A Longitudinal Study of PubMed Indexed Articles

Sally Hopewell,1 Susan Dutton,1 Ly-mee Yu,1 An-wen Chan,2 and Douglas G. Altman1

Objectives To examine the reporting characteristics and methodological details of randomized trials indexed in PubMed in 2000 and 2006 and to assess whether quality of reporting has improved following publication of the revised CONSORT Statement in 2001.

Design We examined all primary reports of randomized trials indexed in PubMed in December 2000 (n = 519) and December 2006 (n = 616). We included parallel-group, crossover, cluster, factorial, and split-body design studies; cost-effectiveness and diagnostic studies were excluded. We carried out single data extraction for a number of general and CONSORT specific items. Data were analyzed using STATA (ver 10); calculating the risk ratio (RR) (with 95% confidence intervals [CI]) to represent changes in reporting between 2000 and 2006.

Results The majority of randomized trials were 2-arm (73% in 2000 vs 76% in 2006), parallel-group trials (74% vs 78%), published in specialty journals (93% vs 90%), with a median sample size of 52 (interquartile range [IQR], 24-120) in 2000 and 62 (IQR, 33-152) in 2006. The proportion of drug trials decreased between 2000 and 2006 (76% vs 58%) and surgical trials increased (10% vs 21%). More articles reported details of the primary outcome (RR, 1.18; 95% CI, 1.04-1.33), power calculation (RR, 1.66; 95% CI, 1.401.95), random sequence generation (RR, 1.62; 95% CI, 1.32-1.97), and allocation concealment (RR, 1.40; 95% CI, 1.11-1.76) in 2006 (Figure 1). There was no significant difference in reporting of who was blinded (RR, 0.91; 95% CI, 0.75-1.10). In 2006, 28% of reports included a CONSORT flow diagram, and 61% gave the funding source; very few reported details of trial registration (9%) or access to the trial protocol (1%).

Conclusions Without important information about trial conduct it remains difficult to gauge the validity of trial results. Despite some progress in reporting of methodological details in recent years, there remains considerable room for improvement.

1Centre for Statistics in Medicine, University of Oxford, Linton Road, Oxford, OX2 6UB UK, e-mail:; 2Women's College Research Institute, Department of Medicine, University of Toronto, Ontario, Canada

Figure 1. Changes in reporting of methodological Items between 2000 and 2006

Figure 1. Changes in reporting of methodological Items between 2000 and 2006

Blinding: Trial reports exactly who was blinded (eg, participants, care providers, outcome assessors).

CONSORT Guidelines for Reporting Abstracts of Randomized Trials: A Survey of Its Impact on High-Impact Journals

Sally Hopewell,1,2 Isabelle Boutron,1 and Mike Clarke2

Objective To evaluate abstracts for reports of randomized trials published in 5 high-impact journals to assess the impact of CONSORT for Abstracts guidelines (published January 2008) and their influence on editorial policy.

Design We selected a random sample of 30 primary reports of randomized trials per journal per year from Annals of Internal Medicine, British Medical Journal (BMJ), Lancet, Journal of the American Medical Association (JAMA), and New England Journal of Medicine (NEJM) in 2007, 2008, and 2009, if indexed in PubMed with an electronic abstract. Secondary publications and economic analyses were excluded. Two authors extracted data independently using the CONSORT for Abstracts checklist. Data were analyzed using STATA (ver 10); 2007 and 2008 data are reported here, data for 2009 will be presented at the Congress.

Results A total of 284 abstracts were assessed (median participants per trial, 571 [interquartile range, 251 to 2005]). Most abstracts described the study as randomized in the title (216; 76%) and reported participant eligibility (256; 90%), interventions (218; 77%), objectives (274; 97%), primary outcome (201; 71%), result for each group with effect size (211; 74%), and precision (225; 79%). Allocation concealment (13; 5%), sequence generation (7; 2%), and specific details on who was blinded (12; 4%) were poorly reported as were trial design (65; 23%), funding source (3; 1%), harms (119; 42%), and number of participants randomized (137; 48%) and analyzed (92; 32%) in each group. There were substantial differences in the median proportion of CONSORT items reported across journals perhaps reflecting different editorial policies.

Conclusions Abstracts of randomized trials fail to meet a number of recommendations in the CONSORT for Abstracts guidelines. We hope the endorsement of the guidelines by the International Committee of Medical Journal Editors will herald improvements.

1Centre for Statistics in Medicine, Wolfson College, Linton Road, Oxford, OX2 6UB UK, e-mail:; 2UK Cochrane Centre, National Institute for Health Research, Oxford, UK

Reporting Clinical Trial Subgroup Analyses: A Proposal for Rigorously Assessing Heterogeneity in Treatments Effects

David M. Kent,1 Peter M. Rothwell,2 John P. A. Ioannidis,1,3 Douglas G. Altman,4 and Rodney A. Hayward5

Objective To develop a framework for the analysis and reporting of heterogeneity of treatment effect (HTE).

Design We reviewed the recent evidence on optimal statistical approaches to assessing HTE, and supplemented this with a systematic review of subgrouping practices in therapeutic trials that employ either clinical or FDA-accepted surrogate outcomes in 4 general medical journals (BMJ, JAMA, Lancet,and the New England Journal of Medicine).

Results Our initial review suggests that there is frequently tremendous variation in the baseline risk of the outcome of interest in clinical trial populations. These differences in risk may lead to clinically important HTE, such that the "average" benefit observed in the RCT summary result may be nonrepresentative of the treatment effect for many patients, including typical patients enrolled in the trial. Conventional subgroup analyses, which examine whether specific patient characteristics modify the effects of treatment, are usually unable to detect even large variations in treatment benefit (and harm) across risk groups because they do not account for the fact that patients have multiple characteristics simultaneously affecting outcome risk and potential for benefit. Risk-based subgroups using multivariate risk modeling are much better powered to detected HTE. Our systematic review of recently published clinical trials shows that this is often feasible but rarely done. While 61/93 (65.5%) studies reported subgroup analysis (a median of 4 per trial [range, 0-23; interquartile range, 2-6]), only 6 (6.5%) reported a risk-based subgroup analysis. Potentially applicable externally developed predictive models were available for 65 (69.9%) trials. Risk-based analysis was deemed feasible in all but 15 trials (16.1%).

Conclusions Trials that do not present interpretable absolute and relative treatment effects across risk categories are incompletely disclosing their results. Development of guidelines for subgroup analysis and the rigorous assessment of HTE using multivariable risk-based analysis could substantially improve the reporting of clinical trials.

1Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, 800 Washington St, Box #63, Boston, MA 02111, USA, e-mail:; 2Department of Clinical Neurology, John Radcliffe Hospital, Oxford, UK, 3Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece, 4Centre for Statistics in Medicine, University of Oxford, Oxford, UK, 5Division of General Medicine, Health Management and Policy, School of Public Health, Department of Internal Medicine, University of Michigan, Ann Arbor, MI, USA

Reporting Guidelines for Clinical Research: A Systematic Review

David Moher,1 Iveta Simera,2 Kenneth Schulz,3 Donald Miller,4 Jeremy Grimshaw,1 John Hoey,5 and Douglas G. Altman2

Objective To identify, document, and characterize all existing reporting guidelines developed to improve the quality of reporting of health research. A reporting guideline is a checklist, flow diagram, or explicit text to guide authors in reporting a specific type of research and is developed using explicit methodology, of which a consensus process is a crucial component.

Design A systematic review. We searched MEDLINE (February week 2 2009), EMBASE (2009 week 8), PsycInfo (February week 3 2009), and the Cochrane Methodology Register (2009 Issue 1). Two reviewers independently conducted a broad screen of all retrieved titles and abstracts and subsequently a full-text screen to determine final eligibility for those records passing the broad screen. To be included, the reporting guideline must have been developed using explicit methodology, including a consensus process, and be reported in English or French language. All disagreements were resolved through consensus and, third-party arbitration, as needed. One researcher extracted descriptive information about each reporting guideline and the development process, using a recently developed checklist covering 4 phases of the development process. A pilot-tested, standardized data extraction form was used.

Results A total of 2,784 records were identified and 450 are currently being screened for final eligibility (87.0% agreement for broad screen). The majority of records were excluded because they were editorials or described clinical practice guidelines or reporting guidelines developed using a non–consensus-based process. A range of reporting guidelines have been identified related to clinical, laboratory, and economic health research. Some of the identified guidelines build on existing guidance (ie, for a specific clinical area) or are updates of previously published guidelines. A broad range of approaches were followed to develop reporting guidelines, but most included a face-to-face meeting of relevant stakeholders, including content experts, editors, and clinicians. Preliminary analysis indicates that reporting of specific elements of the guideline development process is suboptimal.

Conclusions This review helps to characterize similarities and differences across health research reporting guidelines. The diversity in development approaches and suboptimal reporting of the development process suggest there is a growing need to develop an instrument to help authors, editors, and others appraise the usefulness of any reporting guideline. The development of such a tool will be informed by this systematic review.

1Clinical Epidemiology Methods Centre, Ottawa Health Research Institute, The Ottawa Hospital, General Campus, Critical Care Wing (Eye Institute), 6th Floor, 501 Smyth Rd, Ottawa, Ontario, Canada K1H 8L6, e-mail:; 2Centre for Statistics in Medicine, Oxford University, Oxford, UK; 3Family Health International, Durham, NC, USA; 4University of Ottawa, Ottawa, Ontario, Canada; 5University of Toronto, Toronto, Ontario, Canada

Postpublication Citations, Indexing, Responses, and Online Publishing

Impact Factors of Secondary Journals

Cynthia Lokker, R. Brian Haynes, K. Ann McKibbon, and Nancy Wilczynski

Objective Secondary journals such as Evidence-Based Medicine (EBM), ACP Journal Club (ACPJC), and Evidence-Based Nursing (EBN) review more than 150 clinical journals and summarize articles that pass criteria for scientific merit and clinical relevance to practicing clinicians. Our objective was to calculate 2007 impact factors for the secondary journals in comparison with the article source journals.

Design A retrospective cohort study of articles abstracted in the secondary journals originally published in 2005 and 2006. We collected the number of citations in 2007 to these articles from the Institute for Scientific Information (ISI) Web of Science and calculated the 2007 impact factors for the secondary journals. We compared impact factors of the secondary journals with the published impact factors of the journals represented within the secondary journals. We also compared the mean citations to summarized articles per journal to the published 2007 impact factors.

Results The 2005 and 2006 articles in the secondary journals were originally published in 84 journals, 82 with impact factors (median, 4.1; range, 0.85-52.9). The calculated impact factors for the secondary journals were 39.5 for ACPJC, 30.2 for EBM, and 9.3 for EBN (Table 14). Highest published impact factors for journals were New England Journal of Medicine (NEJM) (52.9), Lancet (28.6), JAMA (25.5), Archives of General Psychiatry (16.0), Annals of Internal Medicine (15.5), and Journal of Clinical Oncology (15.5). ACPJC and EBM had impact factors higher than all but NEJM. Of 100 journals categorized as "general and internal medicine" by ISI, the median impact factor was 1.3. Twelve journals had impact factors higher than EBN but none were nursing journals. Of the 46 nursing journals, the median impact factor was 0.9.

Table 14. Secondary Journal Impact Factors based on articles Originally published in 2005 and 2006 and abstracted in evidence-based Journals

Table 14. Secondary Journal Impact Factors based on articles Originally published in 2005 and 2006 and abstracted in evidence-based Journals

ACPJC, ACP Journal Club; EBM, Evidence-Based Medicine; EBN, Evidence-Based Nursing.

Conclusions The selection processes of evidence-based secondary journals identify articles at the time of publication that go on to garner more citations on average than other articles in the source publications. Whether this is simply due to selection processes or also due to stimulating citations of featured articles is unknown.

Health Information Research Unit, HSC 3H7, 1200 Main St W, McMaster University, Hamilton, Ontario L8N 3Z5, Canada, e-mail:

Does Journal Indexation Depend on the Origin of Publication? A Retrospective Cohort Study of Anesthesia Journals Indexed in MEDLINE and EMBASE

Martin Tramèr,1 Nadia Elia,1 Jean-Daniel Junod,1 Teresa Dib,2 and Christian Mazza3

Objective To study the association between the origin of publication of anesthesia journals and indexation rates in MEDLINE and EMBASE.

Design Retrospective cohort study of anesthesia journals published to 2005. Journals were systematically searched using International Standard Serial Number, US National Library of Medicine database, and Ulrich's Periodicals Directory. We extracted information on origin of publication (United States [US], non–United States [non-US]), first and last date of publication, and indexation in MEDLINE and/or EMBASE. We computed indexation rates per 1000 journal years (IR) and rate ratios (RR) comparing IRs of US with non-US journals, boh with 95% confidence intervals (CI).

Results We retrieved 325 journals, published from 1921 to 2005; 57 journals (25 US, 32 non-US) had been indexed in MEDLINE and 71 (21 US, 50 non-US) in EMBASE. In MEDLINE, IR of US journals was 20.8 (95% confidence interval [CI], 14.1-30.9), and the IR of non-US journals was 8.0 (95% CI, 5.6-11.3); RR 2.61 (95% CI, 1.48-4.55), P < .001. In EMBASE, IR of US journals was 16.9 (95% CI, 11.026.0), and the IR of non-US journals was 12.3 (95% CI, 9.3-16.2); RR 1.38 (95% CI, 0.79-2.34), P = .223. The RRs have significantly increased over time in MEDLINE (RR, 1.75 (95% CI, 1.07-2.88), P = .024; likelihood ratio test for interaction between origin and time period, P = .003, but not in EMBASE (RR, 1.27 (95% CI, 0.76-2.13), P = .364; likelihood ratio test for interaction, P = .561). Although IRs of US journals remained similar in both databases, IRs of non-US journals have dramatically decreased in MEDLINE only.

Conclusions MEDLINE and EMBASE both claim that journals are selected for indexation based on quality and are independent of origin. However, there is evidence that US anesthesia journals are significantly likelier to be indexed in MEDLINE than non-US anesthesia journals; this phenomenon is not found in EMBASE. Since both databases share very similar indexing criteria, quality of journals is unlikely to provide an alternative explanation to our findings.

1Division of Anesthesiology, University Hospitals of Geneva, 24 rue Micheli-du-Crest, Geneva 14, 1211 Switzerland, e-mail: martin.tramer@; 2University of Geneva, Switzerland; 3University of Fribourg, Fribourg, Switzerland

Authors' Reply to Substantive Criticism Raised in Online Letters to the Editor

Peter Gøtzsche,1 Tony Delamothe,2 Fiona Godlee,2 and Andreas Lundh1

Objective To study whether substantive criticism raised in online letters to the editor, defined as a problem that could potentially invalidate the research or make it less reliable than what it seemed to be, is addressed by the authors.

Design Cohort study of research papers published in the BMJ between October 2005 and September 2007. Two observers selected all papers where a substantive criticism was raised in the online Rapid Responses section, and 2 editors, blinded to the authors' replies, judged independently whether it was (1) minor, (2) moderate, or (3) major. Thereafter, the editors judged whether the criticism was (1) fully addressed, (2) partly addressed, or (3) not addressed by the authors. The criticism authors made the same judgment.

Results A substantive criticism was raised against 105 of 350 (30%) research papers, and the 2 editors judged it to be major in 34 and 54 cases, respectively. The authors had responded to 47 (45%) of the criticisms. The criticism was of similar severity in cases with and without authors' replies (P = .72). For the 47 criticisms with responses, we did not find a relation between the seriousness of the criticism and the adequateness of the replies, neither in the opinion of the editors (P = .88 and P = .95, n = 47) nor in the opinion of the criticism authors (P = .83, n = 39, 83% response rate). However, compared with the criticism authors, the editors felt more often that the criticism was addressed (mean, 1.4 vs 2.3, P < .001, n = 39).

Conclusions Substantive criticism was common, but authors replied in only half of the cases. Editors judged the replies far more positively than the criticism authors. Resources permitting, editors might encourage authors to reply to substantive criticism and could aim for adequate replies, eg, by using the criticism authors as peer reviewers of the reply.

1The Nordic Cochrane Centre, Rigshospitalet, Dept 3343, Blegdamsvej 9, DK-2100 Copenhagen, Denmark, e-mail:; 2BMJ, London, UK

Reader Response to an Online Clinical-Decision Feature With Polls and Commenting

Patricia Kritek,1,2 Edward W. Campion,2 and Pam Miller2

Objective Scientific, peer-reviewed journals are developing a greater presence online, but there are concerns about both the quality and the utility of the new, associated participatory features.

Design We assessed the use patterns for a free, interactive feature that includes a short case about a clinical controversy, expert commentary, a poll with 3 decision options, and an option to submit comments. All cases related to original research published at the same time. The topics ranged from outpatient management of asthma to end-of-life decision making.

Results An average of 6577 votes were cast (range, 3703-11205). The first case, mild-persistent asthma, had the most users (32822) with 18.5% voting. The voting rate increased over time to 37.5% for the last case. Participants were from 136 countries in the following regions: North America (63%), Europe (16%), Asia/Russia (9%), South America (9%), Australia/Oceana (2%), and Africa (1%). Those voting were physicians (85%), students (8%), other health professionals (5%), or other (2%). The 7 case-decision exercises received an average of 373 reader comments, although the number of comments correlated poorly with the number of votes cast, and only 6% of those who voted submitted a comment. In 7 cases the average number of comments published was 341. The greatest number of published submissions (492) was associated with a clinical scenario that received 6445 votes, while there were only 407 comments published for the feature with the most votes. Editors judged 92% of submitted comments appropriate for publication (range, 89%-95%).

Conclusions This new, participatory, online, clinical-decision feature at a medical journal's Web site evoked international reader responses, especially in the polls and, to a lesser extent, in the submission of comments. The vast majority of reader comments were judged appropriate for online publication although objective assessment of the quality of comments remains challenging.

1Brigham & Women's Hospital, Pulmonary and Critical Care, 75 Francis St, Clinics Building, 3rd Floor, Boston, MA 02115, USA, e-mail: pkritek@; 2New England Journal of Medicine, Boston, MA, USA

Use of the Internet by Print Medical Journals in 2003-2009: A Longitudinal Observational Study

David L. Schriger,1 Rashida Merchant,1 Ariana Chehrazi,1 and Douglas G. Altman2

Objective To examine the extent to which journals are augmenting print articles with online supplements and functions.

Design This is a longitudinal observational cohort study of 138 high-impact medical journals. The number and kinds of online supplements associated with each print article in a sample of 28 medical journals were assessed biennially starting with March 2003 issues using standardized abstraction forms. The use of "rapid response" pages that permit the public to provide post-publication review of papers was tracked for all journals that have this feature.

Results The number of journals providing online-only supplementary material increased from 32% (2003) to 50% (2005) to 61% (2007) to 64% (2009), and the percentage of articles that contained supplementary material increased from 5% to 12% to 21% to 30%, respectively. This pattern was seen in both a random sample of journals (20) and a selective sample of journals (10) chosen because they were thought to have frequent online-only content. The number of video supplements also increased markedly from 2005 to 2007, showing a slight decrease in 2009. In contrast, the number of journals offering online postpublication review decreased from 12% (17/138) to 9% (12/138) from 2005 to 2007, and the percentage of articles with no responses was essentially unchanged at 82% (2005) and 81% (2007) (2009 data pending). See Table 15.

Conclusions The use of online-only articles and online-only supplements by print journals continues to increase. Postpublication critique of online-only articles provided by the journal does not seem to be taking hold.

1University of California Los Angeles School of Medicine, Emergency Medicine Center, 924 Westwood Blvd, Suite 300, Los Angeles, CA 90024-2924, USA, e-mail:; 2Centre for Statistics in Medicine, University of Oxford, Oxford, UK

Table 15. Supplementary Web-Only material to print Journal articles in 28 Journals 2003-2009

Table 15. Supplementary Web-Only material to print Journal articles in 28 Journals 2003-2009

a r indicates journals selected at random; h, journals handpicked for sample; 5 journals in the random sample (Arch Gen Psychiatry, Biol Blood Marrow Transplant, Ear Hear, J Assoc Res Otolaryngol, J Neurotrauma) and 2 in the handpicked sample (Ann Surg and Pediatrics) had no supplementary material in any of the 4 years.

b Counts do not include the 146 articles that were published solely online (87 of which were in Pediatrics).

c The number of articles reported in the final 3 rows is based on the relevant journals from the 21 listed in the table. The entries for the other columns in these rows are calculated as the average of the percentages for each journal in that sample.




A Survey on Publication Standards of Medical Drug and Device Advertisements Published in Core Medical Journals in China

Liang DU,1 Yao-long CHEN,1 Min CHEN,2 and You-ping LI1

Objective Medical drugs and devices are the most widely used products to provide medical service in any country. Strict pre- and postmarketing assessments are needed during their whole lifespan. More than 6,000 manufacturers exist in China due to the huge market and large profits. Most of them produce the same products but publish numerous different advertisements for their limited products. There are more than 1,000 medical journals in China, and most of them will publish advertisements. Presumably, advertisements in core medical journals have far more influence than ones in other media due to the journal's academic reputation, but they often lack standards or evidence to judge the quality of the advertised products. The Standards for the Examination and Publication of Drug and Device Advertisements were implemented in 1995 and updated in 2007 and 2009. The 2 standards stated that only advertisements with both advertisement license and production license numbers could be published. This study aims to learn the current status, publication standards, formats, and contents of medical advertisements published in journals in China and to discuss the possibility of evidence-based evaluation and standards.

Design We reviewed issue 1 of 222 core medical journals published in 2008 and indexed by A Guide to the Core Journals of China (2004 version), the most important database to index the top 20% academic journals, to identify basic journal information and the content of their advertisements. The general and trade name of the drugs and the advertisements and production license number of the drugs and devices were collected. We used EXCEL software for data input and SPSS 13.0 for statistical analyses.

Results Two hundred eighteen journals were hand-searched and evaluated. The other 4 journals were excluded because a print version could not be found. A total of 1201 advertisements were published in 159 (72.9%) of the journals, with an average of 5.5 advertisements (range, 1-37) per journal. Of the advertisements, 910 (75.8%) were related to medical drugs or devices, including 598 (49.8%) drug and 312 (26%) medical device advertisements. Most advertisements were published in clinical and specialty medical journals. A total of 518 (86.6%) drugs advertisements had both advertisement license and production license number but only 116 (36.1%) of medical devices advertisements stated the advertisement license and production license number. References were found in less than 10% of advertisements.

Conclusions The medical drug advertisements published in core medical journals of China lack sufficient publication standards, and medical devices advertisements are even worse. We cannot assess the efficacy, safety, and cost-effectiveness of advertisement production according to the currently limited, unclear, and highly commercialized advertisements. It is necessary to improve the publication standard for advertisements so that they provide enough necessary evidence, develop proper format and approach, and enhance their application and management. We are conducting a further survey on the information of advertising products related to clinical trial registration, publication, and evidence grade to clearly indicate their real efficacy, safety, and cost-effectiveness.

1Chinese Cochrane Centre, West China Hospital of Sichuan University, No. 37 Guoxuexiang, Chengdu 610041, China, e-mail: yzmylab@hotmail. com; 2Department of Traditional Chinese Medicine, No. 3 People's Hospital, Chengdu, China

Papers Supporting Advertisements in Medical Journals: A Continuation of the Case-control Study

Vasiliy Vlassov

Objective Print journals depend on advertisements. To describe the patterns of the articles published in support of advertisements, including types of articles, differences in the practice of advertisers, and time trends.

Design Case-control study, extending work from a previous study. From a convenience sample of 2 international and 3 Russian peer-reviewed journals, 3 were selected for the extension of the study from 2000 to 2008 because of availability. Only advertisements for medical products, not education and jobs, were considered. The connection of the article to the advertisement was judged by the presence in the article of a positive comment on the advertised product or the use of the trade name and/or manufacturer.

Results Three journals have a rather constant level of association of published advertisements with the content of a journal (odds ratio, 2.2-40 during different years) and interleaf advertisements with articles. All major advertisers have a similar level of association of their advertisements with the content of the journals. Some advertised products do not show the statistically significant association with articles published in the same issue.

Conclusions Manipulation of the journal content for support of the advertisements is a long-lasting practice. The stable level of association during 8 study years means that this practice is not a "period effect." Association is visible also from juxtaposition of advertisements and related articles. The strength of the association and use of this instrument by all major advertisers means that part of the "scientific" content of the journals is misleading. In journals successfully attracting advertisements, this misleading content may be included in up to 50% of published articles.

Moscow Medical Academy, POB 13, Moscow, 109451, Russia, e-mail:

Authorship and Contributorship

Factors Associated With Multiple Authorship in Peer-Reviewed Papers Regarding Pregnancy Over 3 Decades

Brian Mercer,1 Catherine Spong,2 and James Scott3

Objective To determine if variation in authorship number in peer-reviewed papers regarding pregnancy can be attributed to differences in journal type, funding, or multicenter studies.

Design Using PubMed we identified original studies with abstracts and the key-word "pregnancy" published from 1975 to 2008 in 3 obstetric specialty journals: Obstetrics & Gynecology (OG), American Journal of Obstetrics & Gynecology (AJOG), and British Journal of Obstetrics & Gynecology (BJOG) and 3 general medical journals: New England Journal of Medicine (NEJM), Journal of the American Medical Association (JAMA), and Lancet (LT). Univariable and multivariable comparisons were performed for author number, more than 6 (GT6) and more than 10 (GT10) authors, corporate authorship (none specified), and group authorship (authors plus group), according to publication year, journal type, funding, and multicenter study.

Results Of 12,981 papers, GT6 authorship occurred in 15.7% (OG, 7.1%; AJOG, 17.0%; BJOG, 11.8%; JAMA, 33.9%; NEJM, 39.3%; LT, 31.7%; P < .0001). GT6-authorship was more common in general medical journals (34.3% vs 13.3%; odds ratio [OR], 3.40, 95% confidence interval [CI], 3.02-3.83), multicenter (45.3% vs 13.5%; OR, 5.29 [4.59-6.09]), and funded (18.1% vs 9.0%; OR, 2.24 [1.97-2.55]) studies. From 1975 to 2008, GT6-authorship increased from 0% to 72.7% in general medical and 1.7% to 30.7% in obstetric journals, each P < .0001. The difference in GT6 authorship between journal types increased over time, P < .001. GT10-authorship findings were similar. Group and corporate authorship were more common in general medical journals (see Table 1). Group authorship increased over time, P < .001. All papers with corporate authorship listed (n = 69) or referenced (n = 2) the study investigators or a writing committee. GT6, GT10, corporate and group authorship, and total author number varied significantly between journal types after controlling for year, multicenter studies, and funding, P < .001 each.

Table 1. Comparison of Authorship in General Medical and Obstetric Journals

Table 1. Comparison of Authorship in General Medical and Obstetric Journals

a P < .0001

Conclusions The proportion of multiauthored original studies regarding pregnancy is higher in general medical journals than obstetric journals, and this difference has increased over time. These finding are not accounted for by differences in funding or multicenter studies.

1Case Western Reserve University, Suite G240, Dept Ob/Gyn, MetroHealth Medical Center, 2500 MetroHealth Dr, Cleveland, OH 44109, USA, e-mail:; 2Pregnancy and Perinatology Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA; 3Obstetrics & Gynecology, University of Utah Medical Center, Salt Lake City, UT, USA


The Value of Lesser-Impact-Factor Surgical Journals as a Source of Negative and Inconclusive Outcomes Reporting

Ziad Kanaan, Susan Galandiuk, Margaret Abby, Katherine Shannon, Daoud Dajani, and Hiram C. Polk

Background Evidence-based medicine is often used as a template for measuring the quality of medical care. Clinicians put their faith in peer-reviewed articles as quality assured and reliable knowledge. However, the peer-reviewed literature is complicated by the arduous task of clinicians to equally retrieve quality-assured positive, negative, and inconclusive reports.

Objectives (1) Examine the tendency of peer-reviewed surgical journals to publish positive reports or negative and inconclusive outcome articles as a function of the journals' impact factor (IF). (2) Examine the frequency with which surgical journal editors/ publishers follow a previously published joint statement regarding funding and/or conflicts of interest (COI).

Design Papers from 15 surgical journals comprising 3 separate journal groups based on 2006 IF rankings were reviewed. All were published in 2007. Manuscripts were classified by 4 independent reviewers as having positive, negative, or inconclusive primary and secondary outcomes and for statements on funding/COI. Positive reports were defined as P < .05, null hypothesis rejected; negative reports defined as P < .05, null hypothesis accepted; and inconclusive reports defined as P > .05. Case reports, reviews, commentaries, and editorials were excluded. Interobserver consistency was assessed and affirmed in 10% of manuscripts.

Results Review of a total of 2457 articles showed an inverse correlation between impact factor and negative and inconclusive reports (Figure 1).

Conclusions This presumed bias away from opposing points of view that are essential in clinical decision making is a major weakness of current patterns of publication. This bias is further complicated by the failure of all but 1 surgical specialty journal to uniformly describe funding sources and/or COI. Lower IF-rated journals may serve a decidedly useful purpose by publishing more negative and inconclusive outcome studies. The practice of focusing disproportionately on the positive outcomes of most studies may result in unbalanced evidence.

Figure 1. Inverse Correlation Between Impact Factors and Negative and Inclusive Reports

Figure 1. Inverse Correlation Between Impact Factors and Negative and Inclusive Reports

a Includes the only journal surveyed with 100% funding/COI reporting.

Department of Surgery and the Price Institute of Surgical Research, University of Louisville, Louisville, KY 40292, e-mail:

Differences in Editorial Board Reviewer Behavior Based on Gender

Deborah Wing,1 Rebecca Benner,2 Rita Petersen,1 and James Scott2

Objective With increasing female representation on the Obstetrics & Gynecology's editorial board, we evaluated whether there were differences in review behavior based on gender.

Design Retrospective analysis of editorial board members' reviews of original research submissions based on gender using the online management program, Editorial Manager, from January 1, 2002, through December 31, 2008. We evaluated recommendations of the editorial board members for acceptance or rejection using a 4-tier system, agreement with the editor's final decision, turnaround time from review request to submission, and editors' grades of reviews on a 5-point scale. We evaluated performance of editorial board members with advancing tenure, seeking trends in recommendations over time.

Results A total of 6062 manuscript reviews representing 5958 manuscripts were included; 4062 (67%) were assigned to male editorial board members and 2000 (33%) to females. There were a total of 38 editorial board members (25 men, 13 women) with tenure duration from 2 to 4.9 years, and 3 editors (2 men, 1 woman) serving 7 and 6 years, respectively. Women were less likely to accept or accept with minor revisions than were men (P < .003). Median turnaround times were 14 (0-55) days for women and 10 (0-33) days for men (P < 001). The editors' grades assigned to women were more often in the very good to exceptional category than for men (P < .0001). Compared to the editors' final decisions, there was no difference based on gender with approximately 73% decision congruence overall. Men rejected more manuscripts than women with advancing tenure on the editorial board (P < .0001).

Conclusions Thirteen (33%) of editorial board members for this journal are women. There are differences based on gender for the editorial board members' recommendations regarding manuscript triage, turnaround time, and editors' grades assigned. Longitudinal performance with increasing frequency of rejection recommendations with advancing tenure was found for men but not women. Overall, however, these differences do not affect the editors' ultimate decisions regarding publication of manuscripts.

1Division of Maternal-Fetal Medicine, Department of Obstetrics-Gynecology, University of California, Irvine, 101 The City Drive S, Suite 800, Building 56, Orange, CA 92868, USA, e-mail:; 2Obstetrics & Gynecology, Salt Lake City, UT, USA

Eligibility Criteria of Randomized Controlled Trials of Acutely Ill and Hospitalized Patients With Acute Lung Injury or Sepsis

Chris Lazongas,1,2 Andrew Toren,3 Ruxandra L. Pinto,1 Niall D. Ferguson,2,4 and Robert A. Fowler,1,2

Objective To evaluate the generalizability of randomized controlled trials (RCTs) of acute ill and hospitalized patients with acute lung injury or sepsis by examining RCT eligibility criteria and comparing the findings to previously established generalizability among RCTs of general medical conditions.

Design We searched MEDLINE for RCTs (1996-2007) involving ≥50 patients and identified 28 trials in acute lung injury and 29 trials in sepsis. Trial characteristics and eligibility criteria were abstracted. Exclusion criteria were graded as strongly, potentially, or poorly justified according to previously published guidelines.

Results Participants were 52.9 years old (±13.7), male (60.2%), and studied in adult intensive care units (94.7%). There were (mean, standard deviation [SD]) 12.7 ± 6.4 exclusion criteria per trial. Common exclusion criteria included age (86%), pregnancy or lactation (64.9%), and common medical conditions (96.5%). Shock was reason for exclusion in 24.6% of trials, weight in 21.1%. Specific medications were common exclusions among sepsis trials (62.1%) but not acute lung injury (28.6%); respiratory condition exclusion was common in acute lung injury (60.7%) but not sepsis (10.3%). Sepsis trials more commonly investigated pharmacologic agents than did those of acute lung injury (89.7% vs 39.3%) and were more commonly industry sponsored (82.8% vs 42.9%). Pharmacotherapy-based RCTs were more likely to include medication-related exclusions (odds ratio 5.87, P = .004). Among all exclusions, 33.1% were judged strongly justified, 33.7% potentially justified, and 33.2% poorly justified. A total of 96.5% of RCTs contained more than 1 poorly justified exclusion. Compared to RCTs in the general medical literature, critical care trials exhibited fewer strongly justified exclusions (33% vs 47%), more potentially justified (33% vs 15%), and more poorly justified exclusions (37% vs 33%, Χ2 test for trend, P = .0066).

Conclusions Age, gender and common medical comorbidities are common reasons for exclusion in RCTs involving the sickest of hospitalized patients. Many exclusion criteria are poorly justified. This may have important consequences for generalizability of trial results.

1Department of Critical Care Medicine, 2075 Bayview Ave, Room D-478, Sunnybrook Health Sciences Centre, Toronto, Ontario M4N 3M5, Canada, e-mail:; 2Department of Medicine, University of Toronto, Toronto, Ontario, Canada; 3Department of Ophthalmology, University of Ottawa, Ottawa, Ontario, Canada; 4Department of Medicine, Division of Respirology, University Health Network, Toronto, Ontario, Canada.

Multiple Publication of Positive vs Negative Trial Results in Review Articles: Influence on Apparent Weight of the Evidence

Erick Turner

Objective In a previous study of selective publication, we counted trials as unpublished if they were not reported in full (stand-alone) publications. To evaluate a company's public complaints that this method was unfair, we now credit trials as published if they appear only in review publications and ask whether this mitigates the findings of selective publication for that company's drug.

Design Within our previously published data on antidepressant trial outcomes extracted from US Food and Drug Administration (FDA) reviews and matching full journal articles, we focused on duloxetine. We identified review articles citing the full publications using Web of Science with a cutoff date of May 2008, including only those presenting placebo-controlled efficacy outcomes. Within these articles, we determined whether the trial results were presented as positive (statistically significant) or negative (nonsignificant) on the primary outcome. Using Fisher exact test, we compared the proportion of positive vs negative reports according to (1) FDA reviews, (2) stand-alone publications, and (3) all (stand-alone plus review) publications.

Results The FDA reviewed 8 duloxetine trials and judged 4 of them positive and 4 negative. In stand-alone publications, 6 of the 8 trials were published as positive and none as negative (P = .085 vs FDA tally). Within the combination of 6 stand-alone publications plus 21 review publications, outcomes from the 8 trials were reported as positive 103 times and as negative 8 times (P = .003 vs FDA tally).

Conclusions Positive trials were fully published but negative trials were not. Results from negative trials were instead bundled with positive trials into review articles. When we counted such trials as published, we found that multiple publication of trial results significantly skewed the apparent weight of the evidence favoring drug efficacy. In the interest of fair balance, it seems reasonable to expect full publication of all trial results, regardless of trial outcome.

Department of Psychiatry and Center for Ethics in Health Care, Oregon Health & Science University, Portland, OR, USA, and Mental Health and Clinical Neurosciences Division P3MHDC, Veterans Affairs Medical Center, 3710 SW US Veterans Hospital Rd, Portland, OR 97239, USA, e-mail:

Reasons for Not Publishing Studies: A Meta-analysis of Data From Empirical Studies

Fujian Song,1 Caroline Hing,2 Sheetal Parekh,3 Lee Hooper,1 Yoon Loke,1 Jon Ryder,1 Alex Sutton,4 and Ian Harvey1

Objective To summarize data on reasons given by investigators for not publishing their studies.

Design As part of a comprehensive updated review of publication bias, we searched MEDLINE and the Cochrane Methodology Register Database (up to August 2008) to identify studies that provide data on reasons given by investigators for not publishing studies. References of retrieved articles were also checked for relevant studies. Percentages of specific reasons from individual studies were transformed to log odds and pooled using random-effects model.

Results Twenty-one studies were included (published between 1992 and 2006) including 5 studies of investigators of protocol cohorts, 11 studies of authors of meeting abstracts, and 5 studies of other or miscellaneous authors. There was significant heterogeneity in results across studies. The main reasons for nonpublication were lack of time or low priority (34.5%; 95% confidence interval [CI], 27.4%-42.3%), results not important enough (19.6%; 95% CI, 12.0%-30.4%), and journal rejection (10.2%; 95% CI, 5.5%18.2%) (Figure 2). Pooled percentages of specific reasons were similar across different types of empirical studies, except that the lack of time or low interest were significantly higher in studies of meeting abstracts (43.1%; 95% CI, 35.9%-50.6%) than in studies of protocol cohorts (23.8%; 95% CI, 15.9%-34.0%) or studies of other authors (20.7%; 95% CI, 7.7%-44.9%). In the 5 studies of meeting abstracts, fear of journal rejection was given as a reason for 23.7% (95% CI, 8.9%-49.6%) of unpublished studies.

Figure 2. Reasons for Not Publishing Studies: Pooled Results of Empirical Studies

Figure 2. Reasons for Not Publishing Studies: Pooled Results of Empirical Studies

Conclusions Main reasons given by investigators for not publishing studies include lack of time or low priority and results being considered not important. Some study results remained unpublished because of journal rejection or anticipated journal rejection.

1School of Medicine, Health Policy and Practice, Faculty of Health, University of East Anglia, Norwich NR4 7TJ, UK, e-mail:; 2Watford General Hospital, Watford, Hertfordshire UK; 3School of Allied Health Professions, Faculty of Health, University of East Anglia, Norwich, UK; 4Department of Health Sciences, University of Leicester, Leicester, UK

Citation Analysis

Bibliometric Analysis of Latin-American Presence in Pediatric Publications: Geographical Distribution and Countries' Impact Factor

Paula Otero, Norma Rossato, Pablo Duran, Fernando Ferrero, Hebe Gonzalez Pena, Susana Rodriguez, and Jose Ceriani Cernadas

Objective To evaluate the non–English-speaking Latin American countries' participation in pediatric journals with impact factors and that were included in MEDLINE .

Design All articles that were published in journals with impact factors and included in MEDLINE subset "pediatrics" between the years 1998 and 2008 were reviewed. Corresponding author or institution was used to determine the country of origin. The mean impact factor by country was calculated using data from the Journal Citation Reports database. The results obtained were adjusted for each country by population size, funds invested on research and development as percentage of gross domestic product and absolute number of researchers using the latest data available. The number of published articles was considered as an index of quantity of research productivity. The mean impact factor of the published articles was considered as an index of quality of research productivity

Results From 78 pediatric journals with impact factors, a total of 73,295 articles were obtained. From these articles, 1825 (2.5%) were from the 18 non–English-speaking Latin American countries. Only 7 of 18 countries had more than 20 articles published (Brazil, Argentina, Mexico, Chile, Venezuela, Colombia, and Uruguay) (Table 2). The country that accounted the highest number of articles was Brazil with 1055 (57.8%) followed by Argentina (17.2%) and Mexico (10.2%). The country with the highest impact factor was Chile with 2.07 followed by Uruguay (1.89), Argentina (1.62), and Brazil (1.46). When adjusted by population size, number of researchers, and research funding according to percentage of gross domestic product, Chile was the country that ranked in the highest positions for these indicators.

Table 2. Indicators of Quantity and Quality of Research Productivity of Non–English-Speaking Latin American Countries With More Than 20 Articles Published in Pediatric Journals With Impact Factor Included in MEDLINE Between 1998 and 2008 (N = 1825)

Table 2. Indicators of Quantity and Quality of Research Productivity of Non–English-Speaking Latin American Countries With More Than 20 Articles Published in Pediatric Journals With Impact Factor Included in MEDLINE Between 1998 and 2008 (N = 1825)

R&D indicates research and development

Conclusion The publication rate of non–English-speaking Latin American countries in pediatric journals with impact factors is low; Brazil ranked highest in number of articles, but when other types of analyses were done, other countries emerged as main producers of information in this discipline.

Departamento de Publicaciones, Sociedad Argentina de Pediatría, Coronel Diaz 1921, (1425) Ciudad Autonoma de Buenos Aires, Argentina, e-mail:

Demographics and Intellectual Market Share of Medical Journals

Victoria Wong1 and Michael Callaham2

Objective Researchers and editors often need ways to identify the relative importance of journals within a discipline. Impact factor is controversial and limited in usefulness. We introduce a method of measuring a journal's relative importance within its field.

Design The 2007 Journal Citation Reports database was used to identify the proportion of total citations contributed by each of the medical journals within various medical categories defined by the Institute for Scientific Information (ISI). The relative number of total citations for each journal was compared to other journals within the same medical specialty. Results from the 2000 Journal Citation Reports were compared to those from 2007.

Results The number of journals that dominate the medical literature based on their total number of citations varies markedly among different medical subspecialties. In the field of general medicine, for example, 2 journals make up the top 42% of total citations (a pattern also common in many other disciplines). In comparison, 13 journals make up the top 42% of total citations in the field of surgery. The 2000 Journal Citation Reports data showed similar results. Citations are the "raw data" of impact on the literature but also correlate with the impact factor, a more controversial metric.

Conclusions We report a simple method of quantifying a journal's "intellectual market share" within disciplines. This is a useful research tool for selecting journal segments for study or educational efforts (eg, large dominant journals vs small "niche" ones). In addition, editors can measure their journals' relative importance within their field. By tracking a journal's share over time, researchers can measure a journal's citation growth and relative momentum.

1University of California, Davis, Department of Neurology, 4860 Y St, Suite 3700, Sacramento, CA 95817, USA, e-mail:; 2University of California, San Francisco, San Francisco, CA, USA

Conflicts of Interest

Sponsorship of Medical Textbooks by Drug or Device Companies

Andreas Lundh and Peter Gøtzsche

Objective To study whether medical textbooks are sponsored by drug or device companies and, if so, whether they have tried to influence their contents.

Design Cross-sectional study of the medical textbooks written in Danish that are available for the pregraduate clinical courses at the University of Copenhagen and anonymous online survey of editors. For sponsored books, we also contacted the authors.

Results Ten of 71 medical textbooks had listed 1 or more drug or device companies as sponsors, and 1 textbook had none but was nevertheless sponsored, which we found out coincidentally. Thus, 11 books (15%) were sponsored. We contacted 11 editors, and for 8 books that had authors who were not editors, we contacted 1 author. Ten of the editors and 5 authors replied. In 2 cases, the editors had no influence on whether the book should be sponsored, as this was decided by the publisher. One of these editors was contacted 5 times by the various sponsors concerning the content of specific chapters, and in the second case the sponsor had the content of a chapter changed regarding its own drug. Two of the authors noted that they did not know that the book was sponsored. We wrote to the editors of the 60 books that did not appear to be sponsored, and 43 replied; 40 declared that there was not hidden sponsorship while 3 noted that they did not know.

Conclusions Sponsorship of medical textbooks is not uncommon. We regard industry sponsorship of medical textbooks as unacceptable, as it may lead to lack of academic freedom. Medical students may be particularly vulnerable to commercial influences, as they have had little or no training in commercial biases and generally believe what they read in textbooks.

Nordic Cochrane Centre, Rigshospitalet, Dept 3343, Blegdamsvej 9, DK-2100 Copenhagen, Denmark, e-mail:

Conflict of Interest and Disclosure Policies in Psychiatry and Medicine: A Comparative Study of Peer-Reviewed Journals

Gauri Khurana,1 Schuyler Henderson,2 Garry Walters,3 and Andres Martin4

Objective To characterize conflict of interest (COI) and disclosure policies published in peer-reviewed journals and to determine whether there is a qualitative difference between psychiatric and nonpsychiatric journals.

Methods We examined the 20 highest-ranked peer-reviewed journals in psychiatric and nonpsychiatric journals (based on 2007 impact factor). Using qualitative and quantitative approaches (including a screening instrument developed by the authors), we compared the COI and disclosure policies that appeared in print or journal Web sites through May 2009.

Results All journals published COI/disclosure policies that were accessible in print and online. Eight of the psychiatric journals and none of the nonpsychiatric journals required "complete" (vs "relevant") disclosure, but medical journals tended to provide more detailed information about what could constitute potential conflict and asked for broader potentially relevant funding sources. All psychiatric and 16 of the nonpsychiatric journals published COI statements for each submission. Nine psychiatric journals and 10 nonpsychiatric journals had forms for the authors; the remainder required authors to submit their own disclosures. Three psychiatric and 8 nonpsychiatric journals specified disclosures for editors and reviewers.

Conclusions This preliminary study suggests that it is possible to review journals qualitatively and quantitatively to ascertain COI policies. There are variations in what information journals offer and the clarity of their expectations, and there may be field-dependent differences that affect these variations. Although COI has been a topic of substantial debate in medical fields and in larger society, there are challenges to codifying COI policies and creating standardized approaches. These challenges may reflect ongoing debates about what constitutes a COI, what needs to be disclosed, and who is responsible for disclosing COI. Further study into how journals convey COI policies and how these policies subsequently affect disclosure is warranted.

1Yale University School of Medicine, Department of Psychiatry, 300 George St, Suite 901, New Haven, CT 06511, USA, e-mail: gk109@; 2Columbia University, Child and Adolescent Psychiatry, New York, NY, USA; 3Sydney University, Discipline of Psychological Medicine, Sydney, NSW, Australia; 4Yale University School of Medicine, Child Study Center, New Haven, CT, USA

The Impact of Disclosing Financial Ties in Research and Care: A Systematic Review

Adam Licurse,1 Emma Barber,1 Steven Joffe,2 and Cary Gross1

Objective To review original, quantitative studies on the perceptions of patients, research participants, and journal readers about physicians' and investigators' financial ties (FTs) in clinical care and research.

Design Studies were identified by searching MEDLINE (January 1988-February 2009), Scopus, and Web of Knowledge. All English-language studies containing original, quantitative data on attitudes toward physician or investigator disclosure of FTs were included. FTs were defined as any payments made or gifts given by a company to a physician or researcher, including those directly funding research studies. We screened 6381 citations and retrieved 239 potentially eligible full articles. Of these, 18 studies met our inclusion criteria. Data were synthesized qualitatively.

Results Seven studies assessed patient perceptions of physician FTs. Professional gifts to physicians were viewed as unethical by 18% to 47% of patients, while 14% to 64% of patients believed FTs decrease the quality and increase the cost of care. Nine studies examined FTs in clinical research. In 5 studies, a majority of respondents believed FTs were important to disclose. Among those studies assessing willingness to participate in research, respondents were least willing to participate after a disclosure of researcher equity (18%-33% of patients). Three studies examined the impact of FTs on physicians' evaluation of research evidence. Two showed that the perceived quality of journal articles was significantly lower when FTs were disclosed (P < .05), while in another, approximately 70% of physicians believed FTs biased clinical practice guidelines.

Conclusions Patients believe that FTs influence professional behavior, decrease the quality of care, and increase the cost of care. Research participants believe that FTs are important to disclose. For some, a disclosure of FTs affects their willingness to participate in research studies. Limited data suggest that FTs adversely affect physicians' assessments of the quality of journal manuscripts and of practice guidelines.

1Department of Internal Medicine, Yale Medical School, Int Med-Primary Care, PO Box 208025, New Haven, CT 06520-8025, USA, e-mail: cary.; 2Department of Pediatrics, Harvard Medical School, Boston, MA, USA

Screening Investigator Financial Conflict of Interest: A Checklist for Authors, Editors, and Readers

David Moher,1 Paula Rochon,2 John Hoey,3 An-Wen Chan,4 Lorraine Ferris,5 Joel Lexchin,6-8 Marleen Van Laethem,9,10 Sunila Kalkar,2 Melanie Sekeres,11 Wei Wu,2 and Andrea Gruneir2,12

Objective To develop a simple, comprehensive tool to help investigators identify and to report the extent and types of financial conflicts of interest (fCOI) present in current funded research.

Design Between January 2007 and April 2009, we developed the fCOI Checklist using a 3-phase process (premeeting item generation, consensus meeting, and postmeeting consolidation).The checklist items were initially generated by our research team based primarily on published literature of initiatives that targeted specific aspects of fCOI. When required items were not available from these sources, we created the item. We used a modified Delphi process from team members and invited external panel members to revise the checklist. The reviewers used a 5-point adjectival rating scale (1, least important, to 5, most important) and also provided free text suggestions to improve the item for 2 sequential checklist iterations. Twenty-eight people including representation from journals, law, ethics, and research integrity participated in the consensus meeting. In the postmeeting phase, the revised checklist was piloted for usability, and a Web-based version of checklist was created. In the final meeting the revised fCOI Checklist was finalized for investigators conducting clinical trials.

Results The final fCOI checklist is to be completed by an investigator for an individual study. It contains 4 sections (ie, administrative, study, personal financial, and authorship information). These sections are divided into 6 modules containing 14 items and their related subitems. Different modules within the fCOI checklist should be completed at different transition points over the course of the study and updated information to be appended to the originally completed fCOI checklist. The checklist can be completed in fewer than 20 minutes.

Conclusions We consider this fCOI Checklist to be a living document. We invite comments and suggestions to improve the checklist and suggestions for adaptations.

1Clinical Epidemiology Methods Centre, Ottawa Health Research Institute, The Ottawa Hospital, General Campus, Critical Care Wing (Eye Institute), 6th Floor, 501 Smyth Rd, Ottawa, Ontario K1H 8L6, Canada, e-mail:; 2Women's College Research Institute at Women's College Hospital, Toronto, Ontario, Canada; 3Queen's University, Kingston, Ontario, Canada; 4Mayo Clinic, Rochester, MN, USA; 5Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada; 6School of Health Policy and Management, York University, Toronto, Ontario, Canada; 7Emergency Department, University Health Network, Toronto, Ontario, Canada; 8Department of Family and Community Medicine, University of Toronto, Toronto, Ontario, Canada; 9Toronto Rehabilitation Institute, Toronto, Ontario, Canada; 10Joint Centre for Bioethics, University of Toronto, Toronto, Ontario, Canada; 11Department of Physiology University of Toronto, Toronto, Ontario, Canada; 12Baycrest Kunin-Lunenfeld Applied Research Unit, Toronto, Ontario, Canada

Data Sharing

Medical Journal Editor Perspectives on Sharing Results Data From Patient-Oriented Research: A WAME Survey

Karmela Krleža-Jerić,1 Ida Sim,2 Ana Marušić,3 Ludovic Reveiz,4 and Carlos Granados5

Objective This study is assessing perceptions, policies, and practices of medical journal editors regarding various levels of public disclosure of patient-oriented research data.

Design We surveyed editors of member journals of the World Association of Medical Editors (WAME). The 30-item online survey inquired about journal characteristics and editors' views on the public posting of study results datasets including issues of timing, ideal formats, and potential dangers. Initially, 461 WAME members were recruited starting in March 2009 via the WAME listserve. Four reminders followed: 2 via listserve and 2 directly using e-mails extracted from journals' Web sites. We supplemented questionnaire data with journal characteristics gathered from the Web of Knowledge, PubMed, and SHERPA/RoMEO. We used descriptive statistics to examine journals' and editors' characteristics and frequencies of reported perceptions, policies, and practices.

Results As of June 9, 2009, the survey is ongoing. From the original 461 journals, we excluded 131 due to absent or incorrect e-mail addresses. We have now received 102 responses from 89 different journals and 29 countries. Most responders were active editors in chief (47%) or active editors (36%). Most journals are in English (89%) and are indexed in MEDLINE (57%), and 48% had full open access. Thirty-four percent of journals that publish studies involving human participants require trials to be registered prior to inception, while 19% require registration even if they are registered retrospectively in a trial registry approved by the International Committee of Medical Journal Editors or the World Health Organization. Only 7/89 (8%) and 2/89 (2%) journals currently require summary level data and participant level data, respectively. Only 19% of journals require authors to specify their data-sharing plan. We extracted journal characteristics from citation databases for 105 randomly selected WAME member journals. Eighty-six percent of these journals have online publication, 44% have full open access; median impact factor is 2.75 among journals that reported such data.

Conclusions This survey is limited by the moderate response rate, which is due to uncertainty about the actual numbers of active members and journals on the listserve and also whether they publish clinical research. Nevertheless, the findings of this survey may be used to develop results reporting policies by WAME and other organizations and to inform the development of international standards for public disclosure of trial results.

1Canadian Institutes of Health Research, 160 Elgin St, Ottawa, Ontario K1A 0W9, Canada, e-mail: karmela.krleza-jeric@cihr-irsc.; 2University of California San Francisco, San Francisco, CA, USA; 3Croatian Medical Journal and Croatian Centre for Global Health, School of Medicine, University of Split, Split, Croatia; 4Cochrane Collaboration Branch, Research Institute, Sanitas University Foundation, Bogota, Colombia; 5Research Institute, National University of Colombia, Bogota, Colombia

Editorial Decision Process

Consistency of Decision Making by Editors: The Relation Between Reviewers' and Editors' Ratings and Future (10 Years) Citation

Tobias Opthof1,2 and Ruben Coronel1

Objective To test the consistency of editorial decisions and to assess the relation between reviewers' and editors' ratings and citations.

Design The editors of Cardiovascular Research performed an analysis of reviewers' and editors' ratings of 169 original manuscripts consecutively submitted between October and December 1997. First, each editor (7) rated these manuscripts, leading to a combined editors' score (range, 0%-100%). Next, reviewers' reports (3) led to a reviewers' priority rating from 0% (3 low), 33% (1 low, 2 high), 67% (2 high, 1 low), or 100% (3 high). All ratings were compared and associated with citations obtained during 10 years. An editor's decisions (by 1 specific editor) on 21 selected, nonrejected, manuscripts were redone in a blinded manner by all editors 2 months later (without consequence for the manuscripts), including the editor who had made the original decision.

Results From 169 manuscripts, 56 (33%) were published in 1998-1999 (53/3). The same editor who had previously decided to accept the manuscript decided to reject it 33% of the time when he considered the manuscript again based on the same materials (67% for the team decision by majority vote). The editor's ratings had a very weak relation with the reviewer's ratings. Neither the reviewer's ratings nor the editor's ratings were significantly correlated with 10 years' citations. Only manuscripts with both an editor's ratings >50% and reviewer's ratings ≥67% were cited more than the other manuscripts (29.5 ± 5.5, n = 30, vs 15.0 ± 1.9, n = 26; P < .025).

Conclusions Individual editors' decisions are far from consistent, team decisions comply poorly with individual editors' decisions, editors' ratings do not predict reviewers' ratings, neither reviewers' ratings nor editors' ratings predict citation, and combined reviewers' and editors' ratings poorly predict future citation.

1Department of Experimental Cardiology, Center for Heart Failure Research, Academic Medical Center, Meibergdreef 9, Room K2-105, 1105 AZ Amsterdam, the Netherlands, e-mail:; 2Department of Medical Physiology, University Medical Center, Utrecht, the Netherlands

Evaluation of Editors' Judgment on Quality of Articles

Heidi Logothetti,1 Sheryl Martin,1 Rebecca Benner,1 James Scott,1 John Queenan,1 and Catherine Spong2

Objective To evaluate whether the best articles as judged by the editors of a peer-reviewed medical specialty journal (Obstetrics & Gynecology) correspond with the best articles as defined by the highest number of citations and online accesses.

Design Every year, the editors of Obstetrics & Gynecology nominate 12 to 15 original research articles in the journal that originate from a US-based institution for an award (the Pitkin Award). Nominees are selected for their scientific merit, importance to the specialty, study design and methodology, presentation of results, soundness of conclusions, and writing style. An independent committee selects the 4 to 5 best papers from this group. In this case-control study, nominated articles from 2002 to 2007 were matched with the next consecutive journal article eligible for the award but not nominated (controls). The number of citations and online accesses was compared between awardees and controls, and between nominees (including awardees) and controls, using Wilcoxon signed-rank tests. Citation data were obtained from the Web of Knowledge Journal Citation Reports.

Results The 25 award-winning articles published between 2002 and 2007 received more citations when compared to the matched controls (median [range], 17 [0-137], compared to 7 [0-52], P = .02). The online accesses were similar between the 2 groups (4675 [1655-10226] compared to 4368 [888-11621], P = .31, respectively). The 119 nominees received more citations (15 [0-170] compared to 11 [0-62], P < .001) and online accesses (4737 [0-15776] compared to 3671 [0-14042], P < .001) than their matched controls.

Conclusions Articles subjectively judged by the editors of Obstetrics & Gynecology to be the best performed well by 2 objective standards, indicating that the editors accurately assessed the quality of manuscripts. Such ability is critical to journal editors, who decide which studies are published and disseminated.

1Obstetrics & Gynecology, 409 12th St SW, Washington, DC 20024-2188, USA, e-mail:; 2Pregnancy and Perinatology Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA

Subjectivity and the Science of Peer Review: What Qualitative Research Can Tell Us About Editorial Processes at Major Medical Journals

Wendy Lipworth

Objective Editorial and peer review of manuscripts has recently become a popular and important subject of academic research. While some qualitative research has been conducted, most of this research is quantitative. Many important insights have been derived from quantitative research, but qualitative methods may be better suited to understanding the nuances and complexities of open systems such as manuscript review.

Design Qualitative methods, based on grounded theory, were used to carry out an in-depth and inductive analysis of the manuscript review process. Data sources consisted of (1) in-depth, open-ended interviews with journal editors, peer reviewers, and authors from a range of international journals and (2) written peer review documents and editorial deliberations from the Lancet (chosen on the grounds that it is a "model" journal). This combination of sources was felt to provide a rich and detailed insight into the manuscript review process across medical publishing.

Results Despite efforts to ensure that manuscript review is scientific, the review process is characterized by (1) complex negotiations of epistemic authority; (2) dynamic, shifting, and contextually specific relationships of power and vulnerability; (3) reciprocal moral responsibilities; and (4) judgments that are unavoidably both prejudiced (in the neutral sense of the term) and strongly intuitive.

Conclusions Qualitative research into manuscript review is an important adjunct to quantitative studies. The results of this qualitative study have the potential both to challenge existing editorial assumptions, attitudes, and practices and to inform further quantitative research into the manuscript review process.

Centre for Values, Ethics and the Law in Medicine, Medical Foundation Building (K25), University of Sydney, Sydney NSW 2006, Australia, e-mail:

Ethical Concerns

Informed Consent in Clinical Studies Published in the Chinese Medical Journal: Comparison of 2001-2004 and 2005-2008

Mouyue WANG

Objective To investigate and compare the informed consent in published clinical studies in the Chinese Medical Journal from 2001-2004 to 2005-2008.

Design The Chinese Medical Journal is a top medical journal in China published in English by the Chinese Medical Association. This survey aimed to evaluate the status and changes of reporting informed consent in the journal from 2001-2004 to 2005-2008. All full-text (with abstracts) original articles on clinical studies related to diseases' diagnosis, treatment, and prognosis published in the journal during these periods were included. Animal and in vitro studies as well as non–full text articles were excluded. Abstract, introduction, and methods sections were reviewed twice for each paper.

Results A total of 235 papers published in 2001-2004 were investigated, among which 34 reported having obtained informed consent from participants or their guardians, accounting for 14.5%; 30.7% (92/300) of papers published in 2005-2008 reported having obtained informed consent; significant improvement in reporting informed consent was found during these periods (Χ2 = 12.13, P < .001). The proportion of reporting informed consent in papers of prospective design during 2001-2004 was 43.5% (27/62), in retrospective design was 4.0% (7/173), significant difference was found (Χ2 = 37.73, P < .001); the proportions during 2005-2008 were 67.1% (51/76) and 18.3% (41/224), respectively, significant difference was also found (Χ2 = 29.13, P < .001). The overall proportion of reporting informed consent in prospective studies during 2001-2008 was 56.5% (78/138), significantly higher than that in retrospective ones (12.1%, 48/397, Χ2 = 60.45, P < .001).

Conclusions The proportions of reporting informed consent in published clinical studies in the Chinese Medical Journal were both low in 2001-2004 and 2005-2008, though significant improvement was found in 2005-2008 compared to 2001-2004. The proportions in prospective studies were significantly higher than that in retrospective ones.

Chinese Medical Journal, Chinese Medical Association, 42 Dongsi Xidajie, Beijing 100710, China, e-mail:;

Ethical Concerns of Nursing Journal Reviewers: An International Survey

Marion E. Broome,1 Molly Dougherty,2 Margaret Kearney,3 Margaret Freda,4 and Judith Baggs5

Objective Editors of scientific literature rely heavily on peer reviewers to evaluate the integrity of research conduct and validity of findings in manuscript submissions. The purpose of this study was to describe the ethical concerns of reviewers for nursing journals.

Design This descriptive cross-sectional study was an anonymous online survey. The findings reported here were part of a larger investigation of experiences of reviewers. Fifty-two editors of nursing journals (6 outside the United States) agreed to invite their review panels to participate. A 69-item forced-choice and open-ended item survey developed by the authors based on the literature was pilot tested with 18 reviewers before being entered into an online survey program. A total of 1675 reviewers responded with useable surveys. Ninety-one percent of the respondents were women, and 74% from the United States; the remaining 26% represented 44 different countries. Six questions elicited responses about ethical issues, such as conflict of interest, protection of human subjects, plagiarism, duplicate publication, and misrepresentation of data. Reviewers indicated whether they had experienced such a concern and notified the editor, how satisfied they were with the outcome, and provided specific examples.

Results Table 3 presents the findings related to the concerns identified and their outcomes. Approximately 20% of reviewers had experience with various ethical dilemmas. Although the majority reported their concerns to the editor, not all did, and not all were satisfied with the outcomes. The most commonly reported concern was perceived inadequate protection of human subjects. The least common was plagiarism, but this concern was the one most often reported to the editor and least often led to a satisfactory outcome. Qualitative responses at the end of the survey indicate this lack of satisfaction was most commonly related feedback provided on the resolution by the editor.

Table 3. Ethical Issues Identified by Reviewers and Satisfaction With Outcome of Notification

Table 3. Ethical Issues Identified by Reviewers and Satisfaction With Outcome of Notification

Conclusion The findings from this study suggest several areas editors should take note, including follow-up with reviewers when they identify ethical concerns about a manuscript.

1Indiana University School of Nursing, 1111 Middle Dr, Indianapolis, IN 46202, USA, e-mail:; 2University of North Carolina at Chapel Hill, School of Nursing, Hillsborough, NC, USA; 3University of Rochester School of Nursing, Rochester, NY, USA; 4Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA; 5Oregon Health Sciences University School of Nursing, Portland, OR, USA

Must Research Using Publicly Available Anonymous Databases Undergo IRB Review? Views From Journals and IRBs

Ingrid Nygaard and Sheryl Martin

Objective According to US federal regulations, research involving publicly available data is exempt from institutional review board (IRB) oversight. Interpretation of this regulation seems inconsistent. Our aim is to describe journal and IRB policies related to whether IRB review is required for such studies.

Design We evaluated all original contributions using publicly available databases published between July 1, 2008, and December 31, 2008, in 6 high-impact, patient-oriented research journals read by obstetrician-gynecologists to determine whether methods sections specifically described IRB exemption or approval. We excluded meta-analyses, decision analyses, databases requiring approval for use, and public health surveillance summary data. We accessed Web sites for all IRBs associated with the 126 accredited MD-granting US medical schools to describe local policies. We examined Web sites for top 10 obstetric/gynecologic journals according to impact factor and 5 high-impact general journals read by obstetrician-gynecologists to determine journals' IRB policies.

Results Of 447 original research studies, 19 met inclusion criteria. Authors noted that 11 were reviewed by the IRB and 8 were not. Of those reviewed, 7 were considered exempt. Of IRB Web sites, 93 (74%) stated that only the IRB could determine exemption, and 28 (22%) gave investigators or departmental leaders this authority. We were unable to access 5 Web sites. Four journal Web sites clearly stated that authors must document in all manuscripts formal IRB exemption or approval, and 1 required a separate statement of same. Authors were instructed in 4 journals to document IRB status in manuscripts "if applicable" and in 2 journals that ethical consent was expected. Four journals did not mention IRB in instructions.

Conclusions US federal regulations do not specify which body is responsible for determining research exemption. Institutional review board policies are not uniform and continue to be in flux. Journals should strive for improved consistency and transparency in their requirements of IRB oversight for research using publicly available databases.

University of Utah School of Medicine, Obstetrics & Gynecology, 30 N 1900 E, Salt Lake City, UT 84108, USA, e-mail:

Ethics Committee Review Requirements for Published Abstracts of Human Research Presented at Major Medical Association Meetings

Rachel N. Simmons1 and Robert P. Dellavalle2,3

Objective To examine ethics committee (eg, institutional review board [IRB]) review requirements for human research presented at major medical society meetings and publication of meeting abstracts reporting human research in affiliated medical journals.

Design For this descriptive study, we compiled a list of the 100 medical journals with the highest measures of SCImago Journal Rank (SJR). From this list, we identified journals affiliated with medical societies. The sample represented a variety of specialties, the majority being widely recognized US medical societies. Between January 5, 2009, and February 20, 2009, the Web sites of each medical journal and affiliated society were accessed and the Information for Authors or Abstract Submission Guidelines sections were digitally saved and examined. If information on the Web site was unavailable, the authors contacted the journal or society to clarify ethics committee approval requirements.

Results Ethics committee approval requirements for human research submitted to academic journals exceeded those of abstracts submitted to affiliated medical society meetings (100% [27/27] vs 37% [10/27] required approval). Twelve journals or their supplements (44%) published abstracts of research presented at the medical society meetings; ethics committee approval was not required by a majority of medical societies prior to the publication of meeting abstracts describing human research in affiliated journals (58% [7/12]). None of the journals or societies in the study required documentation of ethics committee approval prior to publication (Table 4).

Conclusions Although all of the medical journals in the study required ethics committee approval for manuscripts describing research on human subjects, a loophole exists regarding abstracts from meetings. Many academic societies did not explicitly require ethics committee approval for human research presentations at society meetings and some of these abstracts were published in affiliated journals. Ethical committee review should be confirmed for all human research presentations at medical meetings and resulting published abstracts.

Table 4. Journals Publishing Meeting Abstracts of Human Research Without Requiring Ethics Committee Review

Table 4. Journals Publishing Meeting Abstracts of Human Research Without Requiring Ethics Committee Review

a Meeting abstracts are printed in a supplement to the journal.

1University of Florida College of Medicine, Gainesville, FL, USA; 2Department of Veterans Affairs Medical Center, Denver, CO, USA; 3Department of Dermatology, University of Colorado Denver, School of Medicine, PO Box 6510, Mail Stop F703, Aurora, CO 80045-0510, USA, e-mail:

Funding/Grant Peer Review

Quality Assurance of Grant Applications: Confirmation of Publication Records

Liza Chan,1,2 Kathryn Graham,1 Pam Valentine,1 and Jacques Magnan1

Objective There is a scarcity of literature on the integrity of publication records submitted in grant applications. This study aimed to determine the prevalence and extent of observed discrepancies in the publication lists of grant applications at the Alberta Heritage Foundation of Medical Research (AHFMR), a provincial health research funding agency.

Design A retrospective review of all applications in the 2007 AHFMR investigator awards competition was conducted. Applicants' self-reported peer-reviewed publications were examined against PubMed data following an in-house protocol. The entire publication list was checked for all applications in the junior award categories. For the senior award applications, which often had extensive publication listings, the most recent 20 publications or publications from the last 5 years (whichever had the smaller number) were checked. Types and number of discrepancies (eg, authorship, titles) were documented, classified, and analyzed.

Results Among the 125 applications, 29 (23%) had no discrepancies identified. A total of 376 discrepancies were detected within the 1928 publications checked. The most common discrepancies detected were (1) different article titles (177/376, 47%), (2) omission of coauthors (92/376, 24%), and (3) changed authorship order (69/376, 18%). There were no nonexistent publications nor false claims of authorship observed.

Conclusions This study reveals the types and prevalence of discrepancies in self-reported publication records among grant applications at AHFMR in 2007. The results of this study have led to the development of a quality assurance protocol within AHFMR that includes the addition of an integrity statement in the application form requiring signatures of the applicant and the sponsoring institutions. Data gathering has started in subsequent competition years for comparison. Funding agencies are challenged to implement quality assurance systems to verify the integrity of information collected and ensure that these systems are both feasible and cost effective.

1Heritage Foundation for Medical Research, 1500, 10104-103 Ave NW, Edmonton, Alberta T5J 4A7, Canada, e-mail:; 2John W. Scott Health Sciences Library, University of Alberta, Edmonton, Alberta, Canada

Instructions for Authors

Completeness of Journal Instructions for Authors

Jane C. Wiggs

Objective Authors, authors' editors, and manuscript preparers rely on journals' Instructions for Authors to be current and complete. Most Instructions for Authors include details of manuscript submission and format. The purpose of this study was to determine whether high-quality (high-impact-factor [IF]) journals have high-quality (comprehensive) Instructions for Authors, with content beyond submission and format guidance.

Design Ten criteria were sought in Instructions for Authors of the highest-IF clinical journals publishing original research according to the 2007 Journal Citation Reports (JCR). The qualifying journal with the highest IF in each clinically oriented category was included. Criteria sought were reference in the IFA to (1) the journal's peer review process, (2) clinical trial registration, (3) open access (free or paid), (4) public access (in response to the National Institutes of Health mandate), (5) the sponsor's role in the study, (6) authors' access to data, (7) Web site of the International Committee of Medical Journal Editors (ICMJE) for policy (not style) matters, (8) figure integrity or image manipulation, (9) racial or sex bias in subject selection, and (10) detailed author contributions for publication. The main outcome measures were the mean number and the frequency of criteria met.

Results Included in the study were 46 journals from 51 JCR categories (4 journals had the highest IF in more than 1 category). Impact factors ranged from 2.217 to 52.589. The mean number of criteria met was 3.9 (median, 4; mode, 4; range, 1-9). Criteria were met as follows: the journal's peer-review process (n = 37), public access mandate (n = 31), open access (n = 25), clinical trial registration (n = 22), ICMJE Web site for policy (n = 19), figure integrity or image manipulation (n = 13), detailed author contributions for publication (n = 11), sponsor's role (n = 10), authors' access to data (n = 7), and racial or sex bias (n = 4).

Conclusion Many journals, irrespective of quality, publish Instructions for Authors that lack important information.

Mayo Clinic, Section of Scientific Publications, 4500 San Pablo Rd, Jacksonville, FL 32224, USA, e-mail:

High-Impact-Factor Journals Offer Limited Guidance to Authors Reporting Survey Research

Sara Khangura, Carol Bennett, Jamie Brehaut, Jeremy Grimshaw, David Moher, and Beth Potter

Objective Reporting guidelines, defined as a checklist, flow diagram, or explicit text to guide authors who are reporting a specific type of research using explicit methodology, have been developed to inform reporting for a variety of study designs. As part of a broader review of the literature to identify whether reporting guidelines exist for survey research, our aim was to examine the extent to which leading medical journals provide guidance for reporting survey research.

Design We examined Instructions to Authors Web pages between January 12 and February 9, 2009, for the top 5 journals (by impact factor) from 33 medical specialties identified through Web of Knowledge. All text containing the search terms survey, questionnaire, response rate, and nonresponder was extracted. Web pages were also hand-searched for reference to reporting guidelines for any study design. Additionally, we used PubMed to verify whether or not the journals publish survey research articles.

Results Of 165 high-impact journals identified, 83% (137/165) publish survey research articles. Ten percent (17/165) of the Instructions to Authors Web pages contained 1 or more of the search terms. Four percent (7/165) contained 1 or more search term(s) only; that is, no guidance or directive(s). Another 4% (7/165) contained 1 brief statement, directive, or reference(s) relevant to reporting survey research. A further 2% (3/165) contained more than 1 directive relevant to reporting survey research. While many (95/165) of the journals reference at least 1 reporting guideline for other types of study designs, none refer to a reporting guideline for survey research.

Conclusions The majority of high-impact-factor journals publish survey research. Most Instructions to Authors for these journals do not offer guidance nor refer to guidelines for reporting survey research.

Ottawa Health Research Institute, Clinical Epidemiology Program, ASB Box 693, 1053 Carling Ave, Ottawa, Ontario K1Y 4E9, Canada, e-mail:

Open Access

Impact of Free Access of Articles on Content and Impact Factor of an Indian Biomedical Journal

Anju Sharma and K. Satyanarayana

Objective To assess the impact of free access of articles on the content and impact factor (IF) of the Indian Journal of Medical Research (IJMR).

Design The study period was divided into 2 periods: pre–open access (2000-2003) and pos–topen access (2004-2008). The parameters compared were IF, articles submitted, published year, international contribution, and subscription/royalty. During the study the standard, procedures for submission and peer review remained the same.

Results There was an increase in the number of manuscripts submitted post-2004 when the full text of the journal was made available for free online. The increase in submission (base 2000) was 94% to 182% in 2005-2008. Similarly, international contributions rose from none in 2000 to 13% in 2005 and 28% in 2008. The reviewers' base also became strong and international. Up to 2003 the reviewers' base was largely Indian, and in 2008, 58% reviewers were from countries other than India. The IF of the IJMR gradually increased from 0.4 (2003) to 1.67 in 2008 (Table 5). Subscription/royalty remained more or less same during the 2 study periods.

Table 5. Impact of Free Access of Content on the Overall Improvement of the Indian Journal of Medical Research

Table 5. Impact of Free Access of Content on the Overall Improvement of the Indian Journal of Medical Research

Conclusion Making the full text of the journal available for free online appeared to have contributed significantly in improving the quality, content, and outreach of the IJMR, as the review policies and procedures remained the same.

Division of Publication & Information, Indian Journal of Medical Research, Indian Council of Medical Research, V. Ramalingaswami Bhawan, Ansari Nagar, New Delhi 110 029, India, e-mail:

Peer Review

ET Study (Enhancing Transparency of Biomedical Journals)

Erik Cobo,1,2 Agustí Urrutia,2 Albert Selva-O'Callagham,2 Francesc Cardellach,2 Josep Maria Ribera,2 Jordi Cortés,1 Francesc Miras,1 Celestino Rey-Joly,2 and Miquel Vilardell2

Objective The aim of reporting guidelines (RG) is to enhance the quality and transparency of health research, which may be achieved by including a senior statistician who asks authors to provide information about incomplete or missing RG items. The objective of this study was to investigate the effect of additional review using RG checklists on the quality of manuscripts published in a weekly medical journal with 1.3 impact factor and no specific requirements to follow RG.

Design A masked randomized trial of original research manuscripts conditionally accepted for publication in Medicina Clinica after standard peer review. Half the manuscripts received an extra review performed by a senior statistician. The primary end point was a change from initial to final Goodman quality rating as rated by 3 statisticians who were masked to the randomized group. Allocation concealment: the editorial committee decision was made after peer review was performed and before randomization without information about the additional RG review. Later decisions were aware of the additional RG review only in the treated papers. Random allocation: minimization of differences in initial overall Goodman quality and study type (4 groups: intervention, longitudinal, transversal, and other). The sample size calculation indicated that 50 papers per group allowed 80% power to detect a difference of means equivalent to 55% of the change standard deviation.

Results From May 2008 to April 2009, 126 consecutive papers included the extra RG review. From them, 34 were rejected on the basis of the conventional review and 92 were randomized (for 2 papers the final version was not sent within the scheduled time). Among the extra review and conventional groups, 44.9% (22/49) and 19.5% (8/41) of papers, respectively, were improved from baseline in Goodman overall quality score (OR 3.32, 95% confidence interval [CI], 1.19 - 10.07), but the main analysis failed to show a significant difference on the means (0.26, 95% CI, –0.08 - 0.61, SD=1). (Figure3.)

Figure 3. Effect Size by Study Type

Figure 3. Effect Size by Study Type

Forest plot of the extra review effect size (mean difference between treated and control groups) on the overall quality item (1 to 5) in Goodman scale and on the average of all valid Goodman questions.

Conclusion There is some evidence that the extra review improves paper quality, although its effect size seems to be moderate and smaller than hypothesized.

1Department of Statistics and Operations Research, UPC, Jordi Girona 1-5 (C5), 08034 Barcelona, Spain, e-mail:; 2Editorial Committee, Medicina Clinica, Barcelona, Spain

Diversity and Quality of Reviews for a Generalist Journal

William Phillips,1 Robin Gotler,2 Laura McLellan,2 and Stephen Zyzanski2

Objective General medical journals must evaluate research with a wide variety of topics and methods. Diverse expertise and viewpoints can be valuable, but it is unknown if individuals from other fields can perform quality reviews. We compared editor assessments of reviews done by reviewers in diverse fields with those done by experts in the journal's core field.

Design Retrospective study of all reviews done for Annals of Family Medicine in 2006-2008. Editors contemporaneously made global subjective assessments of each review on a 5-point Likerttype scale: 5 = excellent, 4 = good, 3 = average, 2 = poor, 1 = unacceptable. Reviewers identified themselves in 1 primary field. We grouped fields by how closely they were related to the core audience of the journal, family medicine/general practice (FP/ GP). Nonmedical reviewers included fields such as anthropology, business, communications, economics, education, English, ethics, informatics, law, library science, and social sciences. Other reviewer groups comprised the following categories: Other Primary Care Physicians (pediatrics, internal medicine), Physician Specialties, Non-Physician Health Professionals, and Public Health Sciences. Our primary outcome was proportion of reviews rated good or excellent (4 or 5). We tested differences with Χ2tests for proportions and t test for means.

Results Over 3 years 767 reviewers returned 981 reviews on 563 manuscripts. For the 959 reviews rated by editors, 70% (672/959) were rated good/excellent (95% confidence interval [CI], 0.670.73), mean, 3.91 (95% CI, 3.84-3.97), standard deviation (SD) 1.05, median, 4, range, 1-5. FP/GP reviewers returned 70% good/ excellent reviews (95% CI, 0.66-0.73), mean 3.91. Nonmedical reviewers returned 70% (46/66) good/excellent reviews (95% CI, 0.58-0.80), mean, 3.94 (Χ2P = .65, t test P = .81). Other groups returned reviews of very similar quality (Table 6).

Table 6. Review Ratings by Reviewer Group

Table 6. Review Ratings by Reviewer Group

a Modified Wald method.

b Percentage of family medicine and nonmedical reviewers not significantly different by Χ2statistic (P = .65).

c Means of family medicine and non-medical reviewers not significantly different by 2-tailed unpaired t test (P = .81).

Conclusions Reviewers from a wide variety of fields return high-quality reviews, and these reviews are not different from those by experts in the core professional group. Diversity of reviewers can add special perspectives to a generalist medical journal without compromising quality of reviews.

1Department of Family Medicine, Box 356390, University of Washington, Seattle, WA 98195, USA, e-mail:; 2Department of Family Medicine, Case Western Reserve University, Cleveland, OH, USA

Implementation of a Medical Journal Peer Reviewer Stratification System Based on Quality and Reliability

Steven Green1 and Michael Callaham2

Objective Prior to starting this study 6 years ago Annals of Emergency Medicine had a large reviewer pool (N = 989) who demonstrated substantial variability in quality and reliability. We hypothesized that a tiered, dynamic reviewer stratification system might enable our journal editors to target the bulk of their review invitations to our better reviewers and thus improve our efficiency.

Design In 2003 we instituted a 3-tiered hierarchical classification and stratified our peer reviewers based on predefined criteria for reviewer quality (ie, average review score) and reliability (eg, response to review invitations, on-time reviews). Using our manuscript management software our editors could then target the bulk of their review invitations to the top performance tier, which constitutes approximately one-fourth of the total. Every 6 months since initiation a senior editor analyzes reviewer performance statistics and promotes or demotes individuals within this dynamic classification. Before-and-after measures of global peer-review efficiency were then assessed.

Results We compare 2008 data with 2002, the year prior to the system, and found more top-tier reviewer invitations leading to an on-time review (51% vs 37%), shorter median review turnaround (median, 10 vs 12 days), less late reviews (15% vs 32%), and less reviewers not used in a given year (26% vs 59%). Editors have found the system to be simple and easy to use. No serious problems have been identified. We cannot ascertain how much of the observed improvements are due to this reviewer classification system vs other concurrent quality improvement initiatives including the adoption of our electronic manuscript system a year prior to starting this system.

Conclusion Implementation of a tiered, dynamic system stratifying journal peer reviewers by quality and reliability was readily accomplished by Annals of Emergency Medicine and has appeared to improve the efficiency of our peer review.

Loma Linda University, 2160 Veloz Dr, Santa Barbara, CA 93108, USA, e-mail:; University of California San Francisco, San Francisco, CA, USA

Assessment of Reviewers Recommended by Authors vs Editors: Is There Bias?

Monica Helton and William Balistreri

Objectives To test the hypothesis that author-suggested reviewers (ASRs) are more likely than editor-suggested reviewers (ESRs) to (1) accept invitations to review, (2) recommend acceptance of the manuscript on first review, and (3) concur less frequently with the editor's final decision to accept.

Design We retrospectively evaluated the first 300 manuscripts submitted to the Journal of Pediatrics that were assigned consecutive manuscript numbers in 2007; 122 manuscripts did not undergo peer review. For the 188 reviewed manuscripts, we recorded the following: (1) whether the reviewer was suggested by the author or chosen by the editor, (2) the number of ASRs and ESRs who completed reviews, (3) the initial recommendation of the reviewer, and (4) the final decision of the editor. The statistical methods used were the Χ2 and the McNemar test for correlated proportions. Reviewer recommendations for "acceptance" included "accept" and "accept with revisions."

Results Of the reviewers (n = 873) examined, 37.2% of ASRs accepted the invitation to review (167/449) compared with 41.8% of ESRs (177/424) (P = .17). When evaluating reviews, 65.3% of ASRs recommended acceptance (109/167), whereas 54.2% of ESRs recommended acceptance (96/177) (P = .04). Editors agreed with 49.5% (54/109) of the accept recommendations of ASRs (P < .0001) and with 55.2% (53/96) of ESRs (P < .0001).

Conclusions There is no evidence to suggest that ASRs are more likely than ESRs to accept an invitation to review. However, ASRs are more likely to recommend acceptance of a submitted manuscript. ASRs and ESRs are more likely to recommend acceptance of a manuscript than an editor. Although this could be due to a variety of factors, including a recommendation of acceptance being paired with one or more recommendations of rejection and priority for publication, this emphasizes the peer review motto of "reviewers advise; editors decide."

Journal of Pediatrics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 3021, Cincinnati, OH 45229, USA, e-mail:

Peer Review in Journals Published by Oxford Journals

Huw Price

Objective Peer review processes vary widely, and the procedures followed by any given journal are often known only to the editorial team directly involved. We surveyed our journals to get a clearer picture of the range and patterns of peer review practices, experiences, and expectations among the journals that we publish and to facilitate further discussion of best practice with, and between, our journal editors.

Table 7. Peer Review Process Survey Results

Table 7. Peer Review Process Survey Results

* The number of journals contacted by discipline was estimated based on Oxford Journals' own classification of all the journals approached. By contrast, returns were assigned to disciplines by the survey respondents themselves.

Design In January 2009 we e-mailed 1 editorial representative from each of 204 journals inviting them to take part in an anonymous online survey. This consisted of 27 multiple-choice questions supplemented by 5 text boxes. The opening question asked respondents to identify which 1 of 6 broad disciplinary categories best described their journals. Subsequent questions addressed various aspects of the peer review process. The survey was intended as a means of quickly gathering approximate answers, rather than necessitating any in-depth research by the respondents.

Results A total of 145 journals (71%) were represented. Some of the more interesting findings are presented in Table 7. All journals employed single-blind or double-blind peer review (some used both). Science and medicine journals showed a markedly higher tendency to use single blinding, to use author suggestions to find reviewers, to rate their reviewers, to share reviewer reports among reviewers, and to notify reviewers of the editor's final decision than did social sciences and humanities journals.

Conclusions These results indicate that a divide between the 2 cultures of humanities and sciences persists and is manifest in patterns of peer review practices. In most, though not all, medical journals are most closely aligned with the sciences, and social science journals align with humanities.

University of Oxford, Oxford University Press, Great Clarendon St, Oxford OX2 6DP, UK, e-mail:

Quality of Peer Reviews in 3 Nursing Journals From the Perspective of Authors and Editors

Sandra P. Thomas,1 Mona Shattell,2 Peggy Chinn,3 and W. Richard Cowling2

Objective The literature on review quality in the nursing discipline is quite small. The purpose of this study was to examine the quality of peer review from the perspective of authors and editors in 3 scholarly nursing journals: Advances in Nursing Science, Issues in Mental Health Nursing, and Journal of Holistic Nursing. All 3 journals use double-blind peer review and are indexed in PubMed and CINAHL. Quality of peer review, for the purposes of this study, was defined as constructive guidance for authors to further develop their work for publication and for editors to make sound decisions regarding manuscript disposition.

Design A researcher-developed survey instrument was distributed online to all corresponding authors of manuscripts submitted between 2005 and 2007. A total of 319 authors responded (response rate, 69%) under conditions of anonymity. Additionally, one-third of all manuscript reviews completed between 2005 and 2007 (N = 528) were rated by the research team for level of detail, bias, constructive tone, and usefulness to authors in making revisions and for editors in making decisions.

Results A majority (73.8%) of authors agreed that reviews by these journals provided constructive guidance, and 75.6% agreed that reviews provided adequate rationale for editors' decisions. Forty percent of authors reported fewer than 10 submissions to any journal, and Χ2 analysis showed that inexperienced authors perceived review quality less favorably than experienced authors. Critiques of reviews from editorial perspective included insufficient feedback to authors, inconsistency between reviewers' numeric ratings and manuscript disposition recommendations, and occasional evidence of reviewer bias or disrespectful tone.

Conclusions The results do not provide compelling evidence to question the worth of the standard peer review approach. Given the relative inexperience of many nurse authors in manuscript submission, it is incumbent upon reviewers and editors to continue providing clear and complete feedback and guidance.

1University of Tennessee, College of Nursing, 1200 Volunteer Blvd, Knoxville, TN 37996-4180, USA, e-mail:; 2University of North Carolina, Greensboro, NC, USA; 3University of Connecticut School of Nursing, Storrs, CT, USA

Nursing Journal Peer Reviewers' Views on Quality Indicators in Publishing

Molly Dougherty,1 Margaret Freda,2 Margaret Kearney,3 Judith Baggs,4 and Marion Broome5

Objective To analyze how nursing journal peer reviewers (NJPRs) identify contributions to nursing in manuscripts, define their priorities in writing reviews, and use journal impact factors (IFs).

Design A 69-item online survey was completed in 2007 by NJPRs who were invited by 52 editors of nursing journals worldwide. Editors notified their reviewers of the opportunity to participate once. Anonymous responses were unlinked from journal identification and duplicates removed. Descriptive statistics and Χ2 were used to test hypotheses that NJPRs familiar with IFs of nursing journals would differ from those not familiar by nursing credential, US vs other country residence, and variables related to research and clinical involvement.

Results The NJPRs (N = 1675) were from 44 countries with 74% from the United States, and 90% were nurses. The response rate was 44%. They used contribution to knowledge or research evidence (n = 1404, 83.8%), topic of current interest (n = 1153, 68.8%), and newly emerging area (n = 1141, 68.1%) as indicators of a manuscript's contribution to nursing. In writing their reviews, research rigor (n = 889, 53.1%) and clinical relevance (n = 785, 46.9%) were high priorities. Impact factor was familiar to 810 (48.4%), and 467 (27.9%) used IF to choose journals for submission of manuscripts. Those familiar with IFs were significantly more often not nurses, not US residents, involved in research, and reviewed most often for a research journal (Table 8).

Table 8. Sample Characteristics and Familiarity With Impact Factor Χ2 Results (N = 1675)

Table 8. Sample Characteristics and Familiarity With Impact Factor Χ2 Results (N = 1675)

a Row and column total for Χ2analyses; nonresponses not included. Note: df= 1; all Χ2 ,P < .01

Conclusions When judging a paper's contribution, NJPRs weigh research and clinical interests. Authors in non-US countries receive monetary and other rewards for articles in high-IF journals, but most NJPRs do not use IF and place clinical considerations ahead of IF. Reviewers who are nurses, US residents, review for clinical journals, or are not involved in research are less familiar with IF.

1University of North Carolina at Chapel Hill, School of Nursing, 3602 Andante Dr, Hillsborough, NC 27278, USA, e-mail:; 2Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA; 3University of Rochester School of Nursing, Rochester, NY, USA; 4Oregon Health Sciences University School of Nursing, Portland, OR, USA; 5Indiana University School of Nursing, Indianapolis, IN, USA

Peer Review in Small vs Nonsmall Biomedical Journals

Farrokh Habibzadeh1 and Rob Siebers2

Objective Although some researchers reported some aspects of the peer review system in some "small journals" (SJs), there is no universally accepted definition for SJ. We hypothesized that the peer review process may be so different in SJs from non-SJs (NSJs) that it can provide a mean to differentiate between SJ and NSJ. We therefore conducted this research to study the peer review process in SJs compared to NSJs.

Design On February 25, 2008, members of World Association of Medical Editors (WAME), Eastern Mediterranean Association of Medical Editors (EMAME), Forum for African Medical Editors (FAME), Council of Science Editors (CSE), European Association of Science Editors (EASE), Asociación Mexicana de Editores de Revistas Biomédicas (AMERBAC), and Asociación de Editores de Revistas Biomédicas Venezolanas (ASEREME) were asked to participate in a cross-sectional questionnaire-based online survey. Based on the size of each of these associations and taking into account the common members, we estimate that almost 1000 editors received the invitation. The editors were asked if they thought their journal is an SJ or not and to describe the review system for their journal. The survey ended 4 months later. Categorical variables were examined byΧ2, and the median number of reviewers was compared by Mann-Whitney U test.

Results During the period 115 editors (62 from SJs and 53 from NSJs) from 30 countries representing 56 journals (42 SJs and 14 NSJs) completed the questionnaire. No duplicate responses were found. The average time from submission to decision stated by both groups was 8 to 12 weeks. Both groups believed that the study design and the methodology section are the most important causes of manuscript rejection. Fifty-four (61%) journals used double-blind systems, 31 (35%) single-blind systems, and 3 open review systems. There was no significant difference between SJs and NSJs.

Conclusions SJs are similar to NSJs in terms of most of the peer-review system parameters. However, SJ editors are less likely to use statistical advisors and are more likely to request authors to disclose their conflicts of interest than NSJ editors.

1International Journal of Occupational and Environmental Medicine, PO Box: 71955-575, Shiraz, 71955 Iran, e-mail:; 2Wellington School of Medicine & Health Sciences, Wellington, New Zealand

Effects of Training Reviewers on Quality of Peer Review: A Before-and-After Study


Objective To examine effects of short-term training for the reviewers at a Chinese specialized journal on the quality of peer review.

Design A before-and-after study included 45 reviewers at the Chinese Journal of Tuberculosis and Respiratory Diseases between September 2007 and September 2008. All the reviewers attended a face-to-face training session on peer review for 1 day in February 2008. Three teachers from People's Hospital of Peking University taught the training course. One teacher was a statistician, and the other two teachers were specialists on tuberculosis and respiratory diseases. The training course focused on how to critically appraise research articles and what editors want from reviewers. Three review comments for each reviewer before and after the training were selected randomly and compared. The reviews were evaluated by 1 editorial staff member. The quality of review comments were evaluated by the following: if the comments were returned on time (4 weeks), were 300 words or more, or indicated 4 or more suggestions such as specific errors, detailed suggestions for improvement, or better references. A standardized review draft list was used in this study.

Results The mean and range of years of experience of the 45 reviewers are 2.8 (1.5-5.1) years. After the short-term training, the time taken to complete a review increased nearly 20%. Instances of review comments being more than 300 words increased by about 40%. Almost all reviewers could identify the specific errors in the manuscripts. More than half of the reviewers could give suggestions to correct the errors. There was also an increase in the number of reviewers who gave better references (Table 9).

Table 9. Differences Before and After Training on Quality of Peer Review

Table 9. Differences Before and After Training on Quality of Peer Review

Conclusions Brief training in peer review appears to improve timeliness and quality of review. The long-term effects still need observation.

Publishing House of Chinese Medical Association, Room 424, 42 Donsi Xidajie, Beijing, 100710 China, e-mail:

Peer Reviewers' Recommendations at the Journal of General Internal Medicine, 2004-2008: Style or Substance?

Richard Kravitz,1 Peter Franks,2 Martha Gerrity,3 Mitchell Feldman,4 Cindy Byrne,5 and William Tierney5,6

Objective To examine factors associated with peer reviewers' initial summary recommendations to reject vs accept or revise submissions to the Journal of General Internal Medicine.

Methods We analyzed 5881 reviews of 2664 manuscripts submitted between 2004 and 2008. These reviews were performed by 2916 reviewers. The dichotomous dependent variable was the reviewer's initial summary recommendation: reject vs accept/revise. Independent variables included review year, number of reviewers per manuscript, total reviews by each reviewer, time taken for review, article type (eg, original article, perspective, clinical vignette), and a deputy editor's rating of review quality (1 [poor] to 6 [excellent]). We used random effects logistic regression analyses to account for nesting of reviews by manuscript and by reviewer and to calculate the intracluster correlation coefficients (ICC) for manuscripts and reviewers.

Results Among 2664 manuscripts sent for peer review, reviewers recommended rejection in 28%. The manuscript-level ICC (between reviewers of each manuscript) was 0.18 (95% confidence interval [CI], 0.13-0.23). With 3 reviews per manuscript, this ICC is equivalent to an alpha reliability coefficient of 0.40 (95% CI, 0.31-47). To achieve an alpha of 0.70, 11 (95% CI, 8-16) reviews would be required. The reviewer-level ICC (across manuscripts) was 0.25 (95% CI, 0.20-0.32). Reviewers more often recommended rejection in earlier years (adjusted odds ratio [AOR], 0.91/year; 95% CI, 0.86-0.97) and when submitting more highly rated reviews (AOR, 1.36/quality unit; 95% CI, 1.28-1.44). Other variables were not significant.

Conclusions Between-reviewer agreement on manuscript disposition was modest. Reviewers exhibited greater consistency across manuscripts, suggesting a stable reviewer style (propensity to reject). While the analysis did not include manuscripts rejected by deputy editors without review nor address the consistency of the narrative portion of the reviews, adequate reliability in summary recommendations will require more reviewers per manuscript and/or more standardization in criteria for summary recommendations.

1University of California, Davis, Division of General Internal Medicine, 4150 V St, Suite 2400 PSSB, Sacramento, CA 95817, USA, e-mail:; 2Department of Family and Community Medicine, University of California, Davis, Sacramento, CA, USA; 3Department of Medicine, Oregon Health Sciences University, Portland, OR, USA; 4Division of General Internal Medicine, University of California, San Francisco, CA, USA; 5Regenstrief Institute, Inc, Indianapolis, IN, USA; 6Division of General Internal Medicine, Indiana University Indianapolis, IN, USA

Trends in Using Insecure e-Mail Services in Communication With Journal Editors

KreŠimir Šolićc,1 Vesna Ilakovac,1 Ana MaruŠićc,2 and Matko MaruŠićc2

Objective Free online e-mail services (eg, Gmail, Yahoo, and Hotmail) are considered to have more security flaws than institutional ones but are widely popular and frequently used. The objective of this study was to analyze the changes in the use of free online e-mail services for correspondence by authors of published papers in a medical journal.

Design Contact information of corresponding authors for all papers published in the Croatian Medical Journal (CMJ) during a 10-year span (1998-2007) were collected from the CMJ electronic archive. Domains of all e-mail addresses were assessed, and contacts were categorized into 4 groups: no e-mail, worldwide available free online e-mail service, free national online e-mail service, and institutional or corporate e-mail address.

Results Of 978 authors, 34 had no mail (3.5%), 563 (57.6%) used institutional or corporate e-mail addresses, 246 (25.2%) free national online e-mail service, and 135 (13.8%) worldwide available free online e-mail service. The proportion of authors using world wide available free online e-mail services increased from 7.6% in 1999 to 20.8% in 2007 showing significant increasing trend (Cochrane-Armitage trend test, P = .011). Non-Croatian authors (n = 520, 53%) more often used institutional e-mail addresses than Croatian authors (n = 458, 47%), (Χ2 = 56.1, degrees of freedom = 3, P < .001).

Conclusions There is a significant increasing trend in using worldwide available free online e-mail services in small general medical journals. Authors should be aware that insecure e-mail services may compromise confidential nature of author-editor communication and should consider institutional e-mail addresses over free ones.

1Department of Biophysics, Medical Statistics and Medical Informatics, School of Medicine, Josip Juraj Strossmayer University, J. Huttlera 4, HR-31000, Osijek, Croatia, e-mail:; 2Croatian Medical Journal and Split University School of Medicine, Split, Croatia

Postpublication Perceptions and Influence

Canadian Physicians' Perceptions About Biomedical Journals

Erica Frank, Carolina Segura, and Ariella Zbar

Objective Those creating major biomedical journals tout their "large audiences" (Journal of the American Medical Association, JAMA), claiming to be "truly the 'must read' for all medical professionals" (New England Journal of Medicine, NEJM), or "accessed by a vast global audience" (British Medical Journal, BMJ). Yet the extent to which these journals are perceived as relevant by rankand-file physicians is unknown.

Design As part of the Canadian Physicians' Health Study (conducted November 2007-May 2008), we queried a random sample (N = 3013) of Canadian physicians about their agreement with the following statement: "It is clinically important for me to regularly read the major biomedical research journals." We queried on a 5-point Likert scale from strongly agree to strongly disagree. Comparisons were made in SAS, using Χ2 tests.

Results The majority (66%) of Canadian physicians agree or strongly agree that is clinically important for them to regularly read the major biomedical research journals. This was more true for men than women (68% vs 62%, P < .01), non–family physicians than family physicians (74% vs 55%, P < .0001), for non–Canadian-born vs Canadian physicians (72% vs 63%, P < .0001), for those who attended medical school in another country vs Canada or the US (72% vs 65% vs 61%, P = .001), and for those practicing in the inner city vs urban/suburban vs rural/small town/remote settings (75% vs 66% vs 60%, P < .0001). Especially high agreement was found among those working primarily in research units (85%) or in academic settings (82%).

Conclusions While most academicians and researchers believe that regularly reading major biomedical research journals is important for them, nearly half of Canadian family physicians (who represent about half of Canadian physicians) and rural/small town/remote practitioners do not believe so. Journal editors, publishers, and boards should consider these findings when deciding on content and other publication strategies.

University of British Columbia, School of Population and Public Health, James Mather Building, 5804 Fairview Ave, Vancouver, British Columbia V6T 1Z3, Canada, e-mail:

The Influence of the Impact Factor on Discussions Between Doctors and Their Patients: The Case of Rosiglitazone

Jim Nuovo

Objective To track the influence of an article published in a high-impact journal on discussions between doctors and their patients in response to concerns about potential medication adverse side effects.

Design This was a retrospective analysis of a primary care network's electronic medical record database. From a diabetes registry of 12,246 patients, 369 were identified as taking rosiglitazone prior to the June 14, 2007, publication of an article in the New England Journal of Medicine; the article suggesting an increased risk of myocardial events for patients taking the drug. The entire content of all office visits, telephone messages, and medication lists for each patient were reviewed over a 1-year period subsequent to the article's publication. Doctor/patient discussions regarding concerns about rosiglitazone were cataloged including the physician's treatment recommendations.

Results There were documented discussions on rosiglitazone's potential adverse effects in 19.2% of this population. All the discussions occurred between June 15 and October 30, 2007. Of this group, 59.2% remained on rosiglitazone. For those advised to continue rosiglitazone, the clinician indicated that he or she wanted more data before determining if the drug was not safe. For those advised to discontinue rosiglitazone, 83.3% were placed on pioglitazone.

Conclusions An article suggesting potential adverse effects of rosiglitazone resulted in a documented discussion in 19.2% of patients on this medication. These findings suggest an awareness of this publication by patients, presumably derived from media reports. However, an awareness of this concern did not result in a substantial change in practice. The majority of patients remained on rosiglitazone. The content of these discussions suggest that most physicians recommended waiting for more published data before considering a change. While many factors influence physicians' prescribing behavior, this study demonstrates how an article in a high-impact journal influences the doctor-patient dialogue.

Department of Family & Community Medicine; 4860 Y St, Suite 2300, Sacramento, CA 95817, USA, e-mail:

Publishing Models

Transformation of the American Journal of Obstetrics & Gynecology From a Traditional Journal to a "New Format"

Thomas J. Garite,1-3 Roberto Romero,4,5 Moon Kim,1,2 Pamela Poppalardo,6 and the Editors of the American Journal of Obstetrics & Gynecology

Objective To describe the transformation and results of a major medical journal in response to diminishing readership and major revenue losses that threatened its existence.

Design The American Journal of Obstetrics & Gynecology (AJOG) is the oldest journal in its specialty with an impact factor ranking it second among general obstetrics/gynecology journals. Subscriptions had dropped from more than 20,000 to 9000 with loss of subscription and advertising revenue. Readership surveys indicated major changes in reading habits of subscribers. In January 2007, AJOG transitioned to a new format and from a subscription-only journal to controlled circulation (free) distribution. The print version was changed to include only a 1500-word summary cowritten by a employee scientific writer and the authors, with the full-length article published online.

Results In the 2 years following implementation of this change, circulation increased to 44,000 while maintaining two-thirds of its paid subscriptions. Publication time was shortened from 9.9 to 6.9 months. Advertising revenue increased by 45%. According to a PERQ/HIC analysis, there was an increase of high readership from 11% to 29% and of average readership by 18% to 59%. Submissions during this time period increased by 20%. The acceptance rate decreased from 29% to 25% with a small (5%) decrease in articles published. The citation rate increased by 20% and is the highest of any journal in women's health. The impact factor improved from 3.0 to 3.5. The journal's Web site activity increased by 70% in both visits per month and page views. A survey of readers and authors revealed that 85% of readers and 71% of authors found the changes very acceptable or acceptable.

Conclusion Major changes in the format and distribution of a 138-year-old journal in response to dramatic changes in reading habits of subscribers resulted in favorable changes in readership, article submissions, and the journal's financial well-being.

1Obstetrics and Gynecology, 33025 Ponderosa Trail, Oak Creek, CO 80467, USA, e-mail:; 2University of California Irvine College of Medicine; 3Pediatrix Medical Group, Sunrise, FL, USA; 4Wayne State University College of Medicine, Detroit, MI, USA; 5Intramural Division, NICHD, NIH, DHHS, Rockville, MD, USA; 6Publishing Director, Elsevier, Inc, New York, NY, USA

A New Journal Mode: Public Peer Review, Open Access on Web 2.0

Xibin SHEN

Objective With the improvement of Web service from Web 1.0 to Web 2.0, some of Web 2.0 techniques have enriched information distribution, thereby affecting the academic society. Many academic journals provide new online tools such as interactive Web logs (blogs), bulletin board systems (BBS), and content-sharing sites (Digg) that enable users to be participants rather than information receivers. Users play a principal role in creating, collecting, filtering, organizing, and distributing information materials from different sources.

Design Based on this practice, we designed a new mode for creating a journal from the conception of Web 2.0 or a journal 2.0 that is public peer reviewed and open accessed. The journal 2.0 is organized, peer reviewed, revised, and published online by users using Web technology including Web 2.0 or other available techniques. In other words, users play a major role in activities involving science publishing. The users upload, manage, vote, filter, and publish science information on an open platform.

Results Journal 2.0 is feasible for filtering low-quality articles by open peer review. Unlike traditional journals, Journal 2.0 uses open peer review by several special designated users in combination with pubic evaluation. Opening to the public, Journal 2.0 invites several users to make a decision in evaluating manuscripts, which is form-based and involves setting points similar to a gymnastics grading. An original medical article, for instance, can be graded by its relevance and interest (A1), impact (A2), content (A3), originality (A4), and presentation (A5); each item will be multiplied by an assigned weight (W). In addition, users have their own academic value (V) that is used to assess their academic level or their activities for the journal. In this process, the academic score (S1) from the invited users for each manuscript is a weighted average of the values multiplied by their academic values. So, the academic score of each manuscript during designed peer-reviewing is calculated as the following: Ss = ([A1x W1 + A2 x W2 + … + A5 x W5] x V1 + [A1 x W1 + A2 x W2 + … + A5 x W5] x V2 + ... + [A1 x W1 + A2 x * W2 + … + A5 x W5] x Vn)/n. This score plus the user commentaries will be used to primary assess the value of a given manuscript. Meanwhile, another manuscript's evaluation process is open to the public, and their scores (Sp) will mildly adjust the results. In my designed system, however, how the public scores influence the decision can be adjusted by the addition of another weight to special (Ws) or public (Wp) scores. For instance, given an even weight to each section will influence the final value equally to the special ones. At last, the final score of a given manuscript is S = Ss x Ws + Sp x Wp. Within the setting period, the most ranked titles will enter into further steps. The accepted manuscript will return to the author for revising, who can invite his or her colleagues or friends in the platform for editing collaboratively via Wiki. As being done by some traditional journals, published articles will be noted to all users via really simple syndication (RSS) feed, podcast, e-mail alert, etc. In return, the reviewer's academic value will be adjusted by his or her rating for every manuscript and their activities (eg, submission, publishing, providing valuable material of information).

Conclusion Under this open circumstance, peer review bias will be eliminated because of its available comment to everyone, and articles will be more acceptable after voting of major users.

Chinese Journal of Internal Medicine, 42# Dongsi Xidajie, Beijing, 100710, China, e-mail:

Publication Pathway

What Happens to Rejected Manuscripts?

Kenneth Noller,1 Sheryl Martin,2 and James Scott2

Objective To determine the frequency of later publication of original research papers rejected by the journal Obstetrics & Gynecology (O&G).

Design The author search function of PubMed was used to identify papers rejected in 2002 by O&G that were later published by any of the authors in another journal. The abstract submitted to O&G was compared to that of similar publications, and a numerical value was assigned using 4 variables: authors' names, title, number of study participants, and abstract text. Each variable was coded as exact, close, or dissimilar based on predetermined criteria.

Results A total of 194 of 318 (61%) rejected manuscripts were later published. Of these, 73 (38%) were unchanged from the original submission based on exact matches for the 4 variables. Eight (4%) of the 194 were published in journals with an impact factor higher than O&G. Sixty-eight (35%) appeared in journals in the top 20 reproductive category based on impact factor. The mean time from O&G rejection to publication in another journal was 15.7 months (range, 4-57 months). Thirteen (7%) papers were published in general medical journals, and 83 (43%) in other general obstetrics-gynecology journals. The remaining articles appeared in other specialty, obstetrics-gynecology subspecialty, epidemiology, nursing, and review journals. Of the 124 papers (39%) that were not published in any journal, 115 were observational studies. There were 5 randomized controlled trials and 4 papers that did not fit the original research category.

Conclusion Many papers originally rejected by Obstetrics & Gynecology are later published in other peer-reviewed journals. These results have important implications for all scientific journals. The values and quality of the original peer review, whether all rejected articles should be published, impact on scientific validity, and effect on the increasing information overload for clinicians are issues that should be addressed.

1Department of Obstetrics and Gynecology, Tufts University School of Medicine, Boston, MA, USA; 2Obstetrics & Gynecology, Department of Obstetrics and Gynecology, University of Utah Medical Center, 423 Wakara Way, Suite 201, Salt Lake City, UT 84108-1242, USA, e-mail:

Submission and Peer Review Frequencies Before Acceptance: An Analysis of Submissions by Mayo Clinic Section of Scientific Publications, 2006

Colleen M. Sauber, LeAnn M. Stee, Sarah M. Jenkins, and Margery J. Lovejoy

Objective To estimate the percentage of manuscripts submitted to peer-reviewed journals that undergo peer review at up to 3 journals before publication.

Design We hypothesized that 65% of manuscripts submitted to medical and scientific journals undergo peer review 1 or 2 times, and at least 80% undergo review up to 3 times before publication in the identified journal. We retrospectively analyzed manuscripts whose submissions were tracked by Mayo Clinic's publications section in 2006. Data on number of revisions, rejections, and acceptance were recorded for a simple random sample of all manuscripts undergoing full-service process and submitted to peer-reviewed journals that year. Frequency and percentage of revisions, rejections, and acceptances were summarized over all submissions and the random sample. Observed percentages and exact binomial 95% confidence intervals (CIs) were compared with hypothesized values. A journal submission response was considered a peer review by journal and outside experts.

Result The sample comprised 104 manuscripts and 217 submissions to 129 journals. Five manuscripts (4.8%; 8 [3.7 %] submissions) lacked data beyond final submission. Of the manuscripts, 73 (70.2%) underwent 1 or 2 submissions, thus, 1 or 2 peer reviews; 86 (82.7%) underwent up to 3 peer reviews (Table 10). Observed percentages were consistent with hypothesized values. Among clinical articles (n = 80), 67 manuscripts (83.8%) underwent up to 3 submissions; all 13 case reports had 3 or fewer submissions. Among all manuscripts, 73 (70.2%) were ultimately accepted, and 36 (34.6%) were accepted after submission to only 1 journal. Of these 36 manuscripts, 6 were accepted with no revision. Median time from submission to acceptance was 224 days (range, 3-1571 days).

Table 10. Total Submissions for Each Unique Manuscript Among 104 Randomly Selected Manuscripts Submitted in 2006 and the Clinical Articles and Case Reports of the Sample

Table 10. Total Submissions for Each Unique Manuscript Among 104 Randomly Selected Manuscripts Submitted in 2006 and the Clinical Articles and Case Reports of the Sample

Abbreviations: CI, confidence interval; cum %, cumulative percentage.

Conclusions During submission to peer-reviewed journals, approximately 80% of manuscripts undergo up to 3 peer reviews. Peer review influences timing of publication of scientific data and discussion. Focusing submission to an appropriate journal could decrease time to publication.

Mayo Clinic, Section of Scientific Publications, 200 First St SW, Rochester, MN 55905, USA, e-mail:

Quality of the Literature

The Evolution of Thoracic Surgical Literature

Maurice Blitz,1 Andrew Graham,2 Sean P. McFadden,2 Sean C. Grondin,2 and Gary Gelfand2

Objective Modern surgical care is increasingly driven by evidence-based decision making. In an effort to assess the quality of evidence available to thoracic surgeons the levels of evidence of original research and meta-analyses published in the top 3 thoracic surgery subspecialty journals (identified using impact factor) were determined and compared for the years 1998, 2002, and 2006.

Design Searches of the table of contents of the Journal for Thoracic and Cardiovascular Surgery, the Annals of Thoracic Surgery, and the European Journal of Cardio-Thoracic Surgery were performed. All original research or meta-analyses pertaining to thoracic surgery for the years 1998, 2002, and 2006 were identified. The individual abstracts and their corresponding manuscripts were then evaluated and classified by 2 independent observers using the levels of evidence from the Oxford Centre for Evidence-Based Medicine. For each publication within each specified year the distribution of the levels of evidence of the manuscripts was compared.

Results Six hundred sixty-three manuscripts were reviewed; 39 manuscripts were categorized as level 1 evidence (randomized and controlled trials and systematic reviews of randomized and controlled trials). Four (2.4% of all included publications) were identified in 1998, 12 (4.9%) in 2002, and 21 (8.3%) in 2006. This trend to a higher proportion of the publications determined to be level 1 evidence is significant (P = .009). Initial review resulted in a Cohen's kappa of 0.335, suggesting very poor agreement. When the data was categorized as level 1 or levels 2 to 5 (not level 1), the kappa value of 0.695 (95% confidence interval [CI], 0.5070.884) and the corresponding 97.2% (standard error +/- 9.6%) agreement was much more acceptable.

Conclusions The overall trend toward the publication of manuscripts deemed to be of higher levels of evidence has increased from 1998 to 2006. Unfortunately, the proportion of manuscripts determined to be level 1 evidence is low. More emphasis needs to be placed on the completion of systematic reviews as well as on the design, implementation, and publication of randomized and controlled trials so that thoracic surgical literature can maintain its standing within the evidence-based milieu of the general medical literature.

1St Joseph's Health Centre, 30 The Queensway, SSW 245, Toronto, Ontario M6R-1B5, Canada, e-mail:; 2Division of Thoracic Surgery, Department of Surgery, University of Calgary, Calgary, Alberta, Canada

Pharmacovigilance and Local Literature: An Italian Bibliography

Claudio Oliveri and Daniela Ranzani

Objective This study explores existing scientific and medical Italian literature in order to classify local publications and to create a database usable in postmarketing pharmacovigilance activities, as prescribed by the European Guidelines on Pharmacovigilance for Medicinal Products for Human Use (Volume 9A).

Design Local refers to the type of dissemination of the journals: only nonindexed Italian publications have been considered, assuming a national circulation. The publications were selected in 2 different ways. A keyword-based search was performed in Urlich's Database. We subsequently examined the official organs of 195 societies, members of the Italian Federation of Medical Societies (Federazione Italiana Società Medico Scientifiche, FISM). This study was conducted in March 2009. The 2 data sets were first analyzed individually and then compared and combined to form a unified database.

Results A total of 344 Italian scientific medical journals were identified: 176 (51%) journals were indexed in international databases [40 (23%) had an impact factor], and 168 (49%) were nonindexed journals, which could be defined as local. It was found that among nonindexed journals, 117 (70%) were published in Italian only. The nonindexed journals cover all major therapeutic areas; the most represented areas included neurology/psychiatry (24 journals), surgery (13), cardiovascular medicine (11), dentistry (11), pediatrics (9), and diagnostic (9). There were 21 nonspecialized journals.

Conclusions A significant portion of the scientific data published within Europe originates within non-English local journals that may lack an impact factor and not be indexed in international databases, causing this data to be inaccessible at a global level. The creation of a national registry is the first step to allow pharmaceutical companies to screen local bibliographical sources according to rules and regulations governing medicinal products in the European Union (Eudralex). The methodology used to compile our database can be extended to other European countries.

Wolters Kluwer Health Italy Ltd, Via Lanino 5, 20144, Milan, Italy, e-mail:

Quality of Online Information

Multilayer Quality Control for Publications and Online Lectures

Faina Linkov,1 Gilbert S. Omenn,2 Ismail Serageldin,3 Vinton Cerf,4 and Ronald LaPorte1

Objective Quality control (QC) of scientific of educational presentations online is a serious concern for all scientific disciplines. Peer review, the golden standard of quality in various scientific disciplines, may not be optimal for the review of online lectures because it is labor intensive and has low throughput. This paper will discuss the Supercourse, global library of more than 3600 online lectures available at and several alternative quality control approaches that are being developed as part of this global effort.

Design In the Supercourse, we are utilizing both traditional quality control systems (lecture review forms) and novel approaches by developing or adopting multiple quality assessment methods from other fields for PowerPoint online. These systems include expert editorial board, screening to identify inappropriate lectures, opinion of experts, personal characteristics of authors, Web statistics for utilization of lectures (hits, links, page rankings), keynote speeches (benefiting from the choice of speakers by the relevant society), publications and citations from Google Scholar, and a model similar to the National Institutes of Health style review. The lecture review forms at the end of each module rates the lecture on content, presentation, relevance, and overall rating using a 5-point Likert-like scale, with more than 14000 review forms accumulated in the past 8 years. We calculated correlation coefficient between expert and nonexpert reviewers for the first 10 lectures that received the maximum number of reviews.

Results The mean overall score for all lectures was +/- 4.54 out of a possible 5. More than 50% of the Supercourse reviewers were "expert reviewers" in that they were medical doctors and university professors. The correlation coefficient between reviews of experts and nonexperts was 0.79 (P < .05) for the first 10 lectures, demonstrating that both experts and nonexperts rate lectures highly. Other quality metrics utilized in the Supercourse also demonstrate that experts rate lectures very highly.

Conclusions The multilayer approach to QC of online materials is a novel approach that incorporates existing QC mechanisms, such as peer review, as well as explores the utilization of alternative methodologies. We are currently working on making all of the metrics and findings from the QC evaluation available to the end user and on evaluating the effectiveness of all quality metrics. Our hope is that future scientific research on peer review as well on emerging multilayer QC methodologies will help us to determine best measures of QC.

1University of Pittsburgh Cancer Institute, Division of Cancer Prevention and Population Science, 5150 Centre Ave, Suite 4-C, Room 466, Pittsburgh, PA 15232, USA, e-mail:; 2University of Michigan, Ann Arbor, MI, USA; 3Library of Alexandria, Alexandria, Egypt; 4Google, Mountain View, CA, USA

Citations to Web Sites in Scientific Papers: The Permanence of Native and Archived References

Andrea Thorp1 and David Schriger2

Objective Uniform resource locators (URLs) are being cited within the medical literature with increasing frequency. In contrast to cited printed material, a cited Web page is not permanent. Literature suggests that half of the Internet references are inaccessible after 1 year. One solution to ensure access to the intended information is to archive the Internet reference prior to publication. WebCite is a free archiving service that creates a "snapshot" of the desired Web page and provides a permalink URL for that snapshot.

Design Internet references included in original research articles available in the online publication of Annals of Emergency Medicine from June 2007 to February 2009 were included for analysis. Each original referenced URL was archived using To evaluate the permanence of the original and archived Web pages we attempted to access each at 6-month intervals after online publication.

Results Data collection is ongoing. Interim analysis includes 98 original/archived WebCite pairs checked a mean of 268 (standard deviation [SD], 192) days (median, 310) days after online publication. Ninety-nine percent (97/98) of the archived WebCite URLs readily accessed the correct Web page. The 1 inaccessible WebCite linked to the correct URL but had no content visible. In contrast, only 62% (61/98) of original URLs were available in original form including 74% checked within 6 months, 56% checked within 6 to 12 months, and 52% checked within 12 to 24 months of publication. Twelve URLs were completely gone (a "404 error"); the other 25 had changed over time. Four of the Webcite URLs had failed to capture the entire page.

Conclusions Archiving Internet references may offer a more permanent means to retrieving information cited from a Web page. The scientific community should consider establishing a mechanism for archiving Internet references in medical publications.

1Loma Linda University Medical Center, Pediatric Emergency Medicine, 11234 Anderson St, MC-A108, Loma Linda, CA 92373, USA, e-mail:; 2UCLA Medical Center, Emergency Medicine Center, Los Angeles, CA, USA

Understanding Why the US National Library of Medicine Fails to Properly Index the Publication Type of a Number of Randomized Controlled Trials

Susan Wieland1 and Kay Dickersin2

Objective The indexing of randomized controlled trials (RCTs) in databases (ie, as RCT[pt] in MEDLINE) facilitates efficient searching for trials. We examined 591 RCTs added to PubMed in 2005, but not tagged RCT[pt] in MEDLINE, to understand potential explanations for nonindexing.

Design Study reports were initially identified through the Cochrane MEDLINE retagging project, ending in 2006. The project used a sensitive search strategy to search MEDLINE for RCT reports not tagged as RCT[pt] and hand-searching to identify RCTs. In this analysis, 2 independent reviewers read the title and abstract of each indexed RCT not tagged RCT[pt] and classified journal characteristics, type of report (eg, secondary analysis, design, and rationale), and National Library of Medicine topic indexing. Reports could be classified as more than 1 type. Both reviewers completed preliminary classification of the first 100 records, with perfect agreement on RCT status and collaborative development of a classification scheme. One reviewer completed classification of all 591 records, and we will report on reviewer classifications and agreement for all 591 at the Congress.

Results Twenty-two percent (21/97) of confirmed RCTs appeared to be straightforward RCTs. The most common report types otherwise were secondary analyses of RCT data (31/97), the rationale, design, pilot, or baseline data for an RCT (21/97), and observational analyses from RCT data (14/97). Four percent (4/97) were tagged Clinical Trial[pt]. Half the records (45/97) were MeSH tagged "Randomized Controlled Trials," indicating that RCTs were the topic.

Conclusions Preliminary analysis indicates that nearly all RCT records not indexed RCT[pt] in MEDLINE and identified by Cochrane do describe trials. Authors and editors should ensure that the term "randomized controlled trial" is used in their title or abstract, regardless of presence of or type of analysis. Searchers should use text words for randomization and the MeSH "Randomized Controlled Trials" in addition to RCT[pt] when searching MEDLINE, especially for records added after 2005.

1Brown University, Box G, Providence, RI 02912, USA, e-mail:; 2Johns Hopkins Bloomberg School of Public Health, Department of Epidemiology, Baltimore, MD, USA

Quality of Reporting

Quality Review of Clinical Guidelines

Anne Hilde Rosvik, Trond Bjornerud, Espen Movik, and Magne Nylenna

Objective To test the feasibility and the benefit of quality review of clinical guidelines and to compare this process with traditional peer review of scientific articles.

Design Of 352 clinical guidelines included in the Norwegian Electronic Health Library, 112 were suitable for assessment with Appraisal of Guidelines Research and Evaluation (AGREE), an evaluation instrument. Inclusion criteria for assessment were Norwegian origin of guideline, published in the period 2000-2008, comprehensive guidelines (not procedures), national coverage, and direct relation to patient care. AGREE contains 23 items organized in six domains, each assessed on a scale from 1 (strongly disagree) to 4 (strongly agree). Domain scores were calculated by summing up all the scores of the individual items in a domain and by standardizing the total as a percentage of the maximum possible score for that domain. Eighteen doctors and nurses were trained and certified as reviewers, and each guideline was assessed by 2 reviewers. Assessments were carried out during 2007 and 2008.

Results Each reviewer used from 2 to 5 hours on each guideline. The average cost for evaluating 1 guideline amounted to 6400 NOK (800 Euro) including administration. The average scores (± standard deviation) for the 6 domains are as follows: scope/ purpose: 72% (± 16); clarity/presentation: 69% (± 17); stakeholder involvement: 46% (± 19); editorial independence: 36% (± 19); applicability: 34% (± 16); and rigor of development: 33% (± 23). By combining all domains 12 guidelines received the conclusion "strongly agree" (most scores above 60%), 91 "recommend with provisos or alterations" (most scores between 30% and 60%), and 8 "would not recommend," and 1 "unsure."

Conclusions Most Norwegian guidelines do not fulfill the quality criteria. The final conclusion "recommend with provisos or alterations" given to 4 of 5 guidelines is not very helpful. To get useful information about the guideline quality, clinicians have to look at each domain score. Publication of AGREE results, together with each guideline, have, however, made the guideline quality more transparent and easily accessible for clinicians and will hopefully increase quality over time.

Norwegian Knowledge Centre for the Health Services, The Norwegian Electronic Health Library, PO Box 7004, St Olavsplass, Oslo 0130, Norway, e-mail:

Development of Clinical Guidelines in Norway: Do Patients Participate?

Anne Hilde Rosvik, Trond Bjornerud, Espen Movik, and Magne Nylenna

Objective It is widely accepted that patients' views and perspectives should be an integrated part of clinical guidelines. We aimed at exploring to what extent patients participate in guideline development in Norway.

Design The Norwegian Electronic Health Library includes 352 clinical guidelines freely available online, currently used in clinical practice, in the Norwegian language, published in the period 2000-2009, and developed by accepted and well-known organizations or institutions. A total of 112 guidelines were assessed with Appraisal of Guidelines Research and Evaluation (AGREE) during 2007 and 2008, an instrument for evaluating guidelines. Inclusion criteria for assessment were Norwegian origin of guideline, comprehensive guidelines (not procedures), national coverage, and direct relation to patient care. AGREE covers 23 different items organized in 6 domains. We analyzed the item "The patients' views and preferences have been sought" in the domain "Involvement of stakeholders," which was assessed by 2 appraisers, each rating the issue on a scale from 1 (strongly disagree) to 4 (strongly agree).

Results The patient-participation item had a mean score of 1.8 (confidence interval [CI], 1.6-2.0). In 47 (41%) of the guidelines there had been no patient involvement at all. The patient involvement item scored significantly lower than involvement of stakeholders in general. Guidelines produced by the health authorities scored higher than guidelines produced by professional organizations, mean 2.3 (2.0-2.6) vs 1.5 (1.2-1.8). Guidelines on cancer, mental health, and children had the highest scores (Figure 4). There was no significant difference between older (2000-2005) and newer (2006-2009) guidelines. Twelve guidelines scored a top 4 from at least 1 of the appraisers. Of these 10 guideline drafts had been sent to patient organizations for feedback and comments, and in 5 cases patients or patient representatives had participated in the development group.

Figure 4. Mean Scores for Patient Participation in Guideline Development According to Medical Field of Guideline

Figure 4. Mean Scores for Patient Participation in Guideline Development According to Medical Field of Guideline

Conclusions Patients are generally not participating in development of guidelines despite the intention. There is no improvement over the last years.

Norwegian Knowledge Centre for the Health Services, The Norwegian Electronic Health Library, PO Box 7004, St Olavsplass, OSLO 0130, Norway, e-mail:

Variability of Standards for Reporting Tumor-Graft Data in Preclinical Cancer Therapeutic Studies

Adam Pascoe,1 Elizabeth Sugar,2 Scott Kern,1 and Nilofer Azad1

Objective To characterize the methodological and statistical parameters for the reporting of tumor-graft experiments for oncologic drug development.

Design Using 2007 impact factors, we identified the most-cited medical (3) and oncology journals (5) with tumor-graft reports. All publications used tumor-graft implantation in murine models. For each experiment, the characteristics of the animal models, tumor grafts, experimental therapy, and statistical analysis were examined.

Results We examined 145 articles describing 255 experiments from October to December 2008. The papers spanned a wide range of disease types, graft locations, and treatments. Missing data for key design variables was found in 51% of experiments. There was no standard for the treatment initiation of each experiment, commencing based on elapsed time from tumor implantation (63%) or on tumor volume (23%). Eleven percent of the papers did not report a negative control. There was significant variation among outcome measurements: 70% used tumor volume; others used animal survival (20%), postsacrificial examination (21%), and/or biological changes in the tumor (29%). Tumor volume was evaluated via volume, diameter, area, imaging, and palpation without a standard. Volume was derived using 11 specified formulae or an unspecified calculation (22%). Statistical tools used included the t test (47%), ANOVA (21%), and log-rank test (18%); however, 23% of the studies did not report a statistical method for evaluating data. P values were reported in 70% of the experiments and in 8% of the manuscripts' abstracts. Ninety-four percent of studies were reported as positive.

Conclusions Tumor-graft studies are reported without a standard, often without the methodological information necessary to repeat and confirm the experiments. The high percentage of positive trials indicates potential publication bias. Considering the widespread use of such experiments in choosing drugs for oncology clinical development, we feel it is important for scientists and publishers to develop a consensus set of publication guidelines for reporting experimental and statistical procedures, and we present our initial recommendations.

1Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, 1650 Orleans St, Room 4M10, Baltimore, MD 21231, USA, e-mail:; 2Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA

Challenges in Peer Reviewing Evidence-Based Clinical Practice Guidelines: Do We Know the Degree to Which Guideline Recommendations Reflect Underlying Evidence in Ophthalmology?

Tianjing Li, Roberta Scherer, Elizabeth Ssemanda, Ann Ervin, and Kay Dickersin

Objective To compare evaluation of the 2005 American Academy of Ophthalmology (AAO) Glaucoma Preferred Practices Patterns (PPPs) guidelines, using the Appraisal of Guidelines Research & Evaluation (AGREE) instrument, with the actual evidence, to understand the challenges of guideline peer review.

Design Two people independently used AGREE to assess the rigor of development of the AAO PPPs. Since AGREE does not provide a mechanism for assessing whether the evidence cited in a guideline is appropriate, we identified evidence that would have been available and assessed the methodological quality of relevant systematic reviews. In this context, we searched the Cochrane Library, PubMed, and EMBASE for systematic reviews, and CENTRAL for randomized controlled trials (RCTs). Two people independently reviewed search results, selected studies, and assessed systematic review quality.

Results The AGREE score for rigor of development was 48%. Items rated low were searching, criteria for selecting evidence, methods used for formulating the recommendation, and external review. The AAO PPPs cited 5 systematic reviews and 28 RCTs, whereas we identified 18 systematic reviews and more than 300 potentially eligible RCTs that would have been available as of the last date of guideline literature search (April 2004). Of the 12 reviews in which we assessed quality (reasons for not assessing quality in 6 reviews: multiple publication, n = 4; non-English, n = 1; full text unavailable, n = 1), 5 had predetermined eligibility criteria, 4 searched more than 1 database, and 6 evaluated the quality of included studies. Four reviews fulfilled all methodological criteria.

Conclusions Although a global appraisal of the rigor of development of guidelines identifies issues for gathering and synthesizing evidence, it does not inform readers the degree to which these recommendations reflect the underlying evidence. Discussion of the role of peer review in guideline publication is needed, given guidelines' pivotal role in knowledge translation and health care decision making.

Johns Hopkins Bloomberg School of Public Health, Mailroom W5010, 615 N Wolfe St, Baltimore, MD 21205, USA, e-mail:

Reporting Guidelines for Surveys: Limited Guidance and Little Adherence

Carol Bennett, Sara Khangura, Jamie Brehaut, Jeremy Grimshaw, David Moher, and Beth Potter

Objectives To identify articles providing guidance on reporting mail surveys (self-administered questionnaire data) in the health science literature and to compare recommended practice with reported practice.

Design A search strategy was conducted in MEDLINE and PsycINFO electronic databases. Further strategies to identify relevant papers included reviewing reference lists of included studies, using the related articles feature in PubMed for all papers meeting our eligibility criteria, and reviewing relevant textbooks and Web sites. Eligible papers were those written in English and (1) provided guidance on the reporting of survey research or (2) reported evidence of the quality of reporting of survey research. For each source providing guidance on the reporting of survey research, the number of items included and consensus across the guidelines were evaluated. For papers presenting evidence of reporting practice, each aspect of survey reporting addressed was extracted. The data were summarized descriptively.

Results Four papers and 1 Web site ( that provided guidance on the reporting of survey research in the form of a numbered list or checklist were identified; however, none were validated instruments. One checklist was specific to the reporting of internet surveys. For the remaining sources, 39 different reporting items were identified, but only 3 items (description of the research tool, representativeness of the sample population, and response rates) appeared in all guidelines. Six papers were identified that assessed the quality of reporting of some aspect of survey research. Three papers looked at response rate reporting, 1 study evaluated the reporting of nonresponse analysis, and 2 papers assessed description of or access to the questionnaires used in survey research. Overall, these papers indicated that there was suboptimal reporting of these domains.

Conclusions There is limited guidance and no consensus regarding the optimal reporting of survey research. Those recommended reporting criteria that have been evaluated are poorly reported.

Ottawa Health Research Institute, Clinical Epidemiology Program, ASB 2-013, 1053 Carling Ave, Ottawa, Ontario K1Y 4E9, Canada, e-mail:

Publication of Methods in Focus Group Research: Are There Quality Standards for Qualitative Research?

Chima D. Ndumele,1 Genna Ableman,2 Beverly Russell,3 Edith Gurrola,2 and LeRoi S. Hicks2

Objective To identify described strategies and success rates in recruiting non-Hispanic black and Latino participants in published focus group research.

Design We conducted a systematic review of 264 focus groups in 40 published articles between 1996 and 2009 using the PubMed/ MEDLINE and CINAHL databases. Articles were excluded if groups did not target adult minority chronic disease patients, were conducted internationally, or if the primary recruitment was not for focus groups. Patients in studies had a diagnosis of or were at high risk for either coronary disease, diabetes, hypertension, or asthma.

Results The majority of studies (36/40) described recruitment strategies in detail; however, fewer studies reported receiving institutional review board approval (65%) or informed consent from participants (63%). The most commonly reported methods of patient recruitment were active in-person (30%), mailed letters (25%), and flyers (25%); 17/40 (43%) studies reported using a combination of methods. The most common recruitment location for studies was health centers (57%) followed by churches (13%). The majority of published studies (80%) did not fully report how successful recruitment efforts were. Only 8 studies reported the total number of potentially eligible persons initially contacted and described the proportion who agreed to participate in the study. Among the remaining studies, 11 (34%) reported the proportion of patients who attended the focus groups compared to those who agreed to participate; these percentages ranged from 32% to 85%. Overall, 50% (6/12) of studies with ≥60% participant retention reported offering participants monetary incentives, while only 33% (4/12) of studies reported using follow-up phone calls between the participants initial agreement and the focus group.

Conclusions There remains no consensus regarding the relative effectiveness of strategies due to a lack of reporting on the results of study recruitment efforts. Journal editors and reviewers should consider promoting a universal standard for reporting methodology in focus group manuscripts.

1Alpert School of Medicine, Department of Community Health, Brown University, 121 S Main St, 2nd Floor, Providence, RI 02912, USA, e-mail:; 2Harvard Medical School, Boston, MA, USA

Citing the CONSORT Statement and Explanation and Elaboration Paper: What's It All About?

[Updated September 10, 2009]

David Moher,1 Mary Ocampo,1 Kenneth Schulz,2 Sally Hopewell,3 and Douglas G. Altman3

Objective The CONSORT Statement was originally published in 1996. It was updated in 2001, and published simultaneously in Annals of Internal Medicine, JAMA, and Lancet, along with a long explanatory paper. It is likely the first seriously endorsed reporting guideline. As of February 2009, the CONSORT Statement has been cited more than 2000 times and is now cited approximately daily. We set out to examine where and why the 2001 version, and its accompanying explanatory paper, is cited.

Design We conducted a cross-sectional study. To identify publications citing the 2001 version of the CONSORT Statement, we used the Web of Science. We also identified publications citing the CONSORT explanation and elaboration article and the CONSORT for cluster trials and CONSORT for harms publications. We identified citations from 2007 for convenience purposes. We developed a short 7-item questionnaire to extract citation information from each publication. We focused on which CONSORT Statement was cited, the type of article that cited it, where in the report it was cited, and what reason was given by the authors for citing it. We pilot tested the data extraction form. The information was extracted by one member of the research team with a random sample of 10% extracted, independently, by a second member of the team. Our data will be analyzed descriptively.

Results We will present results on where and why the CONSORT Statement and its explanatory paper, where cited and will provide examples of appropriate and inappropriate citations. We will present similar data for the harms and cluster CONSORT extensions.

Conclusions We are undertaking this study as part of a larger effort to more completely understand the impact of reporting guidelines.

1Clinical Epidemiology Methods Centre, Ottawa Health Research Institute, The Ottawa Hospital, General Campus, Critical Care Wing (Eye Institute), 6th Floor, 501 Smyth Road, Ottawa, Ontario K1H 8L6, Canada, e-mail:; 2Family Health International, North Carolina, USA; and 3University Centre for Statistics in Medicine, Oxford University, Oxford, UK

Inconsistent and Incompletely Reported Outcomes in Randomized Trials of Kidney Transplantation

Angela C. Webster,1-3 Lorenn P. Ruster,1 Alistair Merrifield,1 Gail Y. Higgins,2 Daniel S. Owers,1 and Jonathan C. Craig1,2

Objective Comparing treatment interventions is challenging when published trial outcomes are not consistent. The optimum result after kidney transplantation is a live patient with a transplant that functions well. We systematically reviewed kidney function measures reported in contemporary randomized trials of kidney transplantation.

Design Using the Specialised Register of the Cochrane Renal Group we identified trials of immunosuppression in kidney recipients, 2000-2007. Two authors worked independently using standardized tools, with differences resolved by discussion with a third author. We abstracted details of outcomes death, transplant loss, and measures of relative transplant function (creatinine and estimates by formula, eGFR). We examined completeness of published data and identified factors associated with complete reporting using logistic regression.

Results Of 99 trials 94 reported patient death, 91 transplant loss, 74 creatinine, and 56 eGFR. For creatinine, 59 reported mean value at trial endpoint, 16 median, 2 mean-change, and 2 number with value greater than threshold. Fifty reported mean eGFR at trial endpoint, 8 median and 2 mean change calculated using at least 4 different formulae (7 method not stated). For creatinine, 52 reported incomplete data, and eGFR 33, most commonly missing number participants contributing measurements (dependent on continued transplant function and available creatinine value at nominated time point) and/or estimate precision. The odds of complete reporting of data were significantly increased for trials sponsored by the pharmaceutical industry for death (P = .0025), transplant loss (P = .03), and creatinine (P = .02), but journal audience, journal impact factor, year of publication, geographical location of trial, and number of participants did not significantly influence data completeness.

Conclusions Inconsistent definition and incomplete reporting of outcomes in trials impedes meaningful comparison of results, may mislead users of evidence, and promotes selective over systematic use of available evidence. Standardized definitions and enforced reporting standards would counter this.

1University of Sydney, School of Public Health, Edward Ford Building A27, Sydney, NSW 2006, Australia, e-mail:; 2Children's Hospital at Westmead, NSW, Australia; 3Centre for Renal and Transplant Research, Westmead Hospital, NSW, Australia

The SPIRIT Initiative: Defining Standard Protocol Items for Randomized Trials

An-Wen Chan,1 Jennifer Tetzlaff,2 Douglas G. Altman,3 Peter Gøtzsche,4 Asbjørn Hróbjartsson,4 Karmela Krleža-Jerićc,5 Andreas Laupacis,6 and David Moher7

Objective The study protocol for a randomized trial serves as the origin of subsequent trial conduct and reporting. With recent international shifts in policy and legislation toward increased public access to information from trial protocols, these documents will become increasingly important for transparency and critical appraisal of trial methods and results. However, the completeness of protocols varies greatly, partly due to variable standards for their content. We aim to develop evidence-based recommendations for core items to address in the protocol of a randomized trial.

Design An evidence-based checklist of key items to include in a trial protocol is being developed through 3 processes: (1) systematic review of existing guidelines for protocol content identified from MEDLINE, EMBASE, Cochrane Library, book chapters, and citation snowballing; (2) systematic review of empiric evidence supporting the importance of addressing specific protocol items (searches of MEDLINE, EMBASE, and the Cochrane Library, as well as citation snowballing were used); (3) Delphi consensus process involving 3 rounds of e-mail survey (n = 96 participants from 17 countries) and a consensus meeting (n = 18 participants) (participants were selected using purposive sampling to ensure representation from key stakeholder groups).

Results We identified 27 guidelines for protocol content, only 1 of which was specific to randomized trials. None were developed using an evidence-based approach. More than 1500 articles have also been systematically reviewed to identify empiric evidence for specific protocol items. Following the iterative Delphi consensus process, the current draft checklist consists of 35 items in 5 categories.

Conclusions The evidence-based SPIRIT recommendations will benefit peer reviewers, researchers, and other key stakeholders by helping to standardize the core content of trial protocols and improve their quality, facilitate the critical appraisal of trials, and ultimately enhance transparency in clinical trials research.

1Mayo Clinic, 200 First St SW, Rochester, MN 55905, USA, e-mail:; 2Chalmers Research Group, Children's Hospital of Eastern Ontario Research Institute, Ottawa, Ontario, Canada; 3Centre for Statistics in Medicine, Wolfson College Annexe, Oxford, UK; 4Nordic Cochrane Centre, Rigshospitalet, Copenhagen, Denmark; 5Canadian Institutes of Health Research, Ottawa, Ontario, Canada; 6St Michael's Hospital, Li Ka Shing Knowledge Institute, Toronto, Ontario, Canada; 7Clinical Epidemiology Methods Centre, Ottawa Health Research Institute, The Ottawa Hospital, Ottawa, Ontario, Canada

Assessment of Randomized Controlled Trials Published in the Chinese Medical Journal From 2007 to 2008

Sun Jing,1 Han Kun,2 Qian Shou-chu,1 and You Su-ning3

Objective This study aimed to assess the reporting quality of randomized controlled trials (RCTs) published in the Chinese Medical Journal.

Design According to the hand-search guidelines of Cochrane Collaboration, we hand-searched the RCTs published in the Chinese Medical Journal from 2007 to 2008; ultimately, 32 RCT reports were enrolled in this study. Based on the importance of the 22 items in the 2001 revised CONSORT checklist, we divided the 22 items into 25 evaluation items and added the other 2, clinical trial registration and ethics, totaling 27 items (Table 11). Each RCT paper was evaluated using a 27-point table for quality assessment, and then we calculated the score for each RCT paper (full mark 27 points). To analyze the specific reasons for papers with low scores, we calculated the total score for each item of the 32 enrolled papers (full mark 32 points), respectively. We divided the 32 points into 3 categories according to the scores of papers: low quality (0-10 points), medium quality (11-20 points), and high quality (21-32 points).

Table 11. Assessment Items and Their Scores

Table 11. Assessment Items and Their Scores

Results The average score of articles obeyed the normal distribution with (13.2 ± 2.6) points, less than the half of full mark (27 points). Ten items in the evaluation checklist belonged to the high-quality category (37.0%), 5 to the medium category (18.5%), and 12 to the low category (44.5%). None of the RCT reports described the randomization implementation, blinding or masking, and clinical trial registration. Only 2 reports (6.25%) mentioned how the sample size was determined and the method used to implement the random allocation sequence. Nine reports (28.13%) described the method used to generate the random allocation sequence. Of the 32 items, the section of methods had the lowest score for poor description of sample size, randomization, and blinding.

Conclusions The reporting of RCTs published in the Chinese Medical Journal has not met the checklist of the CONSORT statement and needs to be improved. The RCT reporting should be specified in the Instructions for Authors of medical journals and be publicized among medical researchers, medical writers, readers, editors, and peer reviewers in China.

1Chinese Medical Journal, Chinese Medical Association, 42 Dongsi Xidajie, Beijing 100710, China, e-mail:; 2Chinese Journal of Ophthalmology, Chinese Medical Association, 42 Dongsi Xidajie, Beijing 100710, China; 3Chinese Medical Association, 42 Dongsi Xidajie, Beijing 100710, China

Application of Item Response Theory to Manuscript Rating Scales

Brian Budgell1 and Takeo Nakayama2

Objective Content guidelines and rating scales derived from them, for example, CONSORT and the Jadad scale, are sometimes used to generate quantitative measures of manuscript quality. The initial inclusion of items is normally based on informed, expert opinion. However, the validation of scales and individual items can only be achieved following implementation. The objective of this presentation is to demonstrate the application of item response theory to the validation of checklist and rating scale items.

Design A corpus of 62 reports of randomized controlled trials (RCTs) in traditional medicine was analyzed for the prevalence of 24 selected CONSORT checklist items. Sensitivities of the checklist items were calculated as the ratio of prevalences of checklist items in the top and bottom quartiles of reports (based on summary scores). This permitted the post hoc generation of a rating scale with greater internal validity and sensitivity.

Results Prevalences of items in the 62 reports ranged from 0.08 (95% confidence interval [CI], 0.01-0.15) for item 11b (if done, how was the success of blinding evaluated) to 0.97 (95% CI, 0.93-1.00) for item 4a (precise details of the interventions intended for each group). Normalized summary scores for manuscripts ranged from 0.38 to 0.88 (mean, 0.63). Sensitivities (discriminatory indices for top versus bottom quartiles of reports) ranged from 1.07 to 15.00. Weighting checklist items according to prevalence and eliminating items with low sensitivities permitted the post hoc generation of a rating scale with greater internal validity and sensitivity.

Conclusions The validity and sensitivity of manuscript rating scales may be enhanced by the application of item response theory. The implementation of such validated scales would likely improve the editorial and review process and also inform decisions on the inclusion of reports in systematic reviews and meta-analyses.

1Canadian Memorial Chiropractic College, 6100 Leslie St, Toronto, Ontario M2H 3J1, Canada, e-mail:; 2Department of Health Informatics, Kyoto University School of Public Health, Yoshida Konoe, Sakyo, Kyoto, Japan

The Null Hypothesis Significance Test in Health Sciences Research (1995-2006): Statistical Analysis and Interpretation

Luis Silva,1 Patricio Suarez,2 Ana Fernandez,3 Vanesa Alvarez,4 and Tania Iglesias5

Background The null hypothesis significance test (NHST) is the most frequently used statistical method, although its inferential validity has been widely criticized since its introduction. In 1988, the International Committee of Medical Journal Editors (ICMJE) warned against sole reliance on NHST to substantiate study conclusions and suggested supplementary use of confidence intervals (CIs). Little research has examined the use of these statistical methods in the light of the ICMJE recommendation. The objective of this study was to evaluate patterns since 1995 in use of NHST and CIs both in English- and Spanish-language biomedical publications with particular focus on accuracy regarding interpretation of statistical significance and conclusion validity.

Design Original articles from 3 English and 3 Spanish biomedical journals in 3 fields (general medicine, clinical specialties, and epidemiology/public health) were considered for this study. Papers published in 1995-1996, 2000-2001, and 2005-2006 were selected through a systematic sampling method. After excluding the purely descriptive and theoretical articles, quantitative studies were evaluated for their use of NHST with P values and/or CIs for interpretation of statistical significance and relevance in study conclusions.

Results Among 1043 original papers, 874 were selected for review. The exclusive use of P values was less frequent in English-language publications as well as public health journals; overall such use decreased from 41.3% in 1995-1996 to 21.2% in 2005-2006. While the use of CIs increased over time, the fallacy of significance (to homologate statistical and substantive significance) appeared very often, mainly in journals devoted to clinical specialties (81.3%). On papers originally written in English and Spanish, 14.6% and 9.6% mentioned statistical significance in their conclusions, respectively.

Conclusions Although the exclusive use of NHST decreased over time in publications reviewed, this predominant pattern of statistical analysis remains slow to change. The communication of statistical results in the clinical setting, particularly among publications in Spanish, still presents considerable deficiencies.

National Center for Medical Sciences Information Research, 27th, 110, Havana, 10400, Cuba, e-mail:; 2Cabueñes Hospital, Gijón, Spain; 3Oviedo University, Oviedo, Spain; 4Secundary School, Oviedo, Oviedo, Spain; 5TeleCable of Asturias, Asturias, Spain


Why and How Do Journals Retract Articles?

Elizabeth Wager1 and Peter Williams2

Objective Cases submitted to the Committee On Publication Ethics (COPE) suggest that journals and publishers do not have consistent policies about when and how to retract articles. As the first stage in developing retraction guidelines, we investigated why and how journals retract articles and editors' experiences of the process.

Design Analysis of retractions in PubMed from 1988 to 2008 with English text. We obtained all retractions for 2005-2008 and a random sample from 1988 to 2005 from journals available at University College London. Both authors extracted data and achieved consensus on classification. A purposive sample of editors was interviewed to learn about their experiences of retractions.

Results We analyzed 312 of the 529 retractions included in PubMed from 1988 to 2008 and interviewed 5 editors about 7 cases. Articles were retracted because of data fabrication (5%), data falsification (4%), plagiarism (16%), redundant publication (17%), disputed authorship/data ownership (5%), inaccurate/ misleading reporting (4%), honest research errors (28%), non-replicable findings (11%), or other/no stated reason (9%). Some journals also banned authors of plagiarized or redundant publications. Many retractions were issued by all or some authors (63%) but a significant proportion were issued by editors/publishers (29%) or others (8%). During interviews, editors described the considerable difficulties and significant workload in retracting articles when authors are uncooperative. Most retractions (87%) were of full papers reporting primary data but 13% were other article types (eg, literature reviews or letters). The retracted publications covered basic biomedical research (58%), clinical medicine (23%), and other subjects (19%) reflecting the composition of PubMed.

Conclusions Analysis of PubMed retractions combined with experience at COPE and published cases where journals have not retracted fraudulent articles indicates a considerable diversity of approach regarding how and why articles are retracted and sanctions imposed by journals for misconduct. Interviews suggested editors would welcome more guidance.

1Sideview, 19 Station Rd, Princes Risborough, Bucks, HP27 9DE, UK e-mail:; 2University College, London, UK

Round Up the Usual Suspects? Involvement of Medical Writers and the Pharmaceutical Industry in Retracted Publications

Karen L. Woolley,1,2 Mark J. Woolley,2 Rebecca A. Lew,2 Narelle J. Bramich,2 Julie A. Ely,2 Serina Stretton,2 Julie A. Monk,2 and Janelle R. Keys2

Objectives (1) To quantify, for the first time, how involved declared medical writers and the pharmaceutical industry have been in publications retracted for misconduct; (2) to investigate factors associated with misconduct retractions.

Design We used PubMed (limits: English, human, January 1966-February 2008) to identify publications retracted for either misconduct or mistake. Standardized definitions and data collection tools were used (interrater reliability = 100%), and the mistake retractions served as the control group. Data were analyzed by an independent academic statistician.

Results Of the 463 retractions retrieved, 213 (46%) were misconduct retractions. The involvement of declared medical writers or the pharmaceutical industry was very low or nonexistent for misconduct retractions. Compared with mistake retractions, misconduct retractions were significantly associated with absence of declared medical writer or pharmaceutical industry involvement, single authorship, first author having at least 1 other retraction, or an affiliation in a low/middle-income country (Table 12).

Conclusions The involvement of declared medical writers or the pharmaceutical industry in misconduct retractions is very low or nonexistent. Our data challenge popular opinion and justify increased attention on factors that are associated with misconduct retractions.

Table 12. Involvement in Retracted Publications

Table 12. Involvement in Retracted Publications

CI, confidence interval; NC, noncalculable because of zero misconduct retractions.

1 University of Queensland, Brisbane St Lucia, Australia, and University of the Sunshine Coast, Queensland, Australia; 2 ProScribe Medical Communications, 18 Shipyard Circuit, Noosaville Queensland 4566, Australia, e-mail:


A Survey of Past Participants in the Annals of Emergency Medicine Editorial Board Fellowship Program

Teri Reynolds1 and Michael Callaham2

Objective While initiatives to increase the number of physician researchers have been widespread, little has been published on the recruitment and training of physician editors early in their careers. Annals of Emergency Medicine established the Resident Editorial Fellow program in 1998 with 1 to 3 senior emergency medicine residents selected annually in a competitive process. These resident "fellows" serve during their penultimate or final year of residency and participate in editorial meetings, review manuscript development, shadow decision correspondence, complete article reviews, and oversee article selection for the resident section. We profiled the program by surveying past participants on their subsequent positions and the perceived impact of the fellowship.

Design In early 2009 we emailed a 7-question survey to all 14 prior fellows and asked them submit a curriculum vitae. The survey was self-directed and required free text responses.

Results A total of 14/14 completed the survey. Seven (50%) reported some editorial experience prior to the fellowship, largely on newsletters or nonscientific publications; 12 (86%) reported prior research. Nine (64%) currently serve on the editorial board, appointment to which is based on editorial performance and is not part of the fellowship. Thirteen (93%) currently review for at least 1 scientific journal and 7 (50%) for more than 2. In addition, as reviewers at Annals of Emergency Medicine, prior fellows on average are rated 4 (mean, 4.0, median, 4.2) out of possible 5, well above the average score for experienced reviewers. Ten (71%) hold full-time academic positions (2 recent fellows are still residents and not counted among these 10). Five (36%) now serve as their department's clinical research director, and 12 (86%) teach students or residents. Twelve (86%) identified mentored peer review, and 8 (57%) attending editorial board meetings as among the best parts of the program. Eight (57%) felt the fellowship was responsible for their current involvement in editorial work. Four (29%) felt that it increased their commitment to research. Fourteen (100%) report that the fellowship fully met or exceeded their expectations.

Conclusions Our results describe a young cohort who remain engaged in academic medicine and editorial process (as reviewers and editors) and who are in positions to mentor future editors and researchers. Respondents describe a high level of satisfaction with the program, and many remain involved with the editorial board.

1Highland Hospital Emergency Department, 1411 E 31st St, Oakland, CA 94602, USA, e-mail:; 2University of California, San Francisco, San Francisco, CA, USA

Trial Registration

Association of Trial Registration With the Results and Conclusions of Published Trials of New Oncology Drugs

Nicolas Rasmussen,1 Kirby Lee,2 and Lisa Bero3

Objective To determine whether advance registration reduces bias against statistically insignificant results in the randomized controlled trial literature concerning new drugs.

Design This is a cross-sectional study of published reports of clinical trials evaluating the efficacy of drugs approved for new indications in oncology (where registration was first widely practiced) from 2000 through 2005. Relevant trial reports were identified using PubMed and the Cochrane Library. Evidence of trial registration in the year prior to publication was obtained by a search of public trial databases and corporate registries. Data on blinding, results for primary outcomes, and conclusions were extracted independently by 2 coders. Univariate and multivariate logistic regression identified associations between independent variables and favorable results and conclusions.

Results In univariate analyses, reports of trials unambiguously registered prior to publication (54/137) were more likely to describe statistically significant efficacy results and reach conclusions favoring the test drug (for results, odds ratio [OR], 1.77; 95% confidence interval [CI], 0.87-3.61). Reports of trials sponsored by the test drug maker and with larger sample sizes were significantly more likely to favor the test drug. In multivariate analysis, reports of prior registered trials again were not less likely to favor the test drug (for significant results: OR, 1.50; 95% CI, 0.61-3.68); larger sample sizes and surrogate outcome measures were statistically significant predictors of favorable results, while nonstringent blinding approached statistical significance. Subset analysis similarly showed that prior registered trial reports were more likely to describe statistically significant results favoring new drugs, among 109 industry-sponsored studies only (OR, 1.57; 95% CI, 0.60-4.11) and among 115 main reports only (ie, underlying trials) (OR, 1.30; 95% CI, 0.50-3.34). (Table 13.)

Table 13. Association Between Characteristics of Articles and Statistically Significant Outcome or Conclusions That Favor the Test Drug: Multivariate Logistic Regression (N = 137)

Table 13. Association Between Characteristics of Articles and Statistically Significant Outcome or Conclusions That Favor the Test Drug: Multivariate Logistic Regression (N = 137)

Conclusions Prior registration of trials alone did not decrease the trial literature's bias against statistically non-significant results. Additional mechanisms to ensure full reporting of trial results are necessary.

1National Drug and Alcohol Research Centre, University of New South Wales, Sydney NSW 2052, Australia, e-mail:; 2Department of Clinical Pharmacy, University of California, San Francisco, San Francisco, CA, USA; 3Clinical Pharmacy and Health Policy, University of California, San Francisco, San Francisco, CA, USA

Characterizing Sponsor-Imposed Restrictions on Disclosing Results of Clinical Trials

Tony Tse, Rebecca J. Williams, and Deborah A. Zarin

Objective Concern about undisclosed conflicts of interest and associated withholding of trial data by sponsors is growing. An FDA Amendments Act (FDAAA) provision mandating public disclosure of agreements that restrict the principal investigator's (PIs) ability to disclose results became effective on September 27, 2008. includes the following categories that were based on published results of surveys of trial sponsors and organizations that conduct trials: short embargo (≤60 days), no content control; longer embargo (>60 days and ≤180 days); no content controll; and other disclosure restrictions. The objective of this study was to characterize the types of sponsor-PI agreements reported to and propose options for improving the existing categorization scheme.

Design Entries from all 183 results records posted at ClinicalTrials. gov (as of May 15, 2009) were evaluated, including all full-text descriptions of the Other category.

Results Of the 154 studies for which PIs were not employees of the sponsor, 117 (76%) indicated a restriction: 14 (12%) impose short embargoes, 14 (12%) impose longer embargoes, and 89 (76%) described Other restrictions. Among the 117 studies reporting restrictions, there were 33 phase 1-2 trials and 82 phase 3-4 trials; the majority (111/117) were sponsored by industry. Within the Other category, the following issues were addressed: (1) results communications for multisite studies (54/89); (2) "fixed" delays after study completion, including for publication of multisite studies (40/89); (3) sponsors' rights to review, edit, and/or approve results communications (66/89); and (4) embargoes (80/89). Each sponsor used consistent text for its other entries. (Table 14.)

Table 14. Sponsor-Imposed Restrictions Addressed in Other Category in Results Records Posted at (as of May 15, 2009)

Table 14. Sponsor-Imposed Restrictions Addressed in Other Category in Results Records Posted at (as of May 15, 2009)

Conclusions Of the trials with sponsor-imposed restrictions on results disclosure, 76% were not captured by existing embargo categories at Our analysis suggests that additional categories would more accurately reflect common restrictions for multisite studies, fixed delays, sponsor control of content, and embargoes. Developing improved categories could enhance transparency by providing more consistent, comprehensive descriptions of PI-sponsor agreements.

National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894, USA, e-mail: