A Document Maven Looks at the Pfizer Vaccine Paper in the New England Journal of Medicine

By Lambert Strether of Corrente

As it turns out, I have narrow but deep expertise relevant to the discussion of the New England Journal of Medicine (NEJM) paper on the Pfizer vaccine study, “Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine” (“Safety and Efficacy”), so I thought I’d weigh in, albeit a little late, on the protocols and “exclusion criteria” discussion.

In a past life, several careers ago, I consulted to various large firms, among them medical publishers, on complex document structures; my deliverable would be a formal definition of such structures, plus documentation. For example, and without getting into the weeds on syntax, the structure of all posts at Naked Capitalism includes editorial elements like: a headline (one, required), author (at least one, with an optional bio), and then a long string of running text elements (paragraphs, images, tables, lists, and so on) in any order and any amount greater than one. There are also elements that occur within running text, like emphasis (italic), or links (with their URLs). Run your eye up and down this page, and you will see what I’m talking about. (No, this is not an exhaustive list, and there are plenty of decisions to me made about representation; but let’s not go into those weeds.) I loved that career, because it allowed me to combine my skills as a humanities major with exacting technical work.

Publishers liked having formal definitions of their document structures because (again leaving out syntax) it was possible to have a digital editorial system that converted a single document to several formats without losing information. The NEJM, in fact, faces that very requirement. If you look at the article in question, you will see that it begins with the article type (“ORIGINAL ARTICLE”), then a headline (“Safety and Efficacy…”), then the authors, then two buckets, one the article, and the other figures/media. The article begins with an Abstract, which begins with Background, Methods, and so forth. The NEJM delivers documents structured in this wise in at least three media: (1) online, (2) PDF, both of which we common readers can read, and (3) in a printed version, which we cannot. (There may also be versions for tablets, library readers, even CD; I don’t know, but you can see the advantage of having a single master document composed according to a formal structure and then converted, as opposed to three or more different documents, all of which must be kept in synch with each other.) Of course, the publisher makes an implicit bargain with the reader that all the versions are identical or, if different, are labeled as such[1]. Or not, as we shall see!

Readers (many of them) like documents that are consistently structured (the formalism is, of course, hidden from them) because structure makes it easy to find what they want. Readers expect to be able to flip to the end of a book to find the index, for example, and so if they encountered a book where the index was in the middle, they would justly conclude either that the publisher’s printer had committed a terrible error, or that the author was some sort of pomo buffoon bent on japery. Readers who value their time, like doctors — I well remember a doctor in my study group, who would exclaim “I could be treating patients!” whenever he felt time was being wasted — appreciate structure even more; if they want to know the methods, they go to the Methods section; if the results, the Results section. If editorial matter does not appear where it is supposed to appear, such readers can and will justly conclude the matter is missing, because that is the bargain the publisher made with them through the pattern and practice of structuring the document consistently, issue after issue after issue.

Having set the scene at tedious length — I really did love that work — we now come to IM Doc, who was — I contend — betrayed as a reader by NEJM’s failure as a publisher to uphold its end of the bargain it made with him when it butchered its document structure and its own editorial standards at a critical juncture[2].

IM Doc, in “An Internal Medicine Doctor and His Peers Read the Pfizer Vaccine Study and See Red Flags [Updated]” wrote:

First, a critical issue for any clinician is “exclusion criteria”…. From my reading of this paper, and the accompanying editorial, one would assume there were no exclusion criteria.

The “exclusion criteria” are to be found through another document element called a Protocol (hat tip Rick in Oregon) about which more in a moment.

They certainly are never mentioned…. And now we know there were exclusion criteria, not because of anything Pfizer, the investigators, or the NEJM did but because of stunning news out of the UK. UPDATE: I will address this at greater length, but an alert reader did find the study protocol, which were not referenced in any way that any of the nine members in my review group could find, nor were they mentioned in the text of paper or editorial, as one would expect for a medication intended for the public at large. I apologize for the oversight, but this information was not easy to find from the article, not mentioned or linked to from the text of the article, the text of the editorial, in the “Figures/Media,” or in a supplemental document.

I need to disentangle this a bit, because I think IM Doc is generously taking on a bit more responsibility than he or needed to.

First, we need to consider what is meant by “the article.” As we know, there are at least three versions of “Safety and Efficacy”: The online version, the PDF version, and the printed version. Which one was IM Doc reading? Certainly the PDF version (assuming the printed version had not yet arrived in the mail[3]). We know that from IM Doc’s text, because he tells us so himself: He cites to “page 5, in Table 1,” to “tables on pages 6 and 7,” and “to tables on page 7.” The online version has no pages. We also know that because of the social setting in which “Safety and Efficacy” was read. Reader KLG described this setting (IM Doc’s “journal group”) in Yves’ follow-up piece as follows:

IM Doc [was part of a] like-minded journal club (a common mechanism for all biomedical scientists and many clinicians to keep up with current developments)

And IM Doc describes the meeting:

we had an ad hoc meeting of our Journal Club to discuss the NEJM article

It is extremely difficult for me to imagine that each member of the Journal Club, whether on Zoom or in person, was reading the online version on a device (one person an Android phone, another on an iPhone, the fourth and fifth on two iPads of different sizes, the sixth on a laptop, and so on). It’s far more likely they were reading the PDF, and most likely printed out, because that way, when one reader says “flip to Table 1 on page 5,” all the readers can easily do that simultaneously (rather than swiping, tapping, scrolling, and so on).

So now we come to the first extremely simple reason why IM Doc and his study group, working from the PDF, could not find the Protocol which contained the exclusion criteria.

They could not find the Protocol element in the PDF because it was not there.

NEJM’s document structure treats protocols as separate, external documents[4], and puts a link to them, called Protocol, within an element called Supplementary Material[5], which is placed at or near the end of the article in the online version only. Here is the online version, which contains the Supplementary Material:

A Document Maven Looks at the Pfizer Vaccine Paper in the New England Journal of Medicine 2

And here is the PDF version, which does not:

A Document Maven Looks at the Pfizer Vaccine Paper in the New England Journal of Medicine 3

Once again, the Supplemental Material, which includes the Protocol with its inclusion criteria, is critical. As IM Doc writes:

First, a critical issue for any clinician is “exclusion criteria”. This refers in general to groups of subjects that were not allowed into the trial prima facie. Common examples would include over 70, patients on chemotherapy and other immunosuppressed patients, children, diabetics, etc.. This issue is important because I do not want to give my patient this vaccine (available apparently next week) to any patient that is in an excluded group. Those patients really ought to wait until more information is available – FOR THEIR OWN SAFETY. And not to mention, exclusion criteria exist because the subjects in them are usually considered more vulnerable to mayhem than average subjects. From my reading of this paper, and the accompanying editorial, one would assume there were no exclusion criteria. They certainly are never mentioned.

Wnat possible excuse is there for leaving the Supplemental Material out of the PDF? Saving a few bytes?[5]

There is a second, slightly less simple reason why IM Doc and his study group, working from the PDF, could not find the Protocol which contained the exclusion criteria. This concerns how NEJM formats references to the Protocol element in the running text. Here is how references in running text to elements in Supplementary Materials should be formatted, according, at least, to the Journal of the American Medical Associations style guide:

A Document Maven Looks at the Pfizer Vaccine Paper in the New England Journal of Medicine 4

NISO (the National Information Standards Organization) agrees:

A Document Maven Looks at the Pfizer Vaccine Paper in the New England Journal of Medicine 5

So, one would expect references in running text to the Protocol to appear along the lines of “see the Protocol in Supplementary Materials, to flag the “critical part of the evidence” for the reader — that’s one reason we have capitals — and to guide them to its location. That is not what NEJM did. Here is one example:

A Document Maven Looks at the Pfizer Vaccine Paper in the New England Journal of Medicine 6

There’s no formatting in the PDF to indicate what “the protocol” is, and no indication of where it is to be found. All the other mentions of “protocol” are like that. So no wonder — in the trackless wilderness of NEJM’s house style — the Protocol was difficult to find.

* * *


(1) NEJM’s editorial practice removes the Supplementary Material element from the PDF version, and hence critical material like the Protocol element that contains the link to the external protocol document, which contains the “exclusion criteria” sought by IM Doc. It is entirely reasonable for readers to expect both online and PDF versions to be identical with respect to critical material.

(2) NEJM’s editorial practice permits the copy desk to format text references to the Protocol element in lowercase, with no indication of the element in which it is to be found. It is entirely reasonable for readers to expect references to critical, named elements to be formatted according to JAMA and NISO standards, wihc at a minimum require initial caps.

(3) The net result of NEJM’s editorial practice was that a study group of nine practicing, busy, and stressed physicians were unable to find critical material affecting treatment because (a) the material was not there to be found and (b) there was no signal to show where it was to be found. NEJM should know its readers better.

(4) NEJM should fix its editorial practice. It’s broken.


[1] If matters were otherwise, readers would be put in the impossible and absurd position of having to read all the versions of any given paper and check each for consistency. As it turns out, this is exactly the situation the NEJM has put its readers in. (This reminds me very much of Obama organizers who would say “Go look at the website!” when questioned on details of policy. No.)

[2] The misadventures that came to happen with an article that was one of the most important NEJM published this year, and perhaps for many years, would be a matter for internal inquiry at NEJM. Since the editor-in-chief of NEJM was one of the authors of the “EDITORIAL” that accompanied “Safety and Efficacy,” such an inquiry would presumably be easy to initiate.

[3] The printed version and the PDF version should be identical in any case, because generally the PDF is what the publisher sends to the printer, albeit with the addition of crop marks, etc.

[4] Treating the protocol as an external document makes sense, because the protocol is often quite long, and will be formatted in the style of whoever is submitted the article — in this case, Pfizer — and not in the journal’s house style.

[5] There is another element, confusingly also called Supplemental Appendix and placed at the end of the article. A link to the Supplemental Appendix seems to have made a brief appearance somewhere near the top of the screen for the online version, and then vanished mysteriously, as described by both Yves and reader KLG here. It did not and would not have contained the Protocol element, which goes in the Supplemental Material section.

[6] PDFs can also be made clickable, so not only could the Protocol element have been included within in, it could have linked to the protocol.

