Scientific Journals of the Future
by Steven M. Bachrach, Trinity University
San Antonio, TX
The 20th century has witnessed a great many technological advances,
but perhaps none have had as great an impact upon everyday lifestyles as
changes in our means of communication. We have seen the development of true
mass communication. Telephones are ubiquitous, even in airplanes. Cellular
phones, beepers, and messaging services make us accessible anywhere on the
globe. Offset printing and high-speed copiers allow for the manufacture of
books only days after the events written about have occurred. Film has evolved
from the silents to the talkies to color to Technicolor to video, and virtual
reality entertainment centers are just around the corner. Television is perhaps
the most dominant communications medium and it, too, has progressed from the
vacuum tube-filled console with the round black-and-white screen to the
portable color set with video cassette unit, and flat-panel, high definition
models are in the offing. Delivery of television signals via an analog signal
sent through the air is being supplanted by cable systems and by digital
satellite systems, each offering over 100 channels. Since the beginning of the
'90s, the Internet has fostered a new realm of communications, the globally
connected computer network where millions chat in specially created virtual
environments and send email to children at school overseas.
Yet, while all of these changes have radically altered our society,
the primary means by which scientists communicate with each other has remained
frozen in time, unchanged in well over 100 years. We scientists still produce
the written article, published in the specialized scientific journal, that
appears as ink applied to paper.
Our libraries are jammed from floor to ceiling with printed journal
after printed journal. The trend for growth in scientific publishing is truly
staggering. Using the field of chemistry as an example of what is typical in
all scientific disciplines, Figure 1 shows the number of abstracts indexed by
Chemical Abstracts for the past 90 years. Similarly, the number of scientific
journals has grown, with every new subdiscipline launching its own journal or,
more typically, multiple journals.

This is not to say that scientists are doing science in the same
fashion as their predecessors at the turn of this century. Technology has
completely changed how scientists collect data, but what is pertinent here is
that we process the data and information using state-of- the-art technology.
For example, visualization of data has been revolutionized by the appearance of
relatively low-cost workstations. Large data sets now can be represented using
high-resolution color images and animations. Climatologists simulate weather
over succeeding weeks and watch the storms progress across the face of the
simulated earth. Seismologists visualize the innards of the earth, searching
for undiscovered pockets of oil. Physicians make use of MRI (magnetic resonance
imaging) to visualize internal organs in a non-invasive manner. Physicists
simulate the capture of the moon by the earth, and chemists watch the
progression, in molecular form, from small carbon clusters to bucky balls to
the formation of soot. Molecular biologists use virtual reality to manipulate
enzymes and DNA strands. These animations are necessary pieces of the working
scientists' arsenal, and it is an extremely rare desk in a laboratory that does
not have a PC (personal computer) or workstation glowing with some fancy color
graphic.
And yet, even though the use of color graphics, animations, sound,
and extremely large data sets are now routine and essential components of the
scientific method and process, all of these are omitted when the time comes to
distribute the knowledge among our colleagues. Since we write journal articles
destined to appear in print on paper, we can't include a movie or sound. Color
images are still (for most journals) too expensive to print. Large data sets
consume a significant number of the limited pages available in journals and are
at best relegated to supplements, if not left out entirely.
A New Paradigm
As already mentioned, the Internet has become a major avenue for
mass communications, spurred by a number of developments. First, the Internet
has penetrated most countries around the world. It is now available to many at
home, to most academic centers, and to many businesses around the globe. The
cost of using the Internet is quite low. Last, the power of desktop PCs has
become so formidable that multimedia now can be handled reasonably in many
environments, from the home, to office, to university, to government research
institutions.
At the same time, continued printing of scientific articles on
paper is becoming less effective, and not only because of the limitations of
print in a multi-media world. Libraries are hamstrung by ever tightening
budgets, so the myriad journals can no longer be acquired. Space is limited for
continued storage of printed materials. The result is that published papers
will reach fewer readers, since access to the printed journals will become more
difficult.
The Internet offers solutions to these problems. With this new
medium comes new opportunities. The time is ripe for a dramatic, profound shift
in how scientists should (and will) communicate in the near future.
Instead of publishing articles on paper, publication will occur
electronically, with an article appearing as a collection of files on the
Internet. With the global reach of the Internet, access to information stored
on any computer is potentially available on any desktop. There is no need to
walk to the library or to order a photocopy via interlibrary loan.
As discussed in more detail elsewhere in this book, the economics
of electronic publication may prove to be significantly less costly than print
publication. This can be especially true if authors take a larger role in the
publication process. Lower costs can lead to greater access of materials to all
scientists.
The growth rate of storage capacity of computers is even faster
than the growth rate of publication (Figure 1). This provides hope that our
increasing production of articles can be stored and managed by using the
distributed storage capabilities of the Internet. In fact, with storage
becoming so cheap and available, page limitations may become a restriction of
the past. There will be no pressure to remove data, spectra, etc. simply to
meet a length limitation established by a publisher.
Regardless of the economics of electronic publishing, there is no
doubt that e-publishing will allow fully enabled scientific discourse. There
are no added costs to producing a color image rather than a black-and-white
one. Videos can be created and saved in a digital file that can then be viewed
on any computer that has the proper software. The same is true for sound.
Perhaps more important and revolutionary is that e-publishing
brings data into the hands of the readers in a way that facilitates interaction
and discovery. The best way to describe this is by providing a concrete example
drawn from chemistry. Chemists are very interested in the structure of a
molecule, and one way of obtaining this is via x-ray crystallography. Once a
crystallographer has determined the structure, she must select a single view of
this molecule for presentation in the printed article. This is a static
("dead") image. A reader of this work may be interested in a view of the back
side of the molecule. Unfortunately, the reader is forced into some very
time-consuming manipulations to obtain the necessary view, either by entering
the structure by hand into her own molecular viewing program or by obtaining
the structure from a database and feeding it into the viewer. With electronic
publishing, what is published is not the "dead" image but the coordinates of
the molecule itself. The reader uses a browser that will automatically direct
the coordinates into the appropriate viewer, and the reader is free to
manipulate the structure to her heart's content. This concept of interactive
manipulation of data is possible in all disciplines, be it a matrix of wind
speeds in a hurricane, the scattering tracks made by particles in a collider,
the three-dimensional image of a brain tumor obtained in an MRI (magnetic
resonance image), or the DNA sequences of a dinosaur. This interactivity goes
far beyond what is capable in traditional print and is perhaps the most
compelling reason for the shift to electronic publication of scientific
articles.
How an article is published on the Internet
Publishing an article on the Internet can be very simple. An author
writes the manuscript (creates a file) and places it on a computer that is
accessible by others. The question then becomes, what format should be used for
storing the article?
As of early 1998, there are two readily available and simple
formats for saving documents. They differ quite dramatically in their intent.
Portable Data Format (PDF) attempts to maintain the look and feel of the
printed document. It provides a means for storing a document in a digital form
that will appear identical on any screen and when printed from any output
device, such as a laser printer or offset press. PDF provides for the complete
typeset and layout control that publishers have been using for decades, meaning
that text and graphics can be incorporated on a single page, font and point
size are dictated and preserved, etc. Publishers can deliver a PDF file with
complete assurance of how the document will appear. A PDF file is created from
a wide variety of original sources, such as a text or graphics program, and
then viewed using (as of this writing) Adobe Acrobat, a freely distributed
program. PDF can incorporate hyperlinks to other documents or other sections of
a single document. However, the user is not really dealing with text, since PDF
files cannot be directly edited; one cannot cut text out of the PDF file and
paste it elsewhere—one is really dealing with an image.
The alternative format is hypertext markup language (HTML), which
is a subset of the very powerful page layout language SGML (standard
generalized markup language). Text is marked up to identify structure within
the document. For example, tags are inserted around text to indicate that this
is a header, or a footnote, or an author's name, or in boldface, etc. A program
then interprets these tags and formats the text appropriately. Currently, HTML
has a limited set of layout features, but as HTML continues to evolve, authors
will be able to produce ever more specialized layouts. Files written in HTML
can be viewed quite easily using a browser such as Netscape Navigator or
Microsoft's Internet Explorer.
The markup approach has a number of distinct advantages over the
PDF approach. One is that the reader can dictate some of the presentation
features, such as the font and point size, and the size of the window. But of
greater significance to scientists is that markup languages are inherently
extensible. Currently, HTML deals primarily with how text should be displayed.
For a scientist, some portions of the text have deeper meaning than the words
themselves. The name of a compound is more than just a string of text; it
contains chemical information. The name "benzene" implies a molecular formula
of C6H6, a molecular weight of 78.11, and a structure
shown below. A chemical markup language would allow for this type of
information to be contained within the document itself. A
chemically-intelligent browser could then render a 2-D or 3-D image of the
molecule instead of just the text. Another example is units. While most
scientists use the SI (International System of Units), it is not universal nor
is it often convenient. A bond energy might be represented as 100 kcal/mol
within the text. Chemical markup would allow this number to be properly
identified as an energy unit, and a browser could provide instant conversion to
an alternative unit, such as ergs. The beauty here is that data can be
represented not only as text but also as manipulable information.
Benzene
 |
| Benzene |
Initiatives supported by both Netscape and Microsoft are creating
extensions to HTML, under the name XML. The first example of XML actually was
developed in the field of chemistry—CML, chemical markup language. A
proof-of-concept of CML already exists, with a definition of the tags and a
working browser.
Once the article is written and saved in an appropriate format, the
author simply mounts the files (either HTML or PDF), and any associated
graphics or interactive media, on a web server, which makes the document
available to the public.
The web therefore can be viewed in a sense as everyone's personal
vanity press. For most scientists, this perspective is anathema. Scientists
demand quality control and peer review. If the web ends up as a realm where
everyone places articles without peer review, there is fear that too much
garbage will be available and that it will be difficult, if impossible, to
locate the "good" science, and that a giant step backwards will have been
taken. One of the strong arguments in favor of the journal system is that it
provides a framework for peer review, thereby assuring readers that the
contents have had at least some scrutiny before publication.
There is no reason to believe that the peer review system must
perish in an electronic environment. Furthermore, even with the ability to
easily and readily "self-publish" via the web, it is not a given that the
journals will cease, especially if there is a demand for the value-added
services a journal can provide. The key then is to recognize what the functions
of a journal are and how these may be provided in an electronic medium.
An Aside—The Present State of Electronic Journals in Chemistry
Many electronic chemistry journals are available, but the vast
majority of them take little advantage of the opportunities provided. Most
current electronic journals as of early 1998 are simply electronic delivery of
their print counterpart. In fact, virtually all are offered in both print and
electronic form. Since the publishers use the print version as the
authoritative source, the electronic version is simply a duplicate of the
print. Articles are usually made available as PDF files, though some offer HTML
versions as well.
The key point here is that there is almost no enhancement of the
publication to take advantage of the electronic nature of publication. The
American Chemical Society journals do include hyperlinks to other articles
published by the ACS, so one can quickly obtain an article referenced in the
original article (assuming of course that the reader subscribes to both
journals). Electronic publication of the articles is generally prior to the
print versions, so access to the information is improved. But in general,
interactive tools, 3-D structures, full interactive spectra, color graphics,
and animations are not included in any of these electronic versions. For most
journals, there is no procedure for authors to include electronic enhancements.
Some electronic journals, which are analogues of their print cousins, do allow
for deposition of these electronic enhancements within the supplementary
materials, which are made available through the web. However, they are still
not directly incorporated within the articles.
Only a handful of electronic chemistry journals take advantage of
the enhanced presentation aspects of electronic publication. The earliest was
the Journal of Molecular Modeling, which does include color graphics and
manipulative structures.
The Chemical Educator has had articles that include interactive
java applets to demonstrate new educational techniques. The recently launched
Internet Journal of Chemistry offers authors extensive
opportunities to publish enhanced articles and begins to incorporate some
aspects of chemical markup within the articles.
A Model for Electronic Journals
We must recognize that in an electronic publishing regime, it is
not necessary for the journal to be the distributor of the information. In the
print world, this is essential; the journal collects the articles, prints them
on paper, binds them, and ships them to subscribers.
On the Internet, it is irrelevant where a document resides and how
it gets delivered to a reader. As long as the network is active and the server
is functional, documents will be delivered. A journal publisher may find it
useful to provide a central repository of documents and to function as a
long-term archive, an issue addressed in other chapters of this book. But
physical transfer of files from an author to a central computing facility is
not necessary.
In fact, a central depository may be inefficient. If the central
system goes down, the entire journal is gone. Now, mirror sites can be arranged
to provide redundancy, while a distributed collection of documents can take
advantage of local resources, e.g., large storage capacity or specialized
computing needs for certain applications embedded within an article.
So what benefits should a journal provide in electronic media?
First, peer review and quality control remain the central functions. Journals
will provide the "stamp of approval" indicating that the article has been
judged by experts. Second, the journal provides a collection of these accepted
articles that together provide a continuously evolving picture of the state of
a discipline. A single, isolated scientific article has little context and
little value. A collection of articles, like this book for example, fleshes out
a topic and provides a broader context for evaluating the impact and importance
of the work. Thus, even though the documents may be geographically distributed
over many sites, the journal offers a collective index that maps onto the
Internet. In other words, journals become overlays to the collective content of
the Internet, a road map of approved content.
Thus the journal can provide a table of contents, author and
subject indices, classification schemes, hyperlinks through the citations, and
search engines across the contents. Searches could be conducted within an
article, across a range of articles, or across the entire journal. Search tools
can be quite complex and specific to each discipline. In a sense, the journal
becomes a database, with articles as its content. It is the organizational
structure that the journal provides that is the key here. If the Internet
becomes a complete world of "self-publication" with no journals, then
navigation and location of relevant materials will become horrifically time-
and resource-consuming. But the journal home page can be the stepping-off point
towards locating information.
Third, the journal can provide a mechanism for subscribers to pose
questions and comments on the article to the author and to the other readers.
These questions and answers can be attached to the article, enabling readers to
move quickly from the text to comments and back. In a sense, a discussion group
or mini-conference can be created for each article, providing a "living
document" environment for every work.
The distributed medium also suggests some novel methods for peer
review. Articles can be deposited in a preprint archive and made available to
the community. This archive can collect comments and reviews from the community
at large instead of the one or two referees typically employed in the print
review process. Allowing community review may provide a broader stamp of
approval and increase access to less traditional work that may have a difficult
time in the typical review process. Perhaps, once an article generates a
critical mass of favorable comments from qualified reviewers or subscribers, it
is accepted by a journal.
It is clear that electronic publication can serve the scientific
community in new ways. The key to publication is information distribution.
Electronic publication facilitates this in many ways, some quite novel.
Electronic distribution is likely to be less expensive that print. Access by
scientists around the world is likely to be much greater and easier. Documents
can be made available in less time. Information content is boosted via the
electronic medium, allowing for publication of audio, video, large data sets,
and interactive tools. The tyranny of page limits becomes obsolete, and while
conciseness will remain next to godliness, the advantage of allowing all
quality work to be accepted surely outweighs the disadvantage of some verbose
contributions. Peer review can be made more inclusive and can empower the
community as a whole.
The future of electronic publication holds out hope for a true
information revolution.
Back to the Table of Contents
|