Proposed Change in Intellectual Property Law Threatens Research and Education

July 15, 1998

Inside Washington
Fred W. Weingarten

In May, the House of Representatives voted to pass the "Collections of Information Anti-piracy Act" (H.R. 2652). The bill makes an unprecedented and fundamental change in intellectual property protection, a change that challenges the very foundations of research and education. Yet, so indifferent or unaware were members of the House to the implications of the legislation that they didn't bother to take a recorded vote, but pushed it through by voice vote under suspension of the rules. Only George Brown (D-CA) stood up and spoke against the legislation.

Information and Scholarship
To understand why this legislation, if passed into law, represents such a major change, let's step back a few years. As an engineering student, I had to become familiar with engineering handbooks of various kinds. Readers may remember such books: fat tomes of data, with very brief explanatory text, consisting of tables of mathematical constants, trigonometric and log tables, properties of materials, mathematical formulas, and so on. During the academic year, I used the handbooks in engineering courses; in the summer, I used them on the job to analyze the thermal properties of rocket nozzles.

The publishers of the handbooks, of course, copyrighted their volumes. Were I to have photocopied one of them, in large part or in its entirety, and tried to sell the copies as my own creations, I would have been violating the law. Even as a young graduate student, I recognized that. But I would never have dreamed that the publishers could or would think to assert rights over the data themselves, either over their use or over their being republished (whether in exactly the same form or not) in another work. They made their money by selling the books; those who owned the books, in turn, were free to use the data for whatever purposes they wished-for commercial design, education, or an easy way to pick lottery numbers.

In the scholarly tradition, information (facts, data, theories, models, mathematical formulas, and the like) is a common resource for all. In fact, it is the responsibility of the scholar to share with others, through publication and teaching, what he or she has learned. Research advances incrementally, moving beyond previously published information and theories-proving, improving, or disproving old ideas or offering new ones. Engaging in this "marketplace of ideas" is a highly competitive process, but it is the only way that scientific understanding can advance.

At the same time, publishers of engineering handbooks put a lot of work into creating their publications, possibly calculating values, mining data from various publications, arranging the information in readable tables, and so on. Hence their need to sell the books, and their claim to protection from having the fruits of their efforts stolen, reproduced, and sold in competition.

Careful Balancing of Copyright Law
It is a paradox of information flows in our society that information sharing for research and education is often done through the commercial marketplace, although scholarly and commercial values seem at times to conflict. Intellectual property law has always walked a careful balance between assuring open access to information and providing economic incentives for publication. This balance, which has evolved over centuries, has many elements in the law. The important one at issue in the debate over H.R. 2652 is the traditional distinction between "facts" and "expression." Expression is protected; facts are not.

This distinction is central for scholarship and teaching. I am not permitted to plagiarize a journal article, one that introduces, say, a new mathematical theorem. But I certainly am allowed to reproduce the theorem, the notation, or the logical sequence of the proof. I might do so in order to extend the line of reasoning in new directions, to apply the theorem to some new problem, or even to uncover a flaw in the proof. Alternatively, I might copy the theorem in writing a mathematics textbook or a survey article (giving full credit and attribution, of course, to the original author).

One could draw similar examples from any field of research. In spite of differences in the particular forms of information, and the ways in which they are analyzed and presented, all draw from the past to construct the future. It is a matter of moving knowledge forward. This is not just a legitimate social goal, but an expected, almost mandated, one. In fact, it is stated explicitly in the portion of the Constitution that gives Congress the power to make intellectual property law. Article I, Section 8, Clause 8, says that Congress shall have the power "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries." There it is, in plain 18th-century English. The purpose of intellectual property law is to promote (and surely not to inhibit) innovation.

The Problem
Despite this long historical balance, there is an issue that many in Congress feel needs to be addressed. Information, including scientific information, is becoming an increasingly valuable marketplace commodity, a big business. In many cases, facts can be valuable in and of themselves. The creative organization, formatting, or interpretation of those facts offered by the data provider not only are unimportant, but can even be impediments between researchers and the information they seek to use.

Recently, however, a few major court decisions have ruled that simple, obvious organizational structures (alphabetical or chronological ordering, for instance) are not sufficient to merit copyright protection for the offerings of information providers. Since the facts are already not protected by copyright, this is sufficient to deprive those providers of any intellectual property protection for their products.

In the cases that gave rise to those court cases, other parties had essentially copied vendors' offerings and marketed them in some form, under other names, thus undercutting the market for the original product. Technology, of course, makes both copying and distributing easier to do, whether the information product is electronic or paper. The information industry argues that it can take significant investment to create these collections of information even if the effort is not creative. They argue that, if the law does not protect so called "sweat of the brow" investments in compiling facts, there will be no market incentive to do so, and society will be the poorer. Thus, they have approached both Congress and the World Intellectual Property Organization (WIPO) for relief.

And, thus, Congress is moving forward a bill, H.R. 2652, that some in the scientific and library communities, including the American Association for the Advancement of Science and the Nation-al Academies of Sciences and Engineering, have found alarming.

The Bill
The bill attempts to protect information providers who have compiled information collections by making it illegal to extract, in whole or in substantial part, information from a database and then to use the extracted material to compete with the original provider. That sounds reasonable enough. It certainly did to those in the House who pushed it through, and to many of those who chose to ignore it.

The problem, as pointed out earlier, is that what the bill makes illegal is uncomfortably close to what scholars and teachers do for a living. It does so by attempting to protect facts themselves from exploitation by another party. If distinctions are not addressed very carefully in the language of the bill, if the bill is not drawn very narrowly, the result quite literally could be to outlaw the scientific method. To see why this might be so, we need to look just briefly at some of the wording in the bill.

1. Collection of Information: Although the bill is referred to popularly as a "database" bill, that word never appears in it. Rather, the bill protects "collections of information," which are defined somewhat circularly as "information that has been collected and has been organized for the purpose of bringing discrete items of information together in one place or through one source so that users may access them."

This broad definition would seem to include just about any document, book, or article ever produced. It certainly includes printed as well as electronic products, and was intended to have such coverage. Both of the recent court decisions referred to earlier involved printed, not electronic, products. So the first important point is that, whatever actions the bill outlaws, they cover the totality of scientific publication.

2. Extraction and Competition: The unpleasantly (but typically) turgid prose and the confusing language of this part of the bill are one reason people differ or are uncertain about the impact of the bill. I have to repeat it here:

"Any person who extracts, or uses in commerce, all or a substantial part, measured either quantitatively or qualitatively, of a collection of information gathered, organized, or maintained by another person through the in-vestment of substantial monetary or other re-sources, so as to cause harm to the actual or potential market of that other person . . . for a pro-duct or service that incorporates that collection of information and is offered or intended to be offered for sale . . . shall be liable to that person."


I'm not going to parse that awful sentence. (I've already simplified it by taking out some of the unimportant qualifiers.) But it seems clear that it could arguably cover a much greater range of activity than just copying someone's database and offering it commercially in competition with the original owner. The bill talks about "potential" and, later on, "intended future products" that would incorporate the data. It is very open-ended language, affording an information provider extensive downstream control over the facts that he or she collects. If, say, I were to extract information from a journal article for a textbook, the publisher of that article could argue that it intended to publish a textbook incorporating data from that article. Since my book threatened that publisher's potential market, wouldn't I, therefore, be in violation of the law?

As often occurs in the analysis of potential legislation, the game being played is as follows: Optimists say the bill means only good things and would never be interpreted so harshly. Pessimists say that one should never be sanguine about the good sense of the courts, especially by the time they would get around to interpreting the bill; this is especially so when large economic stakes ride on an interpretation one way or the other. If one wants to protect continuing educational, research, and public use of facts, one had better make it clear (as the bill has explicitly done in the case of sports scores).

If the pessimists' reading is correct, then Goodbye, scholarship. Goodbye, research. Goodbye, education.

The State of the Game
The bill has passed in the House and was sent to the Senate in May. Some feared that heavy pressure from the sponsors, coupled with the fact of rapid approval in the House, would create conditions for the same kind of fast track in the Senate. Fortunately, that has not happened. The Senate Judiciary Committee has said that it will examine the bill with some care. That will presumably mean hearings and informal discussions with members of the research, education, and library communities. Some science associations have already started the process with staff of the science subcommittee of the Senate Committee on Commerce.

To date, higher education organizations, which have a significant stake in the outcome, have been noticeably silent. In fact, their silence was noted in the House as evidence that there were no problems. As mentioned above, some scientific organizations have been involved in their way, but their actions al-so have been limited. These groups have been silent for two reasons. In the first place, intellectual property law is not a policy area in which higher education and research have been historically very active. They lack staff expertise in the area and don't have the necessary political contacts, and they are not familiar with the players, or even with the rules of the politicking game in this arena.

Secondly, political fights that involve heavy and expensive daily lobbying (which this one does) are not a good match with the typical style of science policy. That style, to oversimplify, has been to present carefully reasoned reports and communications from prestigious bodies and to expect that, by and large, good will be done by our friends on the Hill.

This situation may reflect broader trends in science policy: (1) the growing impact of non-scientific policy issues on research and (2) the changing styles and demands on the research community in all of its Washington activities. But that is another article.

The Dilemma
Intellectual property law is a delicate dance on the razor's edge of a dilemma. Economists tell us that the more competitive a market, the lower both the prices we pay for goods and services and the rates of innovation. To that end, antitrust lawyers seek what I heard a Justice Department official once refer to as a "Jurassic Park" of competition. From that and the consumer's perspective, it is to be desired that A take B's ideas and do one better with them-improve them, sell them cheaper, and so on. It's a cruel process in its rawest form.

By restricting that process, intellectual property law essentially restrains competition. It does so in the name of innovation, offering exclusive rights as a social reward for invention. (Fairness to a creator, although not explicitly a part of intellectual property law in the U.S., is also certainly a political consideration.)

So the rule is: Restrict enough to provide incentives, but not so much as to restrain innovation or to extract excessive monopoly profits from the consumer. It's a tough job and can only be done imperfectly. One tool that policy-makers use is to draw bright lines that help define the boundaries. And one of the sharpest, longest standing boundaries is the distinction between fact and expression. The research and education community needs to be deeply concerned when politicians approach that line or start rubbing it out.

It remains to be seen whether the willingness of the Senate to listen to the scientific community will result in actual consideration of ways to alleviate their concerns. Powerful forces are pushing this bill forward. Far more narrow approaches to the bill have been suggested, and they might offer redress for the particular types of data theft and unfair competition that ostensibly led Congress to consider legislation. On the other hand, it seems quite likely that these unfortunate side-effects for research and education are not accidental or unanticipated consequences. Wouldn't you like to "own" forever all downstream rights to any use of the "fact" of the double-helix structure of DNA?

Fred W. Weingarten is director for government affairs at the Computing Research Association and a senior policy fellow at the American Library Association.

Donate · Contact Us · Site Map · Join SIAM · My Account
Facebook Twitter Youtube linkedin google+