CAN'T THE INDEX BE WRITTEN BY A COMPUTER?
Indexing ultimately organizes "aboutness" for quick recall. The computer and its software assist, but the human mind alone can speak to the concept of "aboutness". If a term or concept is not specifically articulated on a page, a computer cannot choose it for the index, nor can a search engine find references to it. Neither can the computer reword the entry in a form that aids readers who are unfamiliar with the author's thrust. A paragraph or discussion can be "about" a topic without specifically using those words.
For example, in a discussion of passwords, keys and locks, you may be speaking about security issues and encryption. The words "security" or "encryption" may not appear in that page yet the reader should be directed to that page by the index under those headings. It takes a human mind to draw these conclusions.
Furthermore, a computer cannot:
Thus, without a human being to analyze content and context, automation in either a search process or in creating an index falls short of effectively bringing together relevant topics while avoiding the unrelated. Yes, a computer can thinkjust like a plane can fly or a car can drive.
For more on why a computer can't index a book, visit http://www.indexers.org.uk/index.php?id=463
If a computer can be programmed to beat a human at chess, why can't it write an index?
"Given any configuration of chess pieces, there are a finite number of sequences possible. It may be a huge number, but it's finite and (more important) definable. That is, with enough computing power and memory, it's possible for a computer to evaluate every possible sequence of moves to a win or loss, and to make the best move possible. That's not at all true of indexing or any other task that requires true understanding." (David Billick)
There are programs that are sometimes called indexing software, but they may in reality be search engines, or concordance builders, or text mining software. The idea of automatic indexing is different from the computer-assisted indexing that professional back-of-the-book indexers use.
Some systems can be adequate for a specific implementation. NStein, Inxight, Autonomy, Convera, Applied Semantics, Sonar Bookends, and/or Entriev are based on automatically extracting concepts from texts in such diverse applications as indexing public records and processing accounts receivables for trucking firms, but the results are not adequate for creating back of the book indexes.
Even "text-mining" software has problems: "How well computers truly make sense of what they are reading is, of course, highly questionable, and most of those who use text-mining software say that it works best when guided by smart people with knowledge of the particular subject." (New York Times, 10/16/2003) These articles are also useful: http://www.intranetjournal.com/features/humanindex-1.shtml
The difference to the reader (and to the number of users who call your Help Line) in the quality of the index, its useablility, flexibility, and integrity, can be enormous.
What is "automatic indexing"?
The Emperor's New Mind. Written in 1989 by Roger Penrose. He discusses why automatic indexing (AI) will, in his opinion, *never* be able to "understand" what information is truly "about."
Martin Tulic: Wellisch's "Glossary of Terminology in Abstracting, Classification, Indexing, and Thesaurus Construction" (available for purchase here) defines Automatic Indexing as: Any [indexing] method by which the [text] of a [documentary unit] is subjected to algorithmic operations in order to extract [terms] or [phrases] that represent [subject], [topics], or [features] of the documentary unit, where [<term>] refers to terms defined elsewhere in the glossary. By this definition, Automatic Indexing has indisputably been at the center of information retrieval systems ever since people realized the problems inherent in KWIC, KWOC, and similar simple algorithms.
Does the indexer really have to read the whole book?
Oh yes. Several times.