Without fail, the first question that is asked when a system supporting ``intelligent'' information retrieval is proposed is, ``Why can't you just use a database?'' While making ``enemies'' with the database community would be neither productive nor have a real point, proponents of what is traditionally considered ``knowledge-based'' approaches must have ready answers to show what the advantages and disadvantages are in pursuing one or the other.
The basic answer to this question lies in the fundamental tradeoff in knowledge representation between expressibility and tractability [LB87]. A database system is a form of knowledge representation which is low on the expressive scale and high on the tractability scale, and this is their raison d'etre. One must understand when choosing to use a database that there is information - indeed, knowledge - which simply can't be expressed. On the same count, one must understand when using a more expressive representation that response for certain queries may be quite slow - indeed, the possibility that there will be no response does exist in some systems.
A simple example of the kind of useful knowledge that can not be represented with a database is the case of finding articles in a library. Consider the scenario where a user is looking for seminal papers on intelligent access to digital libraries, and finds the following entry in the database:
ARTICLE-01405:: TITLE: Knowledge Representation for Intelligent Information Retrieval AUTHOR: PERSON-11234 (Chris Welty) PUBLISHED-IN: PROCEEDINGS-54382 (Proceedings of the CAIA-94 Workshop on Intelligent Access to Digital Libraries)
The user decides to pick up a copy of this paper in the library, and needs to know where to find it. The next step now requires some simple reasoning on the part of the user: Since the article is published in the proceedings, if I find the proceedings I can find the article. The user then looks up PROCEEDINGS-54382:
PROCEEDINGS-54382:: TITLE: Proceedings of the CAIA-94 Workshop on Intelligent Access to Digital Libraries LOCATION: /dl/data/proceedings/54832
Rather than requiring users make this inference, a more expressive KR system would allow that piece of knowledge to be represented as a rule, something like:
IF ?x published-in ?y AND ?y location ?z THEN ?x location ?z
If such a rule were represented, the entry for the article would appear with the proper location as a result of the initial query:
ARTICLE-01405:: TITLE: Knowledge Representation for Intelligent Information Retrieval AUTHOR: PERSON-11234 (Chris Welty) PUBLISHED-IN: PROCEEDINGS-54382 (Proceedings of the CAIA-94 Workshop on Intelligent Access to Digital Libraries) LOCATION: /dl/data/proceedings/54832
This is a very simple example for the purposes of illustration, and it may be well within the capability of most users to come up with it themselves, but the point is rather the fact that inference can not be represented in a database. A user must know that an article's location is the same as the location of the thing it is published in. A more expressive KR system allows the modeler to add that knowledge to the system so that the user doesn't have to know it. This is the essence of intelligent assistance.
There is clearly a performance issue introduced, as necessarily specified in the expressiveness/tractability tradeoff. The cost of performing inference for every entry processed may be fairly high. Some systems provide the capacity to attach rules only to specific entries, for example in the above case we might say the rule for inferring location should only be used for proceedings and books.
The amount of information available is going up exponentially, and the need for more intelligence in the representation will soon increase over the need for rapid retrieval - the more information there is in the library, the more knowledge will be needed to find things. If users are expected to have that knowledge in order to find the information they want, many will not be able to find it. The survival of digital library technology will be dependent on making as much of the information accessible with as little as possible required on the part of the user. The tradeoff must be balanced, however: too much expressiveness will have the same negative effect as too little, due to increased performance problems.