Cavan McCarthy, Ph.D., Visiting Professor
School of Library and Information Science
The University of Iowa

OPEN BIBLIOGRAPHIC
SOFTWARE PROJECT:
bibliographic databases using Open Source software in Web, desktop and developing country environments

PRELIMINARY HTML PRESENTATION OF A PROPOSED RESEARCH PROJECT

SUMMARY

A central pre-occupation of information professionals at this moment is to make legacy databases available via the World Wide Web. Until recently this was a complex task: enterprise level systems in this area were expensive; small scale systems were often limited and sometimes also quite costly, while there was no viable middle path.  Open source software offers an escape from this situation; it is free or sold for nominal price; it is increasingly efficient at distributing database content via the WWW; technical support is available via a growing network of enthusiasts and users.  Open source software is based on Linux operating systems, which are increasingly attracting attention and gaining the confidence of information professionals as stable and secure alternatives to the Windows operating system.

Most database activity takes place in commercial or business environments. Information professionals have more difficulty in identifying and using softwares for distributing bibliographic data via the web, or for database enabled library gateways. A central aim of this project will be to create, test and compare systems which operate adequately in bibliographic or informational environments.

Many professionals also use small-scale data bases to organize bibliographic citations on their personal computers. Much desktop level Open Source software is available, but little attention is paid to small scale general purpose databases, let alone bibliographic software. An unusual situation has arisen in small scale bibliographic data base systems for Microsoft or Apple operating systems (EndNote, ProCite and Reference Manager): these have been purchased by a single company. There is therefore little chance of a fall in the price of these products, which strengthens the opportunities for the production of open source software for this area. Such a system could benefit greatly from the cooperative development procedures firmly established in the open source field.

Open source software is of especial interest to developing countries, which have little money available for expensive Windows or database solutions, but which urgently need to organize and disseminate their informational resources. Library and information professionals are in an especially difficult situation, lacking financial resources and technical knowledge. They had previously been able to use CDS/ISIS, a Unesco-supported bibliographic database software, but further development of this system is uncertain, following the recent decease of its principal programmer and developer.  There is a major demand in developing countries for a viable alternative to CDS/ISIS. Interest is not limited to placing databases on the web, but extends to library automation systems; current plans to develop library automation systems in open source software are of great relevance to developing countries.
 

DATA BASES, WWW AND OPEN SOURCE SOFTWARE

The World Wide Web is famous (or notorious) for impacting or even overturning long-established systems in communication, commerce, publishing, education etc. Sometimes whole sectors are replaced; at the least foundations and existing structures are vigorously shaken and forced to re-evaluate their procedures and objectives. This is exactly what has happened with the traditional database field. Databases involve a dynamic interaction, in which specific retrieval queries, made to the database, generate immediate results containing specific data; results would normally be in ASCII text. On the World Wide Web servers make available whole pages of pre-established HTML content, which are then requested by and sent to clients, to be displayed as a whole on their local browsers. It has been surprisingly difficult to combine the two types of systems, leading to the frequently cited problems of legacy data bases, or databases generated by traditional computing methods, which cannot be queried via a web interface (Marcinko, 1998). One of the major challenges facing library and information professionals is that of developing systems which will easily and cheaply place legacy databases onto the web, and also meet the exponentially increasing demand for web-enabled databases for new applications.
 

SMALL SCALE, WEB-ENABLED DATABASE SYSTEMS
(EXCLUDING OPEN SOURCE SYSTEMS)

Small scale systems in this area are limited and sometimes also costly. Microsoft Access is well-known to PC users, because it was until recently packaged with Microsoft Office. But it is a relatively limited system, and now has to be purchased separately, for several hundred dollars. Filemaker Pro (based on products formerly sold under the trade name Claris) offers cheap Intranet connection for a small system; for wider access it is necessary to purchase advanced editions of the software, which are substantially more expensive. Middle range systems can be supported by a combination of Filemaker Pro and a middleware such as Tango.

Filemaker:
http://www.filemaker.com/

http://www.filemaker.com/products/fm_features.html#datamanagement
 

Tango:
http://www.pervasive.com/index.html

http://www.pervasive.com/products/tango/

http://www.pervasive.com/products/tango/sites.html
 

An important product in the context of this project is ColdFusion, a software from Allaire which permits the distribution of databases via the web (middleware). It is frequently used in libraries (Beiser, 1997); web-pages which include the suffix .cfm are generated by this system. It  runs in both Windows and Linux (but not Macintosh) environments and is available in a series of editions:
 

ColdFusion:

http://www.allaire.com/products/coldFusion/generalInformation/FAQs/45Express.cfm

http://www.allaire.com/handlers/index.cfm?ID=14092

http://www.allaire.com/documents/cf45docs/acrobatdocs/45refcard.pdf

http://commerce.allaire.com/handlers/index.cfm?ID=5854&Method=Full&Title=Allaire%20Price%20List&Cache=False
 
 

LARGE SCALE (ENTERPRISE LEVEL) DATABASE SYSTEMS

Beyond  ColdFusion, Allaire also offers a powerful enterprise-level solution, Spectra. This is a top-level but relatively expensive ($15,000) solution for e-commerce and content management; it runs on top of ColdFusion, but only in a Windows environment:
http://www.allaire.com/products/spectra/
 

Microsoft SQL Server is Microsoft's enterprise level database system; it is aimed at corporations and licensing costs run easily to several thousand dollars. IBM's product in this area is DB2, which runs on a variety of platforms (but not on Macintosh); this is powerful but costs close to $18000.

Perhaps the best-known enterprise-level database system is Oracle, produced by the world's largest database company. It's major advantage is that applications work over all platforms. It is used for major web-based applications, such as Yahoo, Amazon.com or Blockbuster, and is priced accordingly. A major application will rapidly attract licensing fees in the tens of thousands of dollars; the chairman of Oracle is considered to be the second-richest man in the world.

Oracle sites:
http://www.oracle.com/

http://www.oracle.com/appserver/index.html
 
 

DESKTOP LEVEL BIBLIOGRAPHIC SOFTWARE

Small scale bibliographic databases can be set up easily and efficiently on microcomputers using one of the the trio of specialized software products, EndNote, ProCite and Reference Manager, now all part of ISI ResearchSoft, a Thomson Corporation company. A further product, Reference Web Poster, permits making a database available via the web using a relatively simple presentation. For a demonstration of Reference Web Poster, see:
http://www.isiresearchsoft.com/rwp/rwpdemo.html
 
 

OPEN SOURCE DATABASE SOFTWARE

PostgreSQL is probably the oldest Open Source database system; it was based on Ingres, a system developed in the late seventies at Berkeley; Postgres was developed in the late eighties and added SQL capability in 1995. The incorporation of the widely-used standardized database language, SQL, made the software easily available to systems analysts who work in the database area. PostgreSQL functions on Windows NT or Open Source software, but not Macintosh; above all it is available free of charge (but without the customer support offered by commercial systems).
http://www.postgresql.org

MySQL is widely known as an alternative to PostgreSQL; it runs on all platforms, Unix, Mac and Windows, and offers an interface relatively similar to Windows; it runs close to enterprise level. Like PostgreSQL, it is available free, although its development is controlled more closely than PostgreSQL
http://www.mysql.com

Your SQL Database Might Just Be MySQL:
http://www.oreillynet.com/pub/a/network/2000/06/16/magazine/mysql.html

Open Source Databases: As The Tables Turn / Tim Perdue:
http://www.phpbuilder.com/columns/tim20001112.php3

Recently much attention has been paid to solutions which use an additional open-source software, PHP, as an interface to MySQL:

Joe Brockmeier: Introduction to PHP:
Finally, the perfect language for dynamic content and database interaction;
http://www-106.ibm.com/developerworks/library/l-php.html?dwzone=linux

Are PHP and MySQL the Perfect Couple?
http://www.oreillynet.com/pub/a/network/2000/06/16/magazine/php_mysql.html
 

OBJECTIVES

Investigate and facilitate the organization, retrieval and transmission of systematically organized information from databases using Open Source software in a variety of automated and social contexts.

Prepare, test, evaluate and disseminate systems and procedures using Open Source software which:

STATEMENT OF THE PROBLEM

Premise: In order to develop adequately and offer benefits to an increasing number of its members, society needs increasing access to organized, integrated information sources, both traditional and bibliographical, through a variety of interfaces. The need is especially strong in developing countries, which urgently need information systems which adequately support their development process. Current systems are disarticulated, expensive and subject to monopoly pricing which make them inappropriate for a variety of environments. Open Source software offers a clear solution, but needs careful investigation, testing and evaluation before it can fully answer the problems.
 
 

HYPOTHESES

(Note: the testing of hypotheses is not the central objective of this study, but some general hypotheses have been prepared to guide the activities)

General hypothesis:
(Note: hypotheses are presented in positive form)

Open Source software offers solutions to a variety of informational problems that are significantly better than traditional software

Specific hypotheses:

Open Source software is able to make legacy databases, both classic and bibliographic, available on the World Wide Web, more cheaply and more efficiently than traditional software for this area.

Open Source software is able to support personal databases, both classic and bibliographic, in a desktop environment, more cheaply and more efficiently than traditional software for this area.

Open Source software is able to offer database solutions, both classic and bibliographic, to developing countries, more cheaply and more efficiently than traditional software for this area.
 

METHODOLOGY

(The phases are analyzed here in this order, but in fact they are independent and could be undertaken in any order).

Test and evaluate Open Source software for distributing databases in the World Wide Web.

Analyze the specific problems of using Open Source software for the dissemination of bibliographic information via the World Wide Web. Identify, test and disseminate specific Open Source software solutions appropriate for this area.

Test and evaluate Open Source software for personal databases in a desktop environment.

Analyze the specific problems of using Open Source software for bibliographic information in a desktop environment. Identify, test and disseminate specific Open Source software solutions appropriate for this area. Note that current small scale bibliographical systems such as EndNote, ProCite or Reference Manager are relatively expensive and complex, because they include a large number of pre-established templates, for journal articles, books, book chapters, patents etc. They also include a large number of output formats, ANSI, APA, Chicago, Index Medicus, JAMA, Nature, Turabian etc. Under the philosophy of Open Source the central kernel of the software could be set up centrally, while the templates and output formats could be developed by interested participants and shared with others.

Analyze and evaluate Open Source software from the point of view of its suitability for use in developing countries. Recommend specific software solutions, valid for developing country environments. Prepare adequate manuals and training materials, disseminate the solutions and evaluate their impact. Of especial relevance here are attempts to create an Open Source library automation system:
Freeware Library System. Biblio-Tech Review. October 2000. Approx. 3 p.
http://www.biblio-tech.com/html/freeware_library_system.html
 

SCHEDULE:
Two year project:

First semester: background reading, identifying, obtaining and installing relevant Open Source software solutions.

Second semester: preliminary evaluation of relevant Open Source software solutions relevant to project objectives.

Third semester: identification and in-depth analysis of relevant solutions Open Source software solutions. Preparation of manuals, training materials and dissemination of system and procedures amongst potential users.

Fourth semester: analysis of reactions from users; creation of final version of system and procedures; preparation of final report.
 

TEAM:
Principal researcher: Cavan McCarthy, Ph.D.,
Graduate assistant(s)
Researchers or collaborators
interested in open-source software
and in informational problems in developing countries
(Note: this project offers ample scope for international collaborative projects).
 

BUDGET:
Limited financial support required for:
Graduate assistants
Office and computer support etc.
Occasional purchase of software
(limited cost: Open Source software typically costs from $25 to $70)
 

ELECTRONIC SUPPORT SERVICES FOR THE
OPEN BIBLIOGRAPHIC SOFTWARE PROJECT

Creation and maintenance of a website which will act as a
referral point for the Open  Bibliographic Software Project
(no such site currently exists)
Link to operational web databases which offer examples relevant to the objectives of the project
Link to software distributors which offer relevant products
Link to publishers and materials of relevance to the group
(Note that there is some possibility
of financial return from the latter two areas)

As constituent parts of the site: set up and maintain:

Specialized bibliography on open source software,
especially in a bibliographic / developing country context;
(this will of course be distributed via an appropriate open source database system)

Library of documents relevant to open source software activities, especially in a bibliographic / developing country context. Full-text documents will be mounted in appropriate cases; otherwise links will be made to relevant documents.

Receive responses to questionnaires and surveys on open source database software via HTML form or similar electronic system. (Note: data collected in this manner would not be completely reliable, as the electronic respondents would represent only those using Internet at a relatively high level. But it would be very interesting to compare data collected electronically without proper sampling and control with data collected by traditional means; studies of this nature are highly relevant as Internet-based data collection will probably become common in the near future)

Maintain and archive a discussion list (LISTSERV) on open source software activities, in bibliographic / developing country contexts.
 
 

BIBLIOGRAPHY

Beiser, Karl. Publishing text databases on the Web with Inmagic's DB/Text Webserver. Database. 1996 Dec; 19(6):45-50.

Beiser, Karl. Database driven web sites: Cold Fusion for web publishing. Database. 1997 Dec; 20(6):48-52.

Biggs, Deb Renee Biggs. Procite in Libraries : applications in bibliographic database management. Westport, CT: Learned Information; 1995. 256 p .

Buxton, Andrew and Hopkinson, Alan. The CDS/ISIS for Windows handbook. 2nd ed. London: Library Association; 1997.

Dyck, Timothy and Taschek, James. Application servers bridge the data gap: 10 high-powered, scalable, and fault-tolerant servers that put your database on the Internet. Internet Computing. 1998 April; 3(4):77-95.

Green, Rebecca. The design of a relational database for large-scale bibliographic retrieval. Information Technology and Libraries. 1996 Dec; 15(4):207-221.

Gourley, Don. Opening doors with Open Source. Computers in Libraries. 2000 Oct.; 209(40-43).

Harker, Karen R. Order out of chaos: using a web database to manage access to electronic journals: ColdFusion. Library Computing. 1999; 18(1):59-67.

Kiem, Cao Minh and Middleton, Michael. An evaluation of textual storage and retrieval software: CDS/ISIS and InMagic. Program. 1998 July; 32(3):283-302.

Leach, Michael R. Introduction to delivering databases via the World Wide Web: an ASIS Continuing Education Program, Chicago, November 2000. Silver Spring, MD, ASIS, 2000.

Marcinko, Randall. The legacies left us by database producers. Database. 1998 Jun. / Jul.; 21(3):49-58.

Perez, Ernest. GDIdb: getting your database on the web, with no big deal. Library Computing. 1999; 18(1):29-35.

Sullivan, Peter. Using database-driven web pages for your courses. Multimedia Schools. 6(3):42-44.

Swank, Mark; Kittel, Drew, and Wooding, Kjell. World Wide Web database developer's guide. Indianapolis: Sams; 1996. 782 p .

Talacko, Paul. Understanding Linux databases. Linux Format. 2000 Aug; (4):42-48.

Watterson, Karen. Enterprise databases battle on. Byte. 1998 Jun; 23(6):97-104.
 

ADDITIONAL INTERNET RESOURCES

Free Software Project:
http://www.salon.com/tech/fsp/index.html

Free Software Project Resources:
bibliography and relevant URLs
http://www.salon.com/tech/fsp/resources/index.html

OBAS -- An Online Bibliography Assistant:
http://users.iafrica.com/i/iw/iworks/obas/index.html
 

Updated:  2000 Dec. 29     Conditions of use
Return to opening page of Prof. McCarthy's site
Click here to send e-mail to Prof. McCarthy