bgcolor="#ffffff" text="#000000" link="#0000ff" vlink="#000080" alink="#ff0000" >

About Zebra powered Harvest Search System

This is a system to collect data from various sources and make them searchable using a web browser.

This search system is built with following components:

  • Harvest is a flexible system to collect data from different sources (http, ftp, nntp, local files), summarize their contents and make them searchable via various fulltext search engines. Harvest comes with glimpse which is the default indexer and swish.
  • Zebra is a fulltext indexer which follows Z39.50 standard. This standard seems to be popular among the librarians. Zebra allows to search in structured fulltext. It supports very powerful search capability and supports incremental indexing, which makes it easy to manage large amount of data.

The motivation to create this system was to replace Harvest's default fulltext indexer glimpse with a GPLed indexer.

This is still work in progress. The modeling of the data isn't finished yet and there are following uresolved issues:

  • Find out why 1,1010 doesn't work.
  • Create a query page which enables to use the lower level features of Z39.50 and other tweakable variables in, but is not too complicated.
  • How do I get the file name of SOIF object? Or more general: How do I get the name of the data file which has the expression I am looking for?

Back to Query Page