RWhois Server Operations Guide

Release 1.5.0



This document provides an overview of the basic descriptions and operations necessary to run an RWhois server installation. The RWhois server package consists of the server process itself (rwhoisd), tools to enable and manage the native database (rwhois_indexer, repack), and a number of configuration files (rwhoisd.conf and  rwhoisd.dir, for example).

The Programs

The following offers a brief description of the programs and utilities found in the RWhois server release.


rwhoisd is the RWhois protocol server.

Summary: rwhoisd [-c config file] [-r] [-s] [-Vvq] [-di]


Config File: Specifies the main configuration file to use (defaults to 'rwhoisd' in the current working directory).


Root: Specifies that the server will run as a root server and not generate any 'punt' referrals. 


Security: Turns chrooting ON.


Very Verbose: Logging verbosity is set to 7 (debug). 


Verbose: Logging verbosity is set to 6 (info). 


Quiet: Logging verbosity is set to 2 (alert). 


Daemon mode: Server will put itself into the background and run in a stand-alone mode. 


Inetd mode: Server will run itself as a single shot and accept input and output from stdin and stdout. 

Except for the '-c' option, all of the command line options are also accessible in the main configuration file itself. The command line options override the configuration file settings.


Summary: rwhois_indexer [-c config file] [-C class] [-A auth area] [-ivqn] [-s suffix|file list ]


Config File: Specifies the main configuration file to use (defaults to 'rwhoisd' in the current working directory).


Class: Specifies which class of objects to index. Defaults to all classes


Auth Area: Specifies which authority area to index. Defaults to all authority areas


Initialize: Remove the old (registered) index files first.


Verbose: Logging verbosity is set to 6 (info). 


Quiet: Logging verbosity is set to 2 (alert). 


No Syntax Checks: The indexer will not check for schema compliance during indexing.


Suffix mode

Configuration Files

At start up, the RWhois server reads various configuration files. They are categorized into the general (server) configuration files and authority area (database) configuration files.

General Configuration Files

General configuration files consist of the main configuration file, directive configuration file, extended directive configuration file, directive security files, and the RWhois parent file. In these configuration files, extra white space is ignored and lines beginning with the '#' character are treated as comments.

1. Main Configuration File (rwhois.conf)

The main configuration file is a "<tag: <value" delimited file with the following tags.


The configuration file that contains the description of the authority areas found on this server. This file is required and defaults to 'rwhoisd.auth_area' (in the root-dir). 


The directory in which rwhoisd looks for extra binaries. 


Flag determining whether or not the server should try to chroot. It defaults to NO. 


The number of seconds of idle time before the server automatically disconnects. This is normally 200 seconds. 


An alias for 'root-dir', this parameter is deprecated. 


The file containing the options for all of the built-in directives. This file is required and defaults to 'rwhoisd.dir'. 


The fully qualified domain name (FQDN) of the host on which this server resides. This is primarily for the welcome banner. If this file is omitted, rwhoisd will attempt derive the value. 


The advertised port to which rwhoisd listens. The Internet Assigned Naming Authority (IANA)-assigned port for RWhois is 4321. 


The maximum value to which a server can set the hit ceiling. The "-limit" cannot be higher than this. 


The default setting for the number of hits on a query; it can be changed with the "-limit" directive. 


The file containing the punt (or root) referral information. 


The directory containing the files of all pending registration requests and temporary files for the "-register" directive. It should be writable by the rwhoisd process. 


The base directory from which rwhoisd will run. All relative paths will be relative to this directory. If chrooting is not turned off, this is the directory to which rwhoisd will chroot(). If it is not set, it will default to the current working directory. 


The file that allows hosts/networks by directive using tcp wrappers; it is optional. See hosts_access(5) for details. 


The file that disallows hosts/networks by directive using tcp wrappers; it is optional. See hosts_access(5) for details. 


The email address of the server contact. 


The switch between running rwhoisd as a daemon or as a single-session process under inetd. 


If the server is run as root, this file will setuid and setgid to the values indicated by this user. 


The file containing the definitions of any extended ("X-") directives. This file is optional. 


When running in daemon mode, rwhoisd will record its process id in this file. It attempts to unlink this file when it quits. 


The file that determines whether or not logging should use the syslog mechanism; the default is YES. 


The numerical syslog facility to use, if using syslog. It defaults to 3 (daemon). 


If not logging to syslog, this describes the default file to which to log. 


If logging to file, each log level can be directed to a different file using this file. 


The level at which logging occurs; higher numbers mean more logging. The levels correspond numerically to the syslog levels (0 is emergency, 7 is debug). 


The user id string of the server itself, this is the key into a pgp keyring. 


The path to a file containing the password to the rwhois server's private key. 


The path to the pgp binary itself, this should be an absolute path for security reasons. 


The is the path to the server's pgp keyring directory. 


A flag indicating whether the '*' wildcard will be allowed at all; defaults to TRUE. 


A flag indicating whether the leading wildcard construct will be allowed, thus allowing substring searches to occur; defaults to FALSE 


An integer repesenting the maximum number of children (sessions) allowed at one time. Attempts to connect after the limit has be reached will exit with an rwhois error. A value of zero (the default) indicates no maximum.


A value (either UP or DOWN) indicating which direction CIDR searches will traverse. DOWN, the default, means that a search on a network will return that network and/or any sub-networks.


Do not search for down (more specific) referrals. The default is OFF. It is not recommended that this be turned on.


root-dir:         /home/databases/rwhois/

bin-path:         bin

auth-area-file:   rwhoisd.auth_area

directive-file:   rwhoisd.dir

x-directive-file: rwhoisd.x.dir

max-hits-default: 20

max-hits-ceiling: 2000

register-spool:   register_spool

punt-file:        rwhoisd.root


local-port:       4321

security-allow:   rwhoisd.allow

security-deny:    rwhoisd.deny

deadman-time:     200

server-type:      daemon

userid:           guest

chrooted:         yes


use-syslog:       no

default-log-file: rwhoisd.log

2. Directive Configuration File (rwhois.dir)

The directive configuration file contains entries to enable or disable the RWhois directives.


Soa        yes

Register   no

3. Extended Directive Configuration File (rwhois.x.dir)

The extended directive configuration file is a "<tag: <value" delimited file with the following tags.


The minimum length that will activate the command. 


The binary/script that will be executed. 


The natural language description of the directive; it shows up when using 


The record separator. 


Command:     date

Command-len: 4

Program:     /usr/bin/date


command:     pgp

command-len: 3

description: The PGP keyring gateway directive

program:     Xpgp

4. Directive Security Files (rwhois.allow/rwhois.deny)

The directive security files are (or may be) localized versions of Weitze Venema's TCP Wrapper configuration files. In general, entries in this file take the form of

<directive: <security_pattern

where <directive is a particular directive name without the leading '-'. (i.e. 'xfer', 'register', 'X-pgp'), and the security pattern is a space delimited list of IP addresses or domain names. See hosts_access(5) located in the tcp_wrappers distribution.

Example (rwhois.allow):

xfer:     198.41.0

x-date:   all

Example (rwhois.deny):

register: all

xfer:     all

5. RWhois Punt File (rwhois.root)

The RWhois punt (or parent) file contains a list of RWhois Universal Resource Locators (URLs) that are referrals to a higher point in the RWhois information tree. At this point, the server does not arbitrate between the different punt referrals listed in this file, so all listed destinations should be equivalent.



B. Authority Area Configuration Files

An authority area is an identifier for an RWhois database containing data and its schema. It has a hierarchical structure that helps identify the position of the database in the global RWhois data information tree. An authority area has the structure of either a domain name or an IP address in quad-octet prefix/prefix length format.

The authority area configuration files consist of the authority area file, the Start of Authority (SOA) file, the schema file, and the attribute definition files.

1. Authority Area File (rwhois.auth-area)

The authority area file is a "<tag:<value" delimited file containing information about the authority areas for which the RWhois server is primary or secondary.

A primary (or master) RWhois server is where data is registered for an authority area; it answers authoritatively to queries for data in that authority area. There must be one and only one primary server for a particular authority area. An RWhois server may be primary for multiple authority areas. The authority area model is explained in more detail below.

A secondary (or slave) RWhois server is where data is replicated from a primary server for an authority area. It, like its primary server, answers authoritatively to queries for data in that authority area. There can be multiple secondary servers for a particular authority area, and an RWhois server may be secondary for multiple authority areas.

The authority area file contains the following tags.


Flag that states whether the server is master or slave for the authority area. It is either 'master' or 'slave'. 


The name of the authority area. 


The directory where data is stored for the authority area. Note that the data directory for an authority area in quad-octet prefix/prefix length format should be named "net-<quad-octet prefix-<prefix length". 


The schema file for the authority area. 


The SOA file for the authority area. 


If the server is master for a particular authority area, this specifies the slave server(s) in "<host <port" format for that authority area. 


If the server is slave for a particular authority area, this specifies the master server in "<host <port" format for that authority area. 


Type:         master





Slave: 4321

Slave: 4321


Type:         master


Data-Dir:     net-

Schema-File:  net-

Soa-File:     net-

Slave: 4321

Slave: 4321


Type:         slave





Master: 4321

2. SOA File

The SOA file is a "<tag: <value" delimited file with the following tags.


The serial number of the authority area is stated in the "yyyymmddhhmmss" format. The serial number must be changed whenever the data in the authority area changes; the server does this automatically. 


The time interval between data transfers from the primary server. 


The time interval between partial transfers of data from the primary server since a particular serial number. 


The time interval before retrying to connect to a server that appears to be out-of-service. 


The length of time the data remains in the authority area before it becomes stale. 


The primary server for the authority area. 


The email address to the hostmaster for the authority area. 


Serial-Number:      19961008101010

Refresh-Interval:   3600

Increment-Interval: 1800

Retry-Interval:     60

Time-To-Live:       86400



3. Schema File

The schema file is a "<tag: <value" delimited file with the following tags.


The name of the class. 


An alternate name for the class; this may be repeated. 


The file that describes the attributes of the class. 


The directory containing the data records for the class. 


The text description of the class. 


The record separator.


name:         contact

alias:        user



description:  user class

4. Attribute Definition File

The attribute definition file is a "<tag: <value" delimited file that describes the attributes of a particular class. It has the following tags.


The name of the attribute. This name is used as the official name of the attribute. 


Another, usually shorter, name for the attribute. This may be repeated. 


The acceptance format for the attribute. The value should be 
<syntax id:<pattern. Currently, the only accepted syntax is "re" for regular expressions. 


The textual description of the attribute. 


Flag stating whether or not the attribute is a primary key (TRUE or FALSE). This implies 'is-required' is also TRUE. 


Flag stating whether or not the attribute is required (TRUE or FALSE). 


Flag stating whether or not the attribute is repeatable (TRUE or FALSE). This is mutually exclusive with 'is-multi-line'. 


Flag stating whether or not the attribute is multi-line (TRUE or FALSE). This is mutually exclusive with 'is-repeatable'. 


Flag stating whether or not the attribute is hierarchical (TRUE or FALSE). 


Flag stating whether or not a client has to be authenticated to see this attribute (TRUE or FALSE). 


Flag that determines the indexing method. 


Flag that determines the type of attribute (primary used in the displaying of the data). 


The record separator. 

The valid attribute types are


A text string, this is the normal attribute type. 


A reference to another RWhois object for normalization purposes. 


A URL to some related information. 

Currently the valid index types are


This means the attribute is not indexed. 


Index using all possible indexing schemes. Some indexes may not be created because they do not make sense. Soundex indexing is entirely rolled up into this option. 


Index just the literal text value. 


The index treating the value as an IP number or IP network expressed in CIDR notation (prefix/prefix-len). This index type is used for intelligent searching on IP networks. Values must look like IP address or CIDR network blocks (i.e.,, 


attribute:       name

attribute-alias: nm

description:     full name

is-primary-key:  TRUE

is-required:     TRUE

is-repeatable:   FALSE

is-multi-line:   FALSE

is-hierarchical: FALSE

index:           ALL

type:            TEXT


attribute:       email

attribute-alias: em

format:          re:[a-za-z0-9-._]+@[a-za-z0-9-.]

description:     rfc 822 email address

is-primary-key:  TRUE

is-required:     TRUE

is-repeatable:   FALSE

is-multi-line:   FALSE

is-hierarchical: TRUE

index:           EXACT

type:            TEXT

IV. The Native Database (MKDB)

A. Overview

The RWhois server uses its own database, named MKDB (Mark Kosters' Database). It is a fairly simple database whose purpose is to scale up well to larger databases; currently there are 1.9 million records in the RWhois root. The database is designed to be simple to understand and can be manipulated by hand.

MKDB's foundation is a series of sorted index files containing pointers to entries in data files. To support this, there are (currently) three different kinds of files that MKDB uses: data files, index files, and master file lists.

B. The Files

For rwhoisd, data is segregated by authority area and "class" into separate data directories, where it is then indexed. Data added via the protocol (using the "-register" directive) is automatically indexed. For initial database loads, or by-hand manipulation of the data, a command-line indexer (rwhois_indexer) is provided. For each class, there is a single master file list (typically called "local.db") and any number of index and data files.

C. The Master File List

The master file list is a list of all of the data and index files for a particular class. It exists primarily to define which index and data files are currently relevant to the database and to assign each file an index number. The file list also tracks a number of statistics (number of records, size in bytes) designed to help the search engine.

The format of the master file list is considered to be opaque, as it may change at any time. It is manipulated entirely by the indexing process. The following is a sample of the current format, with an explanation of the different fields. The master file list consists of "<tag: <value" pairs separated into records by the record separator ("---"). The current tags include the following.


The type is either "DATA" or some kind of index file (EXACT or CIDR). 


The actual filename, given in a relative path from "root-dir" given in the main configuration file. 


The file's number. The numbers are used for a space efficient way to indicate a file, typically from within a index entry. Numbers start from zero and increment sequentially. 


This lists the size of the file in bytes. 


For data files, this is the number of actual records. For index files it is the number of lines. 


A flag that is either "ON" or "OFF". If a file is locked, it will be ignored by the database except for the generation of file numbers. New files are first added locked so that they can act as placeholders for the file, which is unlocked when it is ready. 

Example (this is

Type:      DATA


File_No:   0

Size:      581

Num_Recs:  1

Lock:      OFF


Type:      EXACT


File_No:   1

Size:      33

Num_Recs:  2

Lock:      OFF

Note that it is entirely possible for the index and data files to exist outside of the directory structure. The only file in MKDB that needs to be in a predictable place is the master file list itself.

D. The Data Files

Data files have a similar format to all of the configuration files in the rwhois server: they are "<tag:<value", where "<tag" is an attribute name. The different records are separated with the record separator. Lines beginning with '#' are considered comments and ignored. Case and leading and trailing whitespace is also ignored. The data should conform to the class description described in the attribute definitions file, and it should contain (at least) the required attributes contained in the base class.




Name:         Public, John Q.


Type:         I

First-Name:   John

Last-Name:    Public

Phone:        (847)-391-7926

Fax:          (847)-338-0340


Created:      11961022

Updated:      11961023





Name:         Doe, Jane


Type:         I

First-Name:   Jane

Last-Name:    Doe

Phone:        (847)-391-7943

Fax:          (847)-338-0340


Created:      11961025

Updated:      11961025


Attributes can either be TEXT, ID, or SEE-ALSO types. Type ID attributes should contain the ID of the referenced RWhois object. Type SEE-ALSO attributes should be URLs.

When data records are deleted via the "-register" directive, they are not actually removed immediately. First, they are marked for deletion by replacing the first character of every line in the record with an underscore character ('_'). The process of actually removing deleted records from a file completely is known as a "purge" and is covered below.

The number of data files have no substantial impact on the performance of rwhoisd, although an extreme number of data files can slow down the "-xfer" directive.

E. The Index File

The index file format is very simple. It consists of a number of sorted index records, where each record contains a pointer to a location in a data file, a "deleted" flag, the "global" id of the attribute, and the key.

An index record has the following format.

<file offset:<data file no:<deleted flag:<global attribute id:<key



This indicates that the record containing the key "EDWARD" is 398 bytes into data file "0", it is not deleted, and it corresponds to the global attribute "8" (Last-Name). The key is always stored in uppercase letters.

Each index file contains the indexed keys of one or more data files, and each data file should only have one corresponding index file. While it is certainly possible to index a single data file into multiple index files using the provided indexer, this will produce "false multiples" of records. That is, a query that should result in one record being found will instead result in multiple identical records being found.

There are three different types of index files: EXACT, SOUNDEX, and CIDR. They all share the common index file format. The only difference between them is how they are treated by the search engine. For instance, when searching a SOUNDEX index file, a transform (soundex) is performed on the search key first.

There is no limit to the number of index files, but if there are more index files, the search will be slower. As the number of index files increases, the typical binary search will approach a linear search in performance.

F. Indexing

The MKDB indexes are generated using a basic process.

1. The first step of the process is to identify the actual data file(s) to be indexed, and the authority area and class to which those data files belong.
2. Once this is determined, all data files are added to the master file list in the locked state to get a file number. This is to set their place in the file list so another indexing process cannot inadvertently change the file number.
3. Then each file is read, record by record. As the records are read, they are checked for syntactic compliance with the record's schema.
4. If the record is valid, then the value of each attribute that was marked in the schema as indexable (index was not NONE) is added to one or more temporary index files.
5. Once all files have been indexed, the temporary index files are sorted (currently using /bin/sort) on the key portion.
6. If everything is correct, the index files are added to the master file list and all files are unlocked. Once an index file is part of the master file list in an unlocked state, it will be read as part of the search operation.

Indexing can occur in one of two ways: as part of the "-register" directive and "by hand" using the command line indexer. The indexing that occurs during the "-register" directive processing is handled automatically and uses a subset of the functionality available in the command line indexer. For instance, the syntax checks are skipped, because the register directive has already performed them. The "-register" directive also adds data in a fast, incremental fashion. Each "-register" action, if it succeeds, produces a data file and an index file. If "-register" is used often fairly severe fragmentation can ensue. In this case, the purge operation should be used to defragment the database; purging is discussed in the next section.

The command line indexer is probably the most convenient way to index data. In the most basic operation, it is used to index data initially. The most convenient way to do this is to place all of the data files in the appropriate data directories (as indicated by the "db-dir" attribute in the schema file) and name all of the files with a common suffix. Then, index all the files in a single step.

% rwhoisd_indexer -i -s "suffix"

The "-i" option removes all previous index files, and the "-s" option indicates that all files ending in "suffix" should be indexed. In the sample database, all data files end in ".txt" but could end in any suffix except ".ndx", which is the suffix for the index files themselves.

Please see the rwhois_indexer man page for more details.

G. Purging

To date, purge operations have not been written. However, there are two levels of purging that can be performed: index purges and data purges. Index purges would simply remove index entries marked for deletion and would perhaps merge sort the index files together. This is a fairly safe and efficient operation. Data purges involve rewriting data files to remove deleted records. Once the data files are rewritten, the files must be reindexed, since the position of records within those files may have changed.

V. Authority Areas

A. Overview

A more complete and accurate treatment of authority areas is given the RWhois Version 1.5 specification. This treatment is given to provide some reasoning for the RWhois behavior and configuration options.

An authority area is an identifier for an RWhois database containing data and its schema. It has a hierarchical structure that helps identify the position of the database in the global RWhois data information tree. In the RWhois 1.5 protocol, an authority area has the structure of either a domain name or an IP address in quad-octet prefix/prefix length format. The hierarchical structure of an authority area helps route a query that cannot be resolved locally up or down the tree.

B. Referral Model

There are two types of referrals. When a query is referred up the tree, it is called a punt referral. When a query is referred down the tree, it is called a link referral. The referral model for an RWhois server follows below.

1. Try to parse hierarchical value from the search value in each query term. For example, parse the domain name from an email address.
2. If the parsed hierarchical value is within one of the authority areas and it is within a referred-auth-area for that authority area, refer the query to the referred-auth-area. This is a link referral.
3. If the parsed hierarchical value is within one of the authority areas and it is not within any of the referred-auth-areas, do not refer the query. Otherwise, it could become a referral loop.
4. If the parsed hierarchical value is not within any of the authority areas, refer it up the tree using the referral records in the RWhois parent file (rwhois.root). This is a punt referral.
5. If the search value in each query term is non-hierarchical, do not refer the query. In the future, it will be referred to a non-hierarchical mesh such as the Common Index Protocol (CIP) mesh.

C. Setting Up Referrals

To set up punt referrals, the RWhois parent file (rwhois.root) must have at least one entry to an RWhois server up the tree. In the sample data, it is a referral to the root RWhois server.



To set up link referrals, the RWhois protocol Version 1.5 defines the referral class. It has the following attributes.


This is a unique identifier of the referral record. 


This is the authority area within which the referral record resides. 


The ID of a guardian object. This is optional and repeatable 


This lists the location of the referral in RWhois URL format 


The ID of the organization object maintaining the referred authority area. 


This is the referred authority area within the authority area. For example, is a referred authority area within authority area and is a referred authority area within authority area. 


This is the time the record was created. It uses the "yyyymmddhhmmss" format. 


This is the time the record was last updated. It uses the "yyyymmddhhmmss" format. 


This is the email address of the contact who last updated this record. 





Referral: rwhois://



Created: 19961022101010

Updated: 19961023101010



ID: 888.


Referral: rwhois://

Referral: rwhois://


Created: 19961022101010

Updated: 19961023101010


VI. Contacting the Authors

There is a mailing list for discussion of the RWhois protocol and software. Send a message to with the word "subscribe" in the body to subscribe. There is a mailing list for RWhois developers as well: send a message to with "subscribe" as the body.