Google

[Prev: TCONTEND][Resources][TOC][Next: TFIRSTPGLINK]

TEXTCLIPFUNC


Syntax

Envariable

N/A

Element

<TEXTCLIPFUNC>
function_name;source_file
</TEXTCLIPFUNC>

Command-line Option

N/A


Description

TEXTCLIPFUNC defines the the Perl function to invoke when MHonArc clips text. For example, the function specified would be invoked when a length specifier is used for a resource variable, e.g. $SUBJECTNA:72$.

The syntax for TEXTCLIPFUNC is as follows:

routine-name;file-of-routine

The definition of each semi-colon-separated value is as follows:

routine-name

The actual routine name of the filter. The name should be fully qualified by the package it is defined in (e.g. "mypackage::filter").

file-of-routine

The name of the file that defines routine-name. If the file is not a full pathname, MHonArc finds the file by looking in the standard include paths of Perl, and the paths specified by the PERLINC resource.

file-of-routine can be left blank if it is known that routine-name will already be loaded, as is the case for the default value for this resource since the routine is an internal MHonArc function.

Writing a Clipping Function

If you want to write your own function, you need to know the Perl programming language. The following information assumes you know Perl.

Function Interface

MHonArc interfaces with text clipping function by calling the routine with a specific set of arguments. The prototype of the interface routine is as follows:

sub clip {
    my($text, $clip_length, $is_html, $has_tags) = @_;
    # code here
}
Parameter Descriptions
$text

The text to be clipped.

NOTE: Since Perl allows one to modify the data passed into it, the first argument should NOT be modified. If you copy arguments from @_ as shown above, then you will be okay since the my operation creates a copy of the arguments in @_.

$clip_length

The number of characters $text should be clipped to.

$is_html

The text may contain entity references, e.g. "&amp;". Entity references should be considered a single character when clipping $text.

$has_tags

The text may contain HTML tags, and the tags should be stripped from $text when generating the clip string. For example, if $text equals "<b>MHonArc</b>" and $clip_length equals 2, then the return value of the function should be "MH".

NOTE

The $has_tags argument is currently not used within MHonArc, but it will likely be used in a future release.

Return Value

The return value should be the clipped version of $text.

Writing Tips

  • Qualify your filter in its own package. This eliminates possible variable/routine conflicts with MHonArc.

  • Make sure your Perl source file ends with a true statement (like "1;"). MHonArc just performs a require on the file, and if the file does not return true, MHonArc will revert to the default value for TEXTCLIPFUNC.

  • Test your function before production use.


Default Setting

mhonarc::clip_text;

Resource Variables

N/A


Examples

The Unicode example resource file sets TEXTCLIPFUNC to a routine that understands UTF-8 text.

The following is the implementation (as of this writing) of MHonArc's default clipping function:

sub clip_text {
    my $str      = \shift;  # Prevent unnecessary copy.
    my $len      = shift;   # Clip length
    my $is_html  = shift;   # If entity references should be considered
    my $has_tags = shift;   # If html tags should be stripped

    if (!$is_html) {
      return substr($$str, 0, $len);
    }

    my $text = "";
    my $subtext = "";
    my $html_len = length($$str);
    my($pos, $sublen, $erlen, $real_len);
    my $er_len = 0;
    
    for ( $pos=0, $sublen=$len; $pos < $html_len; ) {
	$subtext = substr($$str, $pos, $sublen);
	$pos += $sublen;

	# strip tags
	if ($has_tags) {
	    $subtext =~ s/\A[^<]*>//; # clipped tag
	    $subtext =~ s/<[^>]*>//g;
	    $subtext =~ s/<[^>]*\Z//; # clipped tag
	}

	# check for clipped entity reference
	if (($pos < $html_len) && ($subtext =~ /\&[^;]*\Z/)) {
	    my $semi = index($$str, ';', $pos);
	    if ($semi < 0) {
		# malformed entity reference
		$subtext .= substr($$str, $pos);
		$pos = $html_len;
	    } else {
		$subtext .= substr($$str, $pos, $semi-$pos+1)
		    if $semi > $pos;
		$pos = $semi+1;
	    }
	}

	# compute entity reference lengths to determine "real" character
	# count and not raw character count.
	while ($subtext =~ /(\&[^;]+);/g) {
	    $er_len += length($1);
	}

	$text .= $subtext;

	# done if we have enough
	$real_len = length($text)-$er_len;
	if ($real_len >= $len) {
	    last;
	}
	$sublen = $len - (length($text)-$er_len);
    }
    $text;
}

Version

2.5.10


See Also

Resource Variables


[Prev: TCONTEND][Resources][TOC][Next: TFIRSTPGLINK]

$Date: 2002/08/04 03:58:27 $
MHonArc
Copyright © 2002, Earl Hood, mhonarc@mhonarc.org