[This file contains a description of the hyperlink mechanism. See hacking.txt or
hacking.html for an overview of the manual.]
In contrast to most other aspects of functionality, there is no seperate module
for link handling. The components forming the link mechanism are spread over
several other modules instead.
There is a links.c file (and a related links.h, of course), but it contains
only some simple helper functions -- at least for now.
The first step is done already in the layout module: The links need to be
extracted while parsing the structure of the page to generate the item tree.
This is described under
Links in hacking-layout.*.
Link List
To allow easy access to all links on the page without having to scan every text
item, another structure is created. This structure doesn't itself store any
data; it only stores pointers to the entries inside the text items' link lists.
The "Link_list" structure (defined in items.h) contains:
-
The number of strings on the page, and
-
An array of "struct Link_ptr", containing for every link:
-
The item containing the link
-
The number of the link inside the item's link list
This structure is created by make_link_list() (just after parse_struct()). This
function traverses all items, and for every encountered text item stores
pointers to all of its links. The only exception are links representing
"hidden" form input fields: These aren't stored in the link list, so they won't
be selectable.
activate_link()
Actually activating (or deactivating) a selected link is done with the
activate_link() function. This function is explicitely called by the link
selection commands, but can be also called by
scroll_to() (see
hacking-pager.*) automatically, if the active link is scrolled out of the valid
screen area.
As a special case, the current link is deactivated by calling activate_link
with -1 as argument, so no link is active afterwards.
Before activating the requested link, the page is scrolled so that the link is
actually in the valid screen area. (This isn't done when called with -1, as
links can be deactivated even if they are no longer on the screen.) Scrolling
is done using
scroll_to().
Next step is deactivating a previously active link, if any. This is done by
recursively calling active_link() (with -1 as argument). Of course this is also
done only when activating a new link -- if we are only to deactivate the
current active one, we won't call ourself to do so...
Afterwards, the screen area affected by the link activation is re-rendered.
The coordinates of the affected page area normally are determined by the
coordinates assigned to the link in
pre_render(); if the link spans
multiple lines (contains line wraps) however, the x-coordinates of the item
containing the link are taken instead, so all lines containing parts or the
link are repainted as a whole. (It would be too complicated to determine the
exact affected line parts, and it doesn't make much of a difference anyhow.)
The coordinates are truncated to the part visible on the screen, and the area
is repainted.
Finally, "active_link" is set to the newly activated link. (Which is -1, if the
link was to be deactivated.)
If some link is active, it can be followed in the pager by pressing <return>.
The pager immediatly exits in this case, with a return value indicating that a
link was followed; everything else has to be done by the caller. (s.b.)
The URL stored in the link structure ist copied to the local "url".
If the link only references a local anchor (the URL starts with '#'), the old
page isn't freed; instead, its descriptor is passed as the "reference"
parameter to load_page(), so the page data will be reused and only the anchor
activated, instead of reloading the page.
The full URL of the new page to be loaded needs to be determined using the link
URL and the current page's URL. That is done in
init_load() (called from
load_page()). This process is described in detail in
hacking-load.*.
forms.c contains some helper functions for handling forms.
set_form()
set_form() writes the value of a form control (given by its link structure as
argument) to a string in the item tree, so it will be displayed on the output
page. It is called directly to set the initial value just after the form
element was extracted during structure parsing, when the link list doesn't
exist yet.
(It is presently also used instead of update_form() in one place -- which is a
hack; see below under Manipulating.)
First it has to find out where to store the data. For this, the string
containing the element is searched for the first div *after* the link start.
(The first div of a form link, starting at the link start, is the form
indicator, e.g. '['; the following div then contains the value.)
Having this, the value can be stored to the string. How this is done depends on
the type of the form control.
For text, password, and hidden input fields, as much of the "value" as the div
length is copied to the string div. If there is less than the div length, the
remaining space is padded with '_' characters. (Hidden input fields are
displayed like text fields, only they are "dim", and can't be selected.)
For radio buttons and checkboxes, either an '*' is put in, or an nbsp. This is
determined by the "enabled" flag of the form link. (It does *not* depend on the
"value"!)
For <select> options, the character introducing the option (the one *before*
the first char of the option text) is set either to '+' or to '-', also
depending on whether "enabled" is set.
update_form() is very similar; the difference is that it takes the whole page
structure and a link number from the link list as arguments; the string and the
link structure are then extracted from the list, and set_form() is called to
actually set the data.
form_next()
form_next() (together with form_start()) is used to get all form elements of
some form from the item tree. (This is necessary when submitting, or when
activating "radio" input elements.) Depending on the "filter" flag, it returns
either all form elements, or only the ones that are "successful" (have a name,
a value, and are enabled), and should be submitted to the server.
Every time it is called, it returns one link pointer, belonging to a form
control. When there are no more elements in the form NULL is returned.
form_next() takes/returns a "struct Form_handle" as parameter, which stores the
form item, the item of the last returned link, the link number of the last
returned link, and the "filter" flag. This is used to keep track of which form
elements have already been returned in previous calls. The handle has to be
initialized by form_start(), which takes the form item and the "filter" flag as
paramters, and returns the handle. (Not a pointer!)
form_next() traverses all items below the form item, using the "list_next"
pointers. In every text item found, it scans all of its links. Both loops work
directly with the handle values as loop variables. They are not initialized on
entering, so that they continue right where the last call to form_next()
stopped.
To ensure all items inside the form are scanned, form_start() has to find the
first item inside the form, so form_next() will start scanning there. This is
done by repeatedly descending to "first_child", until a childless item is found
-- this is always the first one in the form.
+------+ start --> +------+
| text |-. ,->| form |
+------+=|===============================================================================|=>+------+
| | x ^
| xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +
| x ++++++++++++++++++++++++++++++++++++
| v + + |
| +-----+ +-----+ |
| 1. descend --> ,->| box |-. ,->| box |-'
| | +-----+=|===========|=>+-----+==>NULL
| | x ^ | | x ^
| xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + | xxxxxxxxxxxxx +
| x +++++++++++++++++++++++++++++++++++++++++++++++ | x +++++++++++++
| v + + + | | v + |
| +------+ +------+ +-----+ | | +------+ |
`->| text |-->| text |-. ,->| box |-' `->| text |-'
,--> +------+==>+------+=|===========|=>+-----+==============>+------+==>NULL
| x x | | x ^ x
2. descend v v | xxxxxxxxxxxxx + v
(goal) NULL NULL | x +++++++++++++ NULL
| v + |
| +------+ |
`->| text |-'
+------+==>NULL
(See Structure Tree in
hacking-layout.* for a description of the item tree.)
url_encode()
url_encode() is responsible for creating the string that will be submitted to
the server as part of the URL when a form is posted using the "GET" method, or
in the HTTP request body when using the "POST" method with URL encoding.
This function retrieves all (successful) form controls from the given form,
using form_start()/form_next(). For each form control, the name and the value
are stored, separated by an '=' and followed by an '&'. In both name and value,
certain characters have to be escaped, which is done in the encode_string()
helper function.
Spaces are replaced by a '+'. Special characters, including all
non-ASCII-characters as well as a "real" '+' and some other characters with
special meaning inside the URL ('&', '=', '%' and '#') are replaced by a
hexadecimal representation, introduced with a '%' char. ("%HH") Other
characters are stored directly.
The memory for the created string is allocated in chunks, for efficency
reasons. This is done inside encode_string(). As soon as the currently allocated
size of the string doesn't suffice to hold the next character, it is grown by
one chunk. The test is very crude, to keep it simple: The array is considered
too small if there is space for less then 5 characters, as up to 5 characters
may be stored before the next test, in the worst case: Up to 3 for the
currently processed char (if it has to be hex-escaped), plus the following '='
if we are at the last character in the name, plus the '&' if the value for this
form control is empty. (No checking is done before storing the '=' and '&', so
we have to reserve the place for them here!)
mime_encode()
mime_encode() does the same as url_encode, except that the data is encoded
using multipart MIME format.
This is much simpler: There is no encoding necessary to the data itself; only
the mime header information have to be stored along with the data itself.
As the size of a data block can be easily calculated here, the string is first
resized to fit the new block in each iteration (for each name/value pair), and
then the data is stored with a simple sprintf() call. This should be much more
efficient than the approach in url_encode for bigger data blocks, especially
files. (Though these aren't implemented yet...)
The size is estimated by adding the size of the data strings ("name" and
"value") to the size of the format string -- this is not exact (the format
specifiers ("%s") are counted, though they are replaced in the resulting
string; on the other hand, the trailing '\0' isn't counted explicitely), but
the few bytes too much shouldn't matter. (Especially as they will be used up
later anyways...)
Manipulating
When <return> is pressed on a selected form control (display() returns RET_LINK,
and "form" of "active_link" is not FORM_NO), special action is taken in main().
On a form control, this generally involves getting an new value (either
implicit by the link activation, or by explicit input), which is then saved
either in "link->value" or in "link->enabled" (depending on the control type),
and also stored to the item tree with update_form(). (So the new value will
show up on the output page.)
For text and password input fields, the user is prompted for a new value, which
is stored in "link->value".
For checkboxes, simply the "link->enabled" flag is toggled.
For radio buttons, "link->enabled" is set for the activated button, and reset
for all other radio buttons in the form which have the same "name". This is
done by iterating through all form items (with form_start()/form_next() without
"filter"), and every time a control with the same name is found, disabling it.
Reflecting the disabling in the item tree can't be done with update_form()
however, as the link number in the link list isn't known here. (It's not
returned nor used by form_next().) Thus, we grab the current string item from
the handle used by form_next(), and call set_form() with that. This is a hack
(it depends on the implementation of "struct Form_handle", which it shouldn't);
however, I don't know any simple and reasonable way to implement that with the
current link storing mechanism...
For <select> options, the behaviour depends on whether the select has the
"multiple" attribute. (Which is coded in the link type: Either "FORM_OPTION" or
"FORM_MULTIOPTION") If yes, they behave like checkboxes, otherwise like radio
buttons.
On a submit button, a page load is performed (using
load_page() as usual);
the target URL is taken from the "data.form->url" field of the item
representing the form to which the button belongs, retrieved with
get_form_item(). The "form" parameter is
set to the form item itself; load_page() passes this on to
init_load(), where it is
handled appropriately. (See hacking-load.*)
Submitting
When init_load() is called with a "form" argument, the form data is extracted
and encoded, and passed to the HTTP server.
For forms using the "GET" submit method, this is done in init_load() itself.
The data is extracted and encoded using
url_encode(); the resulting CGI
parameter string is then passed to
merge_urls(), where it is
stored as part of the resulting target URL (in place of any other CGI
paramters). As the form data is now part of the URL, no other special handling
is necessary -- it will be passed to the server with the URL.
For "POST" forms, the form item is simply passed on to http_init_load() and
from there to get_http_cmd(), where either url_encode() or
mime_encode() is used to get the encoded
form data, which is then stored in the body of the HTTP request and passed to
the server.
Anchors
If the loaded link contains a fragment identifier, "active_anchor" is set to
the desired anchor after loading the page. This is done in
load_page() (described
in hacking-layout.*), after all other steps of the loading process are
completed.
Similar to links, anchors are extracted while parsing the page structure.
However, they are stored in another way: Every anchor has an own item, either
of type ITEM_BLOCK_ANCHOR or ITEM_INLINE_ANCHOR, depending on the anchor type.
Both types are Virtual
Items; this is described in hacking-layout.*.
Jumping to an anchor primarily implies retrieving the matching anchor number. This is
done (inside load_page()) by comparing the fragment identifier given in the URL
with the names of all anchors from the anchor list. When the right anchor is
found, its entry number in the anchor list is stored in "page->active_anchor".
activate_anchor()
This function first calculates the position to scroll the page to. This
position depends on the setting of the "anchor_offset" config variable: If this
has a nonzero value, the page is scrolled so that the anchor will start at the
reciprocal value of "anchor_offset" of the screen height; when "anchor_offset"
is five for example (current default), we will scroll to the anchor start
position minus one fifth of LINES -- the anchor start will appear one fifth of
the screen height below the screen top. If "anchor_offset" is zero, the anchor
will be shown "link_margin" lines below the screen top instead.
With this calculated "optimal" scroll position scroll_to() is called, and
returns the actual pager position -- this may differ from the requested when
the anchor is near the screen top or bottom. Having this, the actual screen
positions of the anchor start and end are calculated. If these are identical
(empty block anchor), the end position (pointing, as always, *after* the last
anchor line), is incremented by one to cause a mark being displayed in the next
line. The theoretical start and end positions are then truncated to the screen
boundaries.
Finally, a mark is printed in the rightmost column of every screen line in the
area spanned by the link, as calculated before. These marks are printed
directly to the curses screen; they aren't stored in the item tree.
When activate_anchor() is called with -1 as anchor number, the previously
activated anchor is deactivated. The page isn't scrolled in this case, but the
screen position of the anchor is calculated as well; the marks are cleared by
re-rendering the area where the marks were drawn. (Calling
render() (described in
hacking-layout.*) with the "overpaint" flag.)
The deactivating is done by the pager after the first keypress. (*After* the
associated function was performed.)