PLT MzScheme: Language Manual

Chapter 14

Support Facilities

14.1 Eval and Load

(eval expr) evaluates the S-expression expr in the current namespace.³² (See section 8 and section 7.4.1.5 for more information about namespaces.)

(load file-path) evaluates each expression in the specified file using eval.³³ The return value from load is the value of the last expression from the loaded file (or void if the file contains no expressions). If file-path is a relative pathname, then it is resolved to an absolute pathname using the current directory. Before the first expression of file-path is evaluated, the current load-relative directory (the value of the current-load-relative-directory parameter; see section 7.4.1.6) is set to the absolute pathname of the directory containing file-path; after the last expression in file-path is evaluated (or when the load is aborted), the load-relative directory is restored to its pre-load value.

(load-relative file-path) is like load, but when file-path is a relative pathname, it is resolved to an absolute pathname using the current load-relative directory rather than the current directory. If the current load-relative directory is #f, then load-relative is the same as load.

(load/use-compiled file-path) is like load-relative, but load/use-compiled also checks for .zo files (usually produced with compile-file; see Chapter 9 in PLT MzLib: Libraries Manual) and .so (Unix and Mac OS Classic), .dll (Windows), or .dylib (Mac OS X), files.³⁴ The check for a compiled file occurs whenever file-path ends with a dotted extension of three characters or less (e.g., .ss or .scm) and when a compiled subdirectory exists in the same directory as file-path. A .zo version of the file is loaded if it exists directly in the compiled subdirectory. An .so or .dll version of the file is loaded if it exists within a native subdirectory of the compiled directory, in a deeper subdirectory as named by system-library-subpath. A compiled file is loaded only if its modification date is not older than the date for file-path. If both .zo and .so or .dll files are available, the .so or .dll file is used.

Multiple files can be combined into a single .so or .dll file by creating a special dynamic extension _loader.so or _loader.dll. When such a extension is present where a normal .so or .dll would be loaded, then the _loader extension is first loaded. The result returned by _loader must be a procedure that accepts a symbol. This procedure will be called with a symbol matching the base part of file-path (without the directory path part of the name and without the filename extension), and the result must be two values; if #f is returned as the first result, then load/use-compiled ignores _loader for file-path and continues as normal. Otherwise, the first return value is yet another procedure. When this procedure is applied to no arguments, it should have the same effect as loading file-path. The second return value is either a symbol or #f; a symbol indicates that calling the returned procedure has the effect of declaring the module named by the symbol (which is potentially useful information to a load handler; see section 5.8).

While a .zo, .so, or .dll file is loaded (or while a thunk returned by _loader is invoked), the current load-relative directory is set to the directory of the original file-path.

(load/cd file-path) is the same as (load file-path), but load/cd sets both the current directory and current load-relative directory to the directory of file-path before the file's expressions are evaluated.

(read-eval-print-loop) starts a new read-eval-print loop using the current input, output, and error ports. When read-eval-print-loop starts, it installs a new error escape procedure (see section 6.7) that does not exit the read-eval-print loop. The read-eval-print-loop procedure does not return until eof is read as an input expression; then it returns void.

The read-eval-print-loop procedure is parameterized by the current prompt read handler, the current evaluation handler, and the current print handler; a custom read-eval-print loop can be implemented as in the following example (see also section 7.4.1):

  (parameterize ([current-prompt-read my-read] 
                 [current-eval my-eval] 
                 [current-print my-print]) 
    (read-eval-print-loop))

14.2 Exiting

(exit [v]) passes v on to the current exit handler (see exit-handler in section 7.4.1.9). The default value for v is #t. If the exit handler does not escape or terminate the thread, void is returned.

The default exit handler quits MzScheme (or MrEd), using its argument as the exit code if it is between 1 and 255 inclusive (meaning ``failure''), or 0 (meaning ``success'') otherwise.

When MzScheme is embedded within another application, the default exit handler may behave differently.

14.3 Input Parsing

MzScheme's input parser obeys the following non-standard rules:

Square brackets (``['' and ``]'') and curly braces (``{'' and ``}'') can be used in place of parentheses. An open square bracket must be closed by a closing square bracket and an open curly brace must be closed by a closing curly brace. Whether square brackets are treated as parentheses is controlled by the read-square-bracket-as-paren parameter (see section 7.4.1.3). Similarly, the parsing of curly braces is controlled with the read-curly-brace-as-paren parameter. By default, square brackets and curly braces are treated as parentheses.
Vector constants can be unquoted, and a vector size can be specified with a decimal integer between the # and opening parenthesis. If the specified size is larger than the number of vector elements that are provided, the last specified element is used to fill the remaining vector slots. For example, #4(1 2) is equivalent to #(1 2 2 2). If no vector elements are specified, the vector is filled with 0. If a vector size is provided and it is smaller than the number of elements provided, the exn:read exception is raised.
Boxed constants can be created using #&. The S-expression following #& is treated as a quoted constant and put into the new box. (Spaces following the #& are ignored.) Box reading is controlled with the read-accept-box boolean parameter (see section 7.4.1.3). Box reading is enabled by default. When box reading is disabled and #& is provided as input, the exn:read exception is raised.
Expressions beginning with #' are wrapped with syntax, in the same way that expressions starting with ' are wrapped with quote. Similarly, #` generates quasisyntax, #, generates unsyntax, and #,@ generates unsyntax-splicing. See also section 12.2.1.2.
The following character constants are recognized:
- #\nul or #\null (ASCII 0)
- #\backspace (ASCII 8)
- #\tab (ASCII 9)
- #\newline or #\linefeed (ASCII 10)
- #\vtab (ASCII 11)
- #\page (ASCII 12)
- #\return (ASCII 13)
- #\space (ASCII 32)
- #\rubout (ASCII 127)
Whenever #\ is followed by at least two alphabetic characters, characters are read from the input port until the next non-alphabetic character is returned. If the resulting string of letters does not match one of the above constants (case-insensitively), the exn:read exception is raised.

Character constants can also be specified through direct ASCII values in octal notation: #n₁n₂n₃ where n₁ is in the range [0, 3] and n₂ and n₃ are in the range [0, 7]. Whenever #\ is followed by at least two characters in the range [0, 7], the next character must also be in this range and the resulting octal number must be in the range 000₈ to 377₈.
Within string constants, the following escape sequences are recognized in addition to \" and \\:
- \a: alarm (ASCII 7)
- \b: backspace (ASCII 8)
- \t: tab (ASCII 9)
- \n: linefeed (ASCII 10)
- \v: vertical tab (ASCII 11)
- \f: formfeed (ASCII 12)
- \r: return (ASCII 13)
- \e: escape (ASCII 27)
- \o, \oo, or \ooo: ASCII for octal o, oo, or ooo, where each o is 0, 1, 2, 3, 4, 5, 6, or 7. The \ooo form takes precedence over the \oo form, and \oo takes precedence over \o.
- \xh or \xhh: ASCII for hexadecimal h or hh, where each h is 0, 1, 2, 3, 4, 5, 6, 7, a, A, b, B, c, C, d, D, e, E, f, or F. The \xhh form takes precedence over the \xh form.
Furthermore, a backslash followed by a linefeed, carriage return or return-linefeed combination is elided, allowing string constants to span lines. Any other use of backslash within a string constant is an error.
Numbers containing a decimal point or exponent (e.g., 1.3, 2e78) are normally read as inexact. If the read-decimal-as-inexact parameter is set to #f, then such numbers are instead read as exact. The parameter does not affect the parsing of numbers with an explicit exactness tag (#e or #i).
A parenthesized sequence containing two delimited dots (``.'') triggers infix parsing. A single datum must appear between the dots, and one or more datums must appear before the first dot and after the last dot:
(left-datum ···¹ . first-datum . right-datum ···¹)
The resulting list consists of the datum between the dots, followed by the remaining datums in order:
(first-datum left-datum ···¹ right-datum ···¹)
Consequently, the input expression (1 . < . 2) produces #t, and (1 2 . + . 3 4 5) produces 15.
When the read-accept-dot parameter is set to #f, then a delimited dot (``.'') is disallowed in input. When the read-accept-quasiquote parameter is set to #f, then a backquote or comma is disallowed in input. These modes simplify Scheme's input model for students.
MzScheme's identifier and symbol syntax is considerably more liberal than the syntax specified by R5RS. When input is scanned for tokens, the following characters delimit an identifier:

" , ' ` ( ) [ ] { } space tab return newline page vtab

In addition, an identifier cannot start with a hash mark (``#'') unless the hash mark is immediately followed by a percent sign (``%''). The only other special characters are backslash (``\'') or quoting vertical bars (``|''); any other character is used as part of an identifier.

Symbols containing special characters (including delimiters) are expressed using an escaping backslash (``\'') or quoting vertical bars (``|''):
- A backslash preceding any character includes that character in the symbol literally; double backslashes produce a single backslash in the symbol.
- Characters between a pair of vertical bars are included in the symbol literally. Quoting bars can be used for any part of a symbol, or the whole symbol can be quoted. Backslashes and quoting bars can be mixed within a symbol, but a backslash is not a special character within a pair of quoting bars.
Characters quoted with a backslash or a vertical bar always preserve their case, even when identifiers are read case-insensitively (the default).

An input token constructed in this way is an identifier when it is not a numerical constant (following the extended number syntax described in section 3.3). A token containing a backslash or vertical bars is never treated as a numerical constant.

Examples:
- (quote a\(b) produces the same symbol as (string->symbol "a(b").
- (quote A\B) produces the same symbol as (string->symbol "aB") when identifiers are read without case-sensitivity.
- (quote a\ b), (quote |a b|), and (quote a| |b) all produce the same symbol as (string->symbol "a b").
- (quote |a||b|) is the same as (quote ab).
- (quote 10) is the number 10, but (quote |10|) produces the same symbol as (string->symbol "10").
Whether a vertical bar is used as a special or normal symbol character is controlled with the read-accept-bar-quote boolean parameter (see section 7.4.1.3). Vertical bar quotes are enabled by default. Quoting backslashes cannot be disabled.
By default, symbols are read case-insensitively (i.e., uppercase characters are downcased). Case sensitivity for reading can be controlled in three ways:
- Quoting part of a symbol with an escaping backslash (``\'') or quoting vertical bar (``|'') always preserves the case of the quoted portion, as described above.
- The sequence #cs can be used as a prefix for any expression to make reading symbols within the expression case-sensitive. A #ci prefix similarly makes reading symbols in an expression case-insensitive. Whitespace can appear between a #cs or #ci prefix and its expression, and prefixes can be nested. Backslash and vertical-bar quotes override a #ci prefix.
- When the read-case-sensitive parameter (see section 7.4.1.3) is set to #t, then case is preserved when reading symbols. The default is #f, and it is set to #f while loading a module (see section 5.8). A #cs or #ci prefix overrides the parameter setting, as does backslash or vertical-bar quoting.
Case conversions are not sensitive to the current locale (see section 7.4.1.11).
S-expressions with shared structure are expressed using #n= and #n#, where n is a decimal integer. See section 14.5.
Expressions of the form #%x are symbols, where x can be a symbol or a number.
Expressions beginning with #~ are interpreted as compiled MzScheme code. See section 14.6.
Multi-line comments are started with #| and terminated with |#. Comments of this form can be nested arbitrarily.
If the first line of a loaded file begins with #!, it is ignored by the default load handler. If an ignored line ends with a backslash (``\''), then the next line is also ignored. (The #! convention is for shell scripts; see Chapter 18 for details.)

Reading from a custom port can produce arbitrary values generated by the port; see section 11.1.6 for details. If the port generates a non-character value in a position where a character is required (e.g., within a string), the exn:read:non-char exception is raised.

14.4 Output Printing

MzScheme's printer obeys the following non-standard rules:

A vector can be printed by write and print using the shorthand described in section 14.3, where the vector's length is printed between the leading # and the opening parenthesis and repeated tail elements are omitted. For example, #(1 2 2 2) is printed as #4(1 2). The display procedure does not output vectors using this shorthand. Shorthand vector printing is controlled with the print-vector-length boolean parameter (see section 7.4.1.4). Shorthand vector printing is enabled by default.
Boxes (see section 3.9) can be printed with the #\& notation (see section 14.3). When box printing is disabled, all boxes are printed as #<box>. Box printing is controlled with the print-box boolean parameter (see section 7.4.1.4). Box printing is enabled by default.
Structures (see Chapter 4) can be printed using vector notation. In the vector, the first item is a symbol of the form struct:s -- where s is the name of the structure -- and the remaining elements are the elements of the structure, but the vector exposes only as much information about the structure as the current inspector can access (see section 4.6). When structure printing is disabled, or when no part of the structure is accessible to the current inspector, a structure is printed as #<struct:s>. Structure printing is controlled with the print-struct boolean parameter (see section 7.4.1.4). Structure printing is disabled by default.
Symbols containing spaces or special characters write using escaping backslashes and quoting vertical bars. When the read-case-sensitive parameter is set to #f, then symbols containing uppercase characters also use escaping backslashes or quoting vertical bars. In addition, symbols are quoted with vertical bars or a leading backslash when they would otherwise print the same as a numerical constant. If the value of the read-accept-bar-quote boolean parameter is #f (see section 7.4.1.3), then backslashes are always used to escape special characters instead of quoting them with vertical bars, and a vertical bar is not treated as a special character. See section 14.3 for more information about symbol parsing. Symbols display without escaping or quoting special characters.
Characters with the special names described in section 14.3 write using the same name. (Some characters have multiple names; the #\newline and #\nul names are used instead of #\linefeed and #\null). Other ``printable'' characters write as #\ followed by the single-byte character value, and ``unprintable'' characters are written in octal notation; unprintable-character detection depends on locale sensitivity (see section 7.4.1.11), and when the current locale is disabled, a character whose char->integer value is greater than 127 is always treated as unprintable. All characters display as the single-byte character value.
Strings containing ``unprintable'' characters (see above) write using the escape sequences described in section 14.3. All strings display as their literal character sequences.
S-expressions with shared structure can be printed using #n= and #n#, where n is a decimal integer. See section 14.5.

14.5 Data Sharing in Input and Output

MzScheme can read and print graphs, S-expressions with shared structure (e.g., a cycle). Graphs are described by tagging the shared structure once with #n= (using some decimal integer n with no more than eight digits) and then referencing it later with #n# (using the same number n). For example, the following S-expression describes the infinite list of ones:

#0=(1 . #0#)

If this graph is entered into MzScheme's read-eval-print loop, MzScheme's compiler will loop forever, trying to compile an infinite expression. In contrast, the following expression defines ones to the infinite list of ones, using quote to hide the infinite list from the compiler:

(define ones (quote #0=(1 . #0#)))

A tagged structure can be referenced multiple times. Here, v is defined to be a vector containing the same cons cell in all three slots:

(define v #(#1=(cons 1 2) #1# #1#))

A tag #n= must appear to the left of all references #n#, and all references must appear in the same top-level S-expression as the tag. By default, MzScheme's printer will display a value without showing the shared structure:

#((1 . 2) (1 . 2) (1 . 2))

Graph reading and printing are controlled with the read-accept-graph and print-graph boolean parameters (see section 7.4.1.4). Graph reading is enabled by default, and graph printing is disabled by default. However, when the printer encounters an graph containing a cycle, graph printing is automatically enabled, temporarily. (For this reason, the display, write, and print procedures require memory proportional to the depth of the value being printed.) When graph reading is disabled and a graph is provided as input, the exn:read exception is raised.

If the n in a #n= form or a #n# form contains more than eight digits, the exn:read exception is raised. If a #n# form is not preceded by a #n= form using the same n, the exn:read exception is raised. If two #n= forms are in the same expression for the same n, the exn:read exception is raised.

14.6 Compilation

Normally, compilation happens automatically: when an S-expression is evaluated, it is first compiled and then the compiled code is executed. However, MzScheme can also write and read compiled code. MzScheme can read compiled code much faster than reading S-expression code and compiling it, so compilation can be used to speed up program loading. The MzLib procedure compile-file (see Chapter 9 in PLT MzLib: Libraries Manual) is sufficient for most compilation purposes.

(compile expr) compiles expr, where expr is any S-expression that can be passed to eval. The result is a compiled expression Scheme value. This value can be passed to eval to evaluate the compiled expression.
(compiled-expression? v) returns #t if v is a compiled expression, #f otherwise.

When a compiled expression is written to an output port, the written form starts with #~. These expressions are essentially assembly code for the MzScheme interpreter, and reading such an expression produces a compiled expression. When a compiled expression contains syntax object constants, the #~ form of the expression drops location information and properties for the syntax objects (see section 12.2 and section 12.6.2).

Never ask MzScheme to evaluate an expression starting with #~ unless compile generated the expression. To keep users from accidentally specifying bad instructions, read will not accept expressions beginning with #~ unless it is specifically enabled through the read-accept-compiled boolean parameter (see section 7.4.1.3). When the default load handler is used to load a file, compiled expression reading is automatically (temporarily) enabled as each expression is read.

A compiled code object may contain uninterned symbols (see section 3.6) that were created by gensym, string->uninterned-symbol, and generate-temporaries. When the compiled object is read via #~, each uninterned symbol in the original expression is mapped to a new uninterned symbol, where multiple instances of a single symbol are consistently mapped to the same new symbol. The original and new symbols have the same printed representation.

Special problems arise when an uninterned symbol is used to construct an identifier for a module's exported variable. Since a module and its importers are typically compiled and written as separate expressions, different uninterned symbols are generated for identifiers when the different modules are loaded. MzScheme corrects for the problem by recording identifier-position pairs for each import, and then resolving imports at load time by checking the printed representation of a name at the expected position in the exporting module. (In principle, the position is sufficient, but MzScheme checks the printed form of the name to help avoid confusion due to mis-ordered compilation sequences.)

14.7 Dynamic Extensions

A dynamically-linked extension library is loaded into MzScheme with (load-extension file-path). The separate document Inside PLT MzScheme contains information about writing MzScheme extensions. An extension can only be loaded once during a MzScheme session, although the extension-writer can provide functionality to handle extra calls to load-extension for a single extension.

As with load, the current load-relative directory (the value of the current-load-relative-directory parameter; see section 7.4.1.6) is set while the extension is loaded. The load-relative-extension procedure is like load-extension, but it loads an extension with a pathname that is relative to the current load-relative directory instead of the current directory.

The load-extension procedure actually just dispatches to the current load extension handler (see section 7.4.1.6). The result of calling load-extension is determined by the extension. If the extension cannot be loaded, the exn:i/o:filesystem exception is raised. The detail field of the exception is 'wrong-version if the load fails because the extension has the wrong version.

14.8 Saving and Restoring Program Images

An image is a memory dump from a running MzScheme program that can be later restored (one or more times) to continue running the program from the point of the dump. Images are only supported for statically-linked Unix versions of MzScheme (and MrEd). There are a few special restrictions on images:

All files and TCP connections must be closed when an image is created.
No dynamic extensions can be loaded before an image is created.
No operating system subprocesses can be active when an image is created.

(write-image-to-file file-path [cont-proc]) copies the state of the entire MzScheme process ³⁵ to file-path, replacing file-path if it already exists. If images are not supported, the exn:misc:unsupported exception is raised. If cont-proc is #f, then the MzScheme or MrEd process exits immediately after creating the image. Otherwise, cont-proc must be a procedure of no arguments, and the return value(s) of the call to write-image-to-file is (cont-proc). The default value for cont-proc is void.

(read-image-from-file file-path arg-vector) restores the image saved to file-path. Once the image is restored, execution of the original program continues with the return from write-image-to-file; the return value in the restored program is the a vector of strings arg-vector. A successful call to read-image-from-file never returns because the restored program is overlayed over the current program. The vector arg-vector must contain no more than 20 strings, and the total length of the strings must be no more than 2048 characters.

If an error is encountered while reading or writing an image, the exn:i/o:filesystem exception is raised or exn:misc exception is raised. Certain errors during read-image-from-file are unrecoverable; in case of such errors, MzScheme prints an error message and exits immediately.

An image can also be restored by starting the stand-alone version of MzScheme or MrEd with the --restore flag followed by the image filename. The return value from write-image-to-file in the restored program is a vector of strings that are the extra arguments provided on the command line after the image filename (if any).

³² The eval procedure actually calls the current evaluation handler (see section 7.4.1.5) with e to evaluate the expression.

³³ The load procedure actually just sets the current load-relative directory and calls the current load handler (see section 7.4.1.6) with file-path to load the file. The description of load here is actually a description of the default load handler.

³⁴ The load/use-compiled procedure actually just calls the current load/use-compiled handler (see section 7.4.1.6). The default handler, in turn, calls the load or load-extension handler, depending on the type of file that is loaded.

³⁵ The set of environment variables is not saved. When an image is restored, the environment variables of the restoring program are transferred into the restored program.