The Macrological Fascicle

Chapter 3

Syntax objects

Syntax objects are the means by which the hygiene model (section 1.2) is implemented in the Scheme language. They are the means by which macros written by users can obtain information about the forms used to invoke them at the site of macro use, and by which they produce their output.

A syntax object may be wrapped, as described in the hygiene model. It may also be unwrapped, fully or partially, i.e., consist of list and vector structure with wrapped syntax objects or nonsymbol values at the leaves. More formally, a syntax object is:

A syntax object may contain circular structures created by datum labels or by use of the datum->syntax procedure (section 3.2). An implementation may also consider other, non-datum values to be syntax objects, but the meaning and behaviour of such values when included in the output of transformers is not defined by this report.

The distinction between the terms ‘syntax object’ and ‘wrapped syntax object’ is important. For example, when invoked by the expander, a transformer procedure must accept a wrapped syntax object but may return any syntax object, including an unwrapped syntax object. Wrapped syntax objects are distinct from other types of values.

Identifiers

Syntax objects representing identifiers are always wrapped. A symbol which is not wrapped is never a valid syntax object.

(identifier? obj)
procedure

Returns #t if obj is an identifier, i.e., a syntax object representing an identifier, and #f otherwise.

Examples:

(identifier? #'x)
#t
(identifier? 'x)
#f
(identifier? #'(x))
#f
(identifier-defined? id)
procedure

Returns #t if the given id has a binding associated with it, or #f otherwise.

Operationally, identifier-defined? returns #t if the given identifier has a lexical address associated with it within its lexical environment, and #f otherwise.

Rationale: While it is possible to detect whether a particular identifier is bound or not using identifier properties (section 2.5), it is somewhat cumbersome to have to catch and deal with the exception raised by an attempted reference to a property on an unbound identifier. The identifier-defined? procedure also does not depend on the environment maintained by the expander, and can therefore be used outside of the dynamic extent of a call to a macro transformer.

In general, a test to determine whether an identifier is bound or not is useful to improve error reporting in macros which depend on some identifier named within them having been bound outside of the macro use.

Examples:

(identifier-defined? #'identifier-defined?)
#t

Assuming no identifier x is defined:

(identifier-defined? #'x)
#f
(let ((x 1)) (identifier-defined? #'x))
#t
(generate-identifier)
procedure
(generate-identifier symbol)
procedure

Returns a new identifier. The optional argument symbol specifies the symbolic name of the resulting identifier. The returned identifier is guaranteed not to be bound-identifier=? to any existing identifier. If the optional symbol argument is not given it should also not be symbolic-identifier=? to any existing identifier.

Operationally, (generate-identifier symbol) returns a new wrapped syntax object wrapping the symbol with a history containing a time-stamp for the end of a fictive macro transcription step.

(generate-temporaries list-stx)
procedure

list-stx must be a list or syntax object representing a list-structured form.

Returns a list of generated identifiers as long as the input list list-stx. Each generated identifier is subject to the same requirements as imposed on generate-identifier when called without an argument.

Operationally, generate-temporaries first converts list-stx to a proper list, unwrapping the successive cdrs of any wrapped pairs, then calls map on the resulting list, generating a new identifier with (generate-identifier) for each item.

(bound-identifier=? id1 id2)
procedure

Returns #t if a binding for one id would capture a reference to the other in the output of the transformer, assuming that the reference appears within the scope of the binding, and #f otherwise. In general, two identifiers are bound-identifier=? only if both are present in the original program or both are introduced by the same transformer application (perhaps implicitly — see datum->syntax).

Operationally, bound-identifier=? returns #t if id1 and id2 both have the same symbolic names and the same histories (discarding time-stamps for the beginning and end of the same macro transcription step), and #f otherwise.

Examples:

(bound-identifier=? #'x #'x)
#t
(bound-identifier=? #'x #'y)
#f
(bound-identifier=? (generate-identifier 'x)
                    (generate-identifier 'x))
#f
(symbolic-identifier=? id1 id2)
procedure

Returns #t if the two ids have the same symbolic name, and #f otherwise.

Rationale: An example definition for this procedure was given in the R6RS, but the procedure was not actually made part of any library. Since it is part of the operation of free-identifier=? and is needed to implement the small language’s cond-expand, it is provided here.

Examples:

(symbolic-identifier=? (generate-identifier 'x)
                       (generate-identifier 'x))
#t
(symbolic-identifier=? #'x #'y)
#f

Implementation:

(define (symbolic-identifier=? id_1 id_2)
  (symbol=? (syntax->datum id_1)
            (syntax->datum id_2)))
(free-identifier=? id1 id2)
procedure

Returns #t if the bindings for the two ids would refer to the same lexical binding if inserted as free identifiers in the output of the transformer. If either of the ids is not lexically bound, return #t if they are symbolic-identifier=?. Otherwise, return #f.

Operationally, free-identifier=? returns #t if id1 and id2 map to the same lexical address within their respective lexical environments, or if neither maps to any lexical address and their symbolic names are the same, and #f otherwise.

Free-identifier=? can be used within transformers to find uses of auxiliary syntax keywords.

Examples:

(import (scheme base)
        (rename (scheme base)
                (else otherwise)))
(free-identifier=? #'else #'otherwise)
#t
(free-identifier=? #'else #'=>)
#f

The following examples show that unbound identifiers compare the same if they have the same symbolic names. The examples assume that no identifier x is defined.

(free-identifier=? (generate-identifier 'x)
                   (generate-identifier 'x))
#t
(free-identifier=? #'x (generate-identifier 'x))
#t
(let ((x 1))
  (free-identifier=? #'x (generate-identifier 'x)))
#f

Wrapped syntax objects

(quote-syntax syntactic datum)
syntax

Syntax: The syntactic datum is either an identifier, or a datum which is neither an identifier nor a list nor a vector, or one of the following.

(syntactic datum ...)

(syntactic datum ... . syntactic datum)

#(syntactic datum ...)

Semantics: Quote-syntax is the syntactic analogue of quote. It evaluates to a syntax object representation of the syntactic datum which retains hygiene information for the identifiers contained in the syntactic datum. The result of evaluating a quote-syntax expression is suitable for inclusion in the expansion of a macro use.

Examples:

(symbol? (quote-syntax x))
#f
(identifier? (unwrap-syntax (quote-syntax x)))
#t
(let-syntax ((car (lambda (x) (quote-syntax car))))
  ((car) '(0)))
0
(let-syntax
    ((quote-foo
      (lambda (stx)
        (quote-syntax (quote foo)))))
  (let ((quote (lambda (x) 'bar)))
    (quote-foo)))
foo

Note: This form was called syntax in the R4RS. In the R6RS, the form called syntax was extended with additional functionality; that is also the version which appears under that name in this report (section 4.3).

(unwrap-syntax stx)
procedure

Unwraps the immediate datum structure from the syntax object stx, leaving nested syntax structure (if any) in place, without stripping any syntactic information from identifiers.

Operationally, if stx is an identifier or is not a wrapped syntax object, then it is returned unchanged. Otherwise unwrap-syntax converts the outermost structure of stx into a data object, returning a pair whose car and cdr are syntax objects, a vector whose elements are syntax objects, or a Scheme value which is neither an identifier, a pair, nor a vector. Syntax objects within the pairs or vectors returned by unwrap-syntax retain their original hygiene information.

Examples:

(identifier? (unwrap-syntax (quote-syntax x)))
#t
(identifier? (cdr (unwrap-syntax (quote-syntax (x . y)))))
#t
(identifier? (cdr (unwrap-syntax (quote-syntax (x y z)))))
#f
(syntax->datum stx)
procedure

Strips all syntactic information from the syntax object stx and returns the corresponding Scheme datum. Identifiers stripped in this manner are converted to their symbolic names.

The result of syntax->datum must not be and must not contain any wrapped syntax objects. If a datum wrapped within stx contains cycles, these must be present (re-created in a copy, if necessary) within the returned datum.

Rationale: By processing the result of calling syntax->datum on a syntax object in parallel with unwrapping the original syntax object step by step, transformers which need to handle cyclical structures specially can detect and process such structures as appropriate.

This procedure irrevocably deletes hygiene information from identifiers: syntax->datum and datum->syntax cannot, in general, round-trip cleanly.

Examples:

(symbol? (syntax->datum #'x))
#t
(syntax->datum (quote-syntax (quote #1=(a . #1#))))
(quote #1=(a . #1#))

Note: This procedure, which can operate on entire expressions and not just individual identifiers, replaces the procedure identifier->symbol of the R4RS.

(datum->syntax context-id datum)
procedure

context-id must be an identifier and datum should be a datum value.

Returns a syntax object representation of datum that contains the same contextual information as context-id, with the effect that the syntax object behaves as if it were introduced into the code when context-id was introduced.

Operationally, datum->syntax creates a new wrapped syntax object which wraps datum and which copies its hygiene information from context-id.

Note: This procedure, which can operate on entire expressions and not just individual symbols, replaces the procedure construct-identifier of the R4RS.

Todo: What if datum already contains wrapped syntax objects? Should they be unchanged in the output?

Example: The following macro makes an early return procedure available in its body under the name return, without this name having to be explicitly given to the macro.

(define-syntax with-return
  (lambda (stx)
    (let ((return-id
           (syntax->datum (car (unwrap-syntax stx)) 'return))
          (body (cdr (unwrap-syntax stx))))
      `(,(quote-syntax call-with-current-continuation)
         (,(quote-syntax lambda) (,return-id) . ,body)))))
(define (find-odd ls)
  (with-return
    (for-each
     (lambda (n) (if (odd? n) (return n)))
     ls)))
(find-odd '(6 2 8 3 1 8))
3

The following example shows how the name return is introduced into the code at the same time the keyword with-return in the macro use was introduced: it is available within the code introduced by the expansion of a hygienic syntax-rules macro, and not to the user of that macro, preserving the referential transparency of hygienic macros which make use of non-hygienic macros in their implementation.

(define-syntax suppress-exceptions
  (syntax-rules ()
    ((_ body_0 body_1 ...)
     (with-return
       (with-exception-handler
           (lambda (e) (return #f))
         (lambda () body_0 body_1 ...))))))
(suppress-exceptions (raise 'oops))
#f
(let ((return (lambda (ignored) #t)))
  (suppress-exceptions
    (return #f)))
#t

Todo: This example will probably need changing to use delimited control operators, once it is decided what form those will take in the Foundations.