Chapter 15. Lexical Blocking

Lexical, or static, blocking is another means of preventing identifier collisions in MDL. (The first was dynamic blocking -- binding and ENVIRONMENTs.) By using a subset of the MDL lexical blocking facilities, the "block structure" of such languages as Algol, PL/I, SAIL, etc., can be simulated, should you wish to do so.

15.1. Basic Considerations

Since what follows appears to be rather complex, a short discussion of the basic problem lexical blocking solves and MDL's basic solution will be given first.

ATOMs are identifiers. It is thus essential that whenever you type an ATOM, READ should respond with the unique identifier you wish to designate. The problem is that it is unreasonable to expect the PNAMEs of all ATOMs to be unique. When you use an ATOM A in a program, do you mean the A you typed two minutes ago, the A you used in another one of your programs, or the A used by some library program?

Dynamic blocking (pushing down of LVALs) solves many such problems. However, there are some which it does not solve -- such as state variables (whether they are impure or pure). Major problems with a system having only dynamic blocking usually arise only when attempts are made to share large numbers of significant programs among many people.

The solution used in MDL is basically as follows: READ must maintain at least one table of ATOMs to guarantee any uniqueness. So, MDL allows many such tables and makes it easy for the user to specify which one is wanted. Such a table is an object of TYPE OBLIST ("object list"). All the complication which follows arises out of a desire to provide a powerful, easily used method of working with OBLISTs, with reasonable values used by default.

15.2. OBLISTs

An OBLIST is of PRIMTYPE UVECTOR with UTYPE LIST; the LIST holds ATOMs. The ATOMs are ordered by a hash coding on their PNAMEs: each LIST is a hashing bucket.) What follows is information about OBLISTs as such.

15.2.1. OBLIST Names

Every normally constituted OBLIST has a name. The name of an OBLIST is an ATOM associated with the OBLIST under the indicator OBLIST. Thus,

<GETPROP oblist OBLIST>

or

<GET oblist OBLIST>

returns the name of oblist.

Similarly, every name of an OBLIST is associated with its OBLIST, again under the indicator OBLIST, so that

<GETPROP oblist-name:atom OBLIST>

or

<GET oblist-name:atom OBLIST>

returns the OBLIST whose name is oblist-name.

Since there is nothing special about the association of OBLISTs and their names, the name of an OBLIST can be changed by the use of PUTPROP, both on the OBLIST and its name. It is not wise to change the OBLIST association without changing the name association, since you are likely to confuse READ and PRINT terribly.

You can also use PUT or PUTPROP to remove the association between an OBLIST and its name completely. If you want the OBLIST to go away (be garbage collected), and you want to keep its name around, this must be done: otherwise the association will force it to stay, even if there are no other references to it. (If you have no references to either the name or the OBLIST (an ATOM -- including a TYPE name -- points to its OBLIST), both of them -- and their association -- will go away without your having to remove the association, of course.) It is not recommended that you remove the name of an OBLIST without having it go away, since then ATOMs in that OBLIST will PRINT the name as if they were in no OBLIST -- which is defeating the purpose of this whole exercise.

15.2.2. MOBLIST

<MOBLIST atom fix>

("make oblist") creates and returns a new OBLIST, containing no ATOMs, whose name is atom, unless there already exists an OBLIST of that name, in which case it returns the existing OBLIST. fix is the size of the OBLIST created -- the number of hashing buckets. fix is optional (ignored if the OBLIST already exists), 13 by default. If specified, fix should be a prime number, since that allows the hashing to work better.

15.2.3. OBLIST?

<OBLIST? atom>

returns #FALSE () if atom is not in any OBLIST. If atom is in an OBLIST, it returns that OBLIST.

15.3. READ and OBLISTs

READ can be explicitly told to look up an ATOM in a particular OBLIST by giving the ATOM a trailer. A trailer consists of the characters !- (exclamation-point dash) following the ATOM, immediately followed by the name of the OBLIST. For example,

A!-OB

specifies the unique ATOM of PNAME A which is in the OBLIST whose name is the ATOM OB.

Note that the name of the OBLIST must follow the !- with no separators (like space, tab, carriage-return, etc.). There is a name used by default (section 15.5) which types out and is typed in as !-separator.

Trailers can be used recursively:

B!-A!-OB

specified the unique ATOM of PNAME B which is in the OBLIST whose name is the unique ATOM of PNAME A which is in the OBLIST whose name is OB. (Whew!) The repetition is terminated by the look-up and insertion described below.

If an ATOM with a given PNAME is not found in the OBLIST specified by a trailer, a new ATOM with that PNAME is created and inserted into that OBLIST.

If an OBLIST whose name is given in a trailer does not exist, READ creates one, of length 13 buckets.

If trailer notation is not used (the "normal" case), and for an ATOM that terminates a trailer, READ looks up the PNAME of the ATOM in a LIST of OBLISTs, the LVAL of the ATOM OBLIST by default. This look-up starts with <1 .OBLIST> and continues until .OBLIST is exhausted. If the ATOM is not found. READ usually inserts it into <1 .OBLIST>. (It is possible to force READ to use a different element of the LIST of OBLISTs for new insertions. If the ATOM DEFAULT is in that LIST, the OBLIST following that ATOM will be used.)

15.4. PRINT and OBLISTs

When PRINT is given an ATOM to output, it outputs as little of the trailer as is necessary to specify the ATOM uniquely to READ. That is, if the ATOM is the first ATOM of that PNAME which READ would find in its normal look-up in the current .OBLIST, no trailer is output. Otherwise, !- is output and the name of the OBLIST is recursively PRIN1ed.

Warning: there are obscure cases, which do not occur in normal practice, for which the PRINT trailer does not terminate. For instance, if an ATOM must have a trailer printed, and the name of the OBLIST is an ATOM in that very same OBLIST, death. Any similar case will also give PRINT a hernia.

15.5. Initial State

In an initial MDL, .OBLIST contains two OBLISTs. <1 .OBLIST> initially contains no ATOMs, and <2 .OBLIST> contains all the ATOMs whose GVAL are SUBRs or FSUBRs, as well as OBLIST, DEFAULT, T, etc. It is difficult to lose track of the latter; the specific trailer !-separator will always cause references to that OBLIST. In addition, the SUBR ROOT, which takes no arguments, always returns that OBLIST.

The name of <ROOT> is ROOT; this ATOM is in <ROOT> and would cause infinite recursion were it not for the use of !-separator. The name of the initial <1 .OBLIST> is INITIAL (really INITIAL!-).

The ATOM OBLIST also has a GVAL. ,OBLIST is initially the same as .OBLIST; however, ,OBLIST is not affected by the SUBRs used to manipulate the OBLIST structure. It is instead used only when errors occur.

In the case of an error, the current .OBLIST is checked to see if it is "reasonable" -- that is, contains nothing of the wrong TYPE. (It is reasonable, but not standard, for .OBLIST to be a single OBLIST instead of a LIST of them.) If it is reasonable, that value stays current. Otherwise, OBLIST is SET to ,OBLIST. Note that changes made to the OBLISTs on ,OBLIST -- for example, new ATOMs added -- remain. If even ,OBLIST is unreasonable, OBLIST is SET and SETGed to its initial value. <ERRET> (section 16.4) always assumes that .OBLIST is unreasonable.

Three other OBLISTs exist in a virgin MDL: their names and purposes are as follows:

ERRORS!- contains ATOMs whose PNAMEs are used as error messages. It is returned by <ERRORS>.

INTERRUPTS!- is used by the interrupt system (section 21.5.1). It is returned by <INTERRUPTS>.

MUDDLE!- is used infrequently by the interpreter when loading compiled programs to fix up references to locations within the interpreter.

The pre-loading of compiled programs may create other OBLISTs in an initialized MDL (Lebling, 1979).

15.6. BLOCK and ENDBLOCK

These SUBRs are analogous to begin and end in Algol, etc., in the way they manipulate static blocking (and in no other way.)

<BLOCK look-up:list-of-oblists>

returns its argument after "pushing" the current LVAL of the ATOM OBLIST and making its argument the current LVAL. You usually want <ROOT> to be an element of look-up, normally its last.

<ENDBLOCK>

"pops" the LVAL of the ATOM OBLIST and returns the resultant LIST of OBLISTs.

Note that this "pushing" and "popping" of .OBLIST is entirely independent of functional application, binding, etc.

15.7. SUBRs Associated with Lexical Blocking

15.7.1. READ (again)

<READ channel eof-routine look-up>

This is a fuller call to READ. look-up is an OBLIST or a LIST of them, used as stated in section 15.3 to look up ATOMs and insert them in OBLISTs. If not specified, .OBLIST is used. See also section 11.1.1.1, 11.3, and 17.1.3 for other arguments.

15.7.2. PARSE and LPARSE (again)

<PARSE string radix:fix look-up>

as was previously mentioned, applies READ's algorithm to string and returns the first MDL object resulting. This includes looking up prospective ATOMs on look-up, if given, or .OBLIST. LPARSE can be called in the same way. See also section 7.6.6.2 and 17.1.3 for other arguments.

15.7.3. LOOKUP

<LOOKUP string oblist>

returns the ATOM of PNAME string in the OBLIST oblist, if there is such an ATOM; otherwise, it returns #FALSE (). If string would PARSE into an ATOM anyway, LOOKUP is faster, although it looks in only one OBLIST instead of a LIST of them.

15.7.4. ATOM

<ATOM string>

creates and returns a spanking new ATOM of PNAME string which is guaranteed not to be on any OBLIST.

An ATOM which is not on any OBLIST is PRINTed with a trailer of !-#FALSE ().

15.7.5. REMOVE

<REMOVE string oblist>

removes the ATOM of PNAME string from oblist and returns that ATOM. If there is no such ATOM, REMOVE returns #FALSE (). Also,

<REMOVE atom>

removes atom from its OBLIST, if it is on one. It returns atom if it was on an OBLIST; otherwise it returns #FALSE ().

15.7.6. INSERT

<INSERT string-or-atom oblist>

creates an ATOM of PNAME string, inserts it into oblist and returns it. If there is already an ATOM with the same PNAME as atom in oblist, an error occurs. The standard way to avoid the error and always get your atom is

<OR <LOOKUP string oblist> <INSERT string oblist>>

As with REMOVE, INSERT can also take an ATOM as its first argument; this ATOM must not be on any OBLIST -- it must have been REMOVEd, or just created by ATOM -- else an error occurs. The OBLIST argument is never optional. If you would like the new ATOM to live in the OBLIST that READ would have chosen, you can <PARSE string> instead.

15.7.7. PNAME

<PNAME atom>

returns a STRING (newly created) which is atom's PNAME ("printed name"). If trailers are not needed, PNAME is much faster than UNPARSE on atom. (In fact, UNPARSE has to go all the way through the PRINT algorithm twice, the first time to see how long a STRING is needed.)

15.7.8. SPNAME

SPNAME ("shared printed name") is identical to PNAME, except that the STRING it returns shares storage with atom (appendix 1), which is more efficient if the STRING will not be modified. PUTting into such a STRING will cause an error.

15.8. Example: Another Solution to the INC Problem

What follows is an example of the way OBLISTs are "normally" used to provide "externally available" ATOMs and "local" ATOMs which are not so readily available externally. Lebling (1979) describes a systematic way to accomplish the same thing and more.

<MOBLIST INCO 1>
        ;"Create an OBLIST to hold your external symbols.
        Its name is INCO!-INITIAL!- ."

INC!-INCO
        ;"Put your external symbols into that OBLIST.
    If you have many, just write them successively."

<BLOCK (<MOBLIST INCI!-INCO 1> <GET INCO OBLIST> <ROOT>)>
    ;"Create a local OBLIST, naming it INCI!-INCO, and set up
    .OBLIST for reading in your program. The OBLIST INCO is
    included in the BLOCK so that as your external symbols are
    used, they will be found in the right place. Note that the
    ATOM INCO is not in any OBLIST of the BLOCK; therefore,
    trailer notation of !-INCO will not work within the current
    BLOCK-ENDBLOCK pair."

<DEFINE INC ;"INC is found in the INCO OBLIST."
    (A) ;"A is not found and is therefore put into INCI by
READ."
    #DECL ((VALUE A) <OR FIX FLOAT>)
    <SET .A <+ ..A 1>>> ;"All other ATOMs are found in the
ROOT."
<ENDBLOCK>

This example is rather trivial, but it contains all of the issues, of which there are three.

The first idea is that you should create two OBLISTs, one to hold ATOMs which are to be known to other users (INCO), and the other to hold internal ATOMs which are not normally of interest to other (INCI). The case above has one ATOM in each category.

Second, INCO is explicitly used without trailers so that surrounding BLOCK and ENDBLOCKs will have an effect on it. Thus INCO will be in the OBLIST desired by the user; INC will be in INCO, and the user can refer to it by saying INC!-INCO; INCI will also be in INCO, and can be referred to in the same way; finally, A is really A!-INCI!-INCO. The point of all this is to structure the nesting of OBLISTs.

Finally, if for some reason (like saving storage space) you wish to throw INCI away, you can follow the ENDBLOCK with

<REMOVE "INCI" <GET INCO OBLIST>>

and thus remove all references to it. The ability to do such pruning is one reason for structuring OBLIST references.

Note that, even after removing INCI, you can "get A back" -- that is, be able to type it in -- by saying something of the form

<INSERT <1 <1 ,INC!-INCO>> <1 .OBLIST>>

thereby grabbing A out of the structure of INC and re-inserting it into an OBLIST. however, this resurrects the name collision caused by <INC!-INCO A>.