/[cvsroot]/skandha4/info/slisp-15
ViewVC logotype

Contents of /skandha4/info/slisp-15

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1.1.1 - (show annotations) (download) (vendor branch)
Tue May 25 09:39:12 1999 UTC (21 years, 5 months ago) by jsp
Branch: biostr, MAIN
CVS Tags: skandha4, skandha4_1_6, dsg_1_0, HEAD
Branch point for: skandha4_pub, autoconf
Changes since 1.1: +0 -0 lines
ANSI C version of Skandha4


1 This is Info file slisp, produced by Makeinfo version 1.68 from the
2 input file slisp.texi.
3
4 This file documents slisp, (skandha-lisp), an extendable xlisp.
5
6 Copyright @copyright 1994 University of Washington Digital Anatomist
7 Program by Jeff Prothero (jsp@biostr.washington.edu) and Jim Brinkley
8 See slisp/SLISPCOPYING for details
9
10 
11 File: slisp, Node: zerop, Prev: write-char, Up: Xlisp Language Reference
12
13 zerop
14 =====
15
16 zerop
17 type: predicate function (subr)
18 location: built-in
19 source file: xlmath.c
20 Common LISP compatible: yes
21 supported on: all machines
22
23 SYNTAX
24
25 (zerop <expr> )
26 <expr> - the numeric expression to check
27
28 DESCRIPTION
29
30 The ZEROP predicate checks to see if the number <expr> is zero. T is
31 returned if the number is zero, NIL is returned otherwise. A bad
32 argument type error is generated if the <expr> is not a numeric
33 expression.
34
35 EXAMPLES
36
37 (zerop 0) ; returns T
38 (zerop 0.0) ; returns T
39 (zerop 99999.9) ; returns NIL
40 (zerop -0.000000000002) ; returns NIL
41 ;
42 (zerop 'a) ; error: bad argument type
43 (setq a 0) ; set value of A to 0
44 (zerop a) ; returns T
45
46
47
48 
49 File: slisp, Node: Xlisp Internals, Prev: Xlisp Language Reference, Up: Top
50
51 Xlisp Internals
52 ***************
53
54 jsp92Jan15 NOTE: For now, this chapter is just a single huge node.
55 It will be massaged into something closer to reasonable texinfo form
56 when I've decided whether to install xlisp21d in skandha4 and update
57 this file to reflect the changes.
58
59
60 92Jan14 jsp@glia.biostr.washington.edu (Jeff Prothero). Public Domain.
61
62 +---------------------+
63 | xlisp 2.1 internals |
64 +---------------------+
65
66 "Trust the Source, Luke, trust the Source!"
67
68
69 Contents
70 --------
71 Who should read this?
72 What is an LVAL?
73 What is the obarray?
74 The Interpreter Stacks
75 What is a context?
76 What is an environment?
77 How are xlisp entities stored and identified?
78 How are vectors implemented?
79 How are strings implemented?
80 How are symbols implemented?
81 How are closures implemented?
82 How are objects implemented?
83 How are classes implemented?
84 How is the class hierarchy laid out?
85 How do we look up the value of a variable?
86 How are function calls implemented?
87 How are message-sends implemented?
88 How is garbage collection implemented?
89 How are the source files laid out?
90 How do I add a new primitive fn to xlisp?
91 Minor Observations.
92 Acknowledgements.
93
94
95
96 Who should read this?
97 ---------------------
98
99 Anyone poking through the C implementation of xlisp for the first
100 time. This is intended to provide a rough roadmap of the global xlisp
101 structures and algorithms. If you just want to write lisp code in
102 xlisp, you don't need to read this file -- go read xlisp.doc,
103 XlispOOP.doc, and XlispRef.doc, in about that order. If you want to
104 tinker with the xlisp implementation code, you should *still* read
105 those three before reading this. The following isn't intended to be
106 exhaustively precise -- that's what the source code is for! It is
107 intended only to allow you a fighting change of understanding the code
108 the first time through (instead of the third time).
109
110 At the bottom of the file you'll find an example of how to add new
111 primitive functions to xlisp.
112
113
114
115 What is an LVAL?
116 ----------------
117
118 An "LVAL" is the C type for a generic pointer to an xlisp
119 garbage-collectable something. (Cons cell, object, string, closure,
120 symbol, vector, whatever.) Virtually every variable in the
121 interpreter is an LVAL. Cons cells contain two LVAL slots,
122 symbols contains four LVAL slots, etc.
123
124
125
126 What is the obarray?
127 -------------------
128
129 The obarray is the xlisp symbol table. More precisely, it is a
130 hashtable mapping ascii strings (SYMBOL names) to SYMBOLs. (The name
131 "obarray" is traditional but a bit of a misnomer, since it contains
132 only xlisp SYMBOLs, and in particular contains no xlisp OBJECTs.) It
133 is used when converting lisp expressions from text to internal form.
134 Since it is a root for the garbage collector, it also serves to
135 distinguish permanent global-variable SYMBOLs from other SYMBOLs --
136 you can permanently protect a SYMBOL from the garbage collector by
137 entering it into the obarray. This is called "interning" the SYMBOL.
138 The obarray is called "obarray" in C and "*OBARRAY*" in xlisp. It is
139 physically implemented as a VECTOR-valued SYMBOL.
140
141
142
143 The Interpreter Stacks
144 ----------------------
145
146 xlisp uses two stacks, an "evaluation stack" and an "argument stack".
147 Both are roots for the garbage collector. The evaluation stack is
148 largely private to the interpreter and protects internal values from
149 garbage collection, while the argument stack holds the conventional
150 user-visible stackframes.
151
152
153 The evaluation stack is an EDEPTH-long array of "LVAL" allocated by
154 xldmem.c:xlminit(). It grows zeroward.
155
156 xlstkbase points to the zero-near end of the evaluation stack.
157
158 xlstktop points to the zero-far end of the evaluation stack; the
159 occupied part of the stack lies between xlstack and xlstktop. NOTE
160 that xlstktop is *NOT* the top of the stack in the conventional sense
161 of indicating the most recent entry on the stack: xlstktop is a static
162 bounds pointer which never changes once the stack is allocated.
163
164 xlstack starts at the zero-far end of the evaluation stack. *xlstack
165 is the most recent LVAL on the stack. The garbage collector MARKs
166 everything reachable from the evaluation stack (among other things),
167 so we frequently push things on this stack while C code is
168 manipulating them. (Via xlsave(), xlprotect(), xlsave1(), xlprot1().)
169
170
171 The argument stack is an ADEPTH-long array of "LVAL". It also grows
172 zeroward. The evaluator pushes arguments on the argument stack at the
173 start of a function call (form evaluation). Built-in functions
174 usually eat them directly off the stack. For user-lisp functions
175 xleval.c:evfun() pops them off the stack and binds them to the
176 appropriate symbols before beginning execution of the function body
177 proper.
178
179 xlargstkbase is the zero-near end of argument stack.
180
181 xlargstktop is the zero-far end of argument stack. Like xlstktop,
182 xlargstktop is a static bounds pointer which never changes after
183 the stack is allocated.
184
185 *xlsp ("sp"=="stack pointer") is the most recent item on the argument stack.
186
187 xlfp ("fp"=="frame pointer") is the base of the current stackframe.
188
189
190
191 What is a context?
192 ------------------
193
194 An xlisp "context" is something like a checkpoint, recording a
195 particular point buried in the execution history so that we can
196 abort/return back to it. Contexts are used to implement call/return,
197 catch/throw, signals, gotos, and breaks. xlcontext points to the
198 chain of active contexts, the top one being the second-newest active
199 context. (The newest -- that is, current -- active context is
200 implemented by the variables xlstack xlenv xlfenv xldenv xlcontext
201 xlargv xlargc xlfp xlsp.) Context records are written by
202 xljump.c:xlbegin() and read by xljump.c:xljump(). Context records are
203 C structures on the C program stack; They are not in the dynamic
204 memory pool or on the lisp execution or argument stacks.
205
206
207
208 What is an environment?
209 -----------------------
210
211 An environment is basically a store of symbol-value pairs, used to
212 resolve variable references by the lisp program. xlisp maintains
213 three environments, in the global variables xlenv, xlfenv and xldenv.
214
215 Lisp supports two sorts of binding, "lexical binding" and "dynamic
216 binding". Lexically bound variables are 'visible' only in code
217 textually within their binding form. Dynamically bound variables are
218 'visible' in any code *called* from the binding form. (Either kind of
219 binding may be shadowed by other declarations, of course.)
220 Historically, lisp has been moving from dynamic binding (which is easy
221 for interpreters to handle), to lexical binding (which is easy for
222 humans and compilers to handle). Almost all xlisp binding forms are
223 lexically scoped. The most important exception is progv.
224
225 xlenv and xlfenv track lexical bindings. xlenv and xlfenf are
226 conceptually a single environment, although they are implemented
227 separately. They are linked-list stacks which are pushed when we
228 enter a function and popped when we exit it. We also switch
229 xlenv+xlfenf environments entirely when we begin executing a new
230 closure (user-fn written in lisp).
231
232 The xlenv environment is the most heavily used environment. It is
233 used to resolve everyday data references to local variables. It
234 consists of a list of frames (and objects). Each frame is a list of
235 sym-val pairs. In the case of an object, we check all the instance
236 and class variables of the object, then do the same for its
237 superclass, until we run out of superclasses.
238
239 The xlfenv environment is maintained strictly parallel to xlenv, but
240 is used to find function values instead of variable values. The
241 separation may be partly for lookup speed and partly for historical
242 reasons. Merging these two lists into a single list (while
243 distinguishing function bindings from variable bindings, of course)
244 would slightly decrease fn enter/exit overhead while increasing the
245 overhead of looking up each variable or function binding.
246
247 When we send a message, we set xlenv to the value it had when the
248 message CLOSURE was built, then push on (obj msg-class), where
249 msg-class is the [super]class defining the method. (We also set
250 xlfenv to the value xlfenv had when the method was built.) This makes
251 the object instance variables part of the environment, and saves the
252 information needed to correctly resolve references to class variables,
253 and to implement SEND-SUPER.
254
255 The xldenv environment tracks dynamic bindings. It is a simple list
256 of sym-val pairs, treated as a stack. Progv uses it to save the old
257 values of symbols it binds, and it is also used to save old values of
258 s_evalhook and s_applyhook (*EVALHOOK* and *APPLYHOOK*). (These
259 latter mostly support the debug facilities.)
260
261 These environments are manipulated in C via the xlisp.h macros
262 xlframe(e), xlbind(s,v), xlfbind(s,v), xlpbind(s,v,e), xldbind(s,v),
263 xlunbind(e).
264
265
266
267 How are xlisp entities stored and identified?
268 ---------------------------------------------
269
270 Conceptually, xlisp manages memory as a single array of fixed-size
271 objects. Keeping all objects the same size simplifies memory
272 management enormously, since any object can be allocated anywhere, and
273 complex compacting schemes aren't needed. Every LVAL pointer points
274 somewhere in this array. Every xlisp object has the basic format
275 (xldmem.h:typdef struct node)
276
277 struct node {
278 char n_type;
279 char n_flags;
280 LVAL car;
281 LVAL cdr;
282 }
283
284 where n_type is one of:
285
286 FREE A node on the freelist.
287 SUBR A function implemented in C. (Needs evaluated arguments.)
288 FSUBR A "special form" implemented in C. (Needs unevaluated arguments).
289 CONS A regular lisp cons cell.
290 SYMBOL A symbol.
291 FIXNUM An integer.
292 FLONUM A floating-point number.
293 STRING A string.
294 OBJECT Any object, including class objects.
295 STREAM An input or output file.
296 VECTOR A variable-size array of LVALs.
297 CLOSURE Result of DEFUN or LAMBDA -- a function written in lisp.
298 CHAR An ascii character.
299 USTREAM An internal stream.
300 STRUCT A structure.
301
302 Messages may be sent only to nodes with n_type == OBJECT.
303
304 Obviously, several of the above types won't fit in a fixed-size
305 two-slot node. The escape is to have them malloc() some memory
306 and have one of the slots point to it -- VECTOR is the archetype. For
307 example, see xldmem.c:newvector(). To some extent, this malloc()
308 hack simply exports the memory- fragmentation problem to the C
309 malloc()/free() routines. However, it helps keep xlisp simple, and it
310 has the happy side-effect of unpinning the body of the vector, so that
311 vectors can easily be expanded and contracted.
312
313 The garbage collector has special-case code for each of the above node
314 types, so it can find all LVAL slots and recycle any malloc()ed ram
315 when a node is garbage-collected.
316
317 Xlisp pre-allocates nodes for all ascii characters, and for small
318 integers. These nodes are never garbage-collected.
319
320 As a practical matter, allocating all nodes in a single array is not
321 very sensible. Instead, nodes are allocated as needed, in segments of
322 one or two thousand nodes, and the segments linked by a pointer chain
323 rooted at xldmem.c:segs.
324
325
326
327 How are vectors implemented?
328 ----------------------------
329
330 An xlisp vector is a generic array of LVAL slots. Vectors are also
331 the canonical illustration of xlisp's escape mechanism for node types
332 which need more than two LVAL slots (the maximum possible in the
333 fixed-size nodes in the dynamic memory pool). The node CAR/CDR slots
334 for a vector hold a size field plus a pointer to a malloc()ed ram
335 chunk, which is automatically free()ed when the vector is
336 garbage-collected.
337
338 xldmem.h defines macros for reading and writing vector fields and
339 slots: getsize(), getelement() and setelement(). It also defines
340 macros for accessing each of the other types of xlisp nodes.
341
342
343
344 How are strings implemented?
345 ----------------------------
346
347 Strings work much like vectors: The node has a pointer to a malloc()ed
348 ram chunk which is automatically free()ed when the string gets
349 garbage-collected.
350
351
352
353 How are symbols implemented?
354 ----------------------------
355
356 A symbol is a generic user-visible lisp variable, with separate slots
357 for print name, value, function, and property list. Any or all of
358 these slots (including name) may be NIL.
359
360 You create an xlisp symbol from C by calling "xlmakesym(name)" or
361 "xlenter(name)" (to make a symbol and enter it in the obarray).
362
363 You may create symbols from xlisp by explicitly calling GENSYM or
364 MAKE-SYMBOL (uninterned symbols), or INTERN (interned symbol).
365 However, the lisp reader will create symbols on sight, so most
366 lisp symbols are created as side-effects of expressions like 'name.
367
368 (A symbol is 'interned' if it is listed in *OBARRAY*. Various parts
369 of the system, like the lisp reader, treat *OBARRAY* essentially as
370 the list of all known symbols. It is unusual but occasionally useful
371 to create uninterned symbols. You can make READ create an uninterned
372 symbol by preceding it with #:. In xlisp, a newly created symbol has
373 no value unless its name begins with a ':', in which case it has
374 itself for its value -- handy for keywords and message selectors.)
375
376 Most of the symbol-specific code in the interpreter is in xlsym.c.
377
378 Physically, a symbol is implemented like a four-slot vector.
379
380 Random musing: Abstractly, the LISP symbols plus cons cells (etc)
381 constitute a single directed graph, and the symbols mark spots where
382 normal recursive evaluation should stop. Normal lisp programming
383 practice is to have a symbol in every cycle in the graph, so that
384 recursive traversal can be done without MARK bits.
385
386
387
388 How are closures implemented?
389 -----------------------------
390
391 A closure, the return value from a lambda, is a regular coded-in-lisp
392 fn. Physically, it is implemented like an eleven-slot vector, with the
393 node n_type field hacked to contain CLOSURE instead of VECTOR. The
394 vector slots contain:
395
396 name symbol -- 1st arg of DEFUN. NIL for LAMBDA closures.
397 type (s_lambda or s_macro). Must be s_lambda to be executable.
398 args List of "required" formal arguments (as symbols)
399 oargs List of "optional" args, each like: (name (default specified-p))
400 rest Name of "&rest" formal arg, else NIL.
401 kargs keyword args, each like: ((':foo 'bar default specified-p))
402 aargs &aux vars, each like: (('arg default))
403 body actual code (as lisp list) for fn.
404 env value of xlenv when the closure was built. NIL for macros.
405 fenv value of xlfend when the closure was built. NIL for macros.
406 lambda The original formal args list in the DEFUN or LAMBDA.
407
408 The lambda field is for printout purposes. The remaining fields store
409 a predigested version of the formal args list. This is a limited form
410 of compilation: by processing the args list at closure-creation time,
411 we reduce the work needed during calls to the closure.
412
413
414
415 How are objects implemented?
416 ----------------------------
417
418 An object is implemented like a vector, with the size determined by
419 the number of instance variables. The first slot in the vector points
420 to the class of the object; the remaining slots hold the instance
421 variables for the object. An object needs enough slots to hold all
422 the instance variables defined by its class, *plus* all the instance
423 variables defined by all of its superclasses.
424
425
426
427 How are classes implemented?
428 ----------------------------
429
430 A class is a specific kind of object, hence has a class pointer plus
431 instance variables. All classes have the following instance variables:
432
433 MESSAGES A list of (interned-symbol method-closure) pairs.
434 IVARS Instance variable names: A list of interned symbols.
435 CVARS Class variable names: A list of interned symbols.
436 CVALS Class variable values: A vector of values.
437 SUPERCLASS A pointer to the superclass.
438 IVARCNT Number of class instance variables, as a fixnum.
439 IVARTOTAL Total number of instance variables, as a fixnum.
440
441 IVARCNT is the count of the number of instance variables defined by
442 our class. IVARTOTAL is the total number of instance variables in an
443 object of this class -- IVARCNT for this class plus the IVARCNTs from
444 all of our superclasses.
445
446
447
448 How is the class hierarchy laid out?
449 ------------------------------------
450
451 The fundamental objects are the OBJECT and CLASS class objects. (Both
452 are instances of class CLASS, and since CLASSes are a particular kind
453 of OBJECT, both are also objects, with n_type==OBJECT. Bear with me!)
454
455 OBJECT is the root of the class hierarchy: everything you can send a
456 message to has OBJECT as its class or super*class. (Vectors, chars,
457 integers and so forth stand outside the object hierarchy -- you can't
458 send messages to them. I'm not sure why Dave did it this way.) OBJECT
459 defines the messages:
460
461 :isnew -- Does nothing but return the object.
462 :class -- Returns contents of class-pointer slot.
463 :show -- Prints names of obj, obj->class and instance vars.
464
465 Since a CLASS is a specialized type of OBJECT (with instance variables
466 like MESSAGES which generic OBJECTs lack), class CLASS has class
467 OBJECT as its superclass. The CLASS object defines the messages:
468
469 :new -- Create new object with self.IVARTOTAL LVAR slots, plus
470 one for the class pointer. Point class slot to self.
471 Set new.n_type char to OBJECT.
472 :isnew -- Fill in IVARS, CVARS, CVALS, SUPERCLASS, IVARCNT and
473 IVARTOTAL, using parameters from :new call. (The
474 :isnew msg inherits the :new msg parameters because
475 the :isnew msg is generated automatically after
476 each :new msg, courtesy of a special hack in
477 xlobj.c:sendmsg().)
478 :answer -- Add a (msg closure) pair to self.MESSAGES.
479
480
481
482 Here's a figure to summarize the above, with a generic object thrown
483 in for good measure. Note that all instances of CLASS will have a
484 SUPERCLASS pointer, but no normal object will. Note also that the
485 messages known to an object are those which can be reached by
486 following exactly one Class Ptr and then zero or more Superclass Ptrs.
487 For example, the generic object can respond to :ISNEW, :CLASS and
488 :SHOW, but not to :NEW or :ANSWER. (The functions implementing the
489 given messages are shown in parentheses.)
490
491 NIL
492 ^
493 |
494 |Superclass Ptr
495 |
496 Msg+--------+
497 :isnew (xlobj.c:obisnew) <----| class |Class Ptr
498 :class (xlobj.c:obclass) <----| OBJECT |------------------------------>+
499 :show (xlobj.c:objshow) <----| | |
500 +--------+ |
501 ^ ^ ^ |
502 +---------+ | | | |
503 | generic |Class Ptr | | +<---------------+ |
504 | object |-------------->+ |Superclass Ptr ^Superclass Ptr |
505 +---------+ | | |
506 Msg+--------+ +---------+ |
507 :isnew (xlobj.c:clnew) <----| class |Class Ptr | generic |Class Ptr |
508 :new (xlobj.c:clisnew) <----| CLASS |------->+ | CLASS |------->+ |
509 :answer(xlobj.c:clanswer)<----| | | | | | |
510 +--------+ | +---------+ | |
511 ^ ^ | | |
512 | | v v |
513 | +-----------+ <------------------+ v
514 +<------------------------------------+
515
516 Thus, class CLASS inherits the :CLASS and :SHOW messages from class
517 OBJECT, overrides the default :ISNEW message, and provides new
518 messages :NEW and :ANSWER.
519
520 New classes are created by (send CLASS :NEW ...) messages. Their
521 Class Ptr will point to CLASS. By default, they will have OBJECT as
522 their superclass, but this can be overridden by the second optional
523 argument to :NEW.
524
525 The above basic structure is set up by xlobj.c:xloinit().
526
527
528
529 How do we look up the value of a variable?
530 ------------------------------------------
531
532 When we're cruising along evaluating an expression and encounter a
533 symbol, the symbol might refer to a global variable, an instance
534 variable, or a class variable in any of our superclasses. Figuring
535 out which means digging through the environment. The canonical place
536 this happens is in xleval.c:xleval(), which simply passes the buck to
537 xlsym.c:xlgetvalue(), which in turn passes the buck to
538 xlxsym.c:xlxgetvalue(), where the fun of scanning down xlenv begins.
539 The xlenv environment looks something like
540
541 Backbone Environment frame contents
542 -------- --------------------------
543 xlenv --> frame ((sym val) (sym val) (sym val) ... )
544 frame ...
545 object (obj msg-class)
546 frame ...
547 object ...
548 frame ...
549 ...
550
551 The "frame" lines are due to everyday nested constructs like LET
552 expressions, while the "object" lines represent an object environment
553 entered via a message send. xlxgetvalue scans the enviroment left to
554 right, and then top to bottom. It scans down the regular environment
555 frames itself, and calls xlobj.c:xlobjgetvalue() to search the object
556 environment frames.
557
558 xlobjgetvalue() first searches for the symbol in the msg-class, then
559 in all the successive superclasses of msg-class. In each class, it
560 first checks the list of instance-variable names in the IVARS slot,
561 then the list of class-variables name in the CVARS slot.
562
563
564
565 How are function calls implemented?
566 -----------------------------------
567
568 xleval.c contains the central expression-evaluation code.
569 xleval.c:xleval() is the standard top-level entrypoint. The two
570 central functions are xleval.c:xlevform() and xleval.c:evfun().
571 xlevform() can evaluate four kinds of expression nodes:
572
573 SUBR: A normal primitive fn coded in C. We call evpushargs() to
574 evaluate and push the arguments, then call the primitive.
575
576 FSUBR: A special primitive fn coded in C, which (like IF) wants its
577 arguments unevaluated. We call pushargs() (instead of evpushargs())
578 and then the C fn.
579
580 CLOSURE: A preprocessed written-in-lisp fn from a DEFUN or LAMBDA. We
581 call evpushargs() and then evfun().
582
583 CONS: We issue an error if CONS.car isn't a LAMBDA, otherwise we call
584 xleval.c:xlclose() to build a CLOSURE from the LAMBDA, and fall into
585 the CLOSURE code.
586
587 The common thread in all the above cases is that we call evpushargs()
588 or pushargs() to push all the arguments on the evaluation stack,
589 leaving the number and location of the arguments in the global
590 variables xlargc and xlargv. The primitive C functions consume
591 their arguments directly from the argument stack.
592
593 xleval.c:evfun() evaluates a CLOSURE by:
594
595 (1) Switching xlenv and xlfenv to the values they had when
596 the CLOSURE was built. (These values are recorded in the CLOSURE.)
597
598 (2) Binding the arguments to the environment. This involves scanning
599 through the section of the argument stack indicated by xlargc/xlargv,
600 using information from the CLOSURE to resolve keyword arguments
601 correctly and assign appropriate default values to optional arguments,
602 among other things.
603
604 (3) Evaluating the body of the function via xleval.c:xleval().
605
606 (4) Cleaning up and restoring the original environment.
607
608
609
610 How are message-sends implemented?
611 ----------------------------------
612
613 We scan the MESSAGES list in the CLASS object of the recipient,
614 looking for a (message-symbol method) pair that matches our message
615 symbol. If necessary, we scan the MESSAGES lists of the recipient's
616 superclasses too. (xlobj.c:sendmsg().) Once we find it, we basically
617 do a normal function evaluation. (xlobjl.c:evmethod().)
618
619 Two differences between message-send and normal function invocation:
620
621 1) We need to replace the message-symbol by the recipient on the
622 argument stack to make "self" available in the method closure.
623
624 2) We need to push an 'object' stack entry on the xlenv environment to
625 record which class is handling the message (so that, for example,
626 SEND-SUPER can find our superclass).
627
628
629
630 How is garbage collection implemented?
631 --------------------------------------
632
633 The dynamic memory pool managed by xlisp consists of a chain of memory
634 segments (xldmem.h:struct segment) rooted at global C variable "segs".
635 Each segment contains an array of "struct node"s plus a pointer to the
636 next segment. Each node contains a n_type field and a MARK bit, which
637 is zero except during garbage collection.
638
639 Xlisp uses a simple, classical mark-and-sweep garbage collector. When
640 it runs out of memory (fnodes==NIL), it does a recursive traversal
641 setting the MARK flag on all nodes reachable from the obarray, the
642 three environments xlenv/xlfenv/xldenv, and the evaluation and
643 argument stacks. (A "switch" on the n_type field tells us how to find
644 all the LVAL slots in the node (plus associated storage), and a
645 pointer-reversal trick lets us avoid using too much stack space during
646 the traversal.) sweep() then adds all un-MARKed LVALs to fnodes, and
647 clears the MARK bit on the remaining nodes. If this fails to produce
648 enough free nodes, a new segment is malloc()ed.
649
650 The code to do this stuff is mostly in xldmem.c.
651
652
653
654 How are the source files laid out?
655 -----------------------------------------
656
657 To really understand the source, of course, you simply have to sit
658 down and read it. There's no royal road to hacking! So this is a map
659 of the source, not a picture of it.
660
661 The core portable xlisp files have the prefix 'xl' (for 'xlisp').
662
663 The best place to start reading the code is in the two main header
664 files, xlisp.h and xldmem.h.
665
666 The xldmem.h file ('dmem' for 'dynamic memory') defines the actual
667 structure in memory of all the primitive xlisp data types. This is
668 the file you will most likely refer to most often.
669
670 The xlisp.h file defines essentially all global constants and macros
671 which don't need to know about the structures in xldmem.h, mainly
672 manifest constants sizing various arrays for different machines, and
673 macros to test for the type of a list object and to help parse xlisp
674 argument lists.
675
676 The central code files to understand are xlmain.c, xleval.c, xlbfun.c,
677 and xlsym.c.
678
679 xlmain.c contains both main() and the central read-eval-print loop, so
680 it is the place from which to begin tracing flow of control.
681
682 xleval.c (with some support from xlbfun.) contains the heart of xlisp,
683 the code to evaluate functions and macros.
684
685 xlsym.c contains the code to find the value of a symbol.
686
687 Once you understand the above three, you know where xlisp decides to
688 evaluate an s-expression, how it evaluates it, and how it finds the
689 values needed by the expression.
690
691 A good file to tackle next is xlobj.c, which handles much of the
692 object-oriented support, and has much of the flavor of xleval.c.
693
694 Most of the other files are just libraries of functions to handle
695 particular types of processing, and are easily understood once the
696 central code is grokked.
697
698 One of the charms of xlisp *is* that it is small enough to easily read
699 and comprehend. I hope it stays that way!
700
701 Here's a very brief file-by-file overview of the source. Your files
702 will probably be laid out just a little differently. In particular,
703 if you're not running on unix, instead of 'unixstuff.c' you'll have
704 something like 'dosstuff.c'.
705
706 Size Name Contains
707 ----- -------- --------
708 2054 osdefs.h System specific declarations. Empty file in my case.
709 2050 osptrs.h System specific pointers. Empty file in my case.
710 14172 unixstuff.c Isolates I/O fns, which tend to be OS-specific. Unix version.
711 19049 xlbfun.c 'Basic FUNctions': highlevel eval stuff + random support.
712 30229 xlcont.c 'CONTrol': case, do, while, dotimes, other special forms.
713 6380 xldbug.c 'DeBUG': break, abort, error, baktrace...
714 18006 xldmem.c 'Dynamic MEMory': ram allocation, garbage collector.
715 9431 xldmem.h Declaration of LVAL, scads of useful macros.
716 21220 xleval.c 'EVALuation': eval, apply, macroexpand and support for them.
717 11935 xlfio.c 'File I/O': OPEN, READ-CHAR, WRITE-CHAR ...
718 18481 xlftab.c 'Function TABle': Array of all xlisp primitives.
719 4866 xlglob.c 'GLOBal variables': Boring list of declarations.
720 11048 xlimage.c 'memory IMAGE': Code to save/restore contents of xlisp.
721 10579 xlinit.c 'INITialization': Start-of-the-world setup code.
722 6016 xlio.c 'Input/Output': misc I/O stuff, some called by xlfio.c.
723 12664 xlisp.h consp() ..., xlgetarg() ..., misc types.
724 5853 xljump.c catch, throw, return ...
725 20723 xllist.c car, cdr, append, map ... basic list manipulation.
726 11975 xlmath.c Arithmetic functions -- addition, multiplication ...
727 16425 xlobj.c Object support -- create objects & classes, send msgs...
728 4134 xlpp.c A little prettyprinter.
729 9487 xlprin.c Print an arbitrary lisp value in ascii.
730 19535 xlread.c Read in an arbitrary ascii lisp expression.
731 15062 xlstr.c Manipulation of characters and strings.
732 12889 xlstruct.c Lisp structures -- defstruct and kin.
733 5957 xlsubr.c eq, equal, some internal utility fns.
734 7019 xlsym.c Symbols and obarrays ... maksym, getvalue, getprop...
735 5566 xlsys.c Misc stuff -- read file, print backtrace, peek/poke...
736 3926 xmain.c main(), with top read-eval-print loop.
737
738
739
740 How do I add a new primitive fn to xlisp?
741 -----------------------------------------
742
743 Two preliminary comments:
744
745 * You should have a copy of Guy L Steele's "Common Lisp: The Language"
746 (2nd Edition), and make your new primitives compatible with the
747 standard whenever practical.
748
749 * The Skandha4 version of xlisp has been modified to allow new
750 primitives to be added without changing the xlisp source files at
751 all. If you're using this xlisp, you should ignore the following
752 and refer to the Skandha4 documents instead.
753
754 Add a line to the end of xlftab.c:funtab[]. This table contains a
755 list of triples:
756
757 The first element of each triple is the function name as it will
758 appear to the programmer. Make it all upper case.
759
760 The second element is S (for SUBR) if (like most fns) your function
761 wants its arguments pre-evaluated, else F (for FSUBR).
762
763 The third element is the name of the C function to call.
764
765 Remember that your arguments arrive on the xlisp argument stack rather
766 than via the usual C parameter mechanism.
767
768 CAUTION: Try to keep your files separate from generic xlisp files, and
769 to minimize the number of changes you make in the generic xlisp files.
770 This way, you'll have an easier time re-installing your changes when
771 new versions of xlisp come out. For example, if you are going to add
772 many primitive functions to your xlisp, use an #include file rather
773 than putting them all in xlftab.c. It's a good idea to put a marker
774 (like a comment with your initials) on each line you change or insert
775 in the generic xlisp fileset.
776
777 CAUTION: Remember that you usually need to protect the LVAL variables
778 in your function from the garbage-collector. It never hurts to do
779 this, and often produces obscure bugs if you do not. You protect
780 uninitialized local variables with xlsave1() and initialized local
781 variables with xlprot1().
782
783 BE CAREFUL NOT TO PROTECT UNINITIALIZED LOCAL VARIABLES WITH XLPROT1()
784 OR XLPROTECT()! This will appear to work fine until garbage
785 collection happens at an inconvenient moment, at which point the
786 garbage collector will wind up following your uninitialized pointer
787 off to never-never land.
788
789 Note: If you have several pointers to protect, you can save a little
790 runtime and codespace by using
791 xlstkcheck(number-of-variables-to-protect) followed by xlsave()s and
792 xlprotect()s instead of the more expensive xlsave1()s and xlprot1()s.
793
794 Generic code for a new primitive fn:
795
796 /* xlsamplefun - do useless stuff. */
797 /* Called like (samplefun '(a c b) 1 2.0) */
798 LVAL xlsamplefun()
799 {
800 /* Variables to hold the arguments: */
801 LVAL list_arg, integer_arg, float_arg;
802
803 /* Get the arguments, with appropriate errors */
804 /* if any are of the wrong type. Look in */
805 /* xlisp.h for macros to read other types of */
806 /* arguments. Look in xlmath.c for examples */
807 /* of functions which can handle an argument */
808 /* which may be either int or float: */
809 list_arg = xlgalist() ; /* "XLisp Get A LIST" */
810 integer_arg = xlgafixnum(); /* "XLisp Get A FIXNUM" */
811 float_arg = xlgaflonum(); /* "XLisp Get A FLONUM" */
812
813 /* Issue an error message if there are any extra arguments: */
814 xllastarg();
815
816
817
818 /* Call a separate C function to do the actual */
819 /* work. This way, the main function can */
820 /* be called from both xlisp code and C code. */
821 /* By convention, the name of the xlisp wrapper */
822 /* starts with "xl", and the native C function */
823 /* has the same name minus the "xl" prefix: */
824 return samplefun( list_arg, integer_arg, float_arg );
825 }
826 LVAL samplefun( list_arg, integer_arg, float_arg )
827 LVAL list_arg, integer_arg, float_arg;
828 {
829 FIXTYPE val_of_integer_arg;
830 FLOTYPE val_of_float_arg;
831
832 /* Variables which will point to LISP objects: */
833 LVAL result;
834 LVAL list_ptr;
835 LVAL float_ptr;
836 LVAL int_ptr;
837
838 /* Protect our internal pointers by */
839 /* pushing them on the evaluation */
840 /* stack so the garbage collector */
841 /* can't recycle them in the middle */
842 /* of the routine: */
843 xlstkcheck(4); /* Make sure following xlsave */
844 /* calls won't overrun stack. */
845 xlsave(list_ptr); /* Use xlsave1() if you don't */
846 xlsave(float_ptr);/* do an xlstkcheck(). */
847 xlsave(int_ptr);
848 xlsave(result);
849
850 /* Semantic check, illustrating use of xlfail(): */
851 if (list_ptr == NIL) {
852 xlfail("null list");
853 /* Won't return. */
854 }
855
856 /* Create an internal list structure, protected */
857 /* against garbage collection until we exit fn: */
858 list_ptr = cons(list_arg,list_arg);
859
860 /* Get the actual values of our fixnum and flonum: */
861 val_of_integer_arg = getfixnum( integer_arg );
862 val_of_float_arg = getflonum( float_arg );
863
864 /* Semantic check, illustrating use of xlerror(): */
865 if (val_of_integer_arg < -2) {
866 xlerror("bad integer",cvfixnum(val_of_integer_arg));
867 /* Won't return. */
868 }
869
870
871
872 /*******************************************/
873 /* You can have any amount of intermediate */
874 /* computations at this point in the fn... */
875 /*******************************************/
876
877
878 /* Make new numeric values to return: */
879 integer_ptr = cvfixnum( val_of_integer_arg * 3 );
880 float_ptr = cvflonum( val_of_float_arg * 3.0 );
881
882 /* Cons it all together to produce a return value: */
883 result = cons( float_ptr, NIL );
884 result = cons( integer_ptr, result );
885 result = cons( list_ptr, result );
886
887 /* Restore the stack, canceling the xlsave()s: */
888 xlpopn(4); /* Use xlpop() for a single argument.*/
889
890 return result;
891 }
892
893
894
895 Example of what NOT to do:
896 --------------------------
897
898 Here's a function I wrote which does *NOT* correctly prevent the
899 garbage collector from stealing its dynamically allocated cells:
900
901 LVAL incorrect_Point_To_List( p )/*DON'T USE THIS CODE! */
902 geo_point* p;
903 /*-
904 Convert point to (x y z) list.
905 -*/
906 {
907 LVAL result;
908 xlsave1(result);
909 result = cons( /* THIS CODE IS BROKEN! */
910 cvflonum( p->x), /* THIS CODE IS BROKEN! */
911 cons( /* THIS CODE IS BROKEN! */
912 cvflonum( p->y), /* THIS CODE IS BROKEN! */
913 cons( /* THIS CODE IS BROKEN! */
914 cvflonum(p->z), /* THIS CODE IS BROKEN! */
915 NIL /* THIS CODE IS BROKEN! */
916 ) /* THIS CODE IS BROKEN! */
917 ) /* THIS CODE IS BROKEN! */
918 ); /* THIS CODE IS BROKEN! */
919 xlpop();
920 return result;
921 }
922
923 The problem with the above function is that the "z" cell will be
924 allocated first, and is not protected during the allocation of the "y"
925 flonum (or vice versa, depending on the order the compiler chooses to
926 evaluate these arguments). Similarly, the "y" cell is not protected
927 during allocation of the "x" flonum. Here is a correct version, in
928 which "result" always protects the list-to-date:
929
930 LVAL correct_Point_To_List( p )
931 geo_point* p;
932 /*-
933 Convert point to (x y z) list.
934 -*/
935 {
936 LVAL result;
937 xlsave1(result);
938 result = cons( cvflonum(p->z), NIL );
939 result = cons( cvflonum(p->y), result );
940 result = cons( cvflonum(p->x), result );
941 xlpop();
942 return result;
943 }
944
945
946
947 Minor Observations:
948 -------------------
949
950 xlapply, xlevform and sendmsg will issue an error if they encounter a
951 s_macro CLOSURE. You are not allowed to use APPLY or FUNCALL with
952 macros in Common Lisp. There is no way provided to declare macro
953 methods, nor do they make much sense...
954
955 Neither xlapply nor sendmsg will handle FSUBRs. Common Lisp does not
956 allow APPLYing a special form (FSUBR), and since SEND is a SUBR, all
957 of its arguments are already evaluated, so it is not possible to have
958 FSUBR methods.
959
960 Since xlisp tracks the three most recent input expressions (in
961 variables +, ++ and +++) and three most recent results (in variables
962 *, ** and ***), things may occasionally not get garbage-collected as
963 soon as you expect!
964
965 Both xlobj.c:xloinit() and xlobj.c:obsymvols() initialize the "object"
966 and "class" variables. This is neither redundant nor a bug:
967
968 xloinit creates the classes class and object, as well as the symbols,
969 but sets the C variables class and object to point to the class and
970 object.
971
972 obsymbols just sets the C variables by looking up the symbols. It is
973 needed because when you restore a workspace you don't create new
974 objects but still need to know where the existing objects are (they
975 might be in a different location in the saved workspace). Notice that
976 obsymbols is called by xlsymbols which is called both when
977 initializing a new workspace and when restoring an old workspace.
978
979
980
981
982 Acknowledgements
983 ----------------
984
985 This document is considerably improved thanks to careful reading and
986 thoughtful suggestions from Ken Whedbee (kcw@beach.cis.ufl.edu) and
987 (especially) Tom Almy (toma@sail.labs.tek.com).
988
989

brinkley@uw.edu
ViewVC Help
Powered by ViewVC 1.1.26