pjs/js/ref/README

675 строки
26 KiB
Plaintext

This is the README file for the JavaScript Reference (JSRef) implementation.
It consists of build conventions and instructions, source code conventions, a
design walk-through, and a brief file-by-file description of the source.
JSRef builds a library or DLL containing the JavaScript runtime (compiler,
interpreter, decompiler, garbage collector, atom manager, standard classes).
It then compiles a small "shell" program and links that with the library to
make an interpreter that can be used interactively and with test .js files to
run scripts.
The current version of JSRef lacks a conformance testsuite. We aim to provide
one as soon as possible.
Quick start tip: skip to "Using the JS API" below, build js, and play with the
object named "it" (start by setting 'it.noisy = true').
Brendan Eich, 9/17/96
------------------------------------------------------------------------------
Build conventions:
- On Windows, use MSDEV4.2 (js*.mdp) or 5.0 (js*.mak).
- On Mac, use CodeWarrior 3.x (JSRef.mcp in the macbuild subdirectory)
- On Unix, use vendor cc or gcc (ftp://prep.ai.mit.edu/pub/gnu) for compiling,
and use gmake for building.
To compile optimized code, pass BUILD_OPT=1 on the nmake/gmake command line
or preset it in the environment or makefile. The C preprocessor macro DEBUG
will be undefined, and NDEBUG (archaic Unix-ism for "No Debugging") will be
defined. Without BUILD_OPT, DEBUG is predefined and NDEBUG is undefined.
On Unix, your own debug flag, DEBUG_$USER, will be defined or undefined as
BUILD_OPT is unset or set.
(Linux autoconf support way overdue; coming some day soon, I promise.)
- To add C compiler options from the make command line, set XCFLAGS=-Dfoo.
To predefine -D or -U options in the makefile, set DEFINES.
To predefine -I options in the makefile, set INCLUDES.
- To turn on GC instrumentation, define JS_GCMETER.
- To enable multi-threaded execution, define JS_THREADSAFE and flesh out the
stubs and required headers in jslock.c/.h. See the JS API docs for more.
- To turn on the arena package's instrumentation, define PR_ARENAMETER.
- To turn on the hash table package's metering, define PR_HASHMETER.
Naming and coding conventions:
- Public function names begin with JS_ followed by capitalized "intercaps",
e.g. JS_NewObject.
- Extern but library-private function names use a js_ prefix and mixed case,
e.g. js_LookupSymbol.
- Most static function names have unprefixed, mixed-case names: GetChar.
- But static native methods of JS objects have lowercase, underscore-separated
or intercaps names, e.g., str_indexOf.
- And library-private and static data use underscores, not intercaps (but
library-private data do use a js_ prefix).
- Scalar type names are lowercase and js-prefixed: jsdouble.
- Aggregate type names are JS-prefixed and mixed-case: JSObject.
- Macros are generally ALL_CAPS and underscored, to call out potential
side effects, multiple uses of a formal argument, etc.
- Four spaces of indentation per statement nesting level.
- Tabs are taken to be eight spaces, and an Emacs magic comment at the top of
each file tries to help. If you're using MSVC or similar, you'll want to
set tab width to 8, or convert these files to be space-filled.
- DLL entry points have their return type expanded within a PR_PUBLIC_API()
macro call, to get the right Windows secret type qualifiers in the right
places for both 16- and 32-bit builds.
- Callback functions that might be called from a DLL are similarly macroized
with PR_STATIC_CALLBACK (if the function otherwise would be static to hide
its name) or PR_CALLBACK (this macro takes no type argument; it should be
used after the return type and before the function name).
Using the JS API:
- Starting up:
/*
* Tune this to avoid wasting space for shallow stacks, while saving on
* malloc overhead/fragmentation for deep or highly-variable stacks.
*/
#define STACK_CHUNK_SIZE 8192
JSRuntime *rt;
JSContext *cx;
/* You need a runtime and one or more contexts to do anything with JS. */
rt = JS_Init(1000000L);
if (!rt)
fail("can't create JavaScript runtime");
cx = JS_NewContext(rt, STACK_CHUNK_SIZE);
if (!cx)
fail("can't create JavaScript context");
/*
* The context definitely wants a global object, in order to have standard
* classes and functions like Date and parseInt. See below for details on
* JS_NewObject.
*/
JSObject *globalObj;
globalObj = JS_NewObject(cx, &my_global_class, 0, 0);
JS_InitStandardClasses(cx, globalObj);
- Defining objects and properties:
/* Statically initialize a class to make "one-off" objects. */
JSClass my_class = {
"MyClass",
/* All of these can be replaced with the corresponding JS_*Stub
function pointers. */
my_addProperty, my_delProperty, my_getProperty, my_setProperty,
my_enumerate, my_resolve, my_convert, my_finalize
};
JSObject *obj;
/*
* Define an object named in the global scope that can be enumerated by
* for/in loops. The parent object is passed as the second argument, as
* with all other API calls that take an object/name pair. The prototype
* passed in is null, so the default object prototype will be used.
*/
obj = JS_DefineObject(cx, globalObj, "myObject", &my_class, 0,
JSPROP_ENUMERATE);
/*
* Define a bunch of properties with a JSPropertySpec array statically
* initialized and terminated with a null-name entry. Besides its name,
* each property has a "tiny" identifier (MY_COLOR, e.g.) that can be used
* in switch statements (in a common my_getProperty function, for example).
*/
enum my_tinyid {
MY_COLOR, MY_HEIGHT, MY_WIDTH, MY_FUNNY, MY_ARRAY, MY_RDONLY
};
static JSPropertySpec my_props[] = {
{"color", MY_COLOR, JSPROP_ENUMERATE},
{"height", MY_HEIGHT, JSPROP_ENUMERATE},
{"width", MY_WIDTH, JSPROP_ENUMERATE},
{"funny", MY_FUNNY, JSPROP_ENUMERATE},
{"array", MY_ARRAY, JSPROP_ENUMERATE},
{"rdonly", MY_RDONLY, JSPROP_READONLY},
{0}
};
JS_DefineProperties(cx, obj, my_props);
/*
* Given the above definitions and call to JS_DefineProperties, obj will
* need this sort of "getter" method in its class (my_class, above). See
* the example for the "It" class in js.c.
*/
static JSBool
my_getProperty(JSContext *cx, JSObject *obj, jsval id, jsval *vp)
{
if (JSVAL_IS_INT(id)) {
switch (JSVAL_TO_INT(id)) {
case MY_COLOR: *vp = . . .; break;
case MY_HEIGHT: *vp = . . .; break;
case MY_WIDTH: *vp = . . .; break;
case MY_FUNNY: *vp = . . .; break;
case MY_ARRAY: *vp = . . .; break;
case MY_RDONLY: *vp = . . .; break;
}
}
return JS_TRUE;
}
- Defining functions:
/* Define a bunch of native functions first: */
static JSBool
my_abs(JSContext *cx, JSObject *obj, uintN argc, jsval *argv, jsval *rval)
{
jsdouble x, z;
if (!JS_ValueToNumber(cx, argv[0], &x))
return JS_FALSE;
z = (x < 0) ? -x : x;
return JS_NewDoubleValue(cx, z, rval);
}
. . .
/*
* Use a JSFunctionSpec array terminated with a null name to define a
* bunch of native functions.
*/
static JSFunctionSpec my_functions[] = {
/* name native nargs */
{"abs", my_abs, 1},
{"acos", my_acos, 1},
{"asin", my_asin, 1},
. . .
{0}
};
/*
* Pass a particular object to define methods for it alone. If you pass
* a prototype object, the methods will apply to all instances past and
* future of the prototype's class (see below for classes).
*/
JS_DefineFunctions(cx, globalObj, my_functions);
- Defining classes:
/*
* This pulls together the above API elements by defining a constructor
* function, a prototype object, and properties of the prototype and of
* the constructor, all with one API call.
*
* Initialize a class by defining its constructor function, prototype, and
* per-instance and per-class properties. The latter are called "static"
* below by analogy to Java. They are defined in the constructor object's
* scope, so that 'MyClass.myStaticProp' works along with 'new MyClass()'.
*
* JS_InitClass takes a lot of arguments, but you can pass null for any of
* the last four if there are no such properties or methods.
*
* Note that you do not need to call JS_InitClass to make a new instance of
* that class -- otherwise there would be a chicken-and-egg problem making
* the global object -- but you should call JS_InitClass if you require a
* constructor function for script authors to call via new, and/or a class
* prototype object ('MyClass.prototype') for authors to extend with new
* properties at run-time.
*/
protoObj = JS_InitClass(cx, globalObj, &my_class,
/* native constructor function and min arg count */
MyClass, 0,
/* prototype object properties and methods -- these
will be "inherited" by all instances through
delegation up the instance's prototype link. */
my_props, my_methods,
/* class constructor properties and methods */
my_static_props, my_static_methods);
- Running scripts:
/* These should indicate source location for diagnostics. */
char *filename;
uintN lineno;
/*
* The return value comes back here -- if it could be a GC thing, you must
* add it to the GC's "root set" with JS_AddRoot(cx, &thing) where thing
* is a JSString *, JSObject *, or jsdouble *, and remove the root before
* rval goes out of scope, or when rval is no longer needed.
*/
jsval rval;
JSBool ok;
/*
* Some example source in a C string. Larger, non-null-terminated buffers
* can be used, if you pass the buffer length to JS_EvaluateScript.
*/
char *source = "x * f(y)";
ok = JS_EvaluateScript(cx, globalObj, source, strlen(source),
filename, lineno, &rval);
if (ok) {
/* Should get a number back from the example source. */
jsdouble d;
ok = JS_ValueToNumber(cx, rval, &d);
. . .
}
- Calling functions:
/* Call a global function named "foo" that takes no arguments. */
ok = JS_CallFunctionName(cx, globalObj, "foo", 0, 0, &rval);
jsval argv[2];
/* Call a function in obj's scope named "method", passing two arguments. */
argv[0] = . . .;
argv[1] = . . .;
ok = JS_CallFunctionName(cx, obj, "method", 2, argv, &rval);
- Shutting down:
/* For each context you've created: */
JS_DestroyContext(cx);
/* And finally: */
JS_Finish(rt);
- Debugging API
See the trap, untrap, watch, unwatch, line2pc, and pc2line commands in js.c.
Also the (scant) comments in jsdbgapi.h.
Design walk-through:
This section must be brief for now -- it could easily turn into a book.
- JS "JavaScript Proper"
JS modules declare and implement the JavaScript compiler, interpreter,
decompiler, GC and atom manager, and standard classes.
JavaScript uses untyped bytecode and runtime type tagging of data values.
The jsval type is a signed machine word that contains either a signed integer
value (if the low bit is set), or a type-tagged pointer or boolean value (if
the low bit is clear). Tagged pointers all refer to 8-byte-aligned things in
the GC heap.
Objects consist of a possibly shared structural description, called the map
or scope; and unshared property values in a vector, called the slots. Object
properties are associated with nonnegative integers stored in jsvals, or with
atoms (unique string descriptors) if named by an identifier or a non-integral
index expression.
Scripts contain bytecode, source annotations, and a pool of string, number,
and identifier literals. Functions are objects that extend scripts or native
functions with formal parameters, a literal syntax, and a distinct primitive
type ("function").
The compiler consists of a recursive-descent parser and a random-logic rather
than table-driven lexical scanner. Semantic and lexical feedback are used to
disambiguate hard cases such as missing semicolons, assignable expressions
("lvalues" in C parlance), etc. The parser generates bytecode as it parses,
using fixup lists for downward branches and code buffering and rewriting for
exceptional cases such as for loops. It attempts no error recovery.
The interpreter executes the bytecode of top-level scripts, and calls itself
indirectly to interpret function bodies (which are also scripts). All state
associated with an interpreter instance is passed through formal parameters
to the interpreter entry point; most implicit state is collected in a type
named JSContext. Therefore, all API and almost all other functions in JSRef
take a JSContext pointer as their first argument.
The decompiler translates postfix bytecode into infix source by consulting a
separate byte-sized code, called source notes, to disambiguate bytecodes that
result from more than one grammatical production.
The GC is a mark-and-sweep, non-conservative (perfect) collector. It can
allocate only fixed-sized things -- the current size is two machine words.
It is used to hold JS object and string descriptors (but not property lists
or string bytes), and double-precision floating point numbers. It runs
automatically only when maxbytes (as passed to JS_Init) bytes of GC things
have been allocated and another thing-allocation request is made. JS API
users should call JS_GC or JS_MaybeGC between script executions or from the
branch callback, as often as necessary.
An important point about the GC's "perfection": you must add roots for new
objects created by your native methods if you store references to them into
a non-JS structure in the malloc heap or in static data. Also, if you make
a new object in a native method, but do not store it through the rval result
parameter (see math_abs in the "Using the JS API" section above) so that it
is in a known root, the object is guaranteed to survive only until another
new object is created. Either lock the first new object when making two in
a row, or store it in a root you've added, or store it via rval.
The atom manager consists of a hash table associating strings uniquely with
scanner/parser information such as keyword type, index in script or function
literal pool, etc. Atoms play three roles in JSRef: as literals referred to
by unaligned 16-bit immediate bytecode operands, as unique string descriptors
for efficient property name hashing, and as members of the root GC set for
perfect GC. This design therefore requires atoms to be manually reference
counted, from script literal pools (JSAtomMap) and object symbol (JSSymbol)
entry keys.
Native objects and methods for arrays, booleans, dates, functions, numbers,
and strings are implemented using the JS API and certain internal interfaces
used as "fast paths".
In general, errors are signaled by false or unoverloaded-null return values,
and are reported using JS_ReportError or one of its variants by the lowest
level in order to provide the most detail. Client code can substitute its
own error reporting function and suppress errors, or reflect them into Java
or some other runtime system as exceptions, GUI dialogs, etc.
- PR "Portable Runtime"
PR modules declare and implement fundamental representation types and macros,
arenas, hash tables, 64-bit integers, double-precision floating point to
string and back conversions, and date/time functions that are used by the JS
modules. The PR code is independent of JavaScript and can be used without
linking with the JS code.
In general, errors are signaled by false or unoverloaded-null return values,
but are not reported. Therefore, JS calls to PR functions check returns and
report errors as specifically as possible.
File walk-through:
- jsapi.c, jsapi.h
The public API to be used by almost all client code.
If your client code can't make do with jsapi.h, and must reach into a friend
or private js* file, please let us know so we can extend jsapi.h to include
what you need in a fashion that we can support over the long run.
- jspubtd.h, jsprvtd.h
These files exist to group struct and scalar typedefs so they can be used
everywhere without dragging in struct definitions from N different files.
The jspubtd.h file contains public typedefs, and is included by jsapi.h.
The jsprvtd.h file contains private typedefs and is included by various .h
files that need type names, but not type sizes or declarations.
- jsdbgapi.c, jsdbgapi.h
The Debugging API, still very much under development. Provided so far:
- Traps, with which breakpoints, single-stepping, step over, step out, and
so on can be implemented. The debugger will have to consult jsopcode.def
on its own to figure out where to plant trap instructions to implement
functions like step out, but a future jsdbgapi.h will provide convenience
interfaces to do these things.
At most one trap per bytecode can be set. When a script (JSScript) is
destroyed, all traps set in its bytecode are cleared.
- Watchpoints, for intercepting set operations on properties and running a
debugger-supplied function that receives the old value and a pointer to
the new one, which it can use to modify the new value being set.
- Line number to PC and back mapping functions. The line-to-PC direction
"rounds" toward the next bytecode generated from a line greater than or
equal to the input line, and may return the PC of a for-loop update part,
if given the line number of the loop body's closing brace. Any line after
the last one in a script or function maps to a PC one byte beyond the last
bytecode in the script.
An example, from perfect.js:
14 function perfect(n)
15 {
16 print("The perfect numbers up to " + n + " are:");
17
18 // We build sumOfDivisors[i] to hold a string expression for
19 // the sum of the divisors of i, excluding i itself.
20 var sumOfDivisors = new ExprArray(n+1,1);
21 for (var divisor = 2; divisor <= n; divisor++) {
22 for (var j = divisor + divisor; j <= n; j += divisor) {
23 sumOfDivisors[j] += " + " + divisor;
24 }
25 // At this point everything up to 'divisor' has its sumOfDivisors
26 // expression calculated, so we can determine whether it's perfect
27 // already by evaluating.
28 if (eval(sumOfDivisors[divisor]) == divisor) {
29 print("" + divisor + " = " + sumOfDivisors[divisor]);
30 }
31 }
32 delete sumOfDivisors;
33 print("That's all.");
34 }
The line number to PC and back mappings can be tested using the js program
with the following script:
load("perfect.js")
print(perfect)
dis(perfect)
print()
for (var ln = 0; ln <= 40; ln++) {
var pc = line2pc(perfect,ln)
var ln2 = pc2line(perfect,pc)
print("\tline " + ln + " => pc " + pc + " => line " + ln2)
}
The result of the for loop over lines 0 to 40 inclusive is:
line 0 => pc 0 => line 16
line 1 => pc 0 => line 16
line 2 => pc 0 => line 16
line 3 => pc 0 => line 16
line 4 => pc 0 => line 16
line 5 => pc 0 => line 16
line 6 => pc 0 => line 16
line 7 => pc 0 => line 16
line 8 => pc 0 => line 16
line 9 => pc 0 => line 16
line 10 => pc 0 => line 16
line 11 => pc 0 => line 16
line 12 => pc 0 => line 16
line 13 => pc 0 => line 16
line 14 => pc 0 => line 16
line 15 => pc 0 => line 16
line 16 => pc 0 => line 16
line 17 => pc 19 => line 20
line 18 => pc 19 => line 20
line 19 => pc 19 => line 20
line 20 => pc 19 => line 20
line 21 => pc 36 => line 21
line 22 => pc 53 => line 22
line 23 => pc 74 => line 23
line 24 => pc 92 => line 22
line 25 => pc 106 => line 28
line 26 => pc 106 => line 28
line 27 => pc 106 => line 28
line 28 => pc 106 => line 28
line 29 => pc 127 => line 29
line 30 => pc 154 => line 21
line 31 => pc 154 => line 21
line 32 => pc 161 => line 32
line 33 => pc 172 => line 33
line 34 => pc 172 => line 33
line 35 => pc 172 => line 33
line 36 => pc 172 => line 33
line 37 => pc 172 => line 33
line 38 => pc 172 => line 33
line 39 => pc 172 => line 33
line 40 => pc 172 => line 33
- jsconfig.h
Various configuration macros defined as 0 or 1 depending on how JS_VERSION
is defined (as 10 for JavaScript 1.0, 11 for JavaScript 1.1, etc.). Not all
macros are tested around related code yet. In particular, JS 1.0 support is
missing from JSRef. JS 1.2 support will appear in a future JSRef release.
- js.c
The "JS shell", a simple interpreter program that uses the JS API and more
than a few internal interfaces (some of these internal interfaces could be
replaced by jsapi.h calls). The js program built from this source provides
a test vehicle for evaluating scripts and calling functions, trying out new
debugger primitives, etc.
- jsarray.c, jsarray.h
- jsbool.c, jsbool.h
- jsdate.c, jsdate.h
- jsfun.c, jsfun.h
- jsmath.c, jsmath.h
- jsnum.c, jsnum.h
- jsstr.c, jsstr.h
These file pairs implement the standard classes and (where they exist) their
underlying primitive types. They have similar structure, generally starting
with class definitions and continuing with internal constructors, finalizers,
and helper functions.
- jsobj.c, jsobj.h
- jsscope.c, jsscope.h
These two pairs declare and implement the JS object system. All of the
following happen here:
- creating objects by class and prototype, and finalizing objects;
- defining, looking up, getting, setting, and deleting properties;
- creating and destroying properties and binding names to them.
The details of an object map (scope) are mostly hidden in jsscope.[ch],
where scopes start out as linked lists of symbols, and grow after some
threshold into PR hash tables.
- jsatom.c, jsatom.h
The atom manager. Contains well-known string constants, their atoms, the
global atom hash table and related state, the js_Atomize() function that
turns a counted string of bytes into an atom, and literal pool (JSAtomMap)
methods.
- jsgc.c, jsgc.h
[TBD]
- jsinterp.c, jsinterp.h
- jscntxt.c, jscntxt.h
The bytecode interpreter, and related functions such as Call and AllocStack,
live in interp.c. The JSContext constructor and destructor are factored out
into jscntxt.c for minimal linking when the compiler part of JS is split from
the interpreter part into a separate program.
- jsemit.c, jsemit.h
- jsopcode.def, jsopcode.c, jsopcode.h
- jsparse.c, jsparse.h
- jsscan.c, jsscan.h
- jsscript.c, jsscript.h
Compiler and decompiler modules. The jsopcode.def file is a C preprocessor
source that defines almost everything there is to know about JS bytecodes.
See its major comment for how to use it. For now, a debugger will use it
and its dependents such as jsopcode.h directly, but over time we intend to
extend jsdbgapi.h to hide uninteresting details and provide conveniences.
The code generator is split across paragraphs of code in jsparse.c, and the
utility methods called on JSCodeGenerator appear in jsemit.c. Source notes
generated by jsparse.c and jsemit.c are used in jsscript.c to map line number
to program counter and back.
- prtypes.h, prlog2.c
Fundamental representation types and utility macros. This file alone among
all .h files in JSRef must be included first by .c files. It is not nested
in .h files, as other prerequisite .h files generally are, since it is also
a direct dependency of most .c files and would be over-included if nested in
addition to being directly included.
The one "not-quite-a-macro macro" is the PR_CeilingLog2 function in prlog2.c.
- prarena.c, prarena.h
Last-In-First-Out allocation macros that amortize malloc costs and allow for
en-masse freeing. See the paper mentioned in prarena.h's major comment.
- prassert.c, prassert.h
The PR_ASSERT macro is used throughout JSRef source as a proof device to make
invariants and preconditions clear to the reader, and to hold the line during
maintenance and evolution against regressions or violations of assumptions
that it would be too expensive to test unconditionally at run-time. Certain
assertions are followed by run-time tests that cope with assertion failure,
but only where I'm too smart or paranoid to believe the assertion will never
fail...
- prclist.h
Doubly-linked circular list struct and macros.
- prcpucfg.c
This standalone program generates prcpucfg.h, a header file containing bytes
per word and other constants that depend on CPU architecture and C compiler
type model. It tries to discover most of these constants by running its own
experiments on the build host, so if you are cross-compiling, beware.
- prdtoa.c, prdtoa.h
David Gay's portable double-precision floating point to string conversion
code, with Permission To Use notice included.
- prhash.c, prhash.h
Portable, extensible hash tables. These use multiplicative hash for strength
reduction over division hash, yet with very good key distribution over power
of two table sizes. Collisions resolve via chaining, so each entry burns a
malloc and can fragment the heap.
- prlong.c, prlong.h
64-bit integer emulation, and compatible macros that use C's long long type
where it exists (my last company mapped long long to a 128-bit type, but no
real architecture does 128-bit ints yet).
- prosdep.h, prmacos.h, prpcos.h, prunixos.h, os/*.h
A bunch of annoying OS dependencies rationalized into a few "feature-test"
macros such as HAVE_LONG_LONG.
- prprintf.c, prprintf.h
Portable, buffer-overrun-resistant sprintf and friends.
For no good reason save lack of time, the %e, %f, and %g formats cause your
system's native sprintf, rather than PR_dtoa, to be used. This bug doesn't
affect JSRef, because it uses its own PR_dtoa call in jsnum.c to convert
from double to string, but it's a bug that we'll fix later, and one you
should be aware of if you intend to use a PR_*printf function with your own
floating type arguments -- various vendor sprintf's mishandle NaN, +/-Inf,
and some even print normal floating values inaccurately.
- prtime.c, prtime.h
Time functions. These interfaces are named in a way that makes local vs.
universal time confusion likely. Caveat emptor, and we're working on it.
To make matters worse, Java (and therefore JavaScript) uses "local" time
numbers (offsets from the epoch) in its Date class.