gecko-dev/js/src/jsemit.h

581 строка
27 KiB
C
Исходник Обычный вид История

2001-09-20 04:02:59 +04:00
/* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 4 -*-
1998-03-28 05:44:41 +03:00
*
* ***** BEGIN LICENSE BLOCK *****
* Version: MPL 1.1/GPL 2.0/LGPL 2.1
1998-03-28 05:44:41 +03:00
*
* The contents of this file are subject to the Mozilla Public License Version
* 1.1 (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
* http://www.mozilla.org/MPL/
*
* Software distributed under the License is distributed on an "AS IS" basis,
* WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
* for the specific language governing rights and limitations under the
* License.
1998-03-28 05:44:41 +03:00
*
2001-09-20 04:02:59 +04:00
* The Original Code is Mozilla Communicator client code, released
* March 31, 1998.
*
* The Initial Developer of the Original Code is
* Netscape Communications Corporation.
* Portions created by the Initial Developer are Copyright (C) 1998
* the Initial Developer. All Rights Reserved.
*
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
* Contributor(s):
*
* Alternatively, the contents of this file may be used under the terms of
* either of the GNU General Public License Version 2 or later (the "GPL"),
* or the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
* in which case the provisions of the GPL or the LGPL are applicable instead
* of those above. If you wish to allow use of your version of this file only
* under the terms of either the GPL or the LGPL, and not to allow others to
* use your version of this file under the terms of the MPL, indicate your
* decision by deleting the provisions above and replace them with the notice
* and other provisions required by the GPL or the LGPL. If you do not delete
* the provisions above, a recipient may use your version of this file under
* the terms of any one of the MPL, the GPL or the LGPL.
*
* ***** END LICENSE BLOCK ***** */
1998-03-28 05:44:41 +03:00
#ifndef jsemit_h___
#define jsemit_h___
/*
* JS bytecode generation.
*/
#include "jsstddef.h"
#include "jstypes.h"
1998-03-28 05:44:41 +03:00
#include "jsatom.h"
#include "jsopcode.h"
#include "jsprvtd.h"
#include "jspubtd.h"
JS_BEGIN_EXTERN_C
1998-03-28 05:44:41 +03:00
/*
* NB: If you add non-loop STMT_* enumerators, do so before STMT_DO_LOOP or
* you will break the STMT_IS_LOOP macro, just below this enum.
*/
typedef enum JSStmtType {
1998-03-28 05:44:41 +03:00
STMT_BLOCK = 0, /* compound statement: { s1[;... sN] } */
STMT_LABEL = 1, /* labeled statement: L: s */
1998-03-28 05:44:41 +03:00
STMT_IF = 2, /* if (then) statement */
STMT_ELSE = 3, /* else clause of if statement */
1998-03-28 05:44:41 +03:00
STMT_SWITCH = 4, /* switch statement */
STMT_WITH = 5, /* with statement */
STMT_TRY = 6, /* try statement */
STMT_CATCH = 7, /* catch block */
STMT_FINALLY = 8, /* finally statement */
STMT_SUBROUTINE = 9, /* gosub-target subroutine body */
STMT_DO_LOOP = 10, /* do/while loop statement */
STMT_FOR_LOOP = 11, /* for loop statement */
STMT_FOR_IN_LOOP = 12, /* for/in loop statement */
STMT_WHILE_LOOP = 13 /* while loop statement */
} JSStmtType;
1998-03-28 05:44:41 +03:00
#define STMT_IS_LOOP(stmt) ((stmt)->type >= STMT_DO_LOOP)
typedef struct JSStmtInfo JSStmtInfo;
1998-03-28 05:44:41 +03:00
struct JSStmtInfo {
JSStmtType type; /* statement type */
ptrdiff_t update; /* loop update offset (top if none) */
ptrdiff_t breaks; /* offset of last break in loop */
ptrdiff_t continues; /* offset of last continue in loop */
ptrdiff_t gosub; /* offset of last GOSUB for this finally */
ptrdiff_t catchJump; /* offset of last end-of-catch jump */
JSAtom *label; /* name of LABEL or CATCH var */
JSStmtInfo *down; /* info for enclosing statement */
1998-03-28 05:44:41 +03:00
};
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
#define SET_STATEMENT_TOP(stmt, top) \
((stmt)->update = (top), (stmt)->breaks = \
(stmt)->continues = (stmt)->catchJump = (stmt)->gosub = (-1))
1998-03-28 05:44:41 +03:00
struct JSTreeContext { /* tree context for semantic checks */
uint16 flags; /* statement state flags, see below */
uint16 numGlobalVars; /* max. no. of global variables/regexps */
uint32 tryCount; /* total count of try statements parsed */
uint32 globalUses; /* optimizable global var uses in total */
uint32 loopyGlobalUses;/* optimizable global var uses in loops */
JSStmtInfo *topStmt; /* top of statement info stack */
JSAtomList decls; /* function, const, and var declarations */
JSParseNode *nodeList; /* list of recyclable parse-node structs */
};
#define TCF_COMPILING 0x01 /* generating bytecode; this tc is a cg */
#define TCF_IN_FUNCTION 0x02 /* parsing inside function body */
#define TCF_RETURN_EXPR 0x04 /* function has 'return expr;' */
#define TCF_RETURN_VOID 0x08 /* function has 'return;' */
#define TCF_IN_FOR_INIT 0x10 /* parsing init expr of for; exclude 'in' */
#define TCF_FUN_CLOSURE_VS_VAR 0x20 /* function and var with same name */
#define TCF_FUN_USES_NONLOCALS 0x40 /* function refers to non-local names */
#define TCF_FUN_HEAVYWEIGHT 0x80 /* function needs Call object per call */
#define TCF_FUN_FLAGS 0xE0 /* flags to propagate from FunctionBody */
2004-11-17 10:43:01 +03:00
#define TCF_HAS_DEFXMLNS 0x100 /* default xml namespace = ...; parsed */
#define TREE_CONTEXT_INIT(tc) \
((tc)->flags = (tc)->numGlobalVars = 0, \
(tc)->tryCount = (tc)->globalUses = (tc)->loopyGlobalUses = 0, \
(tc)->topStmt = NULL, ATOM_LIST_INIT(&(tc)->decls), \
(tc)->nodeList = NULL)
#define TREE_CONTEXT_FINISH(tc) \
((void)0)
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
/*
* Span-dependent instructions are jumps whose span (from the jump bytecode to
* the jump target) may require 2 or 4 bytes of immediate operand.
*/
typedef struct JSSpanDep JSSpanDep;
typedef struct JSJumpTarget JSJumpTarget;
struct JSSpanDep {
ptrdiff_t top; /* offset of first bytecode in an opcode */
ptrdiff_t offset; /* offset - 1 within opcode of jump operand */
ptrdiff_t before; /* original offset - 1 of jump operand */
JSJumpTarget *target; /* tagged target pointer or backpatch delta */
};
/*
* Jump targets are stored in an AVL tree, for O(log(n)) lookup with targets
2003-11-25 04:50:41 +03:00
* sorted by offset from left to right, so that targets after a span-dependent
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
* instruction whose jump offset operand must be extended can be found quickly
* and adjusted upward (toward higher offsets).
*/
struct JSJumpTarget {
ptrdiff_t offset; /* offset of span-dependent jump target */
int balance; /* AVL tree balance number */
JSJumpTarget *kids[2]; /* left and right AVL tree child pointers */
};
#define JT_LEFT 0
#define JT_RIGHT 1
#define JT_OTHER_DIR(dir) (1 - (dir))
#define JT_IMBALANCE(dir) (((dir) << 1) - 1)
#define JT_DIR(imbalance) (((imbalance) + 1) >> 1)
/*
* Backpatch deltas are encoded in JSSpanDep.target if JT_TAG_BIT is clear,
* so we can maintain backpatch chains when using span dependency records to
* hold jump offsets that overflow 16 bits.
*/
#define JT_TAG_BIT ((jsword) 1)
#define JT_UNTAG_SHIFT 1
#define JT_SET_TAG(jt) ((JSJumpTarget *)((jsword)(jt) | JT_TAG_BIT))
#define JT_CLR_TAG(jt) ((JSJumpTarget *)((jsword)(jt) & ~JT_TAG_BIT))
#define JT_HAS_TAG(jt) ((jsword)(jt) & JT_TAG_BIT)
#define BITS_PER_PTRDIFF (sizeof(ptrdiff_t) * JS_BITS_PER_BYTE)
#define BITS_PER_BPDELTA (BITS_PER_PTRDIFF - 1 - JT_UNTAG_SHIFT)
#define BPDELTA_MAX (((ptrdiff_t)1 << BITS_PER_BPDELTA) - 1)
#define BPDELTA_TO_JT(bp) ((JSJumpTarget *)((bp) << JT_UNTAG_SHIFT))
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
#define JT_TO_BPDELTA(jt) ((ptrdiff_t)((jsword)(jt) >> JT_UNTAG_SHIFT))
#define SD_SET_TARGET(sd,jt) ((sd)->target = JT_SET_TAG(jt))
#define SD_GET_TARGET(sd) (JS_ASSERT(JT_HAS_TAG((sd)->target)), \
JT_CLR_TAG((sd)->target))
#define SD_SET_BPDELTA(sd,bp) ((sd)->target = BPDELTA_TO_JT(bp))
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
#define SD_GET_BPDELTA(sd) (JS_ASSERT(!JT_HAS_TAG((sd)->target)), \
JT_TO_BPDELTA((sd)->target))
/* Avoid asserting twice by expanding SD_GET_TARGET in the "then" clause. */
#define SD_SPAN(sd,pivot) (SD_GET_TARGET(sd) \
? JT_CLR_TAG((sd)->target)->offset - (pivot) \
: 0)
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
1998-03-28 05:44:41 +03:00
struct JSCodeGenerator {
JSTreeContext treeContext; /* base state: statement info stack, etc. */
JSArenaPool *codePool; /* pointer to thread code arena pool */
JSArenaPool *notePool; /* pointer to thread srcnote arena pool */
void *codeMark; /* low watermark in cg->codePool */
void *noteMark; /* low watermark in cg->notePool */
void *tempMark; /* low watermark in cx->tempPool */
struct {
jsbytecode *base; /* base of JS bytecode vector */
jsbytecode *limit; /* one byte beyond end of bytecode */
jsbytecode *next; /* pointer to next free bytecode */
jssrcnote *notes; /* source notes, see below */
uintN noteCount; /* number of source notes so far */
uintN noteMask; /* growth increment for notes */
ptrdiff_t lastNoteOffset; /* code offset for last source note */
uintN currentLine; /* line number for tree-based srcnote gen */
} prolog, main, *current;
const char *filename; /* null or weak link to source filename */
uintN firstLine; /* first line, for js_NewScriptFromCG */
JSPrincipals *principals; /* principals for constant folding eval */
JSAtomList atomList; /* literals indexed for mapping */
intN stackDepth; /* current stack depth in script frame */
uintN maxStackDepth; /* maximum stack depth so far */
JSTryNote *tryBase; /* first exception handling note */
JSTryNote *tryNext; /* next available note */
size_t tryNoteSpace; /* # of bytes allocated at tryBase */
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
JSSpanDep *spanDeps; /* span dependent instruction records */
JSJumpTarget *jumpTargets; /* AVL tree of jump target offsets */
JSJumpTarget *jtFreeList; /* JT_LEFT-linked list of free structs */
uintN numSpanDeps; /* number of span dependencies */
uintN numJumpTargets; /* number of jump targets */
uintN emitLevel; /* js_EmitTree recursion level */
JSAtomList constList; /* compile time constants */
JSCodeGenerator *parent; /* Enclosing function or global context */
1998-03-28 05:44:41 +03:00
};
#define CG_BASE(cg) ((cg)->current->base)
#define CG_LIMIT(cg) ((cg)->current->limit)
#define CG_NEXT(cg) ((cg)->current->next)
#define CG_CODE(cg,offset) (CG_BASE(cg) + (offset))
#define CG_OFFSET(cg) PTRDIFF(CG_NEXT(cg), CG_BASE(cg), jsbytecode)
#define CG_NOTES(cg) ((cg)->current->notes)
#define CG_NOTE_COUNT(cg) ((cg)->current->noteCount)
#define CG_NOTE_MASK(cg) ((cg)->current->noteMask)
#define CG_LAST_NOTE_OFFSET(cg) ((cg)->current->lastNoteOffset)
#define CG_CURRENT_LINE(cg) ((cg)->current->currentLine)
#define CG_PROLOG_BASE(cg) ((cg)->prolog.base)
#define CG_PROLOG_LIMIT(cg) ((cg)->prolog.limit)
#define CG_PROLOG_NEXT(cg) ((cg)->prolog.next)
#define CG_PROLOG_CODE(cg,poff) (CG_PROLOG_BASE(cg) + (poff))
#define CG_PROLOG_OFFSET(cg) PTRDIFF(CG_PROLOG_NEXT(cg), CG_PROLOG_BASE(cg),\
jsbytecode)
#define CG_SWITCH_TO_MAIN(cg) ((cg)->current = &(cg)->main)
#define CG_SWITCH_TO_PROLOG(cg) ((cg)->current = &(cg)->prolog)
1998-03-28 05:44:41 +03:00
/*
* Initialize cg to allocate bytecode space from codePool, source note space
* from notePool, and all other arena-allocated temporaries from cx->tempPool.
* Return true on success. Report an error and return false if the initial
* code segment can't be allocated.
1998-03-28 05:44:41 +03:00
*/
extern JS_FRIEND_API(JSBool)
js_InitCodeGenerator(JSContext *cx, JSCodeGenerator *cg,
JSArenaPool *codePool, JSArenaPool *notePool,
const char *filename, uintN lineno,
JSPrincipals *principals);
/*
* Release cg->codePool, cg->notePool, and cx->tempPool to marks set by
* js_InitCodeGenerator. Note that cgs are magic: they own the arena pool
* "tops-of-stack" space above their codeMark, noteMark, and tempMark points.
* This means you cannot alloc from tempPool and save the pointer beyond the
* next JS_FinishCodeGenerator.
*/
extern JS_FRIEND_API(void)
js_FinishCodeGenerator(JSContext *cx, JSCodeGenerator *cg);
1998-03-28 05:44:41 +03:00
/*
* Emit one bytecode.
*/
extern ptrdiff_t
js_Emit1(JSContext *cx, JSCodeGenerator *cg, JSOp op);
/*
* Emit two bytecodes, an opcode (op) with a byte of immediate operand (op1).
*/
extern ptrdiff_t
js_Emit2(JSContext *cx, JSCodeGenerator *cg, JSOp op, jsbytecode op1);
/*
* Emit three bytecodes, an opcode with two bytes of immediate operands.
*/
extern ptrdiff_t
js_Emit3(JSContext *cx, JSCodeGenerator *cg, JSOp op, jsbytecode op1,
jsbytecode op2);
1998-03-28 05:44:41 +03:00
/*
* Emit (1 + extra) bytecodes, for N bytes of op and its immediate operand.
*/
extern ptrdiff_t
js_EmitN(JSContext *cx, JSCodeGenerator *cg, JSOp op, size_t extra);
1998-03-28 05:44:41 +03:00
/*
* Unsafe macro to call js_SetJumpOffset and return false if it does.
*/
#define CHECK_AND_SET_JUMP_OFFSET(cx,cg,pc,off) \
JS_BEGIN_MACRO \
if (!js_SetJumpOffset(cx, cg, pc, off)) \
return JS_FALSE; \
JS_END_MACRO
1998-03-28 05:44:41 +03:00
#define CHECK_AND_SET_JUMP_OFFSET_AT(cx,cg,off) \
CHECK_AND_SET_JUMP_OFFSET(cx, cg, CG_CODE(cg,off), CG_OFFSET(cg) - (off))
extern JSBool
js_SetJumpOffset(JSContext *cx, JSCodeGenerator *cg, jsbytecode *pc,
ptrdiff_t off);
1998-03-28 05:44:41 +03:00
/* Test whether we're in a with statement. */
extern JSBool
js_InWithStatement(JSTreeContext *tc);
/* Test whether we're in a catch block with exception named by atom. */
extern JSBool
js_InCatchBlock(JSTreeContext *tc, JSAtom *atom);
1998-03-28 05:44:41 +03:00
/*
* Push the C-stack-allocated struct at stmt onto the stmtInfo stack.
1998-03-28 05:44:41 +03:00
*/
extern void
js_PushStatement(JSTreeContext *tc, JSStmtInfo *stmt, JSStmtType type,
ptrdiff_t top);
1998-03-28 05:44:41 +03:00
/*
* Pop tc->topStmt. If the top JSStmtInfo struct is not stack-allocated, it
* is up to the caller to free it.
1998-03-28 05:44:41 +03:00
*/
extern void
js_PopStatement(JSTreeContext *tc);
1998-03-28 05:44:41 +03:00
/*
* Like js_PopStatement(&cg->treeContext), also patch breaks and continues.
* May fail if a jump offset overflows.
1998-03-28 05:44:41 +03:00
*/
extern JSBool
js_PopStatementCG(JSContext *cx, JSCodeGenerator *cg);
1998-03-28 05:44:41 +03:00
/*
* Define and lookup a primitive jsval associated with the const named by atom.
* js_DefineCompileTimeConstant analyzes the constant-folded initializer at pn
* and saves the const's value in cg->constList, if it can be used at compile
* time. It returns true unless an error occurred.
*
* If the initializer's value could not be saved, js_LookupCompileTimeConstant
* calls will return the undefined value. js_LookupCompileTimeConstant tries
* to find a const value memorized for atom, returning true with *vp set to a
* value other than undefined if the constant was found, true with *vp set to
* JSVAL_VOID if not found, and false on error.
*/
extern JSBool
js_DefineCompileTimeConstant(JSContext *cx, JSCodeGenerator *cg, JSAtom *atom,
JSParseNode *pn);
extern JSBool
js_LookupCompileTimeConstant(JSContext *cx, JSCodeGenerator *cg, JSAtom *atom,
jsval *vp);
1998-03-28 05:44:41 +03:00
/*
* Emit code into cg for the tree rooted at pn.
1998-03-28 05:44:41 +03:00
*/
extern JSBool
js_EmitTree(JSContext *cx, JSCodeGenerator *cg, JSParseNode *pn);
1998-03-28 05:44:41 +03:00
/*
* Emit code into cg for the tree rooted at body, then create a persistent
* script for fun from cg.
1998-03-28 05:44:41 +03:00
*/
extern JSBool
js_EmitFunctionBody(JSContext *cx, JSCodeGenerator *cg, JSParseNode *body,
JSFunction *fun);
1998-03-28 05:44:41 +03:00
/*
* Source notes generated along with bytecode for decompiling and debugging.
* A source note is a uint8 with 5 bits of type and 3 of offset from the pc of
* the previous note. If 3 bits of offset aren't enough, extended delta notes
* (SRC_XDELTA) consisting of 2 set high order bits followed by 6 offset bits
1998-03-28 05:44:41 +03:00
* are emitted before the next note. Some notes have operand offsets encoded
* immediately after them, in note bytes or byte-triples.
1998-03-28 05:44:41 +03:00
*
* Source Note Extended Delta
* +7-6-5-4-3+2-1-0+ +7-6-5+4-3-2-1-0+
* |note-type|delta| |1 1| ext-delta |
* +---------+-----+ +---+-----------+
*
1998-03-28 05:44:41 +03:00
* At most one "gettable" note (i.e., a note of type other than SRC_NEWLINE,
* SRC_SETLINE, and SRC_XDELTA) applies to a given bytecode.
*
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
* NB: the js_SrcNoteSpec array in jsemit.c is indexed by this enum, so its
* initializers need to match the order here.
1998-03-28 05:44:41 +03:00
*/
typedef enum JSSrcNoteType {
SRC_NULL = 0, /* terminates a note vector */
SRC_IF = 1, /* JSOP_IFEQ bytecode is from an if-then */
SRC_IF_ELSE = 2, /* JSOP_IFEQ bytecode is from an if-then-else */
SRC_WHILE = 3, /* JSOP_IFEQ is from a while loop */
SRC_FOR = 4, /* JSOP_NOP or JSOP_POP in for loop head */
SRC_CONTINUE = 5, /* JSOP_GOTO is a continue, not a break;
also used on JSOP_ENDINIT if extra comma
at end of array literal: [1,2,,] */
SRC_VAR = 6, /* JSOP_NAME/SETNAME/FORNAME in a var decl */
SRC_PCDELTA = 7, /* offset from comma-operator to next POP,
or from CONDSWITCH to first CASE opcode */
1998-03-28 05:44:41 +03:00
SRC_ASSIGNOP = 8, /* += or another assign-op follows */
SRC_COND = 9, /* JSOP_IFEQ is from conditional ?: operator */
SRC_RESERVED0 = 10, /* reserved for future use */
SRC_HIDDEN = 11, /* opcode shouldn't be decompiled */
1998-03-28 05:44:41 +03:00
SRC_PCBASE = 12, /* offset of first obj.prop.subprop bytecode */
SRC_LABEL = 13, /* JSOP_NOP for label: with atomid immediate */
SRC_LABELBRACE = 14, /* JSOP_NOP for label: {...} begin brace */
SRC_ENDBRACE = 15, /* JSOP_NOP for label: {...} end brace */
SRC_BREAK2LABEL = 16, /* JSOP_GOTO for 'break label' with atomid */
SRC_CONT2LABEL = 17, /* JSOP_GOTO for 'continue label' with atomid */
SRC_SWITCH = 18, /* JSOP_*SWITCH with offset to end of switch,
2nd off to first JSOP_CASE if condswitch */
SRC_FUNCDEF = 19, /* JSOP_NOP for function f() with atomid */
SRC_CATCH = 20, /* catch block has guard */
SRC_CONST = 21, /* JSOP_SETCONST in a const decl */
SRC_NEWLINE = 22, /* bytecode follows a source newline */
SRC_SETLINE = 23, /* a file-absolute source line number note */
SRC_XDELTA = 24 /* 24-31 are for extended delta notes */
1998-03-28 05:44:41 +03:00
} JSSrcNoteType;
#define SN_TYPE_BITS 5
#define SN_DELTA_BITS 3
#define SN_XDELTA_BITS 6
#define SN_TYPE_MASK (JS_BITMASK(SN_TYPE_BITS) << SN_DELTA_BITS)
#define SN_DELTA_MASK ((ptrdiff_t)JS_BITMASK(SN_DELTA_BITS))
#define SN_XDELTA_MASK ((ptrdiff_t)JS_BITMASK(SN_XDELTA_BITS))
1998-03-28 05:44:41 +03:00
#define SN_MAKE_NOTE(sn,t,d) (*(sn) = (jssrcnote) \
(((t) << SN_DELTA_BITS) \
| ((d) & SN_DELTA_MASK)))
1998-03-28 05:44:41 +03:00
#define SN_MAKE_XDELTA(sn,d) (*(sn) = (jssrcnote) \
((SRC_XDELTA << SN_DELTA_BITS) \
| ((d) & SN_XDELTA_MASK)))
1998-03-28 05:44:41 +03:00
#define SN_IS_XDELTA(sn) ((*(sn) >> SN_DELTA_BITS) >= SRC_XDELTA)
#define SN_TYPE(sn) (SN_IS_XDELTA(sn) ? SRC_XDELTA \
: *(sn) >> SN_DELTA_BITS)
1998-03-28 05:44:41 +03:00
#define SN_SET_TYPE(sn,type) SN_MAKE_NOTE(sn, type, SN_DELTA(sn))
#define SN_IS_GETTABLE(sn) (SN_TYPE(sn) < SRC_NEWLINE)
#define SN_DELTA(sn) ((ptrdiff_t)(SN_IS_XDELTA(sn) \
? *(sn) & SN_XDELTA_MASK \
: *(sn) & SN_DELTA_MASK))
1998-03-28 05:44:41 +03:00
#define SN_SET_DELTA(sn,delta) (SN_IS_XDELTA(sn) \
? SN_MAKE_XDELTA(sn, delta) \
: SN_MAKE_NOTE(sn, SN_TYPE(sn), delta))
1998-03-28 05:44:41 +03:00
#define SN_DELTA_LIMIT ((ptrdiff_t)JS_BIT(SN_DELTA_BITS))
#define SN_XDELTA_LIMIT ((ptrdiff_t)JS_BIT(SN_XDELTA_BITS))
1998-03-28 05:44:41 +03:00
/*
* Offset fields follow certain notes and are frequency-encoded: an offset in
* [0,0x7f] consumes one byte, an offset in [0x80,0x7fffff] takes three, and
* the high bit of the first byte is set.
1998-03-28 05:44:41 +03:00
*/
#define SN_3BYTE_OFFSET_FLAG 0x80
#define SN_3BYTE_OFFSET_MASK 0x7f
1998-03-28 05:44:41 +03:00
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
typedef struct JSSrcNoteSpec {
const char *name; /* name for disassembly/debugging output */
uint8 arity; /* number of offset operands */
uint8 offsetBias; /* bias of offset(s) from annotated pc */
int8 isSpanDep; /* 1 or -1 if offsets could span extended ops,
0 otherwise; sign tells span direction */
} JSSrcNoteSpec;
1998-03-28 05:44:41 +03:00
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
extern JS_FRIEND_DATA(JSSrcNoteSpec) js_SrcNoteSpec[];
extern JS_FRIEND_API(uintN) js_SrcNoteLength(jssrcnote *sn);
#define SN_LENGTH(sn) ((js_SrcNoteSpec[SN_TYPE(sn)].arity == 0) ? 1 \
: js_SrcNoteLength(sn))
1998-03-28 05:44:41 +03:00
#define SN_NEXT(sn) ((sn) + SN_LENGTH(sn))
/* A source note array is terminated by an all-zero element. */
#define SN_MAKE_TERMINATOR(sn) (*(sn) = SRC_NULL)
#define SN_IS_TERMINATOR(sn) (*(sn) == SRC_NULL)
/*
* Append a new source note of the given type (and therefore size) to cg's
* notes dynamic array, updating cg->noteCount. Return the new note's index
* within the array pointed at by cg->current->notes. Return -1 if out of
* memory.
1998-03-28 05:44:41 +03:00
*/
extern intN
js_NewSrcNote(JSContext *cx, JSCodeGenerator *cg, JSSrcNoteType type);
extern intN
js_NewSrcNote2(JSContext *cx, JSCodeGenerator *cg, JSSrcNoteType type,
ptrdiff_t offset);
1998-03-28 05:44:41 +03:00
extern intN
js_NewSrcNote3(JSContext *cx, JSCodeGenerator *cg, JSSrcNoteType type,
ptrdiff_t offset1, ptrdiff_t offset2);
1998-03-28 05:44:41 +03:00
Fixes for bug 80981 (``Need extended jump bytecode to avoid "script too large" errors, etc.''): We now ReportStatementTooLarge only if - a jump offset overflows 32 bits, signed; - there are 2**32 or more span dependencies in a script; - a backpatch chain link is more than (2**30 - 1) bytecodes long; - a source note's distance from the last note, or from script main entry point, is > 0x7fffff bytes. Narrative of the patch, by file: - js.c The js_SrcNoteName array of const char * is now a js_SrcNoteSpec array of "specifiers", structs that include a const char *name member. Also, due to span-dependent jumps at the ends of basic blocks where the decompiler knows the basic block length, but not the jump format, we need an offset operand for SRC_COND, SRC_IF_ELSE, and SRC_WHILE (to tell the distance from the branch bytecode after the condition expression to the span-dependent jump). - jsarena.[ch] JS arenas are used mainly for last-in-first-out allocation with _en masse_ release to the malloc pool (or, optionally, to a private freelist). But the code generator needs to allocate and grow (by doubling, to avoid O(n^2) growth) allocations that hold bytecode, source notes, and span-dependency records. This exception to LIFO allocation works by claiming an entire arena from the pool and realloc'ing it, as soon as the allocation size reaches the pool's default arena size. Call such an allocation a "large single allocation". This patch adds a new arena API, JS_ArenaFreeAllocation, which can be used to free a large single allocation. If called with an allocation that's not a large single allocation, it will nevertheless attempt to retract the arena containing that allocation, if the allocation is last within its arena. Thus JS_ArenaFreeAllocation adds a non-LIFO "free" special case to match the non-LIFO "grow" special case already implemented under JS_ARENA_GROW for large single allocations. The code generator still benefits via this extension to arenas, over purely manual malloc/realloc/free, by virtue of _en masse_ free (JS_ARENA_RELEASE after code generation has completed, successfully or not). To avoid searching for the previous arena, in order to update its next member upon reallocation of the arena containing a large single allocation, the oversized arena has a back-pointer to that next member stored (but not as allocable space within the arena) in a (JSArena **) footer at its end. - jscntxt.c I've observed for many scripts that the bytes of source notes and bytecode are of comparable lengths, but only now am I fixing the default arena size for cx->notePool to match the size for cx->codePool (1024 instead of 256). - jsemit.c Span-dependent instructions in JS bytecode consist of the jump (JOF_JUMP) and switch (JOF_LOOKUPSWITCH, JOF_TABLESWITCH) format opcodes, subdivided into unconditional (gotos and gosubs), and conditional jumps or branches (which pop a value, test it, and jump depending on its value). Most jumps have just one immediate operand, a signed offset from the jump opcode's pc to the target bytecode. The lookup and table switch opcodes may contain many jump offsets. This patch adds "X" counterparts to the opcodes/formats (X is suffixed, btw, to prefer JSOP_ORX and thereby to avoid colliding on the JSOP_XOR name for the extended form of the JSOP_OR branch opcode). The unextended or short formats have 16-bit signed immediate offset operands, the extended or long formats have 32-bit signed immediates. The span-dependency problem consists of selecting as few long instructions as possible, or about as few -- since jumps can span other jumps, extending one jump may cause another to need to be extended. Most JS scripts are short, so need no extended jumps. We optimize for this case by generating short jumps until we know a long jump is needed. After that point, we keep generating short jumps, but each jump's 16-bit immediate offset operand is actually an unsigned index into cg->spanDeps, an array of JSSpanDep structs. Each struct tells the top offset in the script of the opcode, the "before" offset of the jump (which will be the same as top for simplex jumps, but which will index further into the bytecode array for a non-initial jump offset in a lookup or table switch), the after "offset" adjusted during span-dependent instruction selection (initially the same value as the "before" offset), and the jump target (more below). Since we generate cg->spanDeps lazily, from within js_SetJumpOffset, we must ensure that all bytecode generated so far can be inspected to discover where the jump offset immediate operands lie within CG_CODE(cg). But the bonus is that we generate span-dependency records sorted by their offsets, so we can binary-search when trying to find a JSSpanDep for a given bytecode offset, or the nearest JSSpanDep at or above a given pc. To avoid limiting scripts to 64K jumps, if the cg->spanDeps index overflows 65534, we store SPANDEP_INDEX_HUGE in the jump's immediate operand. This tells us that we need to binary-search for the cg->spanDeps entry by the jump opcode's bytecode offset (sd->before). Jump targets need to be maintained in a data structure that lets us look up an already-known target by its address (jumps may have a common target), and that also lets us update the addresses (script-relative, a.k.a. absolute offsets) of targets that come after a jump target (for when a jump below that target needs to be extended). We use an AVL tree, implemented using recursion, but with some tricky optimizations to its height-balancing code (see http://www.enteract.com/~bradapp/ftp/src/libs/C++/AvlTrees.html). A final wrinkle: backpatch chains are linked by jump-to-jump offsets with positive sign, even though they link "backward" (i.e., toward lower bytecode address). We don't want to waste space and search time in the AVL tree for such temporary backpatch deltas, so we use a single-bit wildcard scheme to tag true JSJumpTarget pointers and encode untagged, signed (positive) deltas in JSSpanDep.target pointers, depending on whether the JSSpanDep has a known target, or is still awaiting backpatching. Note that backpatch chains would present a problem for BuildSpanDepTable, which inspects bytecode to build cg->spanDeps on demand, when the first short jump offset overflows. To solve this temporary problem, we emit a proxy bytecode (JSOP_BACKPATCH; JSOP_BACKPATCH_PUSH for jumps that push a result on the interpreter's stack, namely JSOP_GOSUB; or JSOP_BACKPATCH_POP for branch ops) whose nuses/ndefs counts help keep the stack balanced, but whose opcode format distinguishes its backpatch delta immediate operand from a normal jump offset. The cg->spanDeps array and JSJumpTarget structs are allocated from the cx->tempPool arena-pool. This created a LIFO vs. non-LIFO conflict: there were two places under the TOK_SWITCH case in js_EmitTree that used tempPool to allocate and release a chunk of memory, during whose lifetime JSSpanDep and/or JSJumpTarget structs might also be allocated from tempPool -- the ensuing release would prove disastrous. These bitmap and table temporaries are now allocated from the malloc heap. - jsinterp.c Straightforward cloning and JUMP => JUMPX mutating of the jump and switch format bytecode cases. - jsobj.c Silence warnings about %p used without (void *) casts. - jsopcode.c Massive and scary decompiler whackage to cope with extended jumps, using source note offsets to help find jumps whose format (short or long) can't be discovered from properties of prior instructions in the script. One cute hack here: long || and && expressions are broken up to wrap before the 80th column, with the operator at the end of each non-terminal line. - jsopcode.h, jsopcode.tbl The new extended jump opcodes, formats, and fundamental parameterization macros. Also, more comments. - jsparse.c Random and probably only aesthetic fix to avoid decorating a foo[i]++ or --foo[i] parse tree node with JSOP_SETCALL, wrongly (only foo(i)++ or --foo(i), or the other post- or prefix form operator, should have such an opcode decoration on its parse tree). - jsscript.h Random macro naming sanity: use trailing _ rather than leading _ for macro local variables in order to avoid invading the standard C global namespace.
2001-10-17 07:16:48 +04:00
/*
* NB: this function can add at most one extra extended delta note.
*/
extern jssrcnote *
js_AddToSrcNoteDelta(JSContext *cx, JSCodeGenerator *cg, jssrcnote *sn,
ptrdiff_t delta);
1998-03-28 05:44:41 +03:00
/*
* Get and set the offset operand identified by which (0 for the first, etc.).
*/
extern JS_FRIEND_API(ptrdiff_t)
1998-03-28 05:44:41 +03:00
js_GetSrcNoteOffset(jssrcnote *sn, uintN which);
extern JSBool
js_SetSrcNoteOffset(JSContext *cx, JSCodeGenerator *cg, uintN index,
uintN which, ptrdiff_t offset);
1998-03-28 05:44:41 +03:00
/*
* Finish taking source notes in cx's notePool, copying final notes to the new
* stable store allocated by the caller and passed in via notes. Return false
* on malloc failure, which means this function reported an error.
*
2003-08-14 05:19:30 +04:00
* To compute the number of jssrcnotes to allocate and pass in via notes, use
* the CG_COUNT_FINAL_SRCNOTES macro. This macro knows a lot about details of
* js_FinishTakingSrcNotes, SO DON'T CHANGE jsemit.c's js_FinishTakingSrcNotes
* FUNCTION WITHOUT CHECKING WHETHER THIS MACRO NEEDS CORRESPONDING CHANGES!
*/
#define CG_COUNT_FINAL_SRCNOTES(cg, cnt) \
JS_BEGIN_MACRO \
ptrdiff_t diff_ = CG_PROLOG_OFFSET(cg) - (cg)->prolog.lastNoteOffset; \
cnt = (cg)->prolog.noteCount + (cg)->main.noteCount + 1; \
if ((cg)->prolog.noteCount && \
(cg)->prolog.currentLine != (cg)->firstLine) { \
if (diff_ > SN_DELTA_MASK) \
cnt += JS_HOWMANY(diff_ - SN_DELTA_MASK, SN_XDELTA_MASK); \
cnt += 2 + (((cg)->firstLine > SN_3BYTE_OFFSET_MASK) << 1); \
} else if (diff_ > 0) { \
if (cg->main.noteCount) { \
jssrcnote *sn_ = (cg)->main.notes; \
diff_ -= SN_IS_XDELTA(sn_) \
? SN_XDELTA_MASK - (*sn_ & SN_XDELTA_MASK) \
: SN_DELTA_MASK - (*sn_ & SN_DELTA_MASK); \
} \
if (diff_ > 0) \
cnt += JS_HOWMANY(diff_, SN_XDELTA_MASK); \
} \
JS_END_MACRO
extern JSBool
js_FinishTakingSrcNotes(JSContext *cx, JSCodeGenerator *cg, jssrcnote *notes);
/*
* Allocate cg->treeContext.tryCount notes (plus one for the end sentinel)
* from cx->tempPool and set up cg->tryBase/tryNext for exactly tryCount
* js_NewTryNote calls. The storage is freed by js_FinishCodeGenerator.
1998-03-28 05:44:41 +03:00
*/
extern JSBool
js_AllocTryNotes(JSContext *cx, JSCodeGenerator *cg);
1998-03-28 05:44:41 +03:00
/*
* Grab the next trynote slot in cg, filling it in appropriately.
1998-03-28 05:44:41 +03:00
*/
extern JSTryNote *
js_NewTryNote(JSContext *cx, JSCodeGenerator *cg, ptrdiff_t start,
ptrdiff_t end, ptrdiff_t catchStart);
/*
* Finish generating exception information into the space at notes. As with
* js_FinishTakingSrcNotes, the caller must use CG_COUNT_FINAL_TRYNOTES(cg) to
* preallocate enough space in a JSTryNote[] to pass as the notes parameter of
* js_FinishTakingTryNotes.
*/
#define CG_COUNT_FINAL_TRYNOTES(cg, cnt) \
JS_BEGIN_MACRO \
cnt = ((cg)->tryNext > (cg)->tryBase) \
? PTRDIFF(cg->tryNext, cg->tryBase, JSTryNote) + 1 \
: 0; \
JS_END_MACRO
extern void
js_FinishTakingTryNotes(JSContext *cx, JSCodeGenerator *cg, JSTryNote *notes);
1998-03-28 05:44:41 +03:00
JS_END_EXTERN_C
1998-03-28 05:44:41 +03:00
#endif /* jsemit_h___ */