Layout System Overview
Layout's Job: Provide the Presentation
Layout is primarily concerned with providing a presentation to an HTML or
XML document. This presentation is typically formatted in accordance with
the requirements of the CSS1
and CSS2 specifications from
the W3C. Presentation formatting is also required to provide compatibility
with legacy browsers (Microsoft Internet Explorer and Netscape Navigator
4.x). The decision about when to apply CSS-specified formatting and when
to apply legacy formatting is controlled by the document's DOCTYPE specification.
These layout modes are refered to as 'Standards' and 'NavQuirks' modes. (DOCTYPE
and modes are explained in more detail here).
The presentation generally is constrained by the width of the window in which
the presentation is to be displayed, and a height that extends as far as
necessary. This is referred to as the Galley Mode presentation, and is what
one expects from a typical browser. Additionally, layout must support a paginated
presentation, where the width of the presentation is constrained by the dimensions
of the printed output (paper) and the height of each page is fixed. This
paged presentation presents several challenges not present in the galley
presentation, namely how to break up elements that are larger than a single
page, and how to handle changes to page dimensions.
The original design of the Layout system allowed for multiple, possibly different,
presentations to be supported simultaneously from the same content model.
In other words, the same HTML or XML document could be viewed as a normal
galley presentation in a browser window, while simultaneously being presented
in a paged presentation to a printer, or even an aural presentation to a
speech-synthesizer. To date the only real use of this multiple presentation
ability is seen in printing, where multiple presentations are managed, all
connected to the same content model. (note: it is unclear if this is really
a benefit - it may have been better to use a copy of the content model for
each presentation, and to remove the complexity of managing seperate presentations
- analysis is needed here). The idea of supporting a non-visual presentation
is interesting. Layout's support for aural presentations is undeveloped,
though conceptually, it is possible and supported by the architecture.
How Layout Does its Job: Frames and Reflow
So, layout is concerned with providing a presentation, in galley or paged
mode. Given a content model, how does the layout system actually create the
presentation? Through the creation and management of frames. Frames are an
encapsulation of a region on the screen, a region that contains geometry
(size and location, stacking order). Generally frames correspond to the content
elements, though there is often a one-to-many correspondence between content
elements and frames. Layout creates frames for content based on either the
specific HTML rules for an element or based on the CSS display type of the
element. In the case of the HTML-specific elements, the frame types that
correspond to the element are hard-coded, but in the more general case where
the display type is needes, the layout system must determine that display
type by using the Style System. A content element is passed to the Style
System and a request is made to resolve the style for that element. This
causes the Style System to apply all style rules that correspond to the element
and results in a resolved Style Context - the style data specific to that
element. The Layout module looks at the 'display' field of the style context
to determine what kind of frame to create (block, inline, table, etc.). The
style context is associated with the frame via a reference because it is
needed for many other computations during the formatting of the frames.
Once a frame is created for a content element, it must be formatted. We refer
to this as 'laying out' the frame, or as 'reflowing' the frame. Each frame
implements a Reflow method to compute its position and size, among other
things. For more details on the Reflow mechanism, see the Reflow Overview
document... The CSS formatting requirements present two distinct layout
models: 1) the in-flow model, where the geometry of an element is influenced
by the geometry of the elements that preceed it, and 2) the positioned model,
where the geometry of an element is not influenced by the geometry of the
elements that preceed it, or in any case, is influenced more locally. The
in-flow cased is considered the 'normal' case, and corresponds to normal
HTML formatting. The later case, called 'out of flow' puts the document author
in control of the layout, and the author must specify the locations and sizes
of all of the elements that are positioned. There is, of course, some complexity
involved with managing these two models simultanelusly...
So far the general flow of layout looks like this:
1) Obtain a document's content model
2) Utilize the Style System to resolve the style of each element in the content
model
3) Construct the frames that correspond to the content model, according to
the resolved style data.
4) Perform the initial layout, or initial reflow, on the newly constructed
frame.
This is pretty straight-forward, but is complicated somewhat by the notion
of incrementalism. One of the goals of the Layout system's design
is to create parts of the presentation as they become available, rather than
waiting for the entire document to be read, parsed, and then presented. This
is a major benefit for large documents because the user does not have to wait
for the 200th page of text to be read in before the first page can be displayed
- they can start reading something right away. So really, this sequence of
operations Resolve Style, Create Frame, Layout Frame, gets repeated many
times as the content becomes available. In the normal in-flow case this is
quite natural because the sequential addition of new content results in sequential
addition of new frames, and because everything is in-flow, the new frames
do not influence the geometry of the frames that have already been formatted.
When out-of-flow frames are present this is a more difficult problem, however.
Sometimes a content element comes in incrementally, and invalidates the formatting
of some of the frames that preceed it, frame that have already been formatted.
In this case the Layout System has to detect that impact and reflow again
the impacted frames. This is referred to as an incremental reflow.
Another responsibility of layout is to manage
dynamic changes to the content model, changes that occur after the document
has been loaded and (possibly) presented. These dynamic changes are caused
by manipulations of the content model via the DOM (generally
through java script). When a content element is modified, added or removed,
Layout is notified. If content elements have been inserted, new frames are
created and formatted (and impacts to other frames are managed by performing
further incremental reflows). If content is removed, the corresponding frames
are destroyed (again, impacts to other elements are managed). If a content
element is modified, layout must determine if the chage influences the formatting
of that or other elements' presentations, and must then reformat, or re-reflow,
the impacted elements. In all cases, the determination of the impact is critical
to avoid either the problem of not updating impacted elements, thus presenting
an invalid presentation, or updating too much of the presentation and thus
doing too much work, potentially causing performance problems.
One very special case of dynamic content manipulation is the HTML Editor.
Layout is used to implement both a full-blown WYSIWYG HTML editor, and a single
and multi-line text entry control. In both cases, the content is manipulated
by the user (via the DOM) and the resulting visual impacts must be shown as
quickly as possible, without disconcerting flicker or other artifacts that
might bother the user. Consider a text entry field: the user types text into
a form on the web. As the user types a new character it is inserted into
the content model. This causes layout to be notified that a new piece of
content has been entered, which causes Layout to create a new frame and format
it. This must happen very fast, so the user's typing is not delayed. In the
case of the WYSIWYG HTML editor, the user expects that the modifications
they make to the document will apear immediately, not seconds later. This
is especially critical when the user is typing into the document: it would
be quite unusable if typing a character at the end of a document in the HTML
editor caused the entire document to be reformated - it would be too slow,
at least on low-end machines. Thus the HTML editor and text controls put
considerable performance requirements on the layout system's handling of dynamic
content manipulation.
The Fundamentals of Frames: Block and Line
There are many types of frames that are designed to model various formatting
requirements of different HTML and XML elements. CSS2 defines several (block,
inline, list-item, marker, run-in, compact, and various table types) and
the standard HTML form controls require their own special frame types to
be formatted as expected. The most essential frame types are Block and Inline,
and these correspond to the most important Layout concepts, the Block and
Line.
A block is a rectangular region that is composed of one or more lines. A
line is a single row of text or other presentational elements. All layout
happens in the context of a block, and all of the contents of a block are
formatted into lines within that block. As the width of a block is changed,
the contents of the lines must be reformatted. consider for example a large
paragraph of text sitting in paragraph:
<p>
We need documentation for users, web developers, and developers working
on Mozilla. If you write your own code, document it. Much of the
existing code <b>isn’t very well documented</b>. In the process of figuring
things out, try and document your discoveries.
If you’d like to contribute, let us know.
</p>
There is one block that corresponds to the <p> element, and then a number
of lines of text that correspond to the text. As the width of the block changes
(due to the window being resized, for example) the length of the lines within
it changes, and thus more or less text appears on each line. The block is
responsible for managing the lines. Note that lines may contain only inline
elements, whereas block may contain both inline elements and other blocks.
Other Layout Models: XUL
In addition to managing CSS-defined formatting, the layout system provides
a way to integrate other layout schemes into the presentation. Currently
layout supports the formatting of XUL elements, which utilize a constraint-based
layout language. The Box is introduced into the layout system's frame model
via an adapter (BoxToBlockAdapter) that translates the normal layout model
into the box formatting model. Conceptually, this could be used to introduce
other layout systems, but it might be worth noting that there was no specific
facility designed into the layout system to accommodate this. Layout deals
with frames, but as long as the layout system being considered has no need
to influence presentational elements from other layout systems, it can be
adapted using a frame-based adapter, ala XUL.
Core Classes
At the highest level, the layout system is a group of classes that manages
the presentation within a fixed width and either unlimited height (galley
presentation) or discrete page heights (paged presentation). Digging just
a tiny bit deeper into the system we find that the complexity (and interest)
mushrooms very rapidly. The idea of formatting text and graphics to fit within
a given screen area sounds simple, but the interaction of in-flow and out-of-flow
elements, the considerations of incremental page rendering, and the performance
concerns of dynamic content changes makes for a system that has a lot of
work to do, and a lot of data to manage. Here are the high-level classes
that make up the layout system. Of course this is a small percentage of the
total clases in layout (see the detailed design documents for the details
on all of the classes, in the context of their actual role).
Presentation Shell / Presentation Context
Together the presentation shell and the presentation context provide the
root of the current presentation. The original design provided for a single
Presentation Shell to manage multiple Presentation Contexts, to allow a single
shell to handle multiple presentations. It is unclear if this is really possible,
however, and in general it is assumed that there is a one-to-one correspondence
between a presentation shell and a presentation context. The two classes
should probably be folded into one, or the distinction between them formalized
and utilized in the code. The Presentation Shell currently owns a controlling
reference to the Presentation Context. Further references to the Presentation
Shell and Presentation Context will be made by using the term Presentation
Shell.
The Presentation Shell is the root of the presentation, and as such it owns
and manges a lot of layout objects that are used to create and maintain a
presentation (note that the DocumentViewer is the owner
of the Presentation Shell, and in some cases the creator of the objects
used by the Presentation Shell to manage the presentation. More details
of the Document Viewer are needed here...). The Presentation Shell,
or PresShell, is first and foremost the owner of the formatting objects,
the frames. Management of the frames is facilitated by the Frame Manager,
an instance of which the PresShell owns. Additionally, the PresShell provides
a specialized storage heap for frames, called an Arena, that is used to make
allocation and deallocation of frames faster and less likely to fragment
the global heap.
The Presentation Shell also owns the root of the Style System, the Style
Set. In many cases the Presentation Shell provides pass-through methods to
the Style Set, and generally uses the Style Set to do style resolution and
Style Sheet management.
One of the critical aspects of the Presentation Shell is the handling of
frame formatting, or reflow. The Presentation Shell owns and maintains a
Reflow Queue where requests for reflow are held until it is time to perform
a reflow, and then pulled out and executed.
It is also important to see the Presentation Shell as an observer of many
kinds of events in the system. For example, the Presentation Shell receives
notifications of document load events, which are used to trigger updates
to the formatting of the frames in some cases. The Presentation Shell also
receives notifications about changes in cursor and focus states, whereby
the selection and caret updates can be made visible.
There are dozens of other data objects managed by the Presentation Shell
and Presentation Context, all necessary for the internal implementation.
These data members and their roles will be discussed in the Presentation
Shell design documents. For this overview, the Frames, Style Set, and Reflow
Queue are the most important high-level parts of the Presentation Shell.
Frame Manager
The Frame Manager is used to, well, manage frames. Frames are basic formatting
objects used in layout, and the Frame Manager is responsible for making frames
available to clients. There are several collections of frames maintained
by the Frame Manager. The most basic is a list of all of the frames starting
at the root frame. Clients generally do not want to incur the expense of
traversing all of the frames from the root to find the frame they are interested
in, so the Frame Manager provides some other mappings based on the needs of
the clients.
The most crucial mapping is the Primary Frame Map. This collection provides
access to a frame that is designated as the primary frame for a piece of
content. When a frame is created for a piece of content, it may be the 'primary
frame' for that content element (content elements that require multiple
frames have primary and secondary frames; only the primary frame is mapped).
The Frame Manager is then instructed to store the mapping from a content
element to the primary frame. This mapping facilitates updates to frames
that result in changes to content (see discussion
above).
Another important mapping maintained by the Frame Manager is that of the
undisplayed content. When a content element is defined as having no display
(via the CSS property 'display:none') it is noted by a special entry in the
undisplayed map. This is important because no frame is generated for these
elements yet changes to their style values and to the content elements still
need to be handled by layout, in case their display state changes from 'none'
to something else. The Undisplayed Map keeps track of all content and style
data for elements that currently have no frames. (note:
the original architecture of the layout system included the creation of frames
for elements with no display. This changed somewhere along the line, but
there is no indication of why the change was made. Presumably it is more
time and space-efficient to prevent frame creation for elements with no display.)
The Frame Manager also maintains a list of Forms and Form Controls, as content
nodes. This is presumably related to the fact that layout is responsible
for form submission, but this is a design flaw that is being corrected by
moving form submission into the content world. These collections of Forms
and Form Controls should be removed eventually.
CSS Frame Constructor
The Frame Constructor is responsible for resolving style values for content
elements and creating the appropriate frames corresponding to that element.
In addition to managing the creation of frames, the Frame Constructor is
responsible for managing changes to the frames. Frame Construction is generally
achieved by the use of stateless methods, but in some cases there is the
need to provide context to frames created as children of a container. The
Frame Manager uses the Frame Constructor State class to manage the important
information about the container of a frame being created (and lots of other state-stuff too - need to describe more
fully).
Frame
The Frame is the most fundamental layout object. The class nsFrame is the
base class for all frames, and it inherits from the base class nsIFrame (note:
nsIFrame is NOT an interface, it is an abstract base class. It was once an
interface but was changed to be a base class when the Style System was modified
- the name was not changed to reflect that it is not an interface). The Frame
provides generic functionality that can be used by subclasses but cannot
itself be instantiated.
nsFrame:
The Frame provides a mechanism to navigate to a parent frame as well as child
frames. All frames have a parent except for the root frame. The Frame is
able to provide a reference to its parent and to its children upon request.
The basic data that all frames maintain include: a rectangle describing the
dimensions of the frame, a pointer to the content that the frame is representing,
the style context representing all of the stylistic data corresponding to
the frame, a parent frame pointer, a sibling frame pointer, and a series
of state bits.
Frames are chained primarily by the sibling links. Given a frame, one can
walk the sibling of that frame, and can also navigate back up to the parent
frame. Specializations of the frame also allow for the management of child
frames; this functionality is provided by the Container Frame.
Container Frames:
The Container Frame is a specialization of the base frame class that introduces
the ability to manage a list of child frames. All frames that need to manage
child frames (e.g. frames that are not themselves leaf frames) derive from
Container Frame.
Block Frame
Reflow State
Reflow Metrics
Space Manager
StyleSet
StyleContext
Document History:
- 05/20/2002 - Marc Attinasi: created, wrote highest level introduction
to general layout concepts, links to relevant specs and existing documents.