update overview and readme

2017-01-16 22:52:14 -08:00 · 2017-01-16 22:52:14 -08:00 · 6f0dd9b851
--- a/Overview.md
+++ b/Overview.md
@ -1,67 +1,33 @@
 # Overview
-At a high level, the parser accepts source code as an input, and
+The syntax tree produced by the parser ensures two key attributes:
-produces a syntax tree as an output.
+1. **All source information is held in full fidelity.** This means that the tree contains every piece of 
 information found in the source text, every grammatical construct, every lexical token, and everything
 else in between including whitespace and comments. The syntax trees also represent errors in source code
 when the program is incomplete or malformed, by representing skipped or missing tokens in the syntax tree.
 2. **A syntax tree obtained from the parser is completely round-trippable back to the text it was parsed from.**
 From any syntax node, it is possible to get the text representation of the subtree rooted at that node.
 This means that syntax trees can be used as a way to construct and edit source text.
-If you're familiar with Roslyn and TypeScript, many of the concepts presented here will be familiar
+## Key Concepts
-(albeit adapted, to account for the unique runtime characteristics of PHP.)
+The **Syntax Tree** produced is literally a tree data structure, where non-terminal structural elements parent other
-
+elements. Each syntax tree is made up of **Nodes** (non-terminal elements) and
-## Syntax Tree
+ **Tokens** (terminal elements).
 A syntax tree is literally a tree data structure, where non-terminal structural 
 elements parent other elements. Each syntax tree is made up of Nodes (represented by circles), 
 Tokens (represented by squares), and trivia (not represented, below, but attached to each Token).
 ![image](https://cloud.githubusercontent.com/assets/762848/19092929/e10e60aa-8a3d-11e6-8b90-51eabe5d1d8e.png)
 Syntax trees have two key attributes.
 1. The first attribute is that Syntax trees hold all the source information in full fidelity. 
 This means that the syntax tree contains every piece of information 
 found in the source text, every grammatical construct, every lexical 
 token, and everything else in between including whitespace, comments, 
 and preprocessor directives. For example, each literal mentioned in 
 the source is represented exactly as it was typed. The syntax trees 
 also represent errors in source code when the program is incomplete 
 or malformed, by representing skipped or missing tokens in the syntax tree.
 2. This enables the second attribute of syntax trees. A syntax tree obtained 
 from the parser is completely round-trippable back to the text it was parsed 
 from. From any syntax node, it is possible to get the text representation of 
 the sub-tree rooted at that node. This means that syntax trees can be used 
 as a way to construct and edit source text. By creating a tree you have by 
 implication created the equivalent text, and by editing a syntax tree, 
 making a new tree out of changes to an existing tree, you have effectively 
 edited the text.
 The syntax tree is composed of Nodes (represented by circles), 
 Tokens (represented by squares), and Trivia (not represented directly, but attached to 
 individual Tokens)
 Additionally associated with each Node and Token is **Positional Information**, **Errors**, and **Comment + Whitespace Trivia**.
 All trees guarantee a set of **Invariants** - properties of the tree that always hold true, no matter what the
 input. This set of invariants provides a consistent foundation 
 that makes it easier to ensure the tree is "structurally sound", and confidently reason about the tree 
 as we continue to build up our understanding. For instance, one such invariant is that the original text 
 (including whitespace and comments) should always be reproducible from a Node. See [Invariants](Invariants.md)
 for a complete list. 
 ## Tree Elements
 ### Nodes
 Syntax nodes are one of the primary elements of syntax trees. These nodes represent 
 syntactic constructs such as declarations, statements, clauses, and expressions. 
-Each category of syntax nodes is represented by a separate class derived from SyntaxNode. 
+Each category of syntax nodes is represented by a separate class derived from `Node`.
 The set of node classes is not extensible.
 All syntax nodes are non-terminal nodes in the syntax tree, which means they always have 
 other nodes and tokens as children. As a child of another node, each node has a parent node
 that can be accessed through the Parent property. Because nodes and trees are immutable, 
 the parent of a node never changes. The root of the tree has a null parent.
 Each node has a ChildNodes method, which returns a list of child nodes in sequential order 
 based on its position in the source text. This list does not contain tokens. Each node also
 has a collection of Descendant methods - such as DescendantNodes, DescendantTokens, or 
 DescendantTrivia - that represent a list of all the nodes, tokens, or trivia that exist in 
 the sub-tree rooted by that node.
 In addition, each syntax node subclass exposes all the same children through 
 properties. For example, a BinaryExpressionSyntax node class has three additional properties 
 specific to binary operators: Left, OperatorToken, and Right.
 Some syntax nodes have optional children. For example, an IfStatementSyntax has an optional 
 ElseClauseSyntax. If the child is not present, the property returns null.
 ### Tokens
 Syntax tokens are the terminals of the language grammar, representing the smallest syntactic 
@ -72,24 +38,23 @@ For efficiency purposes, unlike syntax nodes, there is only one structure for al
 kinds of tokens with a mix of properties that have meaning depending on the kind 
 of token that is being represented.
-### Trivia
+### Whitespace and Comment Trivia
-Syntax trivia represent the parts of the source text that are largely insignificant for 
+Because whitespace and comment trivia are not part of the normal language syntax and can appear anywhere between 
 normal understanding of the code, such as whitespace, comments, and preprocessor directives.
 Because trivia are not part of the normal language syntax and can appear anywhere between 
 any two tokens, they are not included in the syntax tree as a child of a node. Yet, because 
 they are important when implementing a feature like refactoring and to maintain full 
 fidelity with the source text, they do exist as part of the syntax tree.
-You can access trivia by inspecting a token's LeadingTrivia. 
+You can access trivia by inspecting a token's LeadingWhitespaceAndComments. When source text is parsed,
-When source text is parsed, sequences of trivia are associated with tokens. 
+sequences of trivia are associated with tokens. 
-### Kinds
+### Positional Information
-Each node, token, or trivia has a RawKind property (represented by a numeric literal), 
+Each node, token, or trivia knows its position within the source text and the number of 
-that identifies the exact syntax element represented.
+characters it consists of. A text position is represented as a 32-bit integer, which is 
 a zero-based byte index into the string. The width corresponds to a count of characters,
 represented as integers. Zero-length refers to a location between two characters.
-The RawKind property allows for easy disambiguation of syntax node types that share the 
+For efficiency purposes, the position refers to the absolute position within the text, 
-same node class. For tokens and trivia, this property is the only way to distinguish 
+and a helper function is available if you require Line/Column information.
 one type of element from another.
 ### Errors
 Even when the source text contains syntax errors, a full syntax tree that is round-trippable
@ -101,23 +66,12 @@ insert a missing token into the syntax tree in the location that the token was e
 A missing token represents the actual token that was expected, but it has an empty span.
 Second, the parser may skip tokens until it finds one where it can continue parsing. 
-In this case, the skipped tokens that were skipped are attached as a trivia node with 
+In this case, the skipped tokens that were skipped are attached as a skipped token in the tree.
 the kind SkippedTokens.
 Note that the parser produces trees in a tolerant fashion, and will not produce errors for
 all incorrect constructs (e.g. including a non-constant expression as the default value of
 a method parameter). Instead, it attaches these errors on a post-parse walk of the tree.
 ### Positional Information
 Each node, token, or trivia knows its position within the source text and the number of 
 characters it consists of. A text position is represented as a 32-bit integer, which is 
 a zero-based Unicode character index. A TextSpan object is the beginning position and a 
 count of characters, both represented as integers. If TextSpan has a zero length, it refers
 to a location between two characters.
 The position refers to the absolute position within the text, but a helper function is available
 if you require Line/Column information. 
 ## Next Steps
-Check out the [Documentation](GettingStarted.md) section for more information on how consume
+Check out the [Readme](Readme.md) for more information on how consume
 the parser, or the [How It Works](HowItWorks.md) section if you want to dive deeper into the implementation.
--- a/README.md
+++ b/README.md
@ -54,7 +54,9 @@ foreach ($astNode->getDescendantNodes() as $descendant) {
 }
 ```
-> Note: The API is still a work in progress, and will evolve according to user feedback.
+> Note: [the API](ApiDocumentation.md) is not yet finalized, so please file issues let us know what functionality you want exposed, 
 and we'll see what we can do! Also please file any bugs with unexpected behavior in the parse tree. We're still
 in our early stages, and any feedback you have is much appreciated :smiley:.
 ## Design Goals
 * Error tolerant design - in IDE scenarios, code is, by definition, incomplete. In the case that invalid code is entered, the
@ -111,8 +113,6 @@ own machine to see for yourself.
 ## Learn more
 **:dart: [Design Goals](#design-goals)** - learn about the design goals of the project (features, performance metrics, and more).
 **:sunrise_over_mountains: [Syntax Overview](Overview.md)** - learn about the composition and key properties of the syntax tree.
 **:seedling: [Documentation](GettingStarted.md#getting-started)** - learn how to reference the parser from your project, and how to perform
 operations on the AST to answer questions about your code.