DbgShell/doc/CustomFormattingEngine.md

14 KiB

Custom Formatting+Output Engine

PowerShell is not a text-based shell; it's an object-based shell. Commands write full objects to the "pipeline"; not strings.

So when you are running PowerShell.exe, what determines what text you see in the console window?

The name for the feature area of PowerShell that takes care of that is called "Formatting and Output", or "F+O" for short.

The basic idea is that whenever you interactively run some pipeline of commands, the PowerShell host tacks on an extra "Out-Default" command onto the end of the pipeline you typed. (So if you type "Get-Foo | Select-Object -First 3", the pipeline that is actually executed is "Get-Foo | Select-Object -First 3 | Out-Default".) The Out-Default command is what figures out how to display whatever objects popped out at the end of your pipeline.

The Default Out-Default

(I may get a few details wrong here, but this is the general idea...) If a string object comes to Out-Default, it will just send that string right along. (The PowerShell host will get it and write it to the console.) But if something else comes along, Out-Default will try and "format" it.

In standard PowerShell, you can define one or more custom "view definitions" for a particular type by typing some XML into a .ps1xml file. You can create List views, Table views, Wide views, or Custom views. For instance, for a table view, you would define what the columns should be—labels, widths—and where the data should come from—property names or custom script.

If Out-Default finds a view definition for a particular object type, it will use that to "format" the object for display. So if you define a table view for an object, then Out-Default will call Format-Table with the specified view definition for your object. (And that will eventually yield strings, which the PowerShell host will write to the console.)

If Out-Default does not find any pre-defined view definitions for an object, it will try to generate one on the fly based on the properties of the object. If the object has 1-4 properties, it will generate a table view; else if there are more it will generate a list view. Sometimes the default generated view is fine; sometimes it's lousy.

If for some reason the object has no properties (so a generated view based on properties would yield nothing), then Output-Default will just call .ToString() on the object. (Or is it the PowerShell host that calls .ToString()? I forget.)

Q: Seems pretty flexible. So why does DbgShell need a custom F+O engine?
A: There are a few reasons:

  • Support for [Color].
  • Better handling of generics.
  • The default F+O has other serious limitations.

Support for Color

DbgShell supports color by using ISO/IEC 6429 control sequences (see the Color page for more info). The built-in formatting engine does not recognize these color sequences, so they add length to strings, but they should really be considered zero-length, since they are stripped out when displayed. Not treating colorized strings as zero-width throws formatting way off.

In addition to treating control sequences as zero-width, sometimes formatting needs to truncate strings (such as to fit a value into a table column). Things like truncating in the middle of a control sequence or chopping off the end of a string with a POP control sequence wreak havoc on the color state.

So DbgShell needs an F+O engine that is control-sequence aware.

Better Handling of Generics

When registering a view definition, you must say what type(s) it applies to, and this does not play well with generics:

  • If you look at the typename for KeyValuePair<string,int>, it is something horrible, like System.Collections.Generic.KeyValuePair`2[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.Int32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]. Besides simply looking hideous, there are PublicKeyTokens and version numbers in there that are onerous to depend on.
  • If you want to apply some formatting in a generic way (for instance, if I want to apply certain formatting for all KeyValuePair<TKey,TValue>, no matter what TKey and TValue are), there is no way to do it.

One way to hack the formatting system is by sticking something else in the TypeNames list, and keying off that in your view definition. But with some very important cases, such as enumeration of dictionary key/value pairs, there is no way to inject yourself into the chain of events such that you can get at the TypeNames list.

DbgShell's solution is to simply let you easily define views for generic types just as easily as for other types (using type names more like you would see in source code, like System.Collections.Generic.KeyValuePair<System.String,System.Int32>). Additionally, you can register views for partially specified generic types by using special wildcards, like System.Collections.Generic.KeyValuePair<?,?>, which would apply to all KVP's no matter the type of key or value.

The Default F+O Has Other Serious Limitations

Creating format view definitions for the built-in PowerShell formatting engine is painful and somewhat esoteric. We can make it easier. (Aside: It's strange that someone chose XML as the definition language for custom formatting views when they had a perfectly cromulant scripting language right in their lap.)

The built-in formatting engine is relatively "set"—with the exception of Format-Custom, there isn't much opportunity to do serious customization of the formatting and display experience. (Even "custom" formatting views are not as flexible as you might think.) By implementing our own engine, we can add useful features such as footers for tables, an auto-generated index column, etc.

Another deficiency of the default F+O is that if you have multiple objects written to the output stream, it will pick a formatting view definition based on the first object it sees, and then try to use that view for all subsequent objects. If the objects written to the output stream are not all the same type... terribleness.

DbgShell's Custom F+O

Q: Does DbgShell's custom formatting engine replace the built-in one?
A: No, it exists "side-by-side".

Q: How do you invoke the alternate formatting engine?
A: You can invoke the alternate formatting cmdlets directly (Format-AltTable, Format-AltList, Format-AltCustom, Format-AltSingleLine; or by their aliases fat, fal, fac, fas), but you can also just use the regular formatting cmdlets (Format-Table, etc.) because we have defined "proxy" functions for them. The proxy functions will check to see if you have an alternate format view definition for the type of object you are trying to format, and if so, it forwards the call on to the alternate formatting cmdlet; if not, it sends it on to the built-in formatting cmdlet.

Q: How do you invoke the original, built-in formatting engine?
A: In case of problems with the alternate formatting engine, you can invoke the built-in formatting cmdlets directly by using their module-qualified names, e.g. Microsoft.PowerShell.Utility\Format-Table, OR use some handy aliases: bft, bfl, bfc.

Q: How do those proxy functions know if a particular object has a view definition for the alternate F+O engine?
A: When you register a view definition with the alternate formatting engine, you give the name of the type it should be applicable for. Then when it has an object it needs to format, the proxy function uses the "TypeNames" of the object. N.B.: this does not mean it calls .GetType() on the object and uses the name of the type; it uses the PSObject.TypeNames list. This list will include base types, so you can define a single view definition for a base type, and it will be applied to derived objects too.

Q: How do I create format view definitions for the alternate formatting engine?
A: Instead of a .ps1xml file, you write a .psfmt file, which contains PowerShell script. In the script, you call various cmdlets to define your format views, such as New-AltScriptColumn, New-AltTableViewDefinition, etc., culminating in a call to Register-AltTypeFormatEntries. Take a look at the included Debugger.DebuggeeTypes.psfmt file for an example.

Q: What types of views does the custom F+O support?
A: It supports some similar to the default F+O: List, Table, and Custom, plus another that does not have an analog in the default F+O: Single-Line.

Debuggee/symbol values

Recall (from one of the Q/As above) that the alternate formatting engine uses an object's "TypeNames" to look up an alternate formatting view definition. (You can see an object's TypeNames list easily by typing something like "$foo.PSObject.TypeNames".) The TypeNames list will also include base types. For instance, for a System.String:

PS C:\Users\danthom> [string] $s = 'hello'
PS C:\Users\danthom> $s.PSObject.TypeNames
System.String
System.Object

When DbgShell creates objects that represent symbol values, it puts some additional pseudo type names into the TypeNames list. The pseudo type names represent the type of the symbol value. These pseudo type names are distinguished by having a '!' in them (a la the debugger syntax for a module!name). For instance:

PS Dbg:\WhatHappenedToThatCsRec.cab\Threads\22> $myGlobal = dt dfsrs!g_FrsStatusList
PS Dbg:\WhatHappenedToThatCsRec.cab\Threads\22> $myGlobal.PSObject.TypeNames
dfsrs!FrsStatusList*
!FrsStatusList*
MS.Dbg.DbgPointerValue
MS.Dbg.DbgPointerValueBase
MS.Dbg.DbgValue
System.Object

PS Dbg:\WhatHappenedToThatCsRec.cab\Threads\22>

The pseudo type names are not real .NET type names, but the alternate formatting engine does not care about that—they are just strings to be used in a lookup. This allows you to create view definitions (for the alternate formatting engine) to customize how certain symbol value objects are displayed.

Single-line views

Recall (from one of the Q/As above) that the alternate formatting engine supports a "single-line" view type that has no equivalent in the default formatting engine. A single-line view definition is just what it sounds like—given an object, it should produce a single string (for display on a single line).

When it comes to view definition selection, single-line view definitions are somewhat special. When an object pops out the end of your pipeline, DbgShell's alternate formatting engine will check to see if there are any alternate formatting engine view definitions registered for it. If it finds a table view first, it will use that; if it finds a list view first, it will use that; if it finds a custom view first, it will use that. But if it finds a "single-line" view, it will not use it (and it will continue to look for the next view definition). Single-line view definitions are only used if you explicitly call Format-AltSingleLine. (If no alternate formatting engine view definitions are found, the object goes on to the PowerShell built-in formatting engine.)

So when are single-line views useful?

You might not ever call Format-AltSingleLine yourself, but some of DbgShell's other view definitions will. In particular, DbgShell defines a custom view definition for UDT values (User-Defined Type; an "object" in C++ or C#), and that custom view uses Format-AltSingleLine to give a summary of each of the fields in the object. With appropriate single-line view definitions registered, this gives you a very powerful view of an object "at a glance", without needing to "drill" into fields to check what the values are.

Example

Here is an example that registers a single-line view for CRITICAL_SECTION objects.

Register-AltTypeFormatEntries {

    New-AltTypeFormatEntry -TypeName '!_RTL_CRITICAL_SECTION' {

        New-AltSingleLineViewDefinition {
            # The DbgUdtValue formatting code pre-screens for unavailable
            # values; we can assume the value is available here.
            if( $_.LockCount -lt 0 )
            {
                $cs = Format-DbgTypeName '_RTL_CRITICAL_SECTION: '
                $cs = $cs.AppendPushPopFg( [ConsoleColor]::Green, 'Not locked.' )
                return $cs
            }
            else
            {
                $cs = Format-DbgTypeName '_RTL_CRITICAL_SECTION: '
                $cs = $cs.AppendPushPopFg( [ConsoleColor]::Yellow, "LOCKED, by thread: $($_.OwningThread)" )
                return $cs
            }
        } # end AltSingleLineViewDefinition
    } # end Type !_RTL_CRITICAL_SECTION
}

The command on the first line says that we are going to register some view definitions for the alternate formatting engine. The parameter to that command is a script block that produces "format entries". A format entry is basically just a tuple mapping a type name to a set of view definitions.

The next line says we are going to create a format entry for objects that have "!_RTL_CRITICAL_SECTION" in their TypeNames list. The last parameter is another script block, which produces view definitions. (There is only one view definition here, but you can have more than one—for instance, you could also define a list view and a table view.)

Finally we come to the New-AltSingleLineViewDefinition command. The object that is being formatted is passed in as the "dollar underbar" ($_) variable. The first thing we do is check to see if the critical section is locked or not, so we can give different output for each case.

The Format-DbgTypeName command returns a ColorString object formatted consistently with how DbgShell displays type names (dark yellow on black). To that we append more text; either a green "Not locked", or a yellow "LOCKED" message, along with the owning thread.

This view definition produces output like below:

Single-line format display for a critsec

For more examples, just take a look at some actual view definitions that come with DbgShell: run dir Bin:\Debugger\*.psfmt from within DbgShell to see a list of formatting files.