Intro to SmoCollectionBase initialization

SmoCollectionBase is the base class of most collections of schema objects in SMO. It provides common indexing and initialization routines.

Implicit initialization

A typical application relies on implicit initialization of collections upon creating an enumerator. The application can optimize the enumeration by using Server.SetDefaultInitFields(typeof(somesmoobjecttype), <array of property names> to assure that the objects in the collection are populated with their needed properties using a single query. Such a loop might look like this:


var server = new Server("myserver");
// the indexer implicitly initializes the Databases collection with only the Name and ID properties
var database = server.Databases["mydatabase"];
// Ensure the IsFileTable and FileTableDirectoryName properties are fetched during collection initialization
server.SetDefaultInitFields(typeof(Table), nameof(Table.Name), nameof(Table.IsFileTable), nameof(Table.FileTableDirectoryName));
// The enumerator implicitly initializes the Tables collection
foreach (var table in database.Tables.Cast<Table>()) 
{
    if (table.IsFileTable) 
    {
        Console.WriteLine($"Found file table {table.Name} using directory {table.FileTableDirectoryName}");  
    }
}

Explicit initialization

Sometimes an application needs to explicitly create or recreate a collection, often in response to external changes that may have invalidated. The most common way to recreate a collection is to call Refresh.

Refresh

Refresh has two overloads:

public void Refresh() {}

public void Refresh(bool refreshChildObjects) {}

The parameterless overload is equivalent to calling the second overload with false. Refresh starts by creating a temporary copy of the existing collection then issuing the appropriate query to generate a new copy of the collection. It then merges the old copy with the new copy. Any objects that existed in the collection previously will be preserved in their current state, while any new objects will be populated using whatever the most recent SetDefaultInitFields call set.

This preservation of prior state for existing objects has several crucial consequences.

If the application implicitly initializes the collection with one set of init fields, then later on calls SetDefaultInitFields with a different set of values and calls Refresh, only objects that were created externally and are new to the collection will actually have those new init fields prepopulated. Any loop through the collection that touches the new properties will require a query to fetch the missing values.
An application can get a reference to an object in a collection and know that this reference will be valid after Refresh calls on the collection.
Perhaps most important - objects that were in the collection before the Refresh will not reflect any external changes on properties that have already been cached. An application may desire to call Refresh on the object after retrieving it from the collection to make sure it gets updated.

To avoid the problems posed by #3, the application can pass refreshChildObjects:true. Then Refresh will loop through the updated collection and call Refresh on the individual objects. Unfortunately, this technique will impose the same performance penalty that implicit initialization without using SetDefaultInitFields has. Looping through the collection again will end up causing a query-per-object to fetch its properties.

Refresh is thus very problematic when performance matters. It's highly likely to issue a potentially time consuming query to collect many rows of data only to throw it all away because the contents of the collection didn't actually change.

ClearAndInitialize

The alternative to implicit initialization and to recreate via Refresh is the ClearAndInitialize method.

/// <summary>
/// Clears old objects and initializes the collection. Unlike Refresh(), any objects already listed in the collection will be replaced with new versions.
/// Use this method to assure all the objects in the collection have the complete set of properties you want. 
/// </summary>
/// <param name="filterQuery">the xpath to filter the objects by properties 
/// (e.g. setting the filter to [(@IsSystemObject = 0)] will exclude the system objects from the result. 
/// By setting the parameter to null or empty string, no filter will be applied to the result</param>        
/// <param name="extraFields">the list of fields to be loaded in each object. 
/// (e.g. setting the extraFields to "new string[] { "IsSystemVersioned" })" when calling this method for TableCollection 
/// will include "IsSystemVersioned" property for each table object. 
/// By setting the parameter to null or empty array, only the default fields will be included in the result</param>
public void ClearAndInitialize(string filterQuery, IEnumerable<string> extraFields)

ClearAndInitialize offers a parameterized explicit initialization combined with a complete reset of the collection contents. Unlike Refresh, prior object references from the collection are not preserved. For example, if your application had a variable named col that was a reference to a Column object from table.Columns, after table.Columns.Refresh() your reference would be the same as that returned by table.Columns[col.Name]. After ClearAndInitialize the indexer would return a different object. All objects in the collection will have the same set of populated properties after the call, using whatever list had been provided to SetDefaultInitFields plus those provided to the extraFields parameter. It's common for applications using this API to skip calling SetDefaultInitFields and just pass the property names to this function.

Perhaps the most interesting parameter to this function is filterQuery.

A full description of the Microsoft.SqlServer.Management.Sdk.Sfc.Urn class would require a whole new article, but essentially a Urn is an XPath-style expression of the server object hierarchy and can contain filter attributes. The Urn is what SMO passes down to the Sfc layer to generate a SQL query. SSMS users can see example URNs in the output window under the Object Explorer category when expanding nodes in Object Explorer.

This is an example from expanding the Tables node for WideWorldImporters:

[2021-12-03T16:05:30.1356237-05:00 End Query] URN:Server[@Name='myserver']/Database[@Name='WideWorldImporters']/Table[@IsSystemObject = 0 and @IsFileTable = 0 and @IsMemoryOptimized = 1 and @TemporalType = 0 and @IsExternal = 0]

At the end of the generated TSQL query you can see the filter being applied in the query itself:

and tbl.is_filetable=0 and CAST(tbl.is_memory_optimized AS bit)=1 and tbl.temporal_type=0 and CAST(tbl.is_external AS bit)=0)

The [] delimited sections are the filter nodes of the Urn. The available operators are =, >=, >, <=, != which can be combined using and and or. The @ identifies a property name on the object. String operand values are delimited by '. Boolean values can be represented by 0 for false and 1 for true or using the tsql operators true() and false().

Expressions can be grouped using (). Evaluation of and and or happens left to right.

We can rewrite the file table loop above using ClearAndInitialize:

var server = new Server("myserver");
// the indexer implicitly initializes the Databases collection with only the Name and ID properties
var database = server.Databases["mydatabase"];
// ClearAndInitialize fetches only file tables with the given properties prepopulated
database.Tables.ClearAndInitialize("[@IsFileTable = 1]", new string[] {nameof(Table.Name), nameof(Table.FileTableDirectoryName)});
foreach (var table in database.Tables.Cast<Table>()) 
{
    Console.WriteLine($"Found file table {table.Name} using directory {table.FileTableDirectoryName}");  
}