Added 2013 and 2014 LDM meeting notes. (#26)

This commit is contained in:
Scott Dorman 2017-02-09 12:25:09 -05:00 коммит произвёл Neal Gafter
Родитель e8a42bafa8
Коммит 172a371cd9
15 изменённых файлов: 2280 добавлений и 0 удалений

Просмотреть файл

@ -0,0 +1,166 @@
# C# Language Design Notes for Oct 7, 2013
## Agenda
We looked at a couple of feature ideas that either came up recently or deserved a second hearing.
1. Invariant meaning of names <_scrap the rule_>
2. Type testing expression <_cant decide on good syntax_>
3. Local functions <_not enough scenarios_>
4. nameof operator <_yes_>
## Invariant meaning of names
C# has a somewhat unique and obscure rule called “invariant meaning in blocks” (documented in section 7.6.2.1 of the language specification) which stipulates that if a simple name is used to mean one thing, then nowhere in the immediately enclosing block can the same simple name be used to mean something else.
The idea is to reduce confusion, make cut & paste refactoring a little more safe, and so on.
It is really hard to get data on who has been saved from a mistake by this rule. On the other hand, everyone on the design team has experience being limited by it in scenarios that seemed perfectly legit.
The rule has proven to be surprisingly expensive to implement and uphold incrementally in Roslyn. This has to do with the fact that it cannot be tied to a declaration site: it is a rule about use sites only, and information must therefore be tracked per use – only to establish in the 99.9% case that no, the rule wasnt violated with this keystroke either.
### Conclusion
The invariant meaning rule is well intentioned, but causes significant nuisance for what seems to be very little benefit. It is time to let it go.
## Type testing expressions
With declaration expressions you can now test the type of a value and assign it to a fresh variable under the more specialized type, all in one expression. For reference types and nullable value types:
``` c#
if ((var s = e as string) != null) { … s … } // inline type test
```
For non-nullable value types it is a little more convoluted, but doable:
``` c#
if ((var i = e as int?) != null) { … i.Value … } // inline type test
```
One can imagine a slightly nicer syntax using a `TryConvert` method:
``` c#
if (MyHelpers.TryConvert(e, out string s)) { … s … }
```
The signature of the `TryConvert` method would be something like
``` c#
public static bool TryConvert<TSource, TResult>(TSource src, out TResult res);
```
The problem is that you cannot actually implement `TryConvert` efficiently: It needs different logic depending on whether `TResult` is a non-nullable value type or a nullable type. But you cannot overload a method on constraints alone, so you need two methods with different names (or in different classes).
This leads to the idea of having dedicated syntax for type testing: a syntax that will a) take an expression, a target type and a fresh variable name, b) return a boolean for whether the test succeeds, c) introduce a fresh variable of the target type, and d) assign the converted value to the fresh variable if possible, or the default value otherwise.
What should that syntax be? A previous proposal was an augmented version of the “is” operator, allowing an optional variable name to be tagged onto the type:
``` c#
if (e is string s) { … s … } // augmented is operator
```
Opinions on this syntax differ rather wildly. While we agree that some mix of “is” and “as” keywords is probably the way to go, no proposal seems appealing to everyone involved. Here are a few:
``` c#
e is T x
T x is e
T e as x
```
(A few sillier proposals were made:
``` c#
e x is T
T e x as
```
But this doesnt feel like the right time to put Easter eggs in the language.)
### Conclusion
Probably 90% of cases are with reference (or nullable) types, where the declaration-expression approach is not too horrible. As long as we cannot agree on a killer syntax, we are fine with not doing anything.
## Local functions
When we looked at local functions on Apr 15, we lumped them together with local class declarations, and dismissed them as a package. We may have given them somewhat short shrift, and as we have had more calls for them we want to make sure we do the right thing with local functions in their own right.
A certain class of scenarios is where you need to declare a helper function for a function body, but no other function needs it. Why would you need to declare a helper function? Here are a few scenarios:
* Task-returning functions may be fast-path optimized and not implemented as async functions: instead they delegate to an async function only when they cannot take the fast path.
* Iterators cannot do eager argument validation so they are almost always wrapped in a non-iterator function which does validation and delegates to a private iterator.
* Exception filters can only contain expressions – if they need to execute statements, they need to do it in a helper function.
Allowing these helper functions to be declared inside the enclosing method instead of as private siblings would not only avoid pollution of the class namespace, but would also allow them to capture type parameters, parameters and locals from the enclosing method. Instead of writing
``` c#
public static IEnumerable<T> Filter<T>(IEnumerable<T> s, Func<T, bool> p)
{
if (s == null) throw new ArgumentNullException("s");
if (p == null) throw new ArgumentNullException("p");
return FilterImpl<T>(s, p);
}
private static IEnumerable<T> FilterImpl<T>(IEnumerable<T> s, Func<T, bool> p)
{
foreach (var e in s)
if (p(e)) yield return e;
}
```
You could just write this:
``` c#
public static IEnumerable<T> Filter<T>(IEnumerable<T> s, Func<T, bool> p)
{
if (s == null) throw new ArgumentNullException("s");
if (p == null) throw new ArgumentNullException("p");
IEnumerable<T> Impl() // Doesnt need unique name, type params or params
{
foreach (var e in s) // s is in scope
if (p(e)) yield return e; // p is in scope
}
return Impl<T>(s, p);
}
```
The underlying mechanism would be exactly the same as for lambdas: we would generate a display class with a method on it. The only difference is that we would not take a delegate to that method.
While it is nicer, though, it is reasonable to ask if it has that much over private sibling methods. Also, those scenarios probably arent super common.
### Inferred types for lambdas
This did bring up the discussion about possibly inferring a type for lambda expressions. One of the reasons they are so unfit for use as local functions is that you have to write out their delegate type. This is particularly annoying if you want to immediately invoke the function:
``` c#
((Func<int,int>)(x => x*x))(3); // What??!?
```
VB infers a type for lambdas, but a fresh one every time. This is ok only because VB also has more lax conversion rules between delegate types, and the result, if you are not careful, is a chain of costly delegate allocations.
One option would be to infer a type for lambdas only when there happens to be a suitable `Func<…>` or `Action<…>` type in scope. This would tie the compiler to the pattern used by the BCL, but not to the BCL itself. It would allow the BCL to add more (longer) overloads in the future, and it would allow others to add different overloads, e.g. with ref and out parameters.
### Conclusion
At the end of the day we are not ready to add anything here. No local functions and no type inference for lambdas.
## The nameof operator
The `nameof` operator is another feature that was summarily dismissed as a tag-along to a bigger, more unwieldy feature: the legendary `infoof` operator.
``` c#
PropertyInfo info = infoof(Point.X);
string name = nameof(Point.X); // "X"
```
While the `infoof` operator returns reflection information for a given language element, `nameof` just returns its name as a string. While this seems quite similar at the surface, there is actually quite a big difference when you think about it more deeply. So lets do that!
First of all, lets just observe that when people ask for the `infoof` operator, they are often really just looking to get the name. Common scenarios include throwing `ArgumentException`s and `ArgumentNullException`s (where the parameter name must be supplied to the exceptions constructor), as well as firing `Notify` events when properties change, through the `INotifyPropertyChanged` interface.
The reason people dont want to just put strings in their source code is that it is not refactoring safe. If you change the name, VS will help you update all the references – but it will not know to update the string literal. If the string is instead provided by the compiler or the reflection layer, then it will change automatically when its supposed to.
At a time when we are trying to minimize the use of reflection, it seems wrong to add a feature, `infoof`, that would contribute significantly to that use. `nameof` would not suffer from that problem, as the compiler would just replace it with a string in the generated IL. It is purely a design time feature.
There are deeper problems with `infoof`, pertaining to how the language element in question is uniquely identified. Lets say you want to get the `MethodInfo` for an overload of a method `M`. How do you designate which overload you are talking about?
``` c#
void M(int x);
void M(string s);
var info = infoof(M…); // Which M? How do you say it?
```
Wed need to create a whole new language for talking about specific overloads, not to mention members that dont have names, such as user defined operators and conversions, etc.
Again, though, `nameof` does not seem to have that problem. It is only for named entities (why else would you want the name), and it only gives you the name, so you dont need to choose between overloads, which all have the same name!
All in all, therefore, it seems that the concerns we have about `infoof` do not apply to `nameof`, whereas much (most?) of its value is retained. How exactly would a `nameof` operator work?
### Syntax
We would want the operator to be syntactically analogous to the existing `typeof(…)`. This means we would allow you to say `nameof(something)` in a way that at least sometimes would parse as an invocation of a method called `nameof`. This is ambiguous with existing code only to the extent that such a call binds. We would therefore take a page out of `var`s book and give priority to the unlikely case that this binds as a method call, and fall back to `nameof` operator semantics otherwise. It would probably be ok for the IDE to always highlight `nameof` as a keyword.
The operand to `nameof` has to be “something that has a name”. In the spirit of the `typeof` operator we will limit it to a static path through names of namespaces, types and members. Type arguments would never need to be provided, but in every name but the last you may need to provide “generic dimension specifiers” of the shape `<,,,>` to disambiguate generic types overloaded on arity.
The grammar would be something like the following:
> _nameof-expression:_
> * `nameof` `(` _identifier_ `)`
> * `nameof` `(` _identifier_ `::` _identifier_ `)`
> * `nameof` `(` _unbound-type-name_ `.` _identifier_ `)`
Where _unbound-type-name_ is taken from the definition of `typeof`. Note that it always ends with an identifier. Even if that identifier designates a generic type, the type parameters (or dimension specifiers) are not and cannot be given.
### Semantics
It is an error to specify an operand to `nameof(…)` that doesnt “mean” anything. It can designate a namespace, a type, a member, a parameter or a local, as long as it is something that is in fact declared.
The result of the `nameof` expression would always be a string containing the final identifier of the operand. There are probably scenarios where it would be desirable to have a “fully qualified name”, but that would have to be constructed from the parts. After all, what would be the syntax used for that? C# syntax? What about VB consumers, or doc comments, or comparisons with strings provided by reflection? Better to leave those decisions outside of the language.
### Conclusion
We like the `nameof` operator and see very few problems with it. It seems easy to spec and implement, and it would address a long-standing customer request. With a bit of luck, we would never (or at least rarely) hear requests for `infoof` again.
While not the highest priority feature, we would certainly like to do this.

Просмотреть файл

@ -0,0 +1,91 @@
# C# Design Notes for Dec 16, 2013
## Agenda
This being the last design meeting of the year, we focused on firming up some of the features that wed like to see prototyped first, so that developers can get going on implementing them.
1. Declaration expressions <_reaffirmed scope rules, clarified variable introduction_>
2. Semicolon operator <_reaffirmed enclosing parentheses_>
3. Lightweight dynamic member access <_decided on a syntax_>
## Declaration expressions
On Sep 23 we tentatively decided on rules for what the scope is when a variable is introduced by a declaration expression. Roughly speaking, the rules put scope boundaries around “structured statements”. While this mostly makes sense, there is one idiom, the “guard clause” pattern, that falls a little short here:
``` c#
if (!int.TryParse(s, out int i)) return false; // or throw, etc.
… // code ideally consuming i
```
Since the variable `i` would be scoped to the if statement, using a declaration statement to introduce it inside of the if would not work.
We talked again about whether to rethink the scope rules. Should we have a different rule to the effect essentially of the variable declaration being introduced “on the preceding statement”? I.e. the above would be equivalent to
``` c#
int i;
if (!int.TryParse(s, out i)) return false; // or throw, etc.
… // code consuming i
```
This opens its own can of worms. Not every statement _has_ a preceding statement (e.g. when being nested in another statement). Should it bubble up to the enclosing statement recursively? Should it introduce a block around the current statement to hold the generated preceding statement, etc.
More damning to this idea is the issue of initialization. If a variable with an initializer introduced in e.g. a while loop body is understood to really be a single variable declared outside of the loop, then that same variable would be re-initialized every time around the loop. That seems quite counterintuitive.
This brings us to the related issue about variable lifetimes. When a variable is introduced somewhere in a loop, is it a fresh variable each time around? It would probably seem strange if it werent. In fact we took a slight breaking change in C# 5.0 in order to make that the case for the loop variable in `foreach`, because the alternative didnt make sense to people, and tripped them up when they were capturing the variable in lambdas, etc.
### Conclusion
We keep the scope rules described earlier, despite the fact that the guard clause example doesnt work. Well look at feedback on the implementation and see if we need to adjust. On variable lifetimes, each iteration of a loop will introduce its own instance of a variable introduced in the scope of that loop. The only exception is if it occurs in the initializer of a for loop, because that part is evaluated only on entry to the loop.
## Semicolon operator
We previously decided that a sequence of expressions separated by semicolons need to be enclosed in parentheses. This is so that we dont get into situations where e.g. commas and semicolons are interspersed among expressions and it isnt visually clear what the precedence is.
We discussed briefly whether those parentheses are necessary even when there is other bracketing around the semicolon-separated expressions.
### Conclusion
Its not worth having special rules for this. Instead, semicolons are a relatively straightforward addition to parenthesized expressions – something like this:
> _parenthesized-expression:_
> * `(` *expression-sequence*\_opt expression_ `)`
> _expression-sequence:_
> * *expression-sequence*\_opt _statement-expression_ `;`
> * *expression-sequence*\_opt _declaration-expression_ `;`
## Lightweight dynamic member access
On Nov 4 we decided that lightweight member access should take the form `x.<glyph>Foo` for some value of `<glyph>`. One candidate for the glyph is `#`, so that member access would look like this:
``` c#
payload.#People[i].#Name
```
However, `#` plays really badly with preprocessor directives. It seems almost impossible to come up with syntactic rules that reconcile with the deep lexer-based recognition of `#`-based preprocessor directives. After searching the keyboard for a while, we settled on `$` as a good candidate. It sort of invokes a “string” aspect in a tip of the hat to classic Basic:
``` c#
payload.$People[i].$Name
```
One key aspect of dynamicness that we havent captured so far is the ability to declaratively construct an object with “lightweight dynamic members”, i.e. with entries accessible with string keys. A core syntactic insight here is to think of `$Foo` above as a unit – as a “lightweight dynamic member name”. What this enables is to think of object construction in terms of object initializers – where the member names are dynamic:
``` c#
var json = new JsonObject
{
$foo = 1,
$bar = new JsonArray { "Hello", "World" },
$baz = new JsonObject { $x = 1, $y = 2 }
};
```
We could also consider allowing a free standing `$Foo` in analogy with a free standing simple name being able to reference an enclosing member in the implicit meaning of `this.$Foo`. This seems a little over the top, so we wont do that for now.
One thing to consider is that objects will sometimes have keys that arent identifiers. This is quite common in Json payloads. Thats fine as far as member access is concerned: just fall back to indexing syntax when necessary:
``` c#
payload.$People[i].["first name"]
```
It would be nice to also be able to initialize such “members” in object initializers. This leads to the idea of a generalized “dictionary initializer” syntax in object initializers:
``` c#
var payload = new JsonObject
{
["first name"] = "Donald",
["last name"] = "Duck",
$city = "Duckburg" // equivalent to ["city"] = "Duckburg"
};
```
So just as `x.$Foo` is a shorthand for `x["Foo"]` in expressions, `$Foo=value` is shorthand for `["Foo"]=value` in object initializers. That syntax in turn is a generalized dictionary initializer syntax, that lets you index and assign on the newly created object, so that the above is equivalent to
``` c#
var __tmp = new JsonObject();
__tmp["first name"] = "Donald";
__tmp["last name"] = "Duck";
__tmp["city"] = "Duckburg";
var payload = __tmp;
```
It could be used equally well with non-string indexers.
A remaining nuisance is having to write all the type names during construction. Could some of them be inferred, or a default be chosen? Thats a topic for a later time.
### Conclusion
Well introduce a notion of dictionary initializers `[index]=value`, which are indices enclosed in `[…]` being assigned to in object initializers. Well also introduce a shorthand `x.$Foo` for indexing with strings `x["Foo"]`, and a shorthand `$Foo=value` for a string dictionary initializer `["Foo"]=value`.

20
meetings/2013/README.md Normal file
Просмотреть файл

@ -0,0 +1,20 @@
# C# Language Design Notes for 2013
Overview of meetings and agendas for 2013
## Oct 7, 2013
[C# Language Design Notes for Oct 7, 2013](LDM-2013-10-07.md)
1. Invariant meaning of names <_scrap the rule_>
2. Type testing expression <_cant decide on good syntax_>
3. Local functions <_not enough scenarios_>
4. nameof operator <_yes_>
## Dec 16, 2013
[C# Language Design Notes for Dec 16, 2013](LDM-2013-12-16.md)
1. Declaration expressions <_reaffirmed scope rules, clarified variable introduction_>
2. Semicolon operator <_reaffirmed enclosing parentheses_>
3. Lightweight dynamic member access <_decided on a syntax_>

Просмотреть файл

@ -0,0 +1,68 @@
# C# Design Notes for Jan 6, 2014
Notes are archived [here](https://roslyn.codeplex.com/wikipage?title=CSharp%20Language%20Design%20Notes).
## Agenda
In this meeting we reiterated on the designs of a couple of features based on issues found during implementation or through feedback from MVPs and others.
1. Syntactic ambiguities with declaration expressions <_a solution adopted_>
2. Scopes for declaration expressions <_more refinement added to rules_>
# Syntactic ambiguities with declaration expressions
There are a couple of places where declaration expressions are grammatically ambiguous with existing expressions. The knee-jerk reaction would be to prefer the existing expressions for compatibility, but those turn out to nearly always lead to semantic errors later.
There are two kinds of full ambiguities (i.e. ones that dont resolve with more lookahead):
``` c#
a * b // multiplication expression or unitialized declaration of pointer?
a < b > c // nested comparison or uninitialized declaration of generic type?
```
The latter one exists also in a method argument version that seems more realistic:
``` c#
a ( b < c , d > e ) // two arguments or one declaration expression?
```
However, in all these cases the expression, when interpreted as a declaration expression, is uninitialized. This means that in the vast majority of cases (except for structs with no accessible members) it will be considered unassigned. Which means that it will be a semantic error for it to occur as a value: it has to occur in one of the few places where an unassigned variable is allowed:
1. On the left hand side of an assignment expression
2. As an argument to an out parameter
3. In parentheses in one of the previous positions
Those are places where a multiplication or comparison expression cannot currently occur, because they are always values, never variables. We can therefore essentially split the world neatly for the two interpretations of the ambiguous expressions:
* If they occur in a place where a variable is required, they are parsed as declaration expressions
* Everywhere else, they are parsed as they are today.
This is a purely syntactic distinction. Rules of definite assignment etc. arent actually used by the compiler to decide, just by us designers to justify that the rule isnt breaking.
There is one potential future conflict we can imagine: If we start allowing ref returning methods and those include user defined overloads of operators, then you could imagine someone defining an overload of “`*`” that returns a ref, and would therefore give meaning to the expression `(a*b) = c`, even when interpreted as multiplication. The rules as proposed here would not allow that; they would try to see `(a*b)` as a parenthesized declaration expression of a variable `b` with pointer type `a*`.
### Conclusion
We like the rule that parsing prefers declaration expressions in ambiguous cases _only_ in places where a variable is required: when occurring as an out argument, on the left hand side of an assignment, or nested in any number of parentheses within those. This is non-breaking, and doesnt seem too harmful to the future design space.
## Scopes for declaration expressions
The initial rules for declaration expression scopes are in need of some refinement in at least two scenarios:
``` c#
if (…) m(int x = 1); else m(x = 2); // cross pollution between branches?
while (…) m(int x = 1; x++); // survival across loop iterations?
```
We want to make sure that declarations are isolated between branches and loop iterations. This means we need to add more levels of scopes. Essentially, whenever an _embedded-expression_ occurs as an embedded expression (as opposed to where any _expression_ can occur), we want to introduce a new nested scope.
Additionally, for for-loops we want to nest scopes for each of the clauses in the header:
``` c#
for (int i = (int a = 0);
i < (int b = 10); // i and a in scope here
i += (int c = 1)) // i, a and b in scope here
(int d += i); // i, a and b but not c in scope here
```
Its as if the for loop was rewritten as
``` c#
{
int i = (int a = 0);
while (i < (int b = 10))
{
{ i += (int c = 1)); }
(int d += i);
}
}
```
### Conclusion
Well adopt these extra scope levels which guard against weird spill-over.

Просмотреть файл

@ -0,0 +1,114 @@
# C# Language Design Notes for Feb 3, 2014
## Agenda
We iterated on some of the features currently under implementation
1. Capture of primary constructor parameters <_only when explicitly asked for with new syntax_>
1. Grammar around indexed names <_details settled_>
1. Null-propagating operator details <_allow indexing, bail with unconstrained generics_>
## Capture of primary constructor parameters
Primary constructors as currently designed and implemented lead to automatic capture of parameters into private, compiler-generated fields of the object whenever those parameters are used after initialization time.
It is becoming increasingly clear that this is quite a dangerous design. To illustrate, whats wrong with this code?
``` c#
public class Point(int x, int y)
{
public int X { get; set; } = x;
public int Y { get; set; } = y;
public double Dist => Math.Sqrt(x * x + y * y);
public void Move(int dx, int dy)
{
x += dx; y += dy;
}
}
```
This appears quite benign, but is in fact catastrophically wrong. The use of `x` and `y` in `Dist` and `Move` causes these values to be captured as private fields. The auto-properties `X` and `Y` each cause their own backing fields to be generated, initialized with the `x` and `y` values passed in to the primary constructors. But from then on, `X` and `Y` lead completely distinct lives from `x` and `y`. Assignments to the `X` and `Y` properties will cause them to be observably updated, but the value of `Dist` remains unchanged. Conversely, changes through the `Move` method will reflect in the value of `Dist`, but not affect the value of the properties.
The way for the developer to avoid this is to be extremely disciplined about not referencing `x` and `y` except in initialization code. But that is like giving them a gun already pointing at their foot: sooner or later it will go subtly wrong, and they will have hard to find bugs.
There are other incarnations of this problem, e.g. where the parameter is passed to the base class and captured multiple times.
There are also other problems with implicit capture: we find, especially from MVP feedback, that people quickly want to specify certain things about the generated fields, such as readonly-ness, attributes, etc. We could allow those on the parameters, but they quickly dont look like parameters anymore.
The best way for us to deal with this is to simply disallow automatic capture. The above code would be disallowed, and given the same declarations of `x`, `y`, `X` and `Y`, `Dist` and `Move` would have to written in terms of the properties:
``` c#
public double Dist => Math.Sqrt(X * X + Y * Y);
public void Move(int dx, int dy)
{
X += dx; Y += dy;
}
```
Now this raises a new problem. What if you want to capture a constructor parameter in a private field and have no intention of exposing it publically. You can do that explicitly:
``` c#
public class Person(string first, string last)
{
private string _first = first;
private string _last = last;
public string Name => _first + " " + _last;
}
```
The problem is that the “good” lower case names in the class-level declaration space are already taken by the parameters, and the privates are left with (what many would consider) less attractive naming options.
We could address this in two ways (that we can think of) in the primary constructor feature:
1. Allow primary constructor parameters and class members to have the same names, with the excuse that their lifetimes are distinct: the former are only around during initialization, where access to the latter through this is not yet allowed.
1. Introduce a syntax for explicitly capturing a parameter. If you ask for it, presumably you thought through the consequences.
The former option seems mysterious: two potentially quite different entities get to timeshare on the same name? And then youd get confusing initialization code like this:
``` c#
private string first = first; // WHAT???
private string last = last;
```
It seems that the latter option is the better one. We would allow field-like syntax to occur in a parameter list, which is a little odd, but kind of says what it means. Specifically specifying an accessibility on a parameter (typically private) would be what triggers capture as a field:
``` c#
public class Person(private string first, private string last)
{
public string Name => _first + " " + _last;
}
```
Once theres an accessibility specified, we would also allow other field modifiers on the parameter; readonly probably being the most common. Attributes could be applied to the field in the same manner as with auto-properties: through a field target.
Conclusion
We like option two. Lets add syntax for capture and not do it implicitly.
Grammar for indexed names
For the lightweight dynamic features, weve been working with a concept of “pseudo-member” or _indexed name_ for the `$identifier` notation.
We will introduce this as a non-terminal in the grammar, so that the concept is reified. However, for the constructs that use it (as well as ordinary identifiers) we will create separate productions, rather than unify indexed names and identifiers under a common grammatical category.
For the stand-alone dictionary initializer notation of `[expression]` we will not introduce a non-terminal.
## Null-propagating operator details
Nailing down the design of the null-propagating operator we need to decide a few things:
### Which operators does it combine with?
The main usage of course is with dot, as in `x?.y` and `x?.m(…)`. It also potentially makes sense for element access `x?[…]` and invocation `x?(…)`. And we also have to consider interaction with indexed names, as in `x?.$y`.
Well do element access and indexed member access, but not invocation. The former two make sense in the context that lightweight dynamic is addressing. Invocation seems borderline ambiguous from a syntactic standpoint, and for delegates you can always get to it by explicitly calling Invoke, as in `d?.Invoke(…)`.
### Semantics
The semantics are like applying the ternary operator to a null equality check, a null literal and a non-question-marked application of the operator, except that the expression is evaluated only once:
``` c#
e?.m(…) => ((e == null) ? null : e0.m(…))
e?.x => ((e == null) ? null : e0.x)
e?.$x => ((e == null) ? null : e0.$x)
e?[…] => ((e == null) ? null : e0[…])
```
Where `e0` is the same as `e`, except if `e` is of a nullable value type, in which case `e0` is `e.Value`.
### Type
The type of the result depends on the type `T` of the right hand side of the underlying operator:
* If `T` is (known to be) a reference type, the type of the expression is `T`
* If `T` is (known to be) a non-nullable value type, the type of the expression is `T?`
* If `T` is (known to be) a nullable value type, the type of the expression is `T`
* Otherwise (i.e. if it is not known whether `T` is a reference or value type) the expression is a compile time error.

Просмотреть файл

@ -0,0 +1,76 @@
# C# Design Notes for Feb 10, 2014
## Agenda
1. Design of using static <_design adopted_>
2. Initializers in structs <_allow in certain situations_>
3. Null-propagation and unconstrained generics <_keep current design_>
## Design of using static
The “using static” feature was added in some form to the Roslyn codebase years ago, and sat there quietly waiting for us to decide whether to add it to the language. Now its coming out, its time to ensure it has the right design.
### Syntax
Should the feature have different syntax from namespace usings, or should it be just like that, but just specifying a type instead? The downside of keeping the current syntax is that we need to deal with ambiguities between types and namespaces with the same name. That seems relatively rare, though, and sticking with current syntax definitely makes it feel more baked in:
``` c#
using System.Console;
```
as opposed to, e.g.:
``` c#
using static System.Console;
```
#### Conclusion
Well stick with the current syntax.
### Ambiguities
This leads to the question of how to handle ambiguities when there are both namespaces and types of a given name. We clearly need to prefer namespaces over types for compatibility reasons. The question is whether we make a choice at the point of the specified name, or whether we allow “overlaying” the type and the namespace, disambiguating at the next level down by preferring names that came from the namespace over ones from the type.
#### Conclusion
We think overlaps are sufficiently rare that well go with the simple rule: A namespace completely shadows a type of the same name, and you cant import the members of such a type. If this turns out to be a problem were free to loosen it up later.
### Which types can you import?
Static classes, all classes, enums? It seems it is almost always a mistake to import non-static types: they will have names that are designed to be used with the type name, such as `Create`, `FromArray`, `Empty`, etc., that are likely to appear meaningless on their own, and clash with others. Enums are more of a gray area. Spilling the enum members to top-level would often be bad, and could very easily lead to massive name clashes, but sometimes its just what you want.
#### Conclusion
Well disallow both enums and non-static classes for now.
### Nested types
Should nested types be imported as top-level names?
#### Conclusion
Sure, why not? They are often used by the very members that are being “spilled”, so it makes sense that they are spilled also.
### Extension methods
Should extension methods be imported as extension methods? As ordinary static methods? When we first introduced extension methods, a lot of people asked for a more granular way of applying them. This could be it: get the extension methods just from a single class instead of the whole namespace. For instance:
``` c#
using System.Linq.Enumerable;
```
Would import just the query methods for in-memory collections, not those for `IQueryable<T>`.
On the other hand, extension methods are designed to be used as such: you only call them as static methods to disambiguate. So it seems wrong if they are allowed to pollute the top-level namespace as static methods. On the _other_ other hand, this would be the first place in the language where an extension method wouldnt be treated like a static method.
#### Conclusion
We will import extension methods as extension methods, but not as static methods. This seems to hit the best usability point.
## Initializers in structs
Currently, field initializers arent allowed in structs. The reason is that initializers _look_ like they will be executed every time the struct is created, whereas that would not be the case: If the struct wasnt `new`ed, or it was `new`ed with the default constructor, no user defined code would run. People who put initializers on fields might not be aware that they dont always run, so its better to prevent them.
It would be nice to have the benefits of primary constructors on structs, but that only really flies if the struct can make use of the parameters in scope through initializers. Also, we now have initializers for auto-properties, making the issue worse. What to do?
We can never prevent people from having uninitialized structs, and the struct type authors still need to make sure that an uninitialized struct is meaningful. However, if a struct has user-defined constructors, chances are they know what theyre doing and initializers wouldnt make anything worse. However, initializers would only run if the user-defined constructors dont chain to the default constructor with `this()`.
### Conclusion
Lets allow field and property initializers in structs, but only if there is a user-defined constructor (explicit or primary) that does not chain to the default constructor. If people want to initialize with the default constructor first, they should call it from their constructor, rather than chain to it.
``` c#
struct S0 { public int x = 5; } // Bad
struct S1 { public int x = 5; S1(int i) : this() { x += i; } } // Bad
struct S2 { public int x = 5; S2(int i) { this = new S2(); x += i; } } // Good
struct S3(int i) { public int x = 5 + i; } // Good
```
## Unconstrained generics in null-propagating operator
We previously looked at a problem with the null-propagating operator, where if the member accessed is of an unconstrained generic type, we dont know how to generate the result, and what its type should be:
``` c#
var result = x?.Y;
```
The answer is different when `Y` is instantiated with reference types, non-nullable value types and nullable value types, so theres nothing reasonable we can do when we dont know which.
The proposal has been raised to fall back to type `object`, and generate code that boxes (which is a harmless operation for values that are already of reference type).
### Conclusion
This seems like a hack. While usable in some cases, it is weirdly different from the mainline semantics of the feature. Lets prohibit and revisit if it becomes a problem.

Просмотреть файл

@ -0,0 +1,259 @@
# C# Language Design Notes for Apr 21, 2014
## Agenda
In this design meeting we looked at some of the most persistent feedback on the language features showcased in the BUILD CTP, and fixed up many of the most glaring issues.
1. Indexed members <_lukewarm response, feature withdrawn_>
2. Initializer scope <_new scope solves all kinds of problems with initialization_>
3. Primary constructor bodies <_added syntax for a primary constructor body_>
4. Assignment to getter-only auto-properties from constructors <_added_>
5. Separate accessibility for type and primary constructor <_not worthy of new syntax_>
6. Separate doc comments for field parameters and fields <_not worthy of new syntax_>
7. Left associative vs short circuiting null propagation <_short circuiting_>
## Indexed members
The indexed member feature – `e.$x` and `new C { $x = e }` – has been received less than enthusiastically. People arent super happy with the syntax, but most of all they arent very excited about the feature.
We came to this feature in a roundabout way, where it started out having much more expressiveness. For instance, it was the way you could declaratively create an object with values at given indices. But given the dictionary initializer syntax – `new C { ["x"] = e }` – the `$` syntax is again just thin sugar for using string literals in indexers. Is that worth new syntax? It seems not.
### Conclusion
Well pull the feature. Theres little love for it, and we shouldnt litter the language unnecessarily. If this causes an outcry, well thats different feedback, and we can then act on that.
## Initializer scope
There are a couple of things around primary constructors and scopes that are currently annoying:
1. You frequently want a constructor parameter and a (private) field with the same name. In fact we have a feature just for that – the so-called field parameters, where primary constructors annotated with an accessibility modifier cause a field to be also emitted. However, if you try to declare this manually, we give an error because members and primary constructor parameters are in the same declaration space.
2. We have special rules for primary constructor parameters, making it illegal to use them after initialization time, even though they are “in scope”.
So in this code:
``` c#
public class ConfigurationException(Configuration configuration, string message)
: Exception(message)
{
private Configuration configuration = configuration;
public override string ToString() => message + "(" + configuration + ")";
}
```
The declaration of the field `configuration` is currently an error, because it clashes with the parameter of the same name in the same declaration space, but it would be nice if it just worked.
The use of `message` in a method body is and should be an error, but it would be preferable if that was a more natural consequence of existing scoping rules, instead of new specific restrictions.
An idea to fix this is to introduce what well call the ___initialization scope___. This is a scope and declaration space that is nested within the type declarations scope and declaration space, and which includes the parameters and base initializer arguments of a primary constructor (if any) and the expressions in all member initializers of the type.
That immediately means that this line becomes legal and meaningful:
``` c#
private Configuration configuration = configuration;
```
The _field_ `configuration` no longer clashes with the _parameter_ `configuration`, because they are no longer declared in the same declaration space: the latters is nested within the formers. Moreover the reference to `configuration` in the initializer refers to the parameter, not the field, because while both are in scope, the parameter is nearer.
Some would argue that a line like the above is a little confusing. You are using the same name to mean different things. That is a fair point. The best way to think of it is probably the corresponding line in a normal constructor body:
``` c#
this.configuration = configuration;
```
Which essentially means the same thing. Just as weve gotten used to `this` disambiguating that line, well easily get used to the leading modifier and type of the field declaration disambiguating the field initializer.
The initialization scope also means that this line is naturally disallowed:
``` c#
public override string ToString() => message + "(" + configuration + ")";
```
Because the reference to `message` does not appear within the initialization scope, and the parameter is therefore not in scope. If there was a field with that name, the field would get picked up instead; it wouldnt be shadowed by a parameter which would be illegal to reference.
A somewhat strange aspect of the initialization scope is that it is textually discontinuous: it is made up of bits and pieces throughout a type declaration. Hopefully this is not too confusing: conceptually it maps quite clearly to the notion of “initialization time”. Essentially, the scope is made up of “the code that runs when the object is initialized”.
There are some further desirable consequences of introducing the initialization scope:
### Field parameters
The feature of field parameters is currently the only way to get a primary constructor parameter and a field of the same name:
``` c#
public class ConfigurationException(private Configuration configuration, …)
```
With the initialization scope, the feature is no longer special magic, but just thin syntactic sugar over the field declaration above. If for some reason field parameters dont work for you, you can easily fall back to an explicit field declaration with the same name as the parameter.
It raises the question of whether wed even _need_ field parameters, but they still seem like a nice shorthand.
### The scope of declaration expressions in initializers
In the current design, each initializer provides its own isolated scope for declaration expressions: there was no other choice really. With the initialization scope, however, declaration expressions in initializers would naturally use that as their scope, allowing the use of locals to flow values between initializers. This may not be common, but you can certainly imagine situations where that comes in handy:
``` c#
public class ConfigurationException(Configuration configuration, string message)
: Exception(message)
{
private Configuration configuration = configuration;
public bool IsRemote { get; } = (var settings = configuration.Settings)["remote"];
public bool IsAsync { get; } = settings["async"];
public override string ToString() => Message + "(" + configuration + ")";
}
```
The declaration expression in `IsRemote`s initializer captures the result of evaluating `configuration.Settings` into the local variable `settings`, so that it can be reused in the initializer for `IsAsync`.
We need to be a little careful about partial types. Since the “textual order” between different parts of a partial type is not defined, it does not seem reasonable to share variables from declaration expressions between different parts. Instead we should introduce a scope within each part of a type containing the field and property initializers contained in that part. This scope is then nested within the initializer scope, which itself covers all the parts.
A similar issue needs to be addressed around the argument to the base initializer. Textually it occurs _before_ the member initializers, but it is evaluated _after_. To avoid confusion, the argument list needs to be in its own scope, nested inside the scope that contains the field and property initializers (of that part of the type). That way, locals introduced in the argument list will not be in scope in the initializers, and members introduced in the initializers cannot be used in the argument list (because their use would textually precede their declaration).
### Primary constructors in VB
Importantly, the notion of the initialization scope would also make it possible to introduce primary constructors in VB. The main impediment to this has been that, because of case insensitivity, the restriction that primary constructor parameters could not coexist with members of the same name became too harsh. If you need both a parameter, a backing field and a property, you quickly run out of names!
The initialization scope helps with that, by introducing a separate scope that the parameters can live in, so they no longer clash with other names.
Unlike C#, VB today allows initializers to reference previously initialized fields. With the initialization scope this would still be possible, as long as theres not a primary constructor parameter shadowing that field. And if there is, you probably want the parameter anyway. An if you dont, you can always get at the field through `Me` (VBs version of `this`).
It is up to the VB design team whether to actually add primary constructors this time around, but it is certainly nice to have a model that will work in both languages.
### Conclusion
The initialization scope solves many problems, and leaves the language cleaner and with less magic. This clearly outweighs the slight oddities it comes with.
## Primary constructor bodies
By far the most commonly reported reason why people cannot use primary constructors is that they dont allow for easy argument validation: there is simply no “body” within which to perform checks and throw exceptions.
We could certainly change that. The simplest thing, syntactically, is to just let you write a block directly in the type body, and that block then gets executed when the object is constructed:
``` c#
public class ConfigurationException(Configuration configuration, string message)
: Exception(message)
{
{
if (configuration == null)
{
throw new ArgumentNullException(nameof(configuration));
}
}
private Configuration configuration = configuration;
public override string ToString() => Message + "(" + configuration + ")";
}
```
This looks nice, but there is a core question we need to answer: when _exactly_ is that block executed? There seem to be two coherent answers to that, and we need to choose:
1. The block is an ___initializer body___. It runs before the base call, following the same textual order as the surrounding field and property initializers. You could even imagine allowing multiple of them interspersed with field initialization, and they can occur regardless of whether there is a primary constructor.
2. The block is a ___constructor body___. It is the body of the primary constructor and therefore runs after the base call. You can only have one, and only if there is a primary constructor that it can be part of.
Both approaches have pros and cons. The initializer body corresponds to a similar feature in Java, and has the advantage that you can weed out bad parameters before you start digging into them or pass them to the base initializer (though arguments passed to the base initializer should probably be validated by the base initializer rather than in the derived class anyway).
As an example of this issue, our previous example where an initializer digs into the contents of a primary constructor parameter, wouldnt work if the validation was done in a constructor body, after initialization (here in a simplified version):
``` c#
public bool IsRemote { get; } = configuration.Settings["remote"];
```
If the passed-in `configuration` is null, this would yield a null reference exception before a constructor body would have a chance to complain (by throwing a better exception). Instead, in a constructor body interpretation, the initialization of `IsRemote` would either have to happen in the constructor body as well, following the check, or it would have to make copious use of the null propagating operator that were also adding:
``` c#
public bool IsRemote { get; } = configuration?.Settings?["remote"] ?? false;
```
On the other hand, the notion of a constructor body is certainly more familiar, and it is easy to understand that the block is stitched together with the parameter list and the base initializer to produce the constructor declaration underlying the primary constructor.
Moreover, a constructor body has access to fields and members, while `this` access during initialization time is prohibited. Therefore, a constructor body can call helper methods etc. on the instance under construction; also a common pattern.
### Conclusion
At the end of the day we have to make a choice. Here, familiarity wins. While the initializer body approach has allure, it is also very much a new thing. Constructor bodies on the other hand work the way they work. The downsides have workarounds. So a constructor body it is.
In a partial type, the constructor body must be in the same part as the primary constructor. Scope-wise, the constructor body is nested within the scope for the primary constructors base arguments, which in turn is nested within the scope for the field and property initializers of that part, which in turn is nested within the initialization scope that contains the primary constructor parameters:
``` c#
partial class C(int x1) : B(int x3 = x1 /* x2 in scope but cant be used */)
{
public int X0 { get; } = (int x2 = x1);
{
int x4 = X0 + x1 + x2 + x3;
}
}
```
Lets look at the scopes (and corresponding declaration spaces) nested in each other here:
* The scope `S4` spans the primary constructor body. It directly contains the local variable `x4`, and is nested within `S3`.
* The scope `S3` spans `S4` plus the argument list to the primary constructors base initializer. It directly contains the local variable `x3`, and is nested within `S2`.
* The scope `S2` spans `S3` plus all field and property initializers in this part of the type declaration. It directly contains the local variable `x2`, and is nested within `S1`.
* The scope `S1` spans `S2` plus similar “`S2`s” from other parts of the type declaration, plus the parameter list of the primary constructor. It directly contains the parameter `x1`, and is nested within `S0`.
* The scope `S0` spans all parts of the whole type declaration, including `S1`. It directly contains the property `X0`.
On top of this, the usual rule applies for local variables, that they cannot be used in a position that textually precedes their declaration.
## Assignment to getter-only auto-properties
There are situations where you cannot use a primary constructor. We have to make sure that you do not fall off too steep of a cliff when you are forced to abandon primary constructor syntax and use an ordinary constructor.
One of the main nuisances that has been pointed out is that the only way to initialize a getter-only auto-property is with an initializer. If you want to initialize it from constructor parameters, you therefore need to have a primary constructor, so those parameters can be in scope for initialization. If you cannot have a primary constructor, then the property cannot be a getter-only auto-property: You have to fall back to existing, more lengthy and probably less fitting property syntax.
Thats a shame. The best way to level the playing field here is to allow assignment to getter-only auto-properties from within constructors:
``` c#
public class ConfigurationException : Exception
{
private Configuration configuration;
public bool IsRemote { get; }
public ConfigurationException(Configuration configuration, string message)
: base(message)
{
if (configuration == null)
{
throw new ArgumentNullException(nameof(configuration));
}
this.configuration = configuration;
IsRemote = configuration.Settings["remote"];
}
}
```
The assignment to `IsRemote` would go directly to the underlying field (since there is no setter to call). Thus, semantics are a little different from assignment to get/set auto-properties, where the setter is called even if you assign from a constructor. The difference is observable if the property is virtual. We could restore symmetry by changing the meaning of assignment to a get/set auto-property to also go directly to the backing field, but that would be a breaking change.
### Conclusion
Lets allow assignment to getter-only auto-properties from constructor bodies. It translates into assignment directly to the underlying field (which is `readonly`). We are ok with the slight difference in semantics from get/set auto-property assignment.
## Separate accessibility on type and primary constructor
There are scenarios where you dont want the constructors of your type to have the same accessibility as the type. A common case is where the type is public, but the constructor is private or protected, object construction being exposed only through factories.
Should we invent syntax so that a primary constructor can get a different accessibility than its type?
### Conclusion
No. There is no elegant way to address this. This is a fine example of a scenario where developers should just fall back to normal constructor syntax. With the previous decisions above, weve done our best to make sure that that cliff isnt too steep.
## Separate doc comments for field parameters and their fields
Doc comments for a primary constructor parameter apply to the parameter. If the parameter is a field parameter, there is no way to add a doc comment that goes on the field itself. Should there be?
### Conclusion
No. If the field needs separate doc comments, it should just be declared as a normal field. With the introduction of initialization scopes above, this is now not only possible but easy.
## Null propagating operator associativity
What does the following mean?
``` c#
var x = a?.b.c;
```
People gravitate to two interpretations, which each side maintains is perfectly intuitive and the only thing that makes sense.
One interpretation is that `?.` is an operator much like `.`. It is left associative, and so the meaning of the above is roughly the same as
``` c#
var x = ((var tmp = a) == null ? null : tmp.b).c;
```
In other words, we access `b` only if `a` is not null, but `c` is accessed regardless. This is obviously likely to lead to a null reference exception; after all the use of the null propagating operator indicates that theres a likelihood that `a` is null. So advocates of the “left associative” interpretation would put a diagnostic on the code above, warning that this is probably bad, and pushing people to write, instead:
``` c#
var x = a?.b?.c;
```
With a null-check again before accessing `c`.
The other interpretation has been called “right associative”, but that isnt exactly right (no pun intended): better to call it “short circuiting”. It holds that the null propagating operator should short circuit past subsequent member access (and invocation and indexing) when the receiver is null, essentially pulling those subsequent operations into the conditional:
``` c#
var x = ((var tmp = a) == null ? null : tmp.b.c);
```
There are long discussions about this, which I will no attempt to repeat here. The “short circuiting” interpretation is slightly more efficient, and probably more useful. On the other hand it is more complicated to fit into the language, because it needs to “suck up” subsequent operations in a way those operations arent “used to”: since when would the evaluation of `e.x` not necessarily lead to `x` being accessed on `e`? So wed need to come up with alternative versions of remote access, indexing and invocation that can represent being part of a short-circuited chain following a null propagating operator.
### Conclusion
Despite the extra complexity and some disagreement on the design team, weve settled on the “short circuiting” interpretation.

Просмотреть файл

@ -0,0 +1,78 @@
# C# Language Design Notes for May 7, 2014
## Agenda
1. protected and internal <_feature cut not worth the confusion_>
2. Field parameters in primary constructors <_feature cut we want to keep the design space open_>
3. Property declarations in primary constructors <_interesting but not now_>
4. Typeswitch <_Not now more likely as part of a future more general matching feature_>
## protected and internal
Protected and internal was never a feature we were super enthusiastic about. The CLR supports it and it seemed reasonable to surface it in the language. However, the syntactic options are not great. For every suggestion there are significant and good reasons why it doesnt work. The community has been incredibly helpful in its creativity about names, as well as in pointing out their flaws.
### Conclusion
We wont do this feature. Guidance for the scenarios it addresses will be to use `internal`: the most important aspect is to hide the member from external consumers of the assembly. The `protected` aspect is more of a software engineering thing within the team. You could imagine at some point adding the protected aspect as an attribute, either recognized by the compiler or respected by a custom diagnostic.
## Field parameters in primary constructors
Now that weve added the initialization scope to classes, it is no longer a problem to have primary constructor parameters with the same name as members. This removes most of the motivation for having the field parameters feature, where an explicit accessibility modifier on a parameter would indicate that there should additionally be a field of that name.
### Conclusion
As the next topic demonstrates, there are more interesting things to consider using this design space for in the future. Lets not occupy it now with this relatively unimportant feature. It is fine that people have to declare their fields explicitly.
## Property declarations in primary constructors
While declaration of fields in the primary constructor parameter list is of limited value, it is very often the case that a constructor parameter is accompanied by a corresponding property. It might be nice if there was a shorthand for this. You could imagine very terse class declarations completely without bodies in some cases.
A hurdle here is the convention that parameters are `camelCase` (start with lower case) and public properties are `PascalCase` (start with upper case). To be general, wed need for each parameter to give not one but two names – something like this:
``` c#
public class Point(int x: X, int y: Y);
```
Which would yield public getter-only properties named `X` and `Y` as well as constructor parameters `x` and `y` with which the properties are initialized. It would expand to this:
``` c#
public class Point(int x, int y)
{
public int X { get; } = x;
public int Y { get; } = y;
}
```
This syntax looks fairly nice in the above example, but it gets a little unwieldy when the names are longer:
``` c#
public class Person(string firstName: FirstName, string lastName: LastName);
```
Maybe we could live with not having separate parameter names. We could reuse the syntax weve just dropped for field parameters and use it for property parameters instead:
``` c#
public class Person(public string FirstName, public string LastName);
```
This would be shorthand for writing
``` c#
public class Person(string FirstName, string LastName)
{
public string FirstName { get; } = FirstName;
public string LastName { get; } = LastName;
}
```
Now the parameters would show up as PascalCase. This does not seem like a big deal for new types, but it would mean that most current code couldnt be moved forward to this syntax without breaking callers who use named arguments.
The implied association of parameter and property could certainly be useful in its own right. You could imagine allowing the use of object initializers to initialize these getter-only properties. Instead of translating it into setter calls, the compiler would know the corresponding constructor parameters to pass the values to:
``` c#
var p = new Person { LastName = "Pascal", FirstName = "Blaise" };
```
Would turn into:
``` c#
var p = new Person("Blaise", "Pascal");
```
Also, in the future, if we were to consider pattern matching or deconstruction features, this association could be helpful.
### Conclusion
We like the idea of providing a shorthand in the primary constructor parameter list for generating simple corresponding properties. However, we are not ready to go down this route just yet. We need to decide on the upper-case/lower-case issue for one thing. We note that primary constructors already provide quite an improvement over what you have to write in C# 5.0. Thats just going to have to be good enough for now.
## Typeswitch
For a long time weve had the idea to add a typeswitch feature to C#. In this coming release, VB is seriously looking at expanding its `Select Case` statement to allow matching on types. Syntactically, this seems to fit right in as a natural extension in VB. In C#, maybe not so much: the `switch` statement is quite restrictive and only a little evolved from Cs original jump table oriented design. It doesnt easily accommodate such a different form of case condition.
So if we were to add typeswitching capabilities to C#, we most likely would do it as a new feature with its own syntax. Options range from a switch-like construct with blocks for each match, to a more expression-oriented style reminiscent of pattern matching in functional languages.
A major point here is that type switching can be seen as a special case of pattern matching. Would we ever add generalized pattern matching to C#? It certainly seems like a reasonable possibility. If so, then we should think of any typeswitching feature in that light: it needs to have the credible ability to “grow up” into a pattern matching feature in the future.
### Conclusion
Weve looked some at this, trying to imagine what a pattern matching future would look like. We have some great ideas, but we are not confident that we can map them out at this point to an extent where we would trust a current typeswitch design to fit well with it. And we do not have capacity to design and implement the full feature set in the current round.
Lets rather wait with the whole package and see if we can attack it in one go in the future.

Просмотреть файл

@ -0,0 +1,120 @@
# C# Language Design Notes for May 21, 2014
## Agenda
1. Limit the nameof feature? <_keep current design_>
2. Extend params IEnumerable? <_keep current design_>
3. String interpolation <_design nailed down_>
## Limit the nameof feature?
The current design of the `nameof(x)` feature allows for the named entity to reference entities that are not uniquely identified by the (potentially dotted) name given: methods overloaded on signature, and types overloaded on generic arity.
This was rather the point of the feature: since you are only interested in the name, why insist on binding uniquely to just one symbol? As long as theres at least one entity with the given name, it seems fine to yield the name without error. This sidesteps all the issues with the mythical `infoof()` feature (that would take an entity and return reflective information for it) of coming up with language syntax to uniquely identify overloads. (Also theres no need to worry about distinguishing generic instantiations from uninstantiated generics, etc.: they all have the same name).
The design, however, does lead to some interesting challenges for the tooling experience:
``` c#
public void M();
public void M(int i);
public void M(string s);
WriteLine(nameof(M)); // writes the string "M"
```
Which definition of `M` should “Go To Definition” on `M` go to? When does renaming an overload cause a rename inside of the `nameof`? Which of the overloads does the occurrence of `nameof(M)` count as a reference to? Etc. The ambiguity is a neat trick at the language level, but a bit of a pain at the tooling level.
Should we limit the application of `nameof` to situations where it is unambiguous?
### Conclusion
No. Lets keep the current design. We can come up with reasonable answers for the tooling challenges. Hobbling the feature would hurt real scenarios.
## Extend params IEnumerable?
By current design, params is extended to apply to `IEnumerable<T>` parameters. The feature still works by the call site generating a `T[]` with the arguments in it; but that array is of course available inside the method only as an `IEnumerable<T>`.
It has been suggested that we might as well make this feature work for other generic interfaces (or even all types) that arrays are implicitly reference convertible to, instead of just `IEnumerable<T>`.
It would certainly be straightforward to do, though there are quite a lot of such types. We could even infer an element type for the array from the passed-in arguments for the cases where the collection type does not have an element type of its own.
On the other hand, it is usually bad practice for collection-expecting public APIs to take anything more specific than `IEnumerable<T>`. That is especially true if the API is not intending to modify the collection, and no meaningful params method would do so: after all, if your purpose is to cause a side effect on a passed-in collection, why would you give the caller the option not to pass one?
### Conclusion
Params only really makes sense on `IEnumerable<T>`. If we were designing the language from scratch today we wouldnt even have params on arrays, but only on `IEnumerable<T>`. So lets keep the design as is.
## String interpolation
There have been a number of questions around how to add string interpolation to C#, some a matter of ambition versus simplicity, some just a matter of syntax. In the following we settle on these different design aspects.
### Safety
Concatenation of strings with contents of variables has a long tradition for leading to bugs or even attack vectors, when the resulting string is subsequently parsed up and used as a command. Presumably if you make string concatenation easier, you are more vulnerable to such issues – or at least, by having a dedicated string interpolation features, you have a natural place in the language to help address such problems.
Consequently, string interpolation in the upcoming EcmaScript 6 standard allows the user to indicate a function which will be charged with producing the result, based on compiler-generated lists of string fragments and expression results to be filled in. A given trusted function can prevent SQL injection or ensure the well-formedness of a URI.
#### Conclusion
We dont think accommodating custom interpolators in C# is the sweet spot at this point. Most people are probably just looking for simpler and more readable syntax for filling out holes in strings. However, as we settle on syntax we should keep an eye on our ability to extend for this in the future.
### Culture
In .NET theres a choice between rendering values in the current culture or an invariant culture. This determines how common values such as dates and even floating point numbers are shown in text. The default is current culture, which even language-recognized functions such as `ToString()` make use of.
Current culture is great if what youre producing is meant to be read by humans in the same culture as the program is run in. If you get more ambitious than that with human readers, the next step up is to localize in some more elaborate fashion: looking up resources and whatnot. At that point, you are reaching for heavier hammers than the language itself should probably provide.
Theres an argument that when a string is produced for machine consumption it is better done in the invariant culture. After all, it is quite disruptive to a comma-separated list of floating point values if those values are rendered with commas instead of dots as the decimal point!
Should a string interpolation feature default to current or invariant culture, or maybe provide a choice?
#### Conclusion
We think this choice has already been made for us, with the language and .NET APIs broadly defaulting to current culture. That is probably the right choice for most quick-and-easy scenarios. If we were to accommodate custom interpolators in the future, there could certainly be one for culture-invariant rendering.
### Syntax
The general format is strings with “holes”, the holes containing expressions to be “printed” in that spot. Wed like the syntax to stress the analogy to `String.Format` as much as possible, and we therefore want to use curly braces `{…}` in the delimiting of holes. Well return to what exactly goes in the curly braces, but for now there is one central question: how do we know to do string interpolation at all?
There are two approaches we can think of:
1. Provide new syntax around the holes
2. Provide new syntax around the string itself
To the first approach, we previously settled on escaping the initial curly brace of each hole to mean this was a string interpolation hole, and the contents should be interpreted as expression syntax:
``` c#
"Hello, \{name}, you have \{amount} donut{s} left."
```
Here, `name` and `amount` refer to variables in scope, whereas `{s}` is just part of the text.
This has a few drawbacks. It doesnt look that much like a format string, because of the backslash characters in front of the curlies. You also need to visually scan the string to see if it is interpolated. Finally thered be no natural place for us to indicate a custom interpolation function in the future.
An example of the second approach would be to add a prefix to the string to trigger interpolation, e.g.:
``` c#
$"Hello, {name}, you have {amount} donut\{s\} left."
```
Now the holes can be expressed with ordinary braces, and just like format strings you have to escape braces to actually get them in the text (though we are eager to use backslash escapes instead of the double braces that format strings use). You can see up front that the string is interpolated, and if we ever add support for custom interpolators, the function can be put immediately before or after the `$`; whichever we decide:
``` c#
LOC$"Hello, {name}, you have {amount} donut\{s\} left."
SQL$"…"
URI$"…"
```
The prefix certainly doesnt have to be a `$`, but thats the character we like best for it.
We dont actually have to do it with a prefix. JavaScript is going to use back ticks to surround the string. But prefix certainly seems better than yet another kind of string delimiter.
#### Conclusion
The prefix approach seems better and more future proof. We are happy to use `$`. It wouldnt compose with the `@` sign used for verbatim strings; it would be either one or the other.
### Format specifiers
Format strings for `String.Format` allow various format specifiers in the placeholders introduced by commas and colons. We could certainly allow similar specifiers in interpolated strings. The semantics would be for the compiler to just turn an interpolated string into a call to `String.Format`, passing along any format specifiers unaltered:
``` c#
$"Hello, {name}, you have {amount,-3} donut\{s\} left."
```
This would be translated to
``` c#
String.Format("Hello, {0}, you have {1,-3} donut{{s}} left.", name, amount)
```
(Note that formatting of literal curlies needs to change if we want to keep our backslash escape syntax, which, tentatively, we do).
The compiler would be free to not call `String.Format`, if it knows how to do things more optimally. This would typically be the case when there are no format specifiers in the string.
#### Conclusion
Allow all format specifiers that are allowed in the format strings of `String.Format`, and just pass them on.
### Expressions in the holes
The final – important – question is which expressions can be put between the curly braces. In principle, we could imagine allowing almost any expression, but it quickly gets weird, both from a readability and from an implementation perspective. What if the expression itself has braces or strings in it? We wouldnt be able to just lex our way past it (when to stop?), and similarly a reader, even with the help of colorization, would get mightily confused about what gets closed out when exactly.
Additionally the choice to allow format specifiers limits the kinds of expressions that can unambiguously precede those.
``` c#
$"{a as b ? – c : d}" // ?: or nullable type and format specifier?
```
The other extreme is to allow just a very limited set of expressions. The common case is going to be simple variables anyway, and anything can be expressed by first assigning into variables and then using those in the string.
#### Conclusion
We want to be quite cautious here, at least to begin with. We can always extend the set of expressions allowed, but for now we want to be close to the restrictive extreme and allow only simple and dotted identifiers.

Просмотреть файл

@ -0,0 +1,47 @@
# C# Language Design Notes for July 9, 2014
## Agenda
1. Detailed design of nameof <_details settled_>
2. Design of #pragma warning extensions <_allow identifiers_>
## Details of nameof
The overall design of `nameof` was decided in the design meeting on [Oct 7, 2013](https://roslyn.codeplex.com/discussions/552376). However, a number of issues werent addressed at the time.
### Syntactic ambiguity
The use of `nameof(…)` as an expression can be ambiguous, as it looks like an invocation. In order to stay compatible, if theres an invokable `nameof` in scope well treat it as an invocation, regardless of whether that invocation is valid. This means that in those cases there is no way to apply the nameof operator. The recommendation of course will be to get rid of any use of `nameof` as an identifier, and we should think about having diagnostics helping with that.
### Which operands are allowed?
The symbols recognized in a nameof expression must represent locals, range variables, parameters, type parameters, members, types or namespaces. Labels and preprocessor symbols are not allowed in a nameof expression.
In general, free-standing identifiers are looked up like simple names, and dotted rightmost identifiers are looked up like member access. It is thus an error to reference locals before their declaration, or to reference inaccessible members. However, there are some exceptions:
_All members are treated as if they were static members._ This means that instance members are accessed by dotting off the type rather than an instance expression. It also means that the accessibility rules around protected instance members are the simpler rules that apply to static members.
_Generic types are recognized by name only._ Normally there needs to be a type parameter list (or at least dimension specifier) to disambiguate, but type parameter lists or dimension specifiers are not needed, and in fact not allowed, on the rightmost identifier in a nameof.
_Ambiguities are not an error._ Even if multiple entities with the same name are found, nameof will succeed. For instance, if a property named `M` is inherited through one interface and a method named `M` is inherited through another, the usual ambiguity error will not occur.
### The referenced set
Because ambiguities are allowed, a nameof operator can reference a set of different entities at the same time. The precise set of referenced entities in the presence of ambiguity can be loosely defined as “those it would be ambiguous between”. Thus, shadowed members or other entities that wouldnt normally be found by lookup, e.g. because they are in a base class or an enclosing scope of where an entity is found, will not be part of the referenced set.
The notion of referenced set has little importance for the language-level semantics, but is important for the tooling experience, e.g. for refactorings, go-to-definition, etc.
Reference to some entities, e.g. obsolete members, `Finalize` or `op_` methods, is normally an error. However, it is not an error in `nameof(…)` unless _all_ members of the referenced set would give an error. If all non-error references give warnings, then a warning is given.
### The resulting string
C# doesnt actually have a notion of canonical name. Instead, equality between names is currently defined directly _beween_ names that may contain special symbols.
For `nameof(… i)` we want the resulting string to be the identifier `I` given, except that formatting characters are omitted, and Unicode escapes are resolved. Also, any leading `@` is removed.
In the case of aliases, this means that those are not resolved to their underlying meaning: the identifier is that of the alias itself.
As a result, the meaning of the identifier is always only used to check if it is valid, never to decide what the resulting string is. There is no semantic component to determining the result of a nameof operator, only to determining if it is allowed.
## Pragma warning directives
Now that custom diagnostics are on their way, we want to allow users to turn these on and off from source code, just as we do with the compilers own diagnostics today. To allow this, we need to extend the model of how a diagnostic is identified: today a number is used, but that is not a scalable model when multiple diagnostic providers are involved.
Instead the design is that diagnostics are identified by an identifier. For compatibility the C# compilers own diagnostics can still be referenced with a number, but can also be referred to with the pattern `CS1234`:
``` c#
#pragma warning disable AsyncCoreSet
#pragma warning disable CS1234
```

Просмотреть файл

@ -0,0 +1,168 @@
# C# Design Notes for Aug 27, 2014
## Agenda
The meeting focused on rounding out the feature set around structs.
1. Allowing parameterless constructors in structs <_allow, but some unresolved details_>
2. Definite assignment for imported structs <_revert to Dev12 behavior_>
## Parameterless constructors in structs
Unlike classes, struct types cannot declare a parameterless constructor in C# and VB today. The reason is that the syntax `new S()` in C# has historically been reserved for producing a zero-initialized instance of the struct. VB.Net has always had an alternative syntax for that (`Nothing`) and C# 2.0 also added one: `default(T)`. So the `new S()` syntax is no longer necessary for this purpose.
It is possible to define parameterless constructors for structs in IL, but neither C#, VB or F# allow you to. All three languages have mostly sane semantics when consuming one, though, _mostly_ having `new S()` call the constructor instead of zero-initializing the struct (except in some corner cases visited below).
Not being able to define parameterless constructors in structs has always been a bit of a pain, and now that were adding initializers to structs it becomes outright annoying.
### Conclusion
We want to add the ability to declare explicit public parameterless constructors in structs, and we also want to think about reducing the number of occurrences of `new S()` that produce a default value. In the following we explore details and additional proposals.
### Accessibility
C#, VB and F# will all call an accessible parameterless constructor if they find one. If there is one, but it is not accessible, C# and VB will backfill `default(T)` instead. (F# will complain.)
It is problematic to have successful but different behavior of `new S()` depending on where you are in the code. To minimize this issue, we should make it so that explicit parameterless constructors have to be public. That way, if you want to replace the “default behavior” you do it everywhere.
#### Conclusion
Parameterless constructors will be required to be public.
### Compatibility
Non-generic instantiation of structs with (public) parameterless constructors does the right thing in all three languages today. With generics it gets a little more subtle. All structs satisfy the `new()` constraint. When `new T()` is called on a type parameter `T`, the compiler _should_ generate a call to `Activator.CreateInstance` – and in VB and F# it does. However, C# tries to be smart, discovers at runtime that `T` is a struct (if it doesnt already know from the `struct` constraint), and emits `default(T)` instead!
``` c#
public T M<T>() where T: new() { return new T(); }
```
Clearly we should remove this “optimization” and always call `Activator.CreateInstance` in C# as well. This is a bit of a breaking change, in two ways. Imagine the above method is in a library:
1. The more obvious – but also more esoteric – break is if people today call the library with a struct type (written directly in IL) which has a parameterless constructor, yet they depend on the library _not_ calling that parameterless constructor. That seems extremely unlikely, and we can safely ignore this aspect.
2. The more subtle issue is if such a library is not recompiled as we start populating the world with structs with parameterless constructors. The library is going to be wrongly not calling those constructors until someone recompiles it. But if its a third party library and theyve gone out of business, no-one ever will.
We believe even the second kind of break is relatively rare. The `new()` constraint isnt used much. But it would be nice to validate.
#### Conclusion
Change the codegen for generic `new T()` in C# but try to validate that the pattern is rare in known code.
### Default arguments
For no good reason C# allows `new` expressions for value types to occur as default arguments to optional parameters:
``` c#
void M(S s = new S()){ … }
```
This is one place where we cannot (and do not) call a parameterless constructor even when there is one. This syntax is plain bad. It suggests one meaning but delivers another.
We should do what we can (custom diagnostic?) to move developers over to use `default(S)` with existing types. More importantly we should not allow this syntax at all when `S` has a parameterless constructor. This would be a slight breaking change for the vanishingly rare IL-authored structs that do today, but so be it.
#### Conclusion
Forbid `new S()` in default arguments when `S` has a parameterless constructor, and consider a custom diagnostic when it doesnt. People should use `default(S)` instead.
### Helpful diagnostic
In general, with this change we are trying to introduce more of a distinction between default values and constructed values for structs. Today it is very blurred by the use of `new S()` for both meanings.
Arguably the use of `new S()` to get the default value is fine as long as `S` does not have any explicit constructors. It can be viewed a bit like making use of the default constructor in classes, which gets generated for you if you do not have _any_ explicit constructors.
The confusion is when a struct type “intends” to be constructed, by advertising one or more constructors. Provided that none of those is parameterless, `new S()` _still_ creates an unconstructed default value. This may or may not be the intention of the calling code. Oftentimes it would represent a bug where they meant to construct it (and run initializers and so forth), but the lack of complaint from the compiler caused them to think everything was all right.
Occasionally a developer really does want to create an uninitialized value even of a struct that has constructors. In those cases, though, their intent would be much clearer if they used the `default(S)` syntax instead.
It therefore seems that everyone would be well served by a custom diagnostic that would help “clear up the confusion” as it were, by
* Flagging occurrences of `new S()` where `S` has constructors but not a parameterless one
* Offering a fix to change to `default(T)`, as well as fixes to call the constructors
This would help identify subtle bugs where they exist, and make the developers intent clearer when the behavior is intentional.
The issue of course is how disruptive such a diagnostic would be to existing code. Would it be so annoying that they would just turn it off? Also, is the above assumption correct, that the occurrence of any constructor means that the library author intended for a constructor to always run?
#### Conclusion
We are cautiously interested in such a diagnostic, but concerned that it would be too disruptive. We should evaluate its impact on current code.
### Chaining to the default constructor when theres a parameterless one
A struct constructor must ensure that the struct is definitely assigned. It can do so by chaining to another constructor or by assigning to all fields.
For structs with auto-properties there is an annoying fact that you cannot assign to the underlying field because its name is hidden, and you cannot assign to the setter, because you are not allowed to invoke a function member until the whole struct is definitely assigned. Catch 22!
People usually deal with this today by chaining to the default constructor – which will zero-initialize the entire struct. If there is a user-defined parameterless constructor, however, that will not work. (Especially not if that is the constructor you are trying to implement!)
There is a workaround. Instead of writing
``` c#
S(int x): this() { this.X = x; }
```
You can make use of the fact that in a struct, `this` is an l-value:
``` c#
S(int x) { this = default(S); this.X = x; }
```
Its not pretty, though. In fact its rather magical. We may want to consider adding an alternative syntax for zero-initializing from a constructor; e.g.:
``` c#
S(int x): default() { this.X = x; }
```
However, it is also worth noting that auto-properties themselves are evolving. You can now directly initialize their underlying field with an initializer on the auto-property. And for getter-only auto-properties, assignment in the constructor will also directly assign the underlying field. So maybe problem just isnt there any longer. You can just zero-initialize the auto-properties directly:
``` c#
public int X { get; set; } = 0;
```
Now the definite assignment analysis will be happy when you get to running a constructor body.
#### Conclusion
Do nothing about this right now, but keep an eye on the issue.
### Generated parameterless constructors
The current rule is that initializers are only allowed if there are constructors that can run them. This seems reasonable, but look at the following code:
``` c#
struct S
{
string label = "<unknown>";
bool pending = true;
public S(){}
}
```
Do we _really_ want to force people to write that trivial constructor? Had this been a class, they would just have relied on the compiler-generated default constructor.
It is probably desirable to at least do what classes do and generate a default constructor when there are no other constructors. Of course we wouldnt generate one when there are no initializers either: that would be an unnecessary (and probably slightly breaking) change over what we do today, as the generated constructor would do exactly the same as the default `new S()` behavior anyway.
A question though is if we should generate a parameterless constructor to run initializers even if there are also parameterful ones. After all, dont we want to ensure that initializers get run in as many cases as possible?
This seems somewhat attractive, though it does mean that a struct with initializers doesnt get to choose _not_ to have a generated parameterless constructor that runs the initializers.
Also, in the case that theres a primary constructor it becomes uncertain what it would mean for a parameterless constructor to run the initializers: after all they may refer to primary constructor parameters that arent available to the parameterless constructor:
``` c#
struct Name(string first, string last)
{
string first = first;
string last = last;
}
```
How is a generated parameterless constructor supposed to run those initializers? To make this work, we would probably have to make the parameterless constructor chain to the primary constructor (all other constructors must chain to the primary constructor), passing _default values_ as arguments.
Alternatively we could require that all structs with primary constructors _also_ provide a parameterless constructor. But that kind of defeats the purpose of primary constructors in the first place: doing the same with less code.
In all we seem to have the following options:
1. Dont generate anything. If you have initializers, you must also provide at least one constructor. The only change from todays design is that one of those constructors can be parameterless.
2. Only generate a parameterless constructor if there are no other constructors. This most closely mirrors class behavior, but it may be confusing that adding an explicit constructor “silently” changes the meaning of `new S()` back to zero-initialization. (The above diagnostic would help with that, though).
3. Generate a parameterless constructor only when there is not a primary constructor and
a. Still fall back to zero-initialization for new S() in this case
b. Require a parameterless constructor to be explicitly specified
This seems to introduce an arbitrary distinction between primary constructors and other constructors that prevents easy refactoring back and forth between them.
4. Generate a parameterless constructor even when there is a primary constructor
a. using default values and/or
b. some syntax to provide the arguments as part of the primary constructor
This seems overly magical, and again treats primary constructors more differently than was the intent with their design.
#### Conclusion
This is a hard one, and we didnt reach agreement. We probably want to do at least option 2, since part of our goal is for structs to become more like classes. But we need to think more about the tradeoffs between that and the more drastic (but also more helpful?) approaches.
## Definite assignments for imported structs
Unlike classes, private fields in structs do need to be observed in various ways on the consumer side – they cannot be considered entirely an implementation detail.
In particular, in order to know if a struct is definitely assigned we need to know if its fields have all been initialized. For inaccessible fields, there is no sure way to do that piecemeal, so if such inaccessible fields exist, the struct-consuming code must insist that the struct value as a whole has been constructed or otherwise initialized.
So the key is to know if the struct has inaccessible fields. The native compiler had a long-running bug that would cause it to check imported structs for inaccessible fields _only_ where those fields were of value type! So if the struct had only a private field of a reference type, the compiler would fail to ensure that it was definitely assigned.
In Roslyn we started out implementing the specification, which was of course stricter and turned out to break some code (that was buggy and should probably have been fixed). Instead we then went to the opposite extreme and just stopped ensuring definite assignment of these structs altogether. This lead to its own set of problems, primarily in the form of a new set of bugs that went undetected because of the laxer rules.
Ideally we would go back to implementing the spec. This would break old code, but have the best experience for new code. If we had a “quirks” mode approach, we could allow e.g. the lang-version flag to be more lax on older versions. Part of migrating a code base to the new version of the language would involve fixing this kind of issue.
### Conclusion
Unfortunately we do not have the notion of a quirks mode. Like a number of issues before, this one alone does not warrant introducing one – after all, it is a new kind of upgrade tax on customers. We should compile a list of things we would do if we had a quirks mode, and evaluate if the combined value would be enough to justify it.
Definite assignment for structs should be on that list. In the meantime however, the best we can do is to revert to the behavior of the native compiler, so thats what well do.

Просмотреть файл

@ -0,0 +1,119 @@
# C# Design Notes for Sep 3, 2014
Quote of the day: “Its a design smell. But its a good smell.”
## Agenda
The meeting focused on rounding out the design of declaration expressions
1. Removing “spill out” from declaration expressions in simple statements <_yes, remove_>
2. Same name declared in subsequent else-ifs <_condition decls out of scope in else-branch_>
3. Add semicolon expressions <_not in this version_>
4. Make variables in declaration expressions readonly <_no_>
## “Spill out”
The scope rules for variables introduced in declaration expressions are reasonably regular: the scope of such a variable extends to the nearest enclosing statement, and like all local variables, it may be used only after it has been defined, textually.
We did make a couple of exceptions, though: an expression-statement or a declaration-statement does _not_ serve as a boundary for such a variable – instead it “spills out” to the directly enclosing block – if there is one.
Similarly, a declaration expression in one field initializer is in scope for neighboring field initializers (as long as they are in the same part of the type declaration).
This was supposed to enable scenarios such as this:
``` c#
GetCoordinates(out var x, out var y);
… // use x and y;
```
to address the complaint that it is too much of a hassle to use out and ref parameters. But we have a nagging suspicion that this scenario – pick up the value in one statement and use it in the next – is not very common. Instead the typical scenario looks like this:
``` c#
if (int.TryParse(s, out int i)) { … i … }
```
Where the introduced local is used in the _same_ statement as it is declared in.
Outside of conditions, probably the most common use is the inline common-subexpression refactoring, where the result of an expression is captured into a variable the first time it is used, so the variable can be applied to the remaining ones:
``` c#
Console.WriteLine("Result: {0}", (var x = GetValue()) * x);
```
The spill-out is actually a bit of a nuisance for the somewhat common scenario of passing dummies to ref or out parameters that you dont need (common in COM interop scenarios), because you cannot use the same dummy names in subsequent statements.
From a rule regularity perspective, the spilling is quite complicated to explain. It would be a meaningful simplification to get rid of it. While complexity of specing and implementing shouldnt stand in the way of a good feature, it is often a smell that the design isnt quite right.
### Conclusion
Lets get rid of the spilling. Every declaration expression is now limited in scope to it nearest enclosing statement. Well live with the (hopefully) slight reduction in usage scenarios.
## Else-ifs
Declaration expressions lend themselves particularly well to a style of programming where an if/else-if chain goes through various options, each represented by a variable declared in a condition, using those variables in the then-clause:
``` c#
if ((var i = o as int?) != null) { … i … }
else if ((var s = o as string) != null) { … s … }
else if …
```
This particular pattern _looks_ like a chain of subsequent options, and even indents like that, but linguistically the else clauses are nested. For that reason, with our current scope rules the variable `I` introduced in the first condition is in scope in all the rest of the statement – even though it is only meaningful and interesting in the then-branch. In particular, it blocks another variable with the same name from being introduced in a subsequent condition, which is quite annoying.
We do want to solve this problem. There is no killer option that we can think of, but there are a couple of plausible approaches:
1. Change the scope rules so that variables declared in the condition of an if are in scope in the then-branch but not in the else-branch
2. Remove the restriction that locals cannot be shadowed by other locals
3. Do something very scenario specific
### Changing the scope rules
Changing the scope rules would have the unfortunate consequence of breaking the symmetry of if-statements, so that
``` c#
if (b) S1 else S2
```
No longer means exactly the same as
``` c#
if (!b) S2 else S1
```
It kind of banks on the fact that the majority of folks who would introduce declaration expressions in a condition would do so for use in the then-branch only. That certainly seems to be likely, given the cases we have seen (type tests, uses of the `Try…` pattern, etc.). But it still may be surprising to some, and downright bothersome in certain cases.
Worse, there may be tools that rely on this symmetry principle. Refactorings to swap then and else branches (negating the condition) abound. These would no longer always work.
Moreover, of course, this breaks with the nice simple scoping principle for declaration expressions that we just established above: that they are bounded (only) by their enclosing statement.
### Removing the shadowing restriction
Since C# 1.0, it has been forbidden to shadow a local variable or parameter with another one. This is seen as one of the more successful rules of hygiene in C# - it makes code safe for refactoring in many scenarios, and just generally easier to read.
There are existing cases where this rule is annoying:
``` C#
task.ContinueWith(task => … task …); // Same task! Why cant I name it the same?
```
Here it seems the rule even runs counter to refactoring, because you need to change every occurrence of the name when you move code into a lambda.
Lifting this restriction would certainly help the else-if scenario. While previous variables would still be in scope, you could now just shadow them with new ones if you choose.
If you do not choose to use the same name, however, the fact that those previous variables are in scope may lead to confusion or accidental use.
More importantly, are we really ready to part with this rule? It seems to be quite well appreciated as an aid to avoid subtle bugs.
### Special casing the scenario
Instead of breaking with general rules, maybe we can do something very localized? Some combination of the two?
It would have to work both in then and else branches; otherwise, it would still break the if symmetry, and be as bad as the first option.
We could allow only variables introduced in conditions of if-statements to be shadowed only by other variables introduced in conditions of if-statements?
This might work, but seems inexcusably ad-hoc, and is almost certain to cause a bug tail in many tools down the line, as well as confusion when refactoring code or trying to experiment with language semantics.
### Conclusion
It seems there truly is no great option here. However, wed rather solve the problem with a wart or two than not address it at all. On balance, option 1, the special scope rule for else clauses, seems the most palatable, so thats what well do.
## Semicolon expressions
We previously proposed a semicolon operator, to be commonly used with declaration expressions, to make “let-like” scenarios a little nicer:
``` c#
Console.WriteLine("Result: {0}", (var x = GetValue(); x * x));
```
Instead of being captured on first use, the value is now captured first, _then_ used multiple times.
We are not currently on track to include this feature in the upcoming version of the language. The question is; should we be? Theres an argument that declaration expressions only come to their full use when they can be part of such a let-like construct. Also, there are cases (such as conditional expressions) where you cannot just declare the variable on first use, since the use is in a branch separate from later uses.
Nevertheless, it might be rash to say that this is our let story. Is this how we want let to look like in C#? We dont easily get another shot at addressing the long-standing request for a let-like expression. It probably needs more thought than we have time to devote to it now.
### Conclusion
Lets punt on this feature and reconsider in a later version.
## Should declaration expression variables be mutable?
C# is an imperative language, and locals are often used in a way that depends on mutating them sometime after initialization. However, you could argue that this is primarily useful when used across statements, whereas it generally would be a code smell to have a declaration thats only visible _inside_ one statement rely on mutation.
This may or may not be the case, but declaration _expressions_ also benefit from a strong analogy with declaration _statements_. It would be weird that `var s = GetString()` introduces a readonly variable in one setting but not another. (Note: it does in fact introduce a readonly variable in a few situations, like foreach and using statements, but those can be considered special).
### Conclusion
Lets keep declaration expressions similar to declaration statements. It is too weird if a slight refactoring causes the meaning to change. It may be worth looking at adding readonly locals at a later point, but that should be done in an orthogonal way.

Просмотреть файл

@ -0,0 +1,118 @@
There were two agenda items...
1. Assignment to readonly autoprops in constructors (we fleshed out details)
2. A new compiler warning to prevent outsiders from implementing your interface? (no, leave this to analyzers)
# Assignment to readonly autoprops in constructors
```cs
public struct S {
public int x {get;}
public int y {get; set;}
public Z z {get;}
public S() {
x = 15;
y = 23;
z.z1 = 1;
}
}
public struct Z { int z1; }
```
_What are the rules under which assignments to autoprops are allowed in constructors?_
__Absolute__ We can't be more permissive in what we allow with readonly autoprops than we are with readonly fields, because this would break PEVerify. (Incidentally, PEVerify doesn't check definite assignment in the constructor of a struct; that's solely a C# language thing).
__Overall principle__ When reading/writing to an autoprop, do we go via the accessor (if there is one) or do we bypass it (if there is one) and access the underlying field directly?
_Option1:_ language semantics say the accessor is used, and codegen uses it.
_Option2:_ in an appropriate constructor, when there is a "final" autoprop (either non-virtual, or virtual in a sealed class), access to an autoprop _means_ an access to the underlying field. This meaning is used for definite assignment, and for codegen. Note that it is semantically visible whether we read from an underlying field vs through an accessor, e.g. in `int c { [CodeSecurity] get;}`
_Resolution: Option1_. Under Option2, if you set a breakpoint on the getter of an autoprop, gets of it would not hit the breakpoint if they were called in the constructor which is weird. Also it would be weird that making the class sealed or the autoprop non-virtual would have this subtle change. And things like Postsharper wouldn't be able to inject. All round Option2 is weird and Option1 is clean and expected.
__Definite Assignment__. Within an appropriate constructor, what exactly are the rules for definite assignment? Currently if you try to read a property before _all_ fields have been assigned then it says CS0188 'this' cannot be used before all fields are assignment, but reading a field is allowed so long as merely that field has been assigned. More precisely, within an appropriate constructor, for purposes of definite assignment analysis, when does access of the autoprop behave as if it's an access of the backing field?
_Option1_: never
_Option2_: Only in case of writes to readonly autoprops
_Option3_: In the case of writes to all autoprops
_Option4_: In the case of reads and writes to all autoprops
_Resolution: Option4_. This is the most helpful to developers. You might wonder what happens if it's a virtual autoprop and someone overrides getter or setter in derived types in such a way that would violate the definite assignment assumptions. But for structs there won't be derived types, and for classes the semantics say that all fields are assigned to default(.) so there's no difference.
__Piecewise initialization of structs__. In the code above, do we allow `z.z1 = 15` to assign to the _field_ of a readonly struct autoprop?
_Option1:_ Yes by threating access to "z" for purposes of definite assignment as an access of the underlying field.
_Option2: _ No because in `z.z1` the read of `z` happens via the accessor as per the principle above, and thus returns an rvalue, and hence assignment to `z.z1` can't work. Instead you will have to write `z = new Z(...)`.
_Resolution: Option2_. If we went with Option1, then readonly autoprops would end up being more expressive than settable autoprops which would be odd! Note that in VB you can still write `_z.z1 = 15` if you do want piecewise assignment.
__Virtual__. What should happen if the readonly autoprop is virtual, and its getter is overridden in a derived class?
_Resolution:_ All reads of the autoprop go via its accessor, as is already the case.
__Semantic model__. In the line `x = 15` what should the Roslyn semantic model APIs say for the symbol `x` ?
_Resolution:_ they refer to the property x. Not the backing field. (Under the hood of the compiler, during lowering, if in an appropriate constructor, for write purposes, it is implicitly transformed into a reference to the backing field). More specifically, for access to an autoprop in the constructor,
1. It should _bind_ to the property, but the property should be treated as a readable+writable (for purposes of what's allowed) in the case of a readonly autoprop.
2. The definite assignment behavior should be as if directly accessing the backing field.
3. It should code gen to the property accessor (if one exists) or a field access (if not).
__Out/ref arguments in C#__. Can you pass a readonly autoprop as an out/ref argument in C#?
_Resolution: No_. For readonly autoprops passed as _ref_ arguments, that wouldn't obey the principle that access to the prop goes via its accessor. For passing readonly autoprops as _out_ arguments with the hope that it writes to the underlying field, that wouldn't obey the principle that we bind to the property rather than the backing field. For writeonly autoprops, they don't exist because they're not useful.
__Static readonly autoprops__ Should everything we've written also work for static readonly autoprops?
_Resolution: Yes._ Note there's currently a bug in the native compiler (fixed in Dev14) where the static constructor of a type G<T> is able to initialize static readonly fields in specializations of G e.g. `G<T>.x=15;. The CLR does indeed maintain separate storage locations for each static readonly fields, so `G<int>.g` and `G<string>.g` are different variables. (The native compiler's bug where the static constructor of G could assign to all of them resulted in unverifiable code).
__VB rules in initializers as well as constructors__. VB initializers are allowed to refer to other members of a class, and VB initializers are all executed during construction time. Should everything we've said about behavior in C# constructors also apply to behavior in VB initializers?
_Resolution: Yes_.
__VB copyback for ByRef parameters__. In VB, when you pass an argument to a ByRef parameter, then either it passes it as an lvalue (if the argument was a local variable or field or similar) or it uses "copy-in to a temporary then invoke the method then copy-out from the temporary" (if the argument was a property), or it uses "copy-in to a temporary then invoke the method then ignore the output" (if the argument was an rvalue or a constant). What should happen when you pass a readonly autoprop to a ByRef parameter?
_Option1:_ Emit a compile-time error because copyback is mysterious and bites you in mysterious ways, and this new way is even more mysterious than what was there before.
_Option2:_ Within the constructor/initializers, copy-in by reading via the accessor, and copy-back by writing to the underlying field. Elsewhere, copy-in with no copy-out. Also, just as happens with readonly fields, emit an error if assignment to a readonly autoprop happens in a lambda in a constructor (see code example below)
_Resolution: Option2_. Exactly has happens today for readonly fields. Note incidentally that passing a readonly autoprop to a ByRef parameter will have one behavior in the constructor and initializers (it will do the copy-back), and will silently have different behavior elsewhere (it won't do any copy-back). This too is already the case with readonly fields. On a separate note, developers would like to have feedback in some cases (not constants or COM) where copyback in a ByRef argument isn't done. But that's not a question for the language design meeting.
__VB copyin for writeonly autoprops__. VB tentatively has writeonly autoprops for symmetry, even though they're not useful. What should happen when you pass a writeonly autoprop as a ByRef argument?
_Resolution: Yuck._ This is a stupid corner case. Notionally the correct thing is to read from the backing field, and write via the setter. But if it's easier to just remove support for writeonly autoprops, then do that.
```vb
Class C
ReadOnly x As Integer = 15
Public Sub New()
f(x)
Dim lamda = Sub()
f(x) ' error BC36602: 'ReadOnly' variable
' cannot be the target of an assignment in a lambda expression
' inside a constructor.
End Sub
End Sub
Shared Sub f(ByRef x As Integer)
x = 23
End Sub
End Class
```
We discussed a potential new error message in the compiler.
__Scenario:__ Roslyn ships with ISymbol interface. In a future release it wants to add additional members to the interface. But this will break anyone who implemented ISymbol in its current form. Therefore it would be good to have a way to prevent anyone _else_ from implementing ISymbol. That would allow us to add members without breaking people.
Is this scenario widespread? Presumably, but we don't have data and haven't heard asks for it. There are a number of workarounds today. Some workarounds provide solid code guarantees. Other workarounds provide "suggestions" or "encouragements" that might be enough for us to feel comfortable breaking people who took dependencies where we told them not to.
__Counter-scenario:__ Nevertheless, I want to _MOCK_ types. I want to construct a mock ISymbol myself maybe using MOQ, and pass it to my functions which take in an ISymbol, for testing purposes. I still want to be able to do this. (Note: MOQ will automatically update whenever we add new members to ISymbol, so users of it won't be broken).
__Workarounds__
1. Ignore the problem and just break people.
2. Like COM, solve it by adding new incremental interfaces ISymbol2 with the additional members. As Adam Speight notes below, you can make ISymbol2 inherit from ISymbol.
3. Instead of interfaces, use abstract classes with internal constructors. Or abstract classes but never add abstract methods to it; only virtual methods.
4. Write documentation for the interface, on MSDN or in XML Doc-Comments, that say "Internal class only; do not implement it". We see this for instance on ICorThreadpool.
5. Declare a method on the interface which has an internal type in its signature. The CLR allows this but the language doesn't so it would have to be authored in IL. Every type which implements the interface would have to provide an implementation of that method.
6. Write run-time checks at the public entry points of key Roslyn methods that take in an ISymbol, and throw if the object given was implemented in the wrong assembly.
7. Write a Roslyn analyzer which is deployed by the same Nuget package that contains the definition of ISymbol, and have this analyzer warn if you're trying to implement the interface. This analyzer could be part of Roslyn, or it could be an independent third-party analyzer used by many libraries.
__Proposal:__ Have the compiler recognize a new attribute. Given the following code
```cs
[System.Runtime.CompilerServices.InternalImplementationOnly] interface I<...> {...}
```
it should be a compile-time warning for a type to implement that interface, directly or indirectly, unless the class is in the same assembly as "I" or is in one of its InternalsVisibleTo friends. It will also be a compile-time error for an interface to inherit from the interface in the same way. Also, we might ask for the .NET Framework team to add this attribute in the same place as System.Runtime.CompilerServices.ExtensionAttribute, and CallerMemberNameAttribute. But doing it isn't necessary since the compiler will recognize any attribute with that exact fully-qualified name and the appropriate (empty) constructor.
Note that this rule would not be cast-iron, since it won't have CLR enforcement. It would still be possible to bypass it by writing IL by hand, or by compiling with an older compiler. But we're not looking for cast-iron. We're just looking for discouragement strong enough to allow us to add members to ISymbol in the future. (In the case of ISymbol, it's very likely that people will be using Roslyn to compile code relating to ISymbol, but that doesn't apply to other libraries).
__Resolution:__ Workaround #7 is a better option than adding this proposal to the language.

Просмотреть файл

@ -0,0 +1,736 @@
# nameof operator: spec v5
The nameof(.) operator has the form nameof(expression). The expression must have a name, and may refer to either a single symbol or a method-group or a property-group. Depending on what the argument refers to, it can include static, instance and extension members.
This is v5 of the spec for the "nameof" operator. [[v1](https://roslyn.codeplex.com/discussions/552376), [v2](https://roslyn.codeplex.com/discussions/552377), [v3](https://roslyn.codeplex.com/discussions/570115), [v4](https://roslyn.codeplex.com/discussions/570364)]. The key decisions and rationales are summarized below. _Please let us know what you think!_
# Rationale
Question: why do we keep going back-and-forth on this feature?
Answer: I think we're converging on a design. It's how you do language design! (1) make a proposal, (2) _spec it out_ to flush out corner cases and make sure you've understood all implications, (3) _implement the spec_ to flush out more corner cases, (4) try it in practice, (5) if what you hear or learn at any stage raises concerns, goto 1.
* _v1/2 had the problem "Why can't I write nameof(this.p)? and why is it so hard to write analyzers?"_
* _v3 had the problem "Why can't I write nameof(MyClass1.p)? and why is it so hard to name method-groups?"_
* _v4 had the problem "What name should be returned for types? and how exactly does it relate to member lookup?"_
This particular "nameof v5" proposal came from a combined VB + C# LDM on 2014.10.22. We went through the key decisions:
1. __Allow to dot an instance member off a type? Yes.__ Settled on the answer "yes", based on the evidence that v1/v2 had it and it worked nicely, and v3 lacked it and ended up with unacceptably ugly "default(T)" constructions.
2. __Allow to dot instance members off an instance? Yes.__ Settled on the answer "yes", based on the evidence that v1/v2 lacked it and it didn't work well enough when we used the CTP, primarily for the case "this.p"
3. __Allow to name method-groups? Yes.__ Settled on the answer "yes", based on the evidence that v1/v2 had it and it worked nicely, and v3 lacked it and ended up with unacceptably ugly constructions to select which method overload.
4. __Allow to unambiguously select a single overload? No.__ Settled on the answer "no" based on the evidence that v3 let you do this but it looked too confusing. I know people want it, and it would be a stepping stone to infoof, but at LDM we rejected these (good) reasons as not worth the pain. The pain is that the expressions look like they'll be executed, and it's unclear whether you're getting the nameof the method or the nameof the result of the invocation, and they're cumbersome to write.
5. __Allow to use nameof(other-nonexpression-types)? No.__ Settled on the answer "only nameof(expression)". I know people want other non-expression arguments, and v1/v2 had them, and it would be more elegant to just write nameof(List<>.Length) rather than having to specify a concrete type argument. But at LDM we rejected these (good) reasons as not worth the pain. The pain is that the language rules for member access in expressions are too different from those for member access in the argument to nameof(.), and indeed member access for StrongBox<>.Value.Length doesn't really exist. The effort to unify the two concepts of member access would be way disproportionate to the value of nameof. This principle, of sticking to existing language concepts, also explains why v5 uses the standard language notions of "member lookup", and hence you can't do nameof(x.y) to refer to both a method and a type of name "y" at the same time.
6. __Use source names or metadata names? Source.__ Settled on source names, based on the evidence... v1/v2/v3 were source names because that's how we started; then we heard feedback that people were interested in metadata names and v4 explored metadata names. But I think the experiment was a failure: it ended up looking like _disallowing_ nameof on types was better than picking metadata names. At LDM, after heated debate, we settled on source names. For instance `using foo=X.Y.Z; nameof(foo)` will return "foo". Also `string f<T>() => nameof(T)` will return "T". There are pros and cons to this decision. In its favor, it keeps the rules of nameof very simple and predictable. In the case of nameof(member), it would be a hindrance in most (not all) bread-and-butter cases to give a fully qualified member name. Also it's convention that "System.Type.Name" refers to an unqualified name. Therefore also nameof(type) should be unqualified. If ever you want a fully-qualified type name you can use typeof(). If you want to use nameof on a type but also get generic type parameters or arguments then you can construct them yourself, e.g. nameof(List<int>) + "`2". Also the languages have no current notion of metadata name, and metadata name can change with obfuscation.
7. __Allow arbitrary expressions or just a subset?__ We want to try out the proposal "just a subset" because we're uneasy with full expressions. That's what v5 does. We haven't previously explored this avenue, and it deserves a try.
8. __Allow generic type arguments?__ Presumably 'yes' when naming a type since that's how expression binding already works. And presumably 'no' when naming a method-group since type arguments are used/inferred during overload resolution, and it would be confusing to also have to deal with that in nameof. [this item 8 was added after the initial v5 spec]
I should say, we're not looking for unanimous consensus -- neither amongst the language design team nor amongst the codeplex OSS community! We hear and respect that some people would like something closer to infoof, or would make different tradeoffs, or have different use-cases. On the language design team we're stewards of VB/C#: we have a responsibility to listen to and understand _every opinion_, and then use our own judgment to weigh up the tradeoffs and use-cases. I'm glad we get to do language design in the open like this. We've all been reading the comments on codeplex and elsewhere, our opinions have been swayed, we've learned about new scenarios and issues, and we'll end up with a better language design thanks to the transparency and openness. I'm actually posting these notes on codeplex as my staging ground for sharing them with the rest of the team, so the codeplex audience really does see everything "in the raw".
# C# Syntax
```
expression: ... | nameof-expression
name-of-expression:
nameof ( expression )
```
In addition to the syntax indicated by the grammar, there are some additional syntactic constraints: (1) the argument expression can only be made up out of simple-name, member-access, base-access, or "this", and (2) cannot be simply "this" or "base" on its own. These constraints ensure that the argument looks like it has a name, and doesn't look like it will be evaluated or have side effects. I found it easier to write the constraints in prose than in the grammar.
[clarification update] Note that member-access has three forms, `E.I<A1...AK>` and `predefined-type.I<A1...AK>` and `qualified-alias-member.I`. All three forms are allowed, although the first case `E` must only be made out of allowed forms of expression.
If the argument to nameof at its top level has an unacceptable form of expression, then it gives the error "This expression does not have a name". If the argument contains an unacceptable form of expression deeper within itself, then it gives the error "This sub-expression cannot be used as an argument to nameof".
It is helpful to list some things not allowed as the nameof argument:
```
invocation-expression e(args)
assignment x += 15
query-expression from y in z select y
lambda-expression () => e
conditional-expression a ? b : c
null-coalescing-expression a?? b
binary-expression ||, &&, |, ^, &, ==, !=,
<, >, <=, >=, is, as, <<,
>>, +, -, *, /, %
prefix-expression +, -, !, ~, ++, --,
*, &, (T)e
postfix-expression ++, --
array-creation-expression new C[…]
object-creation-expression new C(…)
delegate-creation-expression new Action(…)
anonymous-object-creation-expression new {…}
typeof-expression typeof(int)
checked-expression checked(…)
unchecked-expression unchecked(…)
default-value-expression default(…)
anonymous-method-expression delegate {…}
pointer-member-access e->x
sizeof-expression sizeof(int)
literal "hello", 15
parenthesized-expression (x)
element-access e[i]
base-access-indexed base[i]
await-expression await e
nameof-expression nameof(e)
vb-dictionary-lookup e!foo
```
Note that there are some types which are not counted as expressions by the C# grammar. These are not allowed as nameof arguments (since the nameof syntax only allows expressions for its argument). There's no need to spell out that the following things are not valid expressions, since that's already said by the language syntax, but here's a selection of some of the types that are not expressions:
```
predefined-type int, bool, float, object,
dynamic, string
nullable-type Customer?
array-type Customer[,]
pointer-type Buffer*, void*
qualified-alias-member A::B
void void
unbound-type-name Dictionary<,>
```
# Semantics
The nameof expression is a constant. In all cases, nameof(...) is evaluated at compile-time to produce a string. Its argument is not evaluated at runtime, and is considered unreachable code (however it does not emit an "unreachable code" warning).
_Definite assignment._ The same rules of definite assignment apply to nameof arguments as they do to all other unreachable expressions.
_Name lookup_. In the following sections we will be discussing member lookup. This is discussed in $7.4 of the C# spec, and is left unspecified in VB. To understand nameof it is useful to know that the existing member lookup rules in both languages either return a single type, or a single instance/static field, or a single instance/static event, or a property-group consisting of overloaded instance/static properties of the same name (VB), or a method-group consisting of overloaded instance/static/extension methods of the same name. Or, lookup might fail either because no symbol was found or because ambiguous conflicting symbols were found that didn't fall within the above list of possibilities.
_Argument binding_. The nameof argument refers to one or more symbols as follows.
__nameof(simple-name)__, of the form I or I<A1...AK>
The normal rules of simple name lookup $7.6.2 are used but with one difference...
* The third bullet talks about member lookup of I in T with K type arguments. Its third sub-bullet says _"Otherwise, the result is the same as a member access of the form T.I or T.I<A1...AK>. In this case, it is a binding-time error for the simple-name to refer to an instance member."_ For the sake of nameof(simple-name), this case does not constitute a binding-time error.
__nameof(member-access)__, of the form E.I or E.I<A1...AK>
The normal rules of expression binding are used to evaluate "E", with _no changes_. After E has been evaluated, then E.I<A1...AK> is evaluated as per the normal rules of member access $7.6.4 but with some differences...
* The third bullet talks about member lookup of I in E. Its sub-bullets have rules for binding when I refers to static properties, fields and events. For the sake of nameof(member-access), each sub-bullet applies to instance properties, fields and events as well.
* The fourth bullet talks about member lookup of I in T. Its sub-bullets have rules for binding when I refers to instance properties, fields and events. For the sake of nameof(member-access), each sub-bullet applies to static properties, fields and events as well.
__nameof(base-access-named)__, of the form base.I or base.I<A1...AK>
This is treated as nameof(B.I) or nameof(B.I<A1...AK> where B is the base class of the class or struct in which the construct occurs.
_Result of nameof_. The result of nameof is the identifier "I" with the _standard identifier transformations_. Note that, at the top level, every possible argument of nameof has "I<A1...AK>".
[update that was added after the initial v5 spec] If "I" binds to a method-group and the argument of nameof has generic type arguments at the top level, then it produces an error "Do not use generic type arguments to specify the name of methods". Likewise for property-groups.
The standard identifier transformations in C# are detailed in $2.4.2 of the C# spec: first any leading @ is removed, then Unicode escape sequences are transformed, and then any formatting-characters are removed. This of course still happens at compile-time. In VB, any surrounding [] is removed
# Implementation
In C#, nameof is stored in a normal InvocationExpressionSyntax node with a single argument. That is because in C# 'nameof' is a contextual keyword, which will only become the "nameof" operator if it doesn't already bind to a programmatic symbol named "nameof". TO BE DECIDED: what does its "Symbol" bind to?
In VB, NameOf is a reserved keyword. It therefore has its own node:
```vb
Class NameOfExpressionSyntax : Inherits ExpressionSyntax
Public ReadOnly Property Argument As ExpressionSyntax
End Class
```
TO BE DECIDED: Maybe VB should just be the same as C#. Or maybe C# should do the same as VB.
What is the return value from `var r = semanticModel.GetSymbolInfo(argument)`? In all cases, r.Candidates is the list of symbol. If there is only one symbol then it is in r.Symbol; otherwise r.Symbol is null and the reason is "ambiguity".
Analyzers and the IDE will just have to deal with this case.
# IDE behavior
```cs
class C {
[3 references] static void f(int i) {...nameof(f)...}
[3 references] void f(string s) {...nameof(this.f)...}
[3 references] void f(object o) {...nameof(C.f)...}
}
static class E {
[2 references] public static void f(this C c, double d) {}
}
```
__Highlight symbol from argument__: When you set your cursor on an argument to nameof, it highlights all symbols that the argument bound to. In the above examples, the simple name "nameof(f)" binds to the three members inside C. The two member access "nameof(this.f)" and "nameof(C.f)" both bind to extension members as well.
__Highlight symbol from declaration__: When you set your cursor on any declaration of f, it highlights all nameof arguments that bind to that declaration. Setting your cursor on the extension declaration will highlight only "this.f" and "C.f". Setting your cursor on any member of C will highlight both all three nameof arguments.
__Goto Definition__: When you right-click on an argument to nameof in the above code and do GoToDef, it pops up a FindAllReferences dialog to let you chose which declaration. (If the nameof argument bound to only one symbol then it would go straight to that without the FAR dialog.)
__Rename declaration__: If you do a rename-refactor on one of the declarations of f in the above code, the rename will only rename this declaration (and will not rename any of the nameof arguments); the rename dialog will show informational text warning you about this. If you do a rename-refactor on the _last remaining_ declaration of f, then the rename will also rename nameof arguments. Note that if you turn on the "Rename All Overloads" checkbox of rename-refactor, then it will end up renaming all arguments.
__Rename argument__: If you do a rename-refactor on one of the nameof arguments in the above code, the rename dialog will by default check the "Rename All Overloads" button.
__Expand-reduce__: The IDE is free to rename "nameof(p)" to "nameof(this.p)" if it needs to do so to remove ambiguity during a rename. This might make nameof now bind to more things than it used to...
__Codelens__: We've articulated the rules about what the argument of nameof binds to. The CodeLens reference counts above are a straightforward consequence of this.
## Bread and butter cases
```cs
// Validate parameters
void f(string s) {
if (s == null) throw new ArgumentNullException(nameof(s));
}
```
```cs
// MVC Action links
<%= Html.ActionLink("Sign up",
@typeof(UserController),
@nameof(UserController.SignUp))
%>
```
```cs
// INotifyPropertyChanged
int p {
get { return this._p; }
set { this._p = value; PropertyChanged(this, new PropertyChangedEventArgs(nameof(this.p)); }
}
// also allowed: just nameof(p)
```
```cs
// XAML dependency property
public static DependencyProperty AgeProperty = DependencyProperty.Register(nameof(Age), typeof(int), typeof(C));
```
```cs
// Logging
void f(int i) {
Log(nameof(f), "method entry");
}
```
```cs
// Attributes
[DebuggerDisplay("={" + nameof(getString) + "()}")]
class C {
string getString() { ... }
}
```
# Examples
```cs
void f(int x) {
nameof(x)
}
// result "x": Parameter (simple name lookup)
```
```cs
int x=2; nameof(x)
// result "x": Local (simple name lookup)
```
```cs
const x=2; nameof(x)
// result "x": Constant (simple name lookup)
```
```cs
class C {
int x;
... nameof(x)
}
// result "x": Member (simple name lookup)
```
```cs
class C {
void f() {}
nameof(f)
}
// result "f": Member (simple name lookup)
```
```cs
class C {
void f() {}
nameof(f())
}
// result error "This expression does not have a name"
```
```cs
class C {
void f(){}
void f(int i){}
nameof(f)
}
// result "f": Method-group (simple name lookup)
```
```cs
Customer c; ... nameof(c.Age)
// result "Age": Property (member access)
```
```cs
Customer c; ... nameof(c._Age)
// result error "_Age is inaccessible due to its protection level: member access
```
```cs
nameof(Tuple.Create)
// result "Create": method-group (member access)
```
```cs
nameof(System.Tuple)
// result "Tuple": Type (member access). This binds to the non-generic Tuple class; not to all of the Tuple classes.
```
```cs
nameof(System.Exception)
// result "Exception": Type (member access)
```
```cs
nameof(List<int>)
// result "List": Type (simple name lookup)
```
```cs
nameof(List<>)
// result error "type expected": Unbound types are not valid expressions
```
```cs
nameof(List<int>.Length)
// result "Length": Member (Member access)
```
```cs
nameof(default(List<int>))
// result error "This expression doesn't have a name": Not one of the allowed forms of nameof
```
```cs
nameof(default(List<int>).Length)
// result error "This expression cannot be used for nameof": default isn't one of the allowed forms
```
```cs
nameof(int)
// result error "Invalid expression term 'int'": Not an expression. Note that 'int' is a keyword, not a name.
```
```cs
nameof(System.Int32)
// result "Int32": Type (member access)
```
```cs
using foo=System.Int32;
nameof(foo)
// result "foo": Type (simple name lookup)
```
```cs
nameof(System.Globalization)
// result "Globalization": Namespace (member access)
```
```cs
nameof(x[2])
nameof("hello")
nameof(1+2)
// error "This expression does not have a name": Not one of the allowed forms of nameof
```
```vb
NameOf(a!Foo)
' error "This expression does not have a name": VB-specific. Not one of the allowed forms of NameOf.
```
```vb
NameOf(dict("Foo"))
' error "This expression does not have a name": VB-specific. This is a default property access, which is not one of the allowed forms.
```
```vb
NameOf(dict.Item("Foo"))
' error "This expression does not have a name": VB-specific. This is an index of a property, which is not one of the allowed forms.
```
```vb
NameOf(arr(2))
' error "This expression does not have a name": VB-specific. This is an array element index, which is not one of the allowed forms.
```
```vb
Dim x = Nothing
NameOf(x.ToString(2))
' error "This expression does not have a name": VB-specific. This resolves to .ToString()(2), which is not one of the allowed forms.
```
```vb
Dim o = Nothing
NameOf(o.Equals)
' result "Equals". Method-group. Warning "Access of static member of instance; instance will not be evaluated": VB-specific. VB allows access to static members off instances, but emits a warning.
```
```cs
[Foo(nameof(C))]
class C {}
// result "C": Nameof works fine in attributes, using the normal name lookup rules.
```
```cs
[Foo(nameof(D))]
class C { class D {} }
// result "D": Members of a class are in scope for attributes on that class
```
```cs
[Foo(nameof(f))]
class C { void f() {} }
// result "f": Members of a class are in scope for attributes on that class
```
```cs
[Foo(nameof(T))]
class C<T> {}
// result error "T is not defined": A class type parameter is not in scope in an attribute on that class
```
```cs
[Foo(nameof(T))] void f<T> { }
// result error "T not defined": A method type parameter is not in scope in an attribute on that method
```
```cs
void f([Attr(nameof(x))] int x) {}
// result error "x is not defined": A parameter is not in scope in an attribute on that parameter, or any parameter in the method
```
```vb
Function f()
nameof(f)
End Function
' result "f": VB-specific. This is resolved as an expression which binds to the implicit function return variable
```
```vb
NameOf(New)
' result error "this expression does not have a name": VB-specific. Not one of the allowed forms of nameof. Note that New is not a name; it is a keyword used for construction.
```
```vb
Class C
Dim x As Integer
Dim s As String = NameOf(x)
End Class
' result "x": Field (simple name lookup)
```
```cs
class C {
int x;
string s = nameof(x);
}
// result "x". Field (simple name lookup)
```
```cs
class C {
static int x;
string s = nameof(x);
}
// result "x". Field (simple name lookup)
```
```cs
class C {
int x;
string s = nameof(C.x);
}
// result "x". Member (member access)
```
```cs
class C {
int x;
string s = nameof(default(C).x);
}
// result error "This expression isn't allowed in a nameof argument" - default.
```
```cs
struct S {
int x;
S() {var s = nameof(x); ...}
}
// result "x": Field access (simple name lookup). Nameof argument is considered unreachable, and so this doesn't violate definite assignment.
```
```cs
int x; ... nameof(x); x=1;
// result "x": Local access (simple name lookup). Nameof argument is unreachable, and so this doesn't violate definite assignment.
```
```cs
int x; nameof(f(ref x));
// result error "this expression does not have a name".
```
```cs
var @int=5; nameof(@int)
// result "int": C#-specific. Local (simple name lookup). The leading @ is removed.
```
```cs
nameof(m\u200c\u0065)
// result "me": C#-specific. The Unicode escapes are first resolved, and the formatting character \u200c is removed.
```
```vb
Dim [Sub]=5 : NameOf([Sub])
' result "Sub": VB-specific. Local (simple name lookup). The surrounding [.] is removed.
```
```cs
class C {
class D {}
class D<T> {}
nameof(C.D)
}
// result "D" and binds to the non-generic form: member access only finds the type with the matching arity.
```
```cs
class C<T> where T:Exception {
... nameof(C<string>)
}
// result error: the type 'string' doesn't satisfy the constraints
```
# [String Interpolation for C#](http://1drv.ms/1tFUvbq) #
An *interpolated string* is a way to construct a value of type `String` (or `IFormattable`) by writing the text of the string along with expressions that will fill in "holes" in the string. The compiler constructs a format string and a sequence of fill-in values from the interpolated string.
When it is treated as a value of type `String`, it is a shorthand for an invocation of
```cs
String.Format(string format, params object args[])
```
When it is converted to the type `IFormattable`, the result of the string interpolation is an object that stores a compiler-constructed *format string* along with an array storing the evaluated expressions. The object's implementation of
```cs
IFormattable.ToString(string format, IFormatProvider formatProvider)
```
is an invocation of
```cs
String.Format(IFormatProviders provider, String format, params object args[])
```
By taking advantage of the conversion from an interpolated string expression to `IFormattable`, the user can cause the formatting to take place later in a selected locale. See the section `System.Runtime.CompilerServices.FormattedString` for details.
Note: the converted interpolated string may have more "holes" in the format string than there were interpolated expression holes in the interpolated string. That is because some characters (such as `"\{"` `"}"`) may be translated into a hole and a corresponding compiler-generated fill-in.
## Lexical Grammar ##
An interpolated string is treated initially as a token with the following lexical grammar:
```
interpolated-string:
$ " "
$ " interpolated-string-literal-characters "
interpolated-string-literal-characters:
interpolated-string-literal-part interpolated-string-literal-parts
interpolated-string-literal-part
interpolated-string-literal-part:
single-interpolated-string-literal-character
simple-escape-sequence
hexadecimal-escape-sequence
unicode-escape-sequence
interpolation
simple-escape-sequence: one of
\' \" \\ \0 \a \b \f \n \r \t \v \{ \}
single-interpolated-string-literal-character:
Any character except " (U+0022), \ (U+005C), { (U+007B) and new-line-character
interpolation:
{ interpolation-contents }
interpolation-contents:
balanced-text
balanced-text : interpolation-format
balanced-text:
balanced-text-part
balanced-text-part balanced-text
balanced-text-part
Any character except ", (, [, {, /, \ and new-line-character
( balanced-text )
{ balanced-text }
[ balanced-text ]
regular-string-literal
delimited-comment
unicode-escape-sequence
/ after-slash
after-slash
Any character except ", (, [, {, /, \, * and new-line-character
( balanced-text )
{ balanced-text }
[ balanced-text ]
regular-string-literal
* delimited-comment-text[opt] asterisks /
unicode-escape-sequence
interpolation-format:
regular-string-literal
literal-interpolation-format
literal-interpolation-format:
interpolation-format-part
interpolation-format-part literal-interpolation-format
interpolation-format-part
Any character except ", :, \, } and new-line-character
```
With the additional restriction that a *delimited-comment-text* that is a *balanced-text-part* may not contain a *new-line-character*.
This lexical grammar is ambiguous in that it allows a colon appearing in *interpolation-contents* to be considered part of the *balanced-text*, or as the separator between the *balanced-text* and the *interpolation-format*. This ambiguity is resolved by considering it to be a separator between the *balanced-text* and *interpolation-format*.
## Syntactic Grammar ##
An *interpolated-string* token is reclassified, and portions of it are reprocessed lexically and syntactically, during syntactic analysis as follows:
- If the *interpolated-string* contains no *interpolation*, then it is reclassified as a *regular-string-literal*.
- Otherwise
- the portion of the *interpolated-string* before the first *interpolation* is reclassified as an *interpolated-string-start* terminal;
- the portion of the *interpolated-string* after the last *interpolation* is reclassified as an *interpolated-string-end* terminal;
- the portion of the *interpolated-string* between one *interpolation* and another *interpolation* is reclassified as an *interpolated-string-mid* terminal;
- the *balanced-text* of each *interpolation-contents* is reprocessed according to the language's lexical grammar, yielding a sequence of terminals;
- the colon in each *interpolation-contents* that contains an *interpolation-format* is classified as a colon terminal;
- each *interpolation-format* is reclassified as a *regular-string-literal* terminal; and
- the resulting sequence of terminals undergoes syntactic analysis as an *interpolated-string-expression*.
```
expression:
interpolated-string-expression
interpolated-string-expression:
interpolated-string-start interpolations interpolated-string-end
interpolations:
single-interpolation
single-interpolation interpolated-string-mid interpolations
single-interpolation:
interpolation-start
interpolation-start : regular-string-literal
interpolation-start:
expression
expression , expression
```
## Semantics ##
An *interpolated-string-expression* has type `string`, but there is an implicit *conversion from expression* from an *interpolated-string-expression* to the type `System.IFormattable`. By the existing rules of the language (7.5.3.3 Better conversion from expression), the conversion to `string` is a better conversion from expression.
An *interpolated-string-expression* is translated into an intermediate *format string* and *object array* which capture the contents of the interpolated string using the semantics of [Composite Formatting](http://msdn.microsoft.com/en-us/library/txafckwd(v=vs.110).aspx "Composite Formatting"). If treated as a value of type `string`, the formatting is performed using `string.Format(string format, params object[] args)` or equivalent code. If it is converted to `System.IFormattable`, an object of type [`System.Runtime.CompilerServices.FormattedString`](#FormattedString) is constructed using the format string and argument array, and that object is the value of the *interpolated-string-expression*.
The format string is constructed of the literal portions of the *interpolated-string-start*, *interpolated-string-mid*, and *interpolated-string-end* portions of the expression, with special treatment for `{` and `}` characters (see [Notes](#Notes)).
**The evaluation order needs to be specified**.
**The definite assignment rules need to be specified**.
## single-interpolation Semantics ##
**This section should describe in detail the construction of a [*format item*](http://msdn.microsoft.com/en-us/library/txafckwd(v=vs.110).aspx) from a single-interpolation, and the corresponding element of the object array**.
If an *interpolation-start* has a comma and a second expression, the second expression must evaluate to a compile-time constant of type `int`, which is used as the [*alignment* of a *format item*](http://msdn.microsoft.com/en-us/library/txafckwd(v=vs.110).aspx).
If a *single-interpolation* has a colon and a *regular-string-literal*, then the string literal is used as the [*formatString* of a *format item*](http://msdn.microsoft.com/en-us/library/txafckwd(v=vs.110).aspx).
## Notes ##
The compiler is free to translate an interpolated string into a format string and object array where the number of objects in the object array is not the same as the number of interpolations in the *interpolated-string-expression*. In particular, the compiler may translate `{` and `}` characters into a fill-in in the format string and a corresponding string literal containing the character. For example, the interpolated string `$"\{ {n} \}"` may be translated to `String.Format("{0} {1} {2}", "{", n, "}")`.
The compiler is free to use any overload of `String.Format` in the translated code, as long as doing so preserves the semantics of calling `string.Format(string format, params object[] args)`.
## Examples ##
The interpolated string
```
$"{hello}, {world}!"
```
is translated to
```
String.Format("{0}, {1}!", hello, world)
```
The interpolated string
```
$"Name = {myName}, hours = {DateTime.Now:hh}"
```
is translated to
```
String.Format("Name = {0}, hours = {1:hh}", myName, DateTime.Now)
```
The interpolated string
```
$"\{{6234:D}\}"
```
is translated to
```
String.Format("{0}{1:D}{2}", "{", 6234, "}")
```
For example, if you want to format something in the invariant locale, you can do so using this helper method
```cs
public static string INV(IFormattable formattable)
{
return formattable.ToString(null, System.Globalization.CultureInfo.InvariantCulture);
}
```
and writing your interpolated strings this way
```cs
string coordinates = INV("longitude={longitude}; latitude={latitude}");
```
## System.Runtime.CompilerServices.FormattedString ##
The following platform class is used to translate an interpolated string to the type `System.IFormattable`.
```cs
namespace System.Runtime.CompilerServices
{
public class FormattedString : System.IFormattable
{
private readonly String format;
private readonly object[] args;
public FormattedString(String format, params object[] args)
{
this.format = format;
this.args = args;
}
string IFormattable.ToString(string ignored, IFormatProvider formatProvider)
{
return String.Format(formatProvider, format, args);
}
}
}
```
## Issues ##
1. As specified, an interpolated string with no interpolations cannot be converted to `IFormattable` because it is a string literal. It should have such a conversion.

100
meetings/2014/README.md Normal file
Просмотреть файл

@ -0,0 +1,100 @@
# C# Language Design Notes for 2014
Overview of meetings and agendas for 2014
## Jan 6, 2014
[C# Language Design Notes for Jan 6, 2014](LDM-2014-01-06.md)
1. Syntactic ambiguities with declaration expressions <_a solution adopted_>
2. Scopes for declaration expressions <_more refinement added to rules_>
## Feb 3, 2014
[C# Language Design Notes for Feb 3, 2014](LDM-2014-02-03.md)
1. Capture of primary constructor parameters <_only when explicitly asked for with new syntax_>
2. Grammar around indexed names <_details settled_>
3. Null-propagating operator details <_allow indexing, bail with unconstrained generics_>
## Feb 10, 2014
[C# Language Design Notes for Feb 10, 2014](LDM-2014-02-10.md)
1. Design of using static <_design adopted_>
2. Initializers in structs <_allow in certain situations_>
3. Null-propagation and unconstrained generics <_keep current design_>
## Apr 21, 2014
[C# Language Design Notes for Apr 21, 2014](LDM-2014-04-21.md)
1. Indexed members <_lukewarm response, feature withdrawn_>
2. Initializer scope <_new scope solves all kinds of problems with initialization_>
3. Primary constructor bodies <_added syntax for a primary constructor body_>
4. Assignment to getter-only auto-properties from constructors <_added_>
5. Separate accessibility for type and primary constructor <_not worthy of new syntax_>
6. Separate doc comments for field parameters and fields <_not worthy of new syntax_>
7. Left associative vs short circuiting null propagation <_short circuiting_>
## May 7, 2014
[C# Language Design Notes for May 7, 2014](LDM-2014-05-07.md)
1. protected and internal <_feature cut not worth the confusion_>
2. Field parameters in primary constructors <_feature cut we want to keep the design space open_>
3. Property declarations in primary constructors <_interesting but not now_>
4. Typeswitch <_Not now more likely as part of a future more general matching feature_>
## May 21, 2014
[C# Language Design Notes for May 21, 2014](LDM-2014-05-21.md)
1. Limit the nameof feature? <_keep current design_>
2. Extend params IEnumerable? <_keep current design_>
3. String interpolation <_design nailed down_>
## Jul 9, 2014
[C# Language Design Notes for Jul 9, 2014](LDM-2014-07-09.md)
1. Detailed design of nameof <_details settled_>
2. Design of #pragma warning extensions <_allow identifiers_>
## Aug 27, 2014
[C# Language Design Notes for Aug 27, 2014](LDM-2014-08-27.md)
1. Allowing parameterless constructors in structs <_allow, but some unresolved details_>
2. Definite assignment for imported structs <_revert to Dev12 behavior_>
## Sep 3, 2014
[C# Language Design Notes for Sep 3, 2014](LDM-2014-09-03.md)
1. Removing “spill out” from declaration expressions in simple statements <_yes, remove_>
2. Same name declared in subsequent else-ifs <_condition decls out of scope in else-branch_>
3. Add semicolon expressions <_not in this version_>
4. Make variables in declaration expressions readonly <_no_>
## Oct 1, 2014
[C# Language Design Notes for Oct 1, 2014](LDM-2014-10-01.md)
1. Assignment to readonly autoprops in constructors (we fleshed out details)
2. A new compiler warning to prevent outsiders from implementing your interface? (no, leave this to analyzers)
## Oct 15, 2014
[C# Language Design Notes for Oct 15, 2014](LDM-2014-10-15.md)
1. # nameof operator: spec v5
2. # [String Interpolation for C#](http://1drv.ms/1tFUvbq)