csharplang/meetings/2022/LDM-2022-02-23.md

9.0 KiB

C# Language Design Meeting for February 16th, 2022

Agenda

  1. Pattern matching over Span<char>
  2. Checked operators

Quote of the Day

  • "If Memory serves?"

Discussion

Pattern matching over Span<char>

https://github.com/dotnet/csharplang/issues/1881
https://github.com/dotnet/csharplang/blob/main/proposals/pattern-match-span-of-char-on-string.md

We started today by going through the proposal for allowing matching Span<char> and ReadonlySpan<char> instances against string literals. This is an Any Time feature being implemented by a community contributor, and has a few open questions left before it will be ready to be merged. Most places in this section that use Span<char> are actually referring to both Span<char> and ReadonlySpan<char>, unless called out otherwise.

Non-constant matches

Matching a Span<char> to a string is interesting as a pattern feature because it's the first time that we're matching an input to a value that is non-constant. Matched constants can be subjected to a conversion today: for example, obj is uint and 1 will convert the 1 integer value in the pattern to a 1 unsigned integer value. The result of this conversion, though, is itself a constant value. For Span<char>, this isn't the same. spanValue is "Hello world" needs to convert "Hello world" to a ReadonlySpan<char>, the result of which is not a constant value. It could be argued, therefore, that this should not be permitted, and instead the scenario should wait for active patterns. However, we think that this scenario is still morally constant, even if not strictly semantically constant. We can additionally emit better code for the scenario than we currently think active patterns could, as it will benefit from our existing optimizations around string comparisons.

Conclusion

We are ok with allowing a morally constant pattern, even if it's not semantically constant.

Specification terms

This question is around how we should specify the comparison: should we specifically call out MemoryExtensions in the specification and require it to be provided by the framework, or should we have some kind of fallback ability?

In supported all distributions, MemoryExtensions and Span/ReadonlySpan have been either been directly in the framework together, or have been shipped as part of the same NuGet package. So we don't think there are any places where Span would be present, but the SequenceEquals method for comparison won't be. Specifying in terms of MemoryExtensions is also much easier for the implementation: if we have some kind of fallback ability, then that code has to be implemented and tested, all in service of a scenario that we think is extremely unlikely.

Conclusion

We will specify the feature in terms of System.Memory.MemoryExtensions.

null constants

We considered whether null should match against an empty Span<char>, since null is convertible to Span<char>, and the result is an empty Span.

There are some pros and cons to going either direction. In favor of allowing null to match, Span<char> s = null; works today, so it will be odd if s is null would then fail to compile. On the other hand, the subsumption rules start to differ from string, and in a way that's very hard to explain. s is null or "" would fail to compile and "" would say that it was already handled by a previous case: we think this is more confusing than a clear error message stating that null cannot be used to match against a Span<char>, please use "" instead.

Conclusion

We will not allow Span<char>s to be matched against null constants.

Type tests

We universally agreed that the input type must be known to be Span<char>. This is similar to how other constants work, such as this uint example:

using System;

uint u = 1;
Console.WriteLine(u is 1); // true
Console.WriteLine(((object)u) is 1); // false
Console.WriteLine(((object)u) is uint and 1); // true
Console.WriteLine(M1(u)); // false
Console.WriteLine(M2(u)); // true

bool M1<T>(T t) => t is 1;
bool M2<T>(T t) => t is uint and 1;
Conclusion

Input type must be Span<char> to be matched as a Span<char>. No implicit type tests will be emitted.

Subsumption rules

Based on our discussion around null constants, we think the subsumption rules for Span<char> should match the subsumption rules for strings, since they are both being matched with string constants. We think that there are a couple of open questions for Utf8 strings and list patterns though:

  1. Should Utf8 strings allow pattern matching as well?
  2. Should string patterns contribute to subsumption in list patterns over that same value? For example, should stringValue is [] or "" report an unreachable case?
Conclusion

Use the same subsumption rules as strings.

Checked operators

https://github.com/dotnet/csharplang/issues/4665

Last time we discussed checked operators, we made no solid conclusions on the lookup rules. At the core of our concerns with the lookup rules was concern over what principle was the "most important" part of lookup: nearness (how far up the type hierarchy did the compiler have to go to find the operator) or checkness (whether the current context is checked or unchecked). We've had a few different versions of the lookup rules here:

  • Version 1 was the original proposal for unary operators and binary operators. This was originally discussed on February 7th.
  • Version 2 was updated based on that feedback in this version. This was discussed on February 14th.
  • Finally, we have a new version of the rules to consider today:
    1. In an unchecked context, checked operators are ignored.
    2. In a checked context, when we find a checked operator in T0 we take that candidate but we ignore its regular counterpart (ie. in T0, same signature, but no checked) if any. All the other rules (resolution, betterness) can remain as they are today.

This final version is an attempt to codify that, on its own, a single unmarked operator means "all operations", but when paired with a marked checked operator, it's actually an unchecked operator. This is an idea that had been originally put into the checked operator proposal, but we had tried to step away from it and maintain existing operators as applicable to both checked and unchecked scenarios. As we tried to work through the effects on lookup, however, this seemed increasingly complicated and hard to reason about, so we are revisiting the idea.

We further considered whether we should require that, if a type defines a checked operator, it should also be required to define an unchecked version. There is the potential that this requirement could be onerous for some types that don't have a well-defined unchecked version, but those types have tools to express that: either continue to use a single unmarked operator, meaning it will be used for all checked and unchecked operations, or explicitly mark the unchecked version with Obsolete and throw from the body. We think that this nicely sets us up for success here: introducing a checked operator at level requires that we go from a default operator world, which just has a single unmarked operator, into a world that knows about checked and unchecked contexts. This allows us to design the lookup rules coherently to favor nearness, then checkedness, then betterness, as we wanted to. We still need to finalize the exact wording of the rules, but we liked the general principle of option 3. The compiler and the rules will still need to handle cases when a user manually defines a single checked operator without an unchecked version, either in IL or by making a method with the right names and attributes to appear to be an operator, but this is analogous to other operator scenarios the compiler needs to deal with today.

Finally, we discussed whether we should allow and/or require unchecked for unchecked counterpart operator. However, we did not have enough time to dig much into this, beyond realizing the room is evenly split among the three camps (required vs allowed vs disallowed), so a future meeting will need to resolve this.

Conclusion

We will require that declaring a checked operator means there must be a counterpart unchecked operator, and a future meeting will decide whether this counterpart operator can or must be marked with unchecked or not.