Revert "Bringing back essentials topics" (#661)
This commit is contained in:
Родитель
60605a957b
Коммит
0001f6e8ca
|
@ -119,7 +119,7 @@ wish to check if new ones are added, please check back here.
|
|||
* `linuxmint.17.2-x64`
|
||||
* `linuxmint.17.3-x64`
|
||||
|
||||
## macOS RIDs
|
||||
## OS X RIDs
|
||||
|
||||
* `osx.10.10-x64`
|
||||
* `osx.10.11-x64`
|
||||
|
|
|
@ -88,5 +88,5 @@ Publishes the current application using the `netcoreapp1.0` framework.
|
|||
|
||||
`dotnet publish --framework netcoreapp1.0 --runtime osx.10.11-x64`
|
||||
|
||||
Publishes the current application using the `netcoreapp1.0` framework and runtime for `macOS 10.10`. This RID has to
|
||||
Publishes the current application using the `netcoreapp1.0` framework and runtime for `OS X 10.10`. This RID has to
|
||||
exist in the `project.json` `runtimes` node.
|
||||
|
|
|
@ -0,0 +1,605 @@
|
|||
---
|
||||
title: Developing Libraries with Cross Platform Tools
|
||||
description: Developing Libraries with Cross Platform Tools
|
||||
keywords: .NET, .NET Core
|
||||
author: cartermp
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 9f6e8679-bd7e-4317-b3f9-7255a260d9cf
|
||||
---
|
||||
|
||||
# Developing Libraries with Cross Platform Tools
|
||||
|
||||
**Some details are subject to change as the toolchain evolves.**
|
||||
|
||||
This article covers how you can write libraries for .NET using cross-platform CLI tools. They provide an efficient and low-level experience that works across any supported OS. You can still build libraries with Visual Studio, and if that is your preferred experience then you should [refer to the Visual Studio guide](libraries-with-vs.md).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
You must have .NET Core installed on your machine. You will need [the .NET Core SDK and CLI](https://www.microsoft.com/net/core).
|
||||
|
||||
The sections of this document dealing with the .NET Framework versions or Portable Class Libraries (PCL) need the .NET Framework installed. They are only supported on Windows. To do this, [install the .NET Framework](http://getdotnet.azurewebsites.net).
|
||||
|
||||
Additionally, if you wish to support older targets, you will need to install targeting/developer packs for older framework versions from the [target platforms page](http://getdotnet.azurewebsites.net/target-dotnet-platforms.html). Refer to this table:
|
||||
|
||||
| .NET Framework Version | What to download |
|
||||
| ---------------------- | ----------------- |
|
||||
| 4.6 | .NET Framework 4.6 Targeting Pack |
|
||||
| 4.5.2 | .NET Framework 4.5.2 Developer Pack |
|
||||
| 4.5.1 | .NET Framework 4.5.1 Developer Pack |
|
||||
| 4.5 | Windows Software Development Kit for Windows 8 |
|
||||
| 4.0 | Windows SDK for Windows 7 and .NET Framework 4 |
|
||||
| 2.0, 3.0, and 3.5 | .NET Framework 3.5 SP1 Runtime (or Windows 8+ version) |
|
||||
|
||||
## How to target the .NET Standard
|
||||
|
||||
If you're not quite familiar with the .NET Standard, please refer to [the .NET Standard Library](../../standard/library.md) to learn more.
|
||||
|
||||
In that article, there is a table which maps .NET Standard versions to various implementations:
|
||||
|
||||
| Platform Name | Alias | | | | | | | |
|
||||
| :---------- | :--------- |:--------- |:--------- |:--------- |:--------- |:--------- |:--------- |:--------- |
|
||||
|.NET Standard | netstandard | 1.0 | 1.1 | 1.2 | 1.3 | 1.4 | 1.5 | 1.6 |
|
||||
|.NET Core|netcoreapp|→|→|→|→|→|→|1.0|
|
||||
|.NET Framework|net|→|4.5|4.5.1|4.6|4.6.1|4.6.2|4.6.3|
|
||||
|Mono/Xamarin Platforms||→|→|→|→|→|→|*|
|
||||
|Universal Windows Platform|uap|→|→|→|→|10.0|||
|
||||
|Windows|win|→|8.0|8.1|||||
|
||||
|Windows Phone|wpa|→|→|8.1|||||
|
||||
|Windows Phone Silverlight|wp|8.0|||||||
|
||||
|
||||
Here's what this table means for the purposes of creating a library:
|
||||
|
||||
The version of the .NET Platform Standard you pick will be a tradeoff between access to the newest APIs and ability to target more .NET platforms and Framework versions. You can do that by picking a version of `netstandardXX` (Where `XX` is a version number) and adding it to your `project.json` file.
|
||||
|
||||
Additionally, the corresponding [NuGet package to depend on](https://www.nuget.org/packages/NETStandard.Library/) is `NETStandard.Library` version `1.6.0`. Although there's nothing preventing you from depending on `Microsoft.NETCore.App` like with console apps, it's generally not recommended. If you need APIs from a package not specified in `NETStandard.Library`, you can always specify that package in addition to `NETStandard.Library` in the `dependencies` section of your `project.json` file.
|
||||
|
||||
You have three primary options when targeting the .NET Standard, depending on your needs.
|
||||
|
||||
1. You can use the latest version of the .NET Standard - `netstandard1.6` - which is for when you want access to the most APIs and don't mind if you have less reach across implementations.
|
||||
2. You can use a lower version of the .NET Standard to target earlier .NET implementations. The cost here is not having access to some of the latest APIs.
|
||||
|
||||
For example, if you wanted to have guaranteed compatibility with .NET Framework 4.6 and higher, you would pick `netstandard1.3`:
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies":{
|
||||
"NETStandard.Library":"1.6.0"
|
||||
},
|
||||
"frameworks":{
|
||||
"netstandard1.3":{}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The .NET Standard versions in a backward-compatible way. That means that `netstandard1.0` libraries run on `netstandard1.1` platforms and higher. However, there is no forwards-compatibility - lower .NET Standard platforms cannot reference higher ones. This means that `netstandard1.0` libraries cannot reference libraries targeting `netstandard1.1` or higher. You should select the Standard version that has the right mix of APIs and platform support for your needs.
|
||||
|
||||
3. If you want to target the .NET Framework versions 4.0 or below, or you wish to use an API available in the .NET Framework but not in the .NET Standard (for example, `System.Drawing`), read the following sections and learn how to multitarget.
|
||||
|
||||
## How to target the .NET Framework
|
||||
|
||||
**NOTE:** These instructions assume you have the .NET Framework installed on your machine. Refer to the [Prerequisites](#prerequisites) to get dependencies installed.
|
||||
|
||||
Keep in mind that some of the .NET Framework versions used here are no longer in support. Refer to the [.NET Framework Support Lifecycle Policy FAQ](https://support.microsoft.com/gp/framework_faq/en-us) about unsupported versions.
|
||||
|
||||
If you want to reach the maximum developers and projects, use the .NET Framework 4 as your baseline target. To target the .NET Framework, you will need to begin by using the correct Target Framework Moniker (TFM) that corresponds to the .NET Framework version you wish to support.
|
||||
|
||||
```
|
||||
.NET Framework 2.0 --> net20
|
||||
.NET Framework 3.0 --> net30
|
||||
.NET Framework 3.5 --> net35
|
||||
.NET Framework 4.0 --> net40
|
||||
.NET Framework 4.5 --> net45
|
||||
.NET Framework 4.5.1 --> net451
|
||||
.NET Framework 4.5.2 --> net452
|
||||
.NET Framework 4.6 --> net46
|
||||
.NET Framework 4.6.1 --> net461
|
||||
.NET Framework 4.6.2 --> net462
|
||||
.NET Framework 4.6.3 --> net463
|
||||
```
|
||||
|
||||
For example, here's how you would write a library which targets the .NET Framework 4:
|
||||
|
||||
```json
|
||||
{
|
||||
"frameworks":{
|
||||
"net40":{}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
And that's it! Although this compiled only for the .NET Framework 4, you can use the library on newer versions of the .NET Framework.
|
||||
|
||||
## How to target a Portable Class Library (PCL)
|
||||
|
||||
**NOTE:** These instructions assume you have the .NET Framework installed on your machine. Refer to the [Prerequisites](#prerequisites) to get dependencies installed.
|
||||
|
||||
Targeting a PCL profile is a bit trickier than targeting .NET Standard or the .NET Framework. For starters, [reference this list of PCL profiles](http://embed.plnkr.co/03ck2dCtnJogBKHJ9EjY/preview) to find the NuGet target which corresponds to the PCL profile you are targeting.
|
||||
|
||||
Then, you need to do the following:
|
||||
|
||||
1. Create a new entry under `frameworks` in your `project.json`, named `.NETPortable,Version=v{version},Profile=Profile{profile}`, where `{version}` and `{profile}` correspond to a PCL version number and Profile number, respectively.
|
||||
2. In this new entry, list every single assembly used for that target under a `frameworkAssemblies` entry. This includes `mscorlib`, `System`, and `System.Core`.
|
||||
3. If you are multitargeting (see the next section), you must explicitly list dependencies for each target under their target entries. You won't be able to use a global `dependencies` entry anymore.
|
||||
|
||||
The following is an example targeting PCL Profile 328. Profile 328 supports: .NET Standard 1.4, .NET Framework 4, Windows 8, Windows Phone 8.1, Windows Phone Silverlight 8.1, and Silverlight 5.
|
||||
|
||||
```json
|
||||
{
|
||||
"frameworks":{
|
||||
".NETPortable,Version=v4.0,Profile=Profile328":{
|
||||
"frameworkAssemblies":{
|
||||
"mscorlib":"",
|
||||
"System":"",
|
||||
"System.Core":""
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
You can now build.
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ dotnet build
|
||||
```
|
||||
|
||||
Notice the following entry in the `/bin/Debug` folder:
|
||||
|
||||
```
|
||||
$ ls bin/Debug
|
||||
|
||||
portable-net40+sl50+netcore45+wpa81+wp8/
|
||||
```
|
||||
|
||||
This folder contains the `.dll` files necessary to run your library.
|
||||
|
||||
## How to Multitarget
|
||||
|
||||
**NOTE:** These following instructions assume you have the .NET Framework installed on your machine. Refer to the [Prerequisites](#prerequisites) section to learn which dependencies you need to install and where to download them from.
|
||||
|
||||
You may need to target older versions of the .NET Framework when your project supports both the .NET Framework and .NET Core. In this scenario, if you want to use newer APIs and language constructs for the newer targets, use `#if` directives in your code. You also might need to add different packages and dependencies in your `project.json file` for each platform you're targeting to include the different APIs needed for each case.
|
||||
|
||||
For example, let's say you have a library that performs networking operations over HTTP. For .NET Standard and the .NET Framework versions 4.5 or higher, you can use the `HttpClient` class from the `System.Net.Http` namespace. However, earlier versions of the .NET Framework don't have the `HttpClient` class, so you could use the `WebClient` class from the `System.Net` namespace for those instead.
|
||||
|
||||
So, the `project.json` file could look like this:
|
||||
|
||||
```json
|
||||
{
|
||||
"frameworks":{
|
||||
"net40":{
|
||||
"frameworkAssemblies": {
|
||||
"System.Net":"",
|
||||
"System.Text.RegularExpressions":""
|
||||
}
|
||||
},
|
||||
"net452":{
|
||||
"frameworkAssemblies":{
|
||||
"System.Net":"",
|
||||
"System.Net.Http":"",
|
||||
"System.Text.RegularExpressions":"",
|
||||
"System.Threading.Tasks":""
|
||||
}
|
||||
},
|
||||
"netstandard1.6":{
|
||||
"dependencies": {
|
||||
"NETStandard.Library":"1.6.0",
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Note that the .NET Framework assemblies need to be referenced explicitly in the `net40` and `net452` target, and NuGet references are also explicitly listed in the `netstandard1.6` target. This is required in multitargeting scenarios.
|
||||
|
||||
Next, the `using` statements in your source file can be adjusted like this:
|
||||
|
||||
```csharp
|
||||
#if NET40
|
||||
// This only compiles for the .NET Framework 4 targets
|
||||
using System.Net;
|
||||
#else
|
||||
// This compiles for all other targets
|
||||
using System.Net.Http;
|
||||
using System.Threading.Tasks;
|
||||
#endif
|
||||
```
|
||||
|
||||
The build system is aware of the following preprocessor symbols used in `#if` directives:
|
||||
|
||||
```
|
||||
.NET Framework 2.0 --> NET20
|
||||
.NET Framework 3.5 --> NET35
|
||||
.NET Framework 4.0 --> NET40
|
||||
.NET Framework 4.5 --> NET45
|
||||
.NET Framework 4.5.1 --> NET451
|
||||
.NET Framework 4.5.2 --> NET452
|
||||
.NET Framework 4.6 --> NET46
|
||||
.NET Framework 4.6.1 --> NET461
|
||||
.NET Framework 4.6.2 --> NET462
|
||||
.NET Standard 1.0 --> NETSTANDARD1_0
|
||||
.NET Standard 1.1 --> NETSTANDARD1_1
|
||||
.NET Standard 1.2 --> NETSTANDARD1_2
|
||||
.NET Standard 1.3 --> NETSTANDARD1_3
|
||||
.NET Standard 1.4 --> NETSTANDARD1_4
|
||||
.NET Standard 1.5 --> NETSTANDARD1_5
|
||||
.NET Standard 1.6 --> NETSTANDARD1_6
|
||||
```
|
||||
|
||||
And in the middle of the source, you can use `#if` directives to use those libraries conditionally. For example:
|
||||
|
||||
```csharp
|
||||
public class Library
|
||||
{
|
||||
#if NET40
|
||||
private readonly WebClient _client = new WebClient();
|
||||
private readonly object _locker = new object();
|
||||
#else
|
||||
private readonly HttpClient _client = new HttpClient();
|
||||
#endif
|
||||
|
||||
#if NET40
|
||||
// .NET Framework 4.0 does not have async/await
|
||||
public string GetDotNetCount()
|
||||
{
|
||||
string url = "http://www.dotnetfoundation.org/";
|
||||
|
||||
var uri = new Uri(url);
|
||||
|
||||
string result = "";
|
||||
|
||||
// Lock here to provide thread-safety.
|
||||
lock(_locker)
|
||||
{
|
||||
result = _client.DownloadString(uri);
|
||||
}
|
||||
|
||||
int dotNetCount = Regex.Matches(result, ".NET").Count;
|
||||
|
||||
return $"Dotnet Foundation mentions .NET {dotNetCount} times!";
|
||||
}
|
||||
#else
|
||||
// .NET 4.5+ can use async/await!
|
||||
public async Task<string> GetDotNetCountAsync()
|
||||
{
|
||||
string url = "http://www.dotnetfoundation.org/";
|
||||
|
||||
// HttpClient is thread-safe, so no need to explicitly lock here
|
||||
var result = await _client.GetStringAsync(url);
|
||||
|
||||
int dotNetCount = Regex.Matches(result, ".NET").Count;
|
||||
|
||||
return $"dotnetfoundation.orgmentions .NET {dotNetCount} times in its HTML!";
|
||||
}
|
||||
#endif
|
||||
}
|
||||
```
|
||||
|
||||
Now you can build.
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ dotnet build
|
||||
```
|
||||
|
||||
Your `/bin/Debug` folder will look like this:
|
||||
|
||||
```
|
||||
$ ls bin/Debug
|
||||
|
||||
net40/
|
||||
net45/
|
||||
netstandard1.6/
|
||||
```
|
||||
|
||||
### But What about Multitargeting with Portable Class Libraries?
|
||||
|
||||
If you want to cross-compile with a PCL target, you must add a build definition in your `project.json` file under `buildOptions` in your PCL target. You can then use `#if` directives in the source which use the build definition as a preprocessor symbol.
|
||||
|
||||
For example, if you want to target [PCL profile 328](http://embed.plnkr.co/03ck2dCtnJogBKHJ9EjY/preview) (The .NET Framework 4, Windows 8, Windows Phone Silverlight 8, Windows Phone 8.1, Silverlight 5), you could to refer to it to as "PORTABLE328" when cross-compiling. Simply add it to the `project.json` file as a `buildOptions` attribute:
|
||||
|
||||
```json
|
||||
{
|
||||
"frameworks":{
|
||||
"netstandard1.6":{
|
||||
"dependencies":{
|
||||
"NETStandard.Library":"1.6.0",
|
||||
}
|
||||
},
|
||||
".NETPortable,Version=v4.0,Profile=Profile328":{
|
||||
"buildOptions": {
|
||||
"define": [ "PORTABLE328" ]
|
||||
},
|
||||
"frameworkAssemblies":{
|
||||
"mscorlib":"",
|
||||
"System":"",
|
||||
"System.Core":"",
|
||||
"System.Net"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
Now you can conditionally compile against that target:
|
||||
|
||||
```csharp
|
||||
#if !PORTABLE328
|
||||
using System.Net.Http;
|
||||
using System.Threading.Tasks;
|
||||
// Potentially other namespaces which aren't compatible with Profile 328
|
||||
#endif
|
||||
```
|
||||
|
||||
Because `PORTABLE328` is now recognized by the compiler, the PCL Profile 328 library generated by a compiler will not include `System.Net.Http` or `System.Threading.Tasks`.
|
||||
|
||||
Now you can build.
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ dotnet build
|
||||
```
|
||||
|
||||
Your `/bin/Debug` folder will look like this:
|
||||
|
||||
```
|
||||
$ ls bin/Debug
|
||||
|
||||
portable-net40+sl50+netcore45+wpa81+wp8/
|
||||
netstandard1.6/
|
||||
```
|
||||
|
||||
## How to use native dependencies
|
||||
|
||||
You may wish to write a library which depends on a native `.dll` file. If you're writing such a library, you have have two options:
|
||||
|
||||
1. Reference the native `.dll` directly in your `project.json`.
|
||||
2. Package that `.dll` into its own NuGet package and depend on that package.
|
||||
|
||||
For the first option, you'll need to include the following in your `project.json` file:
|
||||
|
||||
1. Setting `allowUnsafe` to `true` in a `buildOptions` section.
|
||||
2. Specifying the path to the native `.dll`(s) with a [Runtime Identifier (RID)](../rid-catalog.md) under `files` in the `packOptions` section.
|
||||
|
||||
If you're distributing your library as a package, it's recommended that you place the `.dll` file at the root level of your project. Here's an example `project.json` for a native `.dll` file that runs on Windows x64:
|
||||
|
||||
```json
|
||||
{
|
||||
"buildOptions":{
|
||||
"allowUnsafe":true
|
||||
},
|
||||
"packOptions":{
|
||||
"files":{
|
||||
"runtimes/win7-x64/native/":"native-lib.dll"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
For the second option, you'll need to build a NuGet package out of your `.dll` file(s), host on a NuGet or MyGet feed, and depend on it directly. You'll still need to set `allowUnsafe` to `true` in the `buildOptions` section of your `project.json`. Here's an example (assuming `MyNativeLib` is a Nuget package at version `1.2.0`):
|
||||
|
||||
```json
|
||||
{
|
||||
"buildOptions":{
|
||||
"allowUnsafe":true
|
||||
},
|
||||
"dependencies":{
|
||||
"MyNativeLib":"1.2.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
To see an example of packaging up cross-platform native binaries, check out the [ASP.NET Libuv Package](https://github.com/aspnet/libuv-package) and the [corresponding reference in KestrelHttpServer](https://github.com/aspnet/KestrelHttpServer/blob/dev/src/Microsoft.AspNetCore.Server.Kestrel/project.json#L18).
|
||||
|
||||
## How to test libraries on .NET Core
|
||||
|
||||
It's important to be able to test across platforms. It's easiest to use [xUnit](http://xunit.github.io/), which is also the testing tool used by .NET Core projects. Setting up your solution with test projects will depend on the [structure of your solution](#structuring-your-solution). The following example assumes that all source projects are under a top-level `/src` folder and all test projects are under a top-level `/test` folder.
|
||||
|
||||
1. Ensure you have a `global.json` file at the solution level which understands where the test projects are:
|
||||
|
||||
```json
|
||||
{
|
||||
"projects":[ "src", "test"]
|
||||
}
|
||||
```
|
||||
|
||||
Your solution folder structure should then look like this:
|
||||
|
||||
```
|
||||
/SolutionWithSrcAndTest
|
||||
|__global.json
|
||||
|__/src
|
||||
|__/test
|
||||
```
|
||||
|
||||
2. Create a new test project by creating a `project.json` file under your `/test` folder. You can also run the `dotnet new` command and modify the `project.json` file afterwards. It should have the following:
|
||||
|
||||
* `netcoreapp1.0` listed as the only entry under `frameworks`.
|
||||
* `dnxcore50` and `portable-net45+win8` added as `imports` under `netcoreapp1.0`.
|
||||
* A reference to `Microsoft.NETCore.App` version `1.0.0`.
|
||||
* A reference to xUnit version `2.1.0`.
|
||||
* A reference to `dotnet-test-xunit` version `1.0.0-rc2-build10025`
|
||||
* A project reference to the library being tested.
|
||||
* The entry `"testRunner":"xunit"`.
|
||||
|
||||
Here's an example (`LibraryUnderTest` version `1.0.0` is the library being tested):
|
||||
|
||||
```json
|
||||
{
|
||||
"testRunner":"xunit",
|
||||
"dependencies":{
|
||||
"LibraryUnderTest":{
|
||||
"version":"1.0.0",
|
||||
"target":"project"
|
||||
},
|
||||
"Microsoft.NETCore.App":{
|
||||
"type":"platform",
|
||||
"version":"1.0.0"
|
||||
},
|
||||
"xunit":"2.1.0",
|
||||
"dotnet-test-xunit":"1.0.0-rc2",
|
||||
},
|
||||
"frameworks":{
|
||||
"netcoreapp1.0":{
|
||||
"imports":[ "dnxcore50", "portable-net45+win8" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
3. Restore packages by running `dotnet restore`. You should do this at the solution level if you haven't restored packages yet.
|
||||
|
||||
4. Navigate to your test project and run tests with `dotnet test`:
|
||||
|
||||
```
|
||||
$ cd path-to-your-test-project
|
||||
$ dotnet test
|
||||
```
|
||||
|
||||
And that's it! You can now test your library across all platforms using command line tools. To continue testing now that you have everything set up, testing your library is very simple:
|
||||
|
||||
1. Make changes to your library.
|
||||
2. Run tests from the command line, in your test directory, with `dotnet test` command.
|
||||
|
||||
Your code will be automatically rebuilt when you invoke `dotnet test` command.
|
||||
|
||||
Just remember to run `dotnet restore` from the command line any time you add a new dependency and you'll be good to go!
|
||||
|
||||
## How to use multiple projects
|
||||
|
||||
A common need for larger libraries is to place functionality in different projects.
|
||||
|
||||
Imagine you wished to build a library which could be consumed in idiomatic C# and F#. That would mean that consumers of your library consume them in ways which are natural to C# or F#. For example, in C# you might consume the library like this:
|
||||
|
||||
```csharp
|
||||
var convertResult = await AwesomeLibrary.ConvertAsync(data);
|
||||
var result = AwesomeLibrary.Process(convertResult);
|
||||
```
|
||||
|
||||
In F#, it might look like this:
|
||||
|
||||
```fsharp
|
||||
let result =
|
||||
data
|
||||
|> AwesomeLibrary.convertAsync
|
||||
|> Async.RunSynchronously
|
||||
|> AwesomeLibrary.process
|
||||
```
|
||||
|
||||
Consumtion scenarios like this mean that the APIs being accessed have to have a different structure for C# and F#. A common approach to accomplishing this is to factor all of the logic of a library into a core project, with C# and F# projects defining the API layers that call into that core project. The rest of the section will use the following names:
|
||||
|
||||
* **AwesomeLibrary.Core** - A core project which contains all logic for the library
|
||||
* **AwesomeLibrary.CSharp** - A project with public APIs intended for consumption in C#
|
||||
* **AwesomeLibrary.FSharp** - A project with public APIs intended for consumption in F#
|
||||
|
||||
### Project-to-project referencing
|
||||
|
||||
To reference a project, you need to do two things:
|
||||
|
||||
1. Understand the name and version number of the project you wish to reference.
|
||||
2. List that project as a dependency using the name and version number from (1).
|
||||
|
||||
In the above case, you may wish to set up the `project.json` for **AwesomeLibrary.Core** as follows:
|
||||
|
||||
```json
|
||||
{
|
||||
"name":"AwesomeLibrary.Core",
|
||||
"version":"1.0.0"
|
||||
}
|
||||
```
|
||||
|
||||
You can use these entries in the `project.json` to control the name and version of the project. If you don't specify these, the default configuration is to use the name of the containing folder as the name and 1.0.0 as the version number.
|
||||
|
||||
The `project.json` files for both **AwesomeLibrary.CSharp** and **AwesomeLibrary.FSharp** now need to reference **AwesomeLibrary.Core** as a `project` target. If you aren't multitargeting, you can use the global `dependencies` entry:
|
||||
|
||||
```json
|
||||
{
|
||||
"dependencies":{
|
||||
"AwesomeLibrary.Core":{
|
||||
"version":"1.0.0",
|
||||
"target":"project"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> **Note:** Failure to list the reference as a `project` target may result in NuGet resolving the dependency with an existing NuGet package which happens to have the same name. Always specify `"target":"project"` when referencing a project in the same solution.
|
||||
|
||||
If you are multitargeting, you may not be able to use a global `dependencies` entry and may have to reference **AwesomeLibrary.Core** in a target-level `dependencies` entry. For example, if you were targeting `netstandard1.6`, you could do so like this:
|
||||
|
||||
```json
|
||||
{
|
||||
"frameworks":{
|
||||
"netstandard1.6":{
|
||||
"dependencies":{
|
||||
"AwesomeLibrary.Core":{
|
||||
"version":"1.0.0",
|
||||
"target":"project"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Structuring a Solution
|
||||
|
||||
Another important aspect of multi-project solutions is establishing a good overall project structure. To structure a multi-project library, you must use top-level `/src` and `/test` folders:
|
||||
|
||||
```
|
||||
/AwesomeLibrary
|
||||
|__global.json
|
||||
|__/src
|
||||
|__/AwesomeLibrary.Core
|
||||
|__Source Files
|
||||
|__project.json
|
||||
|__/AwesomeLibrary.CSharp
|
||||
|__Source Files
|
||||
|__project.json
|
||||
|__/AwesomeLibrary.FSharp
|
||||
|__Source Files
|
||||
|__project.json
|
||||
/test
|
||||
|__/AwesomeLibrary.Core.Tests
|
||||
|__Test Files
|
||||
|__project.json
|
||||
|__/AwesomeLibrary.CSharp.Tests
|
||||
|__Test Files
|
||||
|__project.json
|
||||
|__/AwesomeLibrary.FSharp.Tests
|
||||
|__Test Files
|
||||
|__project.json
|
||||
```
|
||||
|
||||
The `global.json` file for this solution would look like this:
|
||||
|
||||
```json
|
||||
{
|
||||
"projects":["src", "test"]
|
||||
}
|
||||
```
|
||||
|
||||
This approach follows the same pattern established by project templates in the `dotnet new` command establish, where all projects are placed under a `/src` directory and all tests are placed under a `/test` directory.
|
||||
|
||||
Here's how you could restore packages, build, and test your entire project:
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ cd src/AwesomeLibrary.FSharp
|
||||
$ dotnet build
|
||||
$ cd ../AwesomeLibrary.CSharp
|
||||
$ dotnet build
|
||||
$ cd ../../test/AwesomeLibrary.Core.Tests
|
||||
$ dotnet test
|
||||
$ cd ../AwesomeLibrary.CSharp.Tests
|
||||
$ dotnet test
|
||||
$ cd ../AwesomeLibrary.FSharp.Tests
|
||||
$ dotnet test
|
||||
```
|
||||
|
||||
And that's it!
|
|
@ -0,0 +1,665 @@
|
|||
---
|
||||
title: Getting started with .NET Core on Windows/Linux/macOS using the command line
|
||||
description: Getting started with .NET Core on Windows, Linux, or macOS using the .NET Core command line interface (CLI)
|
||||
keywords: .NET, .NET Core
|
||||
author: cartermp
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: be988f09-7349-43b0-97fb-3a703d4587ce
|
||||
---
|
||||
|
||||
# Getting started with .NET Core on Windows/Linux/macOS using the command line
|
||||
|
||||
This guide will show you how to use the .NET Core CLI tooling to build cross-platform console apps. It will start with the most basic console app and eventually span multiple projects, including testing. You'll add these features step-by-step, building on what you've already seen and built.
|
||||
|
||||
If you're unfamiliar with the .NET Core CLI toolset, read [the .NET Core SDK overview](../sdk.md).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before you begin, ensure you have the [latest .NET Core CLI tooling](https://www.microsoft.com/net/core). You'll also need a text editor.
|
||||
|
||||
## Hello, Console App!
|
||||
|
||||
First, navigate to or create a new folder with a name you like. "Hello" is the name chosen for the sample code, which can be found [here](https://github.com/dotnet/core-docs/tree/master/samples/core-projects/console-apps/Hello).
|
||||
|
||||
Open up a command prompt and type the following:
|
||||
|
||||
```
|
||||
$ dotnet new
|
||||
$ dotnet restore
|
||||
$ dotnet run
|
||||
```
|
||||
|
||||
Let's do a quick walkthrough:
|
||||
|
||||
1. `$ dotnet new`
|
||||
|
||||
[`dotnet new`](../tools/dotnet-new.md) creates an up-to-date `project.json` file with NuGet dependencies necessary to build a console app. It also creates a `Program.cs`, a basic file containing the entry point for the application.
|
||||
|
||||
`project.json`:
|
||||
```javascript
|
||||
{
|
||||
"version": "1.0.0-*",
|
||||
"buildOptions": {
|
||||
"emitEntryPoint": true
|
||||
},
|
||||
"dependencies": {
|
||||
"Microsoft.NETCore.App": {
|
||||
"type": "platform",
|
||||
"version": "1.0.0"
|
||||
}
|
||||
},
|
||||
"frameworks": {
|
||||
"netcoreapp1.0": {
|
||||
"imports": "dnxcore50"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
`Program.cs`:
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
namespace ConsoleApplication
|
||||
{
|
||||
public class Program
|
||||
{
|
||||
public static void Main(string[] args)
|
||||
{
|
||||
Console.WriteLine("Hello World!");
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
2. `$ dotnet restore`
|
||||
|
||||
[`dotnet restore`](../tools/dotnet-restore.md) calls into NuGet to restore the tree of dependencies. NuGet analyzes the `project.json` file, downloads the dependencies stated in the file (or grabs them from a cache on your machine), and writes the `project.lock.json` file. The `project.lock.json` file is necessary to be able to compile and run.
|
||||
|
||||
The `project.lock.json` file is a persisted and complete set of the graph of NuGet dependencies and other information describing an app. This file is read by other tools, such as `dotnet build` and `dotnet run`, enabling them to process the source code with a correct set of NuGet dependencies and binding resolutions.
|
||||
|
||||
3. `$ dotnet run`
|
||||
|
||||
[`dotnet run`](../tools/dotnet-run.md) calls `dotnet build` to ensure that the build targets have been built, and then calls `dotnet <assembly.dll>` to run the target application.
|
||||
|
||||
```
|
||||
$ dotnet run
|
||||
Hello, World!
|
||||
```
|
||||
|
||||
You can also execute [`dotnet build`](../tools/dotnet-build.md) to compile and the code without running the build console applications.
|
||||
|
||||
### Building a self-contained application
|
||||
|
||||
Let's try compiling a self-contained application instead of a portable application. You can read more about the [types of portability in .NET Core](../app-types.md) to learn about the different application types, and how they are deployed.
|
||||
|
||||
You need to make some changes to your `project.json`
|
||||
file to direct the tools to build a self-contained application. You can see these in the
|
||||
[HelloNative](https://github.com/dotnet/core-docs/tree/master/samples/core-projects/console-apps/HelloNative)
|
||||
project in the samples directory.
|
||||
|
||||
The first change is to remove the `"type": "platform"` element from all dependencies.
|
||||
This project's only dependency so far is `"Microsoft.NETCore.App"`. The `dependencies` section should look like this:
|
||||
|
||||
```javascript
|
||||
"dependencies": {
|
||||
"Microsoft.NETCore.App": {
|
||||
"version": "1.0.0"
|
||||
}
|
||||
},
|
||||
```
|
||||
|
||||
Next, you need to add a `runtimes` node to specify all the target execution environments. For example, the following
|
||||
`runtimes` node instructs the build system to create executables for the 64 bit version of Windows 10 and the 64 bit version of Mac OS X version 10.11.
|
||||
The build system will generate native executables for the current environment. If you are following these steps on a Windows machine,
|
||||
you'll build a Windows executable. If you are following these steps on a Mac, you'll build the OS X executable.
|
||||
|
||||
```javascript
|
||||
"runtimes": {
|
||||
"win10-x64": {},
|
||||
"osx.10.11-x64": {}
|
||||
}
|
||||
```
|
||||
|
||||
See the full list of supported runtimes in the [RID catalog](../rid-catalog.md).
|
||||
|
||||
After making those two changes you execute `dotnet restore`, followed by `dotnet build` to create the native executable. Then, you can run the generated
|
||||
native executable.
|
||||
|
||||
The following example shows the commands for Windows. The example shows where the native executable gets generated and assumes that the project directory is named HelloNative.
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ dotnet build
|
||||
$ .\bin\Debug\netcoreapp1.0\win10-x64\HelloNative.exe
|
||||
Hello World!
|
||||
```
|
||||
|
||||
You may notice that the native application takes slightly longer to build, but executes slightly faster. This behavior
|
||||
becomes more noticeable as the application grows.
|
||||
|
||||
The build process generates several more files when your `project.json` creates a native build. These files
|
||||
are created in `bin\Debug\netcoreapp1.0\<platform>` where `<platform>` is the RID chosen. In addition to the
|
||||
project's `HelloNative.dll` there is a `HelloNative.exe` that loads the runtime and starts the application.
|
||||
Note that the name of the generated application changed because the project directory's name has changed.
|
||||
|
||||
You may want to package this application to execute it on a machine that does not include the .NET runtime.
|
||||
You do that using the `dotnet publish` command. The `dotnet publish` command creates a new subdirectory
|
||||
under the `./bin/Debug/netcoreapp1.0/<platform>` directory called `publish`. It copies the executable,
|
||||
all dependent DLLs and the framework to this sub directory. You can package that directory to another machine
|
||||
(or a container) and execute the application there.
|
||||
|
||||
Let's contrast that with the behavior of `dotnet publish` in the first Hello World sample. That application
|
||||
is a *portable application*, which is the default type of application for .NET Core. A portable application
|
||||
requires that .NET Core is installed on the target machine. Portable applications can be built on one machine
|
||||
and executed anywhere. Native applications must be built separately for each target machine. `dotnet publish`
|
||||
creates a directory that has the application's DLL, and any dependent dlls that are not part of the platform
|
||||
installation.
|
||||
|
||||
### Augmenting the program
|
||||
|
||||
Let's change the file just a little bit. Fibonacci numbers are fun, so let's try that out (using
|
||||
the native version):
|
||||
|
||||
`Program.cs`:
|
||||
|
||||
```csharp
|
||||
using static System.Console;
|
||||
|
||||
namespace ConsoleApplication
|
||||
{
|
||||
public class Program
|
||||
{
|
||||
public static int FibonacciNumber(int n)
|
||||
{
|
||||
int a = 0;
|
||||
int b = 1;
|
||||
int tmp;
|
||||
|
||||
for (int i = 0; i < n; i++)
|
||||
{
|
||||
tmp = a;
|
||||
a = b;
|
||||
b += tmp;
|
||||
}
|
||||
|
||||
return a;
|
||||
}
|
||||
|
||||
public static void Main(string[] args)
|
||||
{
|
||||
WriteLine("Hello World!");
|
||||
WriteLine("Fibonacci Numbers 1-15:");
|
||||
|
||||
for (int i = 0; i < 15; i++)
|
||||
{
|
||||
WriteLine($"{i+1}: {FibonacciNumber(i)}");
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
And running the program (assuming you're on Windows, and have changed the project directory name to Fibonacci):
|
||||
|
||||
```
|
||||
$ dotnet build
|
||||
$ .\bin\Debug\netcoreapp1.0\win10-x64\Fibonacci.exe
|
||||
1: 0
|
||||
2: 1
|
||||
3: 1
|
||||
4: 2
|
||||
5: 3
|
||||
6: 5
|
||||
7: 8
|
||||
8: 13
|
||||
9: 21
|
||||
10: 34
|
||||
11: 55
|
||||
12: 89
|
||||
13: 144
|
||||
14: 233
|
||||
15: 377
|
||||
```
|
||||
|
||||
And that's it! You can augment `Program.cs` any way you like.
|
||||
|
||||
## Adding some new files
|
||||
|
||||
Single files are fine for simple one-off programs, but chances are you're going to want to break things out into multiple files if you're building anything which has multiple components. Multiple files are a way to do that.
|
||||
|
||||
Create a new file and give it a unique namespace:
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
namespace NumberFun
|
||||
{
|
||||
// code can go here
|
||||
}
|
||||
```
|
||||
|
||||
Next, include it in your `Program.cs` file:
|
||||
|
||||
```csharp
|
||||
using static System.Console;
|
||||
using NumberFun;
|
||||
```
|
||||
|
||||
And finally, you can build it:
|
||||
|
||||
`$ dotnet build`
|
||||
|
||||
Now the fun part: making the new file do something!
|
||||
|
||||
### Example: A Fibonacci Sequence Generator
|
||||
|
||||
Let's say you want to build off of the previous [Fibonacci example](https://github.com/dotnet/core-docs/tree/master/samples/core-projects/console-apps/Fibonacci) by caching some Fibonacci values and add some recursive flair. Your code for a [better Fibonacci example](https://github.com/dotnet/core-docs/tree/master/samples/core-projects/console-apps/FibonacciBetter) might look something like this:
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
|
||||
namespace NumberFun
|
||||
{
|
||||
public class FibonacciGenerator
|
||||
{
|
||||
private Dictionary<int, int> _cache = new Dictionary<int, int>();
|
||||
|
||||
private int Fib(int n) => n < 2 ? n : FibValue(n - 1) + FibValue(n - 2);
|
||||
|
||||
private int FibValue(int n)
|
||||
{
|
||||
if (!_cache.ContainsKey(n))
|
||||
{
|
||||
_cache.Add(n, Fib(n));
|
||||
}
|
||||
|
||||
return _cache[n];
|
||||
}
|
||||
|
||||
public IEnumerable<int> Generate(int n)
|
||||
{
|
||||
for (int i = 0; i < n; i++)
|
||||
{
|
||||
yield return FibValue(i);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Note that the use of `Dictionary<int, int>` and `IEnumerable<int>` means incorporating the `System.Collections` namespace.
|
||||
The `Microsoft.NetCore.App` package is a *metapackage* that contains many of the core
|
||||
assemblies from the .NET Framework. By including this metapackage, you've already included
|
||||
the `System.Collections.dll` assembly as part of your project. You can verify this by
|
||||
running `dotnet publish` and examining the files that are part of the installed
|
||||
package. You'll see `System.Collections.dll` in the list.
|
||||
|
||||
```javascript
|
||||
{
|
||||
"version": "1.0.0-*",
|
||||
"buildOptions": {
|
||||
"emitEntryPoint": true
|
||||
},
|
||||
"dependencies": {
|
||||
"Microsoft.NETCore.App": {
|
||||
"version": "1.0.0-rc2-3002702"
|
||||
}
|
||||
},
|
||||
"frameworks": {
|
||||
"netcoreapp1.0": {
|
||||
"imports": "dnxcore50"
|
||||
}
|
||||
},
|
||||
"runtimes": {
|
||||
"win10-x64": {},
|
||||
"osx.10.11-x64": {}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now adjust the `Main()` method in your `Program.cs` file as shown below. The example assumes that `Program.cs` has a `using System;` statement. If you have a `using static System.Console;` statement, remove `Console.` from `Console.WriteLine`.
|
||||
|
||||
```csharp
|
||||
public static void Main(string[] args)
|
||||
{
|
||||
var generator = new FibonacciGenerator();
|
||||
foreach (var digit in generator.Generate(15))
|
||||
{
|
||||
Console.WriteLine(digit);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Finally, run it!
|
||||
|
||||
```
|
||||
$ dotnet run
|
||||
0
|
||||
1
|
||||
1
|
||||
2
|
||||
3
|
||||
5
|
||||
8
|
||||
13
|
||||
21
|
||||
34
|
||||
55
|
||||
89
|
||||
144
|
||||
233
|
||||
377
|
||||
```
|
||||
|
||||
And that's it!
|
||||
|
||||
## Using folders to organize code
|
||||
|
||||
Say you wanted to introduce some new types to do work on. You can do this by adding more files and making sure to give them namespaces you can include in your `Program.cs` file.
|
||||
|
||||
```
|
||||
/MyProject
|
||||
|__Program.cs
|
||||
|__AccountInformation.cs
|
||||
|__MonthlyReportRecords.cs
|
||||
|__project.json
|
||||
```
|
||||
|
||||
This works great when the size of your project is relatively small. However, if you have a larger app with many different data types and potentially multiple layers, you may wish to organize things logically. This is where folders come into play. You can either follow along with [the NewTypes sample project](https://github.com/dotnet/core-docs/tree/master/samples/core-projects/console-apps/NewTypes) that this guide covers, or create your own files and folders.
|
||||
|
||||
To begin, create a new folder under the root of your project. `/Model` is chosen here.
|
||||
|
||||
```
|
||||
/NewTypes
|
||||
|__/Model
|
||||
|__Program.cs
|
||||
|__project.json
|
||||
```
|
||||
|
||||
Now add some new types to the folder:
|
||||
|
||||
```
|
||||
/NewTypes
|
||||
|__/Model
|
||||
|__AccountInformation.cs
|
||||
|__MonthlyReportRecords.cs
|
||||
|__Program.cs
|
||||
|__project.json
|
||||
```
|
||||
|
||||
Now, just as if they were files in the same directory, give them all the same namespace so you can include them in your `Program.cs`.
|
||||
|
||||
### Example: Pet Types
|
||||
|
||||
This example creates two new types, `Dog` and `Cat`, and has them implement an interface, `IPet`.
|
||||
|
||||
Folder Structure:
|
||||
|
||||
```
|
||||
/NewTypes
|
||||
|__/Pets
|
||||
|__Dog.cs
|
||||
|__Cat.cs
|
||||
|__IPet.cs
|
||||
|__Program.cs
|
||||
|__project.json
|
||||
```
|
||||
|
||||
`IPet.cs`:
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
namespace Pets
|
||||
{
|
||||
public interface IPet
|
||||
{
|
||||
string TalkToOwner();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`Dog.cs`:
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
namespace Pets
|
||||
{
|
||||
public class Dog : IPet
|
||||
{
|
||||
public string TalkToOwner() => "Woof!";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`Cat.cs`:
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
namespace Pets
|
||||
{
|
||||
public class Cat : IPet
|
||||
{
|
||||
public string TalkToOwner() => "Meow!";
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`Program.cs`:
|
||||
```csharp
|
||||
using System;
|
||||
using Pets;
|
||||
using System.Collections.Generic;
|
||||
|
||||
namespace ConsoleApplication
|
||||
{
|
||||
public class Program
|
||||
{
|
||||
public static void Main(string[] args)
|
||||
{
|
||||
List<IPet> pets = new List<IPet>
|
||||
{
|
||||
new Dog(),
|
||||
new Cat()
|
||||
};
|
||||
|
||||
foreach (var pet in pets)
|
||||
{
|
||||
Console.WriteLine(pet.TalkToOwner());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`project.json`:
|
||||
```javascript
|
||||
{
|
||||
"version": "1.0.0-*",
|
||||
"buildOptions": {
|
||||
"emitEntryPoint": true
|
||||
},
|
||||
"dependencies": {
|
||||
"Microsoft.NETCore.App": {
|
||||
"type": "platform",
|
||||
"version": "1.0.0-rc2-3002702"
|
||||
}
|
||||
},
|
||||
"frameworks": {
|
||||
"netcoreapp1.0": {
|
||||
"imports": "dnxcore50"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
And if you run this:
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ dotnet run
|
||||
Woof!
|
||||
Meow!
|
||||
```
|
||||
|
||||
New pet types can be added (such as a `Bird`), extending this project.
|
||||
|
||||
## Testing your Console App
|
||||
|
||||
You'll probably be wanting to test your projects at some point. Here's a good way to do it:
|
||||
|
||||
1. Move any source of your existing project into a new `src` folder.
|
||||
|
||||
```
|
||||
/Project
|
||||
|__/src
|
||||
```
|
||||
|
||||
2. Create a `/test` directory.
|
||||
|
||||
```
|
||||
/Project
|
||||
|__/src
|
||||
|__/test
|
||||
```
|
||||
|
||||
3. Create a new `global.json` file:
|
||||
|
||||
```
|
||||
/Project
|
||||
|__/src
|
||||
|__/test
|
||||
|__global.json
|
||||
```
|
||||
|
||||
`global.json`:
|
||||
```javascript
|
||||
{
|
||||
"projects": [
|
||||
"src", "test"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
This file tells the build system that this is a multi-project system, which allows it to look for dependencies in more than just the current folder it happens to be executing in. This is important because it allows you to place a dependency on the code under test in your test project.
|
||||
|
||||
### Example: Extending the NewTypes project
|
||||
|
||||
Now that the project system is in place, you can create your test project and start writing tests! From here on out, this guide will use and extend [the sample Types project](https://github.com/dotnet/core-docs/tree/master/samples/core-projects/console-apps/NewTypes). Additionally, it will use the [Xunit](https://xunit.github.io/) test framework. Feel free to follow along or create your own multi-project system with tests.
|
||||
|
||||
|
||||
The whole project structure should look like this:
|
||||
|
||||
```
|
||||
/NewTypes
|
||||
|__/src
|
||||
|__/NewTypes
|
||||
|__/Pets
|
||||
|__Dog.cs
|
||||
|__Cat.cs
|
||||
|__IPet.cs
|
||||
|__Program.cs
|
||||
|__project.json
|
||||
|__/test
|
||||
|__NewTypesTests
|
||||
|__TypesTests.cs
|
||||
|__project.json
|
||||
|__global.json
|
||||
```
|
||||
|
||||
There are two new things to make sure you have in your test project:
|
||||
|
||||
1. A correct `project.json` with the following:
|
||||
|
||||
* A reference to `xunit`
|
||||
* A reference to `dotnet-test-xunit`
|
||||
* A reference to the namespace corresponding to the code under test
|
||||
|
||||
2. An Xunit test class.
|
||||
|
||||
`NewTypesTests/project.json`:
|
||||
```javascript
|
||||
{
|
||||
"version": "1.0.0-*",
|
||||
"testRunner": "xunit",
|
||||
|
||||
"dependencies": {
|
||||
"Microsoft.NETCore.App": {
|
||||
"type":"platform",
|
||||
"version": "1.0.0-rc2-3002702"
|
||||
},
|
||||
"xunit":"2.1.0",
|
||||
"dotnet-test-xunit": "1.0.0-rc2-build10015",
|
||||
"NewTypes": "1.0.0"
|
||||
},
|
||||
"frameworks": {
|
||||
"netcoreapp1.0": {
|
||||
"imports": [
|
||||
"dnxcore50",
|
||||
"portable-net45+win8"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`PetTests.cs`:
|
||||
```csharp
|
||||
using System;
|
||||
using Xunit;
|
||||
using Pets;
|
||||
public class PetTests
|
||||
{
|
||||
[Fact]
|
||||
public void DogTalkToOwnerTest()
|
||||
{
|
||||
string expected = "Woof!";
|
||||
string actual = new Dog().TalkToOwner();
|
||||
|
||||
Assert.Equal(expected, actual);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void CatTalkToOwnerTest()
|
||||
{
|
||||
string expected = "Meow!";
|
||||
string actual = new Cat().TalkToOwner();
|
||||
|
||||
Assert.Equal(expected, actual);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now you can run tests! The [`dotnet test`](../tools/dotnet-test.md) command runs the test runner you have specified in your project. Make sure you start at the top-level directory.
|
||||
|
||||
```
|
||||
$ dotnet restore
|
||||
$ cd test/NewTypesTests
|
||||
$ dotnet test
|
||||
```
|
||||
|
||||
Output should look like this:
|
||||
|
||||
```
|
||||
xUnit.net .NET CLI test runner (64-bit win10-x64)
|
||||
Discovering: NewTypesTests
|
||||
Discovered: NewTypesTests
|
||||
Starting: NewTypesTests
|
||||
Finished: NewTypesTests
|
||||
=== TEST EXECUTION SUMMARY ===
|
||||
NewTypesTests Total: 2, Errors: 0, Failed: 0, Skipped: 0, Time: 0.144s
|
||||
SUMMARY: Total: 1 targets, Passed: 1, Failed: 0.
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
Hopefully this guide has helped you learn how to create a .NET Core console app, from the basics all the way up to a multi-project system with unit tests. The next step is to create awesome console apps of your own!
|
||||
|
||||
If a more advanced example of a console app interests you, check out the next tutorial: [Using the CLI tools to write console apps: An advanced step-by-step guide](cli-console-app-tutorial-advanced.md).
|
||||
|
||||
|
|
@ -22,7 +22,7 @@ The scripts in this document describe the steps necessary to build a number of t
|
|||
Prerequisites
|
||||
-------------
|
||||
|
||||
These scripts assume that .NET Core 1.0 and VS Code or another code editor are installed on the machine.
|
||||
These scripts assume that .NET Core RC2 CLI SDK preview and VS Code or another code editor are installed on the machine.
|
||||
|
||||
A solution using only .NET Core projects
|
||||
----------------------------------------
|
||||
|
|
|
@ -22,7 +22,7 @@ The scripts in this document describe the steps necessary to build a number of t
|
|||
Prerequisites
|
||||
-------------
|
||||
|
||||
These scripts assume that .NET Core 1.0 and VS Code or another code editor are installed on the machine.
|
||||
These scripts assume that .NET Core RC2 CLI SDK preview and VS Code or another code editor are installed on the machine.
|
||||
|
||||
A solution using only .NET Core projects
|
||||
----------------------------------------
|
||||
|
|
|
@ -1,446 +0,0 @@
|
|||
---
|
||||
title: Common Type System
|
||||
description: Common Type System
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: bc6245bd-46ce-4800-bdf9-00e2989cfa6a
|
||||
---
|
||||
|
||||
# Common Type System
|
||||
|
||||
The common type system defines how types are declared, used, and managed in the Common Language Runtime, and is also an important part of the runtime's support for cross-language integration. The common type system performs the following functions:
|
||||
|
||||
* Establishes a framework that helps enable cross-language integration, type safety, and high-performance code execution.
|
||||
|
||||
* Provides an object-oriented model that supports the complete implementation of many programming languages.
|
||||
|
||||
* Defines rules that languages must follow, which helps ensure that objects written in different languages can interact with each other.
|
||||
|
||||
* Provides a library that contains the primitive data types (such as [Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean), [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Char](https://docs.microsoft.com/dotnet/core/api/System.Char), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), and [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)) used in application development.
|
||||
|
||||
This topic contains the following sections:
|
||||
|
||||
* [Types in .NET Core](#Types-in-.NET-Core)
|
||||
|
||||
* [Type Definitions](#Type-Definitions)
|
||||
|
||||
* [Type Members](#Type-Members)
|
||||
|
||||
* [Characteristics of Type Members](#Characteristics-of-Type-Members)
|
||||
|
||||
|
||||
## Types in .NET Core
|
||||
|
||||
All types in .NET Core are either value types or reference types.
|
||||
|
||||
Value types are data types whose objects are represented by the object's actual value. If an instance of a value type is assigned to a variable, that variable is given a fresh copy of the value.
|
||||
|
||||
Reference types are data types whose objects are represented by a reference (similar to a pointer) to the object's actual value. If a reference type is assigned to a variable, that variable references (points to) the original value. No copy is made.
|
||||
|
||||
The common type system in .NET Core supports the following five categories of types:
|
||||
|
||||
* [Classes](#Classes)
|
||||
|
||||
* [Structures](#Structures)
|
||||
|
||||
* [Enumerations](#Enumerations)
|
||||
|
||||
* [Interfaces](#Interfaces)
|
||||
|
||||
* [Delegates](#Delegates)
|
||||
|
||||
### Classes
|
||||
|
||||
A class is a reference type that can be derived directly from another class and that is derived implicitly from [System.Object](https://docs.microsoft.com/dotnet/core/api/System.Object). The class defines the operations that an object (which is an instance of the class) can perform (methods, events, or properties) and the data that the object contains (fields). Although a class generally includes both definition and implementation (unlike interfaces, for example, which contain only definition without implementation), it can have one or more members that have no implementation.
|
||||
|
||||
The following table describes some of the characteristics that a class may have. Each language that supports the runtime provides a way to indicate that a class or class member has one or more of these characteristics. However, individual programming languages that target .NET Core may not make all these characteristics available.
|
||||
|
||||
Characteristic | Description
|
||||
-------------- | -----------
|
||||
sealed | Specifies that another class cannot be derived from this type.
|
||||
implements | Indicates that the class uses one or more interfaces by providing implementations of interface members.
|
||||
abstract | Indicates that the class cannot be instantiated. To use it, you must derive another class from it.
|
||||
inherits | Indicates that instances of the class can be used anywhere the base class is specified. A derived class that inherits from a base class can use the implementation of any public members provided by the base class, or the derived class can override the implementation of the public members with its own implementation.
|
||||
exported or not exported | Indicates whether a class is visible outside the assembly in which it is defined. This characteristic applies only to top-level classes and not to nested classes.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> A class can also be nested in a parent class or structure. Nested classes also have member characteristics. For more information, see [Nested Types](#Nested-Types).
|
||||
|
||||
Class members that have no implementation are abstract members. A class that has one or more abstract members is itself abstract; new instances of it cannot be created. Some languages that target the runtime let you mark a class as abstract even if none of its members are abstract. You can use an abstract class when you want to encapsulate a basic set of functionality that derived classes can inherit or override when appropriate. Classes that are not abstract are referred to as concrete classes.
|
||||
|
||||
A class can implement any number of interfaces, but it can inherit from only one base class in addition to [System.Object](https://docs.microsoft.com/dotnet/core/api/System.Object), from which all classes inherit implicitly. All classes must have at least one constructor, which initializes new instances of the class. If you do not explicitly define a constructor, most compilers will automatically provide a default (parameterless) constructor.
|
||||
|
||||
### Structures
|
||||
|
||||
A structure is a value type that derives implicitly from [System.ValueType](https://docs.microsoft.com/dotnet/core/api/System.ValueType), which in turn is derived from [System.Object](https://docs.microsoft.com/dotnet/core/api/System.Object). A structure is very useful for representing values whose memory requirements are small, and for passing values as by-value parameters to methods that have strongly typed parameters. In .NET Core, all primitive data types ([Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean), [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Char](https://docs.microsoft.com/dotnet/core/api/System.Char), [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), and [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)) are defined as structures.
|
||||
|
||||
Like classes, structures define both data (the fields of the structure) and the operations that can be performed on that data (the methods of the structure). This means that you can call methods on structures, including the virtual methods defined on the `System.Object` and `System.ValueType` classes, and any methods defined on the value type itself. In other words, structures can have fields, properties, and events, as well as static and nonstatic methods. You can create instances of structures, pass them as parameters, store them as local variables, or store them in a field of another value type or reference type. Structures can also implement interfaces.
|
||||
|
||||
Value types also differ from classes in several respects. First, although they implicitly inherit from `System.ValueType`, they cannot directly inherit from any type. Similarly, all value types are sealed, which means that no other type can be derived from them. They also do not require constructors.
|
||||
|
||||
For each value type, the common language runtime supplies a corresponding boxed type, which is a class that has the same state and behavior as the value type. An instance of a value type is boxed when it is passed to a method that accepts a parameter of type `System.Object`. It is unboxed (that is, converted from an instance of a class back to an instance of a value type) when control returns from a method call that accepts a value type as a by-reference parameter. Some languages require that you use special syntax when the boxed type is required; others automatically use the boxed type when it is needed. When you define a value type, you are defining both the boxed and the unboxed type.
|
||||
|
||||
### Enumerations
|
||||
|
||||
An enumeration (enum) is a value type that inherits directly from [System.Enum](https://docs.microsoft.com/dotnet/core/api/System.Enum) and that supplies alternate names for the values of an underlying primitive type. An enumeration type has a name, an underlying type that must be one of the built-in signed or unsigned integer types (such as [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), or [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)), and a set of fields. The fields are static literal fields, each of which represents a constant. The same value can be assigned to multiple fields. When this occurs, you must mark one of the values as the primary enumeration value for reflection and string conversion.
|
||||
|
||||
You can assign a value of the underlying type to an enumeration and vice versa (no cast is required by the runtime). You can create an instance of an enumeration and call the methods of `System.Enum`, as well as any methods defined on the enumeration's underlying type. However, some languages might not let you pass an enumeration as a parameter when an instance of the underlying type is required (or vice versa).
|
||||
|
||||
The following additional restrictions apply to enumerations:
|
||||
|
||||
* They cannot define their own methods.
|
||||
|
||||
* They cannot implement interfaces.
|
||||
|
||||
* They cannot define properties or events.
|
||||
|
||||
* They cannot be generic, unless they are generic only because they are nested within a generic type. That is, an enumeration cannot have type parameters of its own.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Nested types (including enumerations) created with C# include the type parameters of all enclosing generic types, and are therefore generic even if they do not have type parameters of their own. For more information, see the [Type.MakeGenericType](https://docs.microsoft.com/dotnet/core/api/System.Type#System_Type_MakeGenericType_System_Type___) reference topic.
|
||||
|
||||
The [FlagsAttribute](https://docs.microsoft.com/dotnet/core/api/System.FlagsAttribute) attribute denotes a special kind of enumeration called a bit field. The runtime itself does not distinguish between traditional enumerations and bit fields, but your language might do so. When this distinction is made, bitwise operators can be used on bit fields, but not on enumerations, to generate unnamed values. Enumerations are generally used for lists of unique elements, such as days of the week, country or region names, and so on. Bit fields are generally used for lists of qualities or quantities that might occur in combination, such as `Red And Big And Fast`.
|
||||
|
||||
The following example shows how to use both bit fields and traditional enumerations.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
|
||||
// A traditional enumeration of some root vegetables.
|
||||
public enum SomeRootVegetables
|
||||
{
|
||||
HorseRadish,
|
||||
Radish,
|
||||
Turnip
|
||||
}
|
||||
|
||||
// A bit field or flag enumeration of harvesting seasons.
|
||||
[Flags]
|
||||
public enum Seasons
|
||||
{
|
||||
None = 0,
|
||||
Summer = 1,
|
||||
Autumn = 2,
|
||||
Winter = 4,
|
||||
Spring = 8,
|
||||
All = Summer | Autumn | Winter | Spring
|
||||
}
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Hash table of when vegetables are available.
|
||||
Dictionary<SomeRootVegetables, Seasons> AvailableIn = new Dictionary<SomeRootVegetables, Seasons>();
|
||||
|
||||
AvailableIn[SomeRootVegetables.HorseRadish] = Seasons.All;
|
||||
AvailableIn[SomeRootVegetables.Radish] = Seasons.Spring;
|
||||
AvailableIn[SomeRootVegetables.Turnip] = Seasons.Spring |
|
||||
Seasons.Autumn;
|
||||
|
||||
// Array of the seasons, using the enumeration.
|
||||
Seasons[] theSeasons = new Seasons[] { Seasons.Summer, Seasons.Autumn,
|
||||
Seasons.Winter, Seasons.Spring };
|
||||
|
||||
// Print information of what vegetables are available each season.
|
||||
foreach (Seasons season in theSeasons)
|
||||
{
|
||||
Console.Write(String.Format(
|
||||
"The following root vegetables are harvested in {0}:\n",
|
||||
season.ToString("G")));
|
||||
foreach (KeyValuePair<SomeRootVegetables, Seasons> item in AvailableIn)
|
||||
{
|
||||
// A bitwise comparison.
|
||||
if (((Seasons)item.Value & season) > 0)
|
||||
Console.Write(String.Format(" {0:G}\n",
|
||||
(SomeRootVegetables)item.Key));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The following root vegetables are harvested in Summer:
|
||||
// HorseRadish
|
||||
// The following root vegetables are harvested in Autumn:
|
||||
// Turnip
|
||||
// HorseRadish
|
||||
// The following root vegetables are harvested in Winter:
|
||||
// HorseRadish
|
||||
// The following root vegetables are harvested in Spring:
|
||||
// Turnip
|
||||
// Radish
|
||||
// HorseRadish
|
||||
```
|
||||
|
||||
### Interfaces
|
||||
|
||||
An interface defines a contract that specifies a "can do" relationship or a "has a" relationship. Interfaces are often used to implement functionality, such as comparing and sorting (the [IComparable](https://docs.microsoft.com/dotnet/core/api/System.IComparable) and [IComparable<T>](https://docs.microsoft.com/dotnet/core/api/System.IComparable%601) interfaces), testing for equality (the [IEquatable<T>](https://docs.microsoft.com/dotnet/core/api/System.IEquatable%601) interface), or enumerating items in a collection (the [IEnumerable](https://docs.microsoft.com/dotnet/core/api/System.Collections.IEnumerable) and [IEnumerable<T>](https://docs.microsoft.com/dotnet/core/api/System.Collections.Generic.IEnumerable%601) interfaces). Interfaces can have properties, methods, and events, all of which are abstract members; that is, although the interface defines the members and their signatures, it leaves it to the type that implements the interface to define the functionality of each interface member. This means that any class or structure that implements an interface must supply definitions for the abstract members declared in the interface. An interface can require any implementing class or structure to also implement one or more other interfaces.
|
||||
|
||||
The following restrictions apply to interfaces:
|
||||
|
||||
* An interface can be declared with any accessibility, but interface members must all have public accessibility.
|
||||
|
||||
* Interfaces cannot define constructors.
|
||||
|
||||
* Interfaces cannot define fields.
|
||||
|
||||
* Interfaces can define only instance members. They cannot define static members.
|
||||
|
||||
Each language must provide rules for mapping an implementation to the interface that requires the member, because more than one interface can declare a member with the same signature, and these members can have separate implementations.
|
||||
|
||||
### Delegates
|
||||
|
||||
Delegates are reference types that serve a purpose similar to that of function pointers in C++. They are used for event handlers and callback functions in .NET Core. Unlike function pointers, delegates are secure, verifiable, and type safe. A delegate type can represent any instance method or static method that has a compatible signature.
|
||||
|
||||
A parameter of a delegate is compatible with the corresponding parameter of a method if the type of the delegate parameter is more restrictive than the type of the method parameter, because this guarantees that an argument passed to the delegate can be passed safely to the method.
|
||||
|
||||
Similarly, the return type of a delegate is compatible with the return type of a method if the return type of the method is more restrictive than the return type of the delegate, because this guarantees that the return value of the method can be cast safely to the return type of the delegate.
|
||||
|
||||
For example, a delegate that has a parameter of type [IEnumerable](https://docs.microsoft.com/dotnet/core/api/System.Collections.IEnumerable) and a return type of [Object](https://docs.microsoft.com/dotnet/core/api/System.Object) can represent a method that has a parameter of type `Object` and a return value of type `IEnumerable`.
|
||||
|
||||
A delegate is said to be bound to the method it represents. In addition to being bound to the method, a delegate can be bound to an object. The object represents the first parameter of the method, and is passed to the method every time the delegate is invoked. If the method is an instance method, the bound object is passed as the implicit `this` parameter; if the method is static, the object is passed as the first formal parameter of the method, and the delegate signature must match the remaining parameters.
|
||||
|
||||
All delegates inherit from [System.MulticastDelegate](https://docs.microsoft.com/dotnet/core/api/System.MulticastDelegate), which inherits from [System.Delegate](https://docs.microsoft.com/dotnet/core/api/System.Delegate). The C# language doesn't allow inheritance from these types. Instead, it provides keywords for declaring delegates.
|
||||
|
||||
Because delegates inherit from `MulticastDelegate`, a delegate has an invocation list, which is a list of methods that the delegate represents and that are executed when the delegate is invoked. All methods in the list receive the arguments supplied when the delegate is invoked.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The return value is not defined for a delegate that has more than one method in its invocation list, even if the delegate has a return type.
|
||||
|
||||
In many cases, such as with callback methods, a delegate represents only one method, and the only actions you have to take are creating the delegate and invoking it.
|
||||
|
||||
For delegates that represent multiple methods, .NET Core provides methods of the `Delegate` and `MulticastDelegate` delegate classes to support operations such as adding a method to a delegate's invocation list (the [Delegate.Combine](https://docs.microsoft.com/dotnet/core/api/System.Delegate#System_Delegate_Combine_System_Delegate_System_Delegate_) method), removing a method (the [Delegate.Remove](https://docs.microsoft.com/dotnet/core/api/System.Delegate#System_Delegate_Remove_System_Delegate_System_Delegate_) method), and getting the invocation list (the [Delegate.GetInvocationList](https://docs.microsoft.com/dotnet/core/api/System.Delegate#System_Delegate_GetInvocationList) method).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> It is not necessary to use these methods for event-handler delegates in C#, because C# provides syntax for adding and removing event handlers.
|
||||
|
||||
## Type Definitions
|
||||
|
||||
A type definition includes the following:
|
||||
|
||||
* Any attributes defined on the type.
|
||||
|
||||
* The type's accessibility (visibility).
|
||||
|
||||
* The type's name.
|
||||
|
||||
* The type's base type.
|
||||
|
||||
* Any interfaces implemented by the type.
|
||||
|
||||
* Definitions for each of the type's members.
|
||||
|
||||
### Attributes
|
||||
|
||||
Attributes provide additional user-defined metadata. Most commonly, they are used to store additional information about a type in its assembly, or to modify the behavior of a type member in either the design-time or run-time environment.
|
||||
|
||||
Attributes are themselves classes that inherit from [System.Attribute](https://docs.microsoft.com/dotnet/core/api/System.Attribute). Languages that support the use of attributes each have their own syntax for applying attributes to a language element. Attributes can be applied to almost any language element; the specific elements to which an attribute can be applied are defined by the [AttributeUsageAttribute](https://docs.microsoft.com/dotnet/core/api/System.AttributeUsageAttribute that is applied to that attribute class.
|
||||
|
||||
### Type Accessibility
|
||||
|
||||
All types have a modifier that governs their accessibility from other types. The following table describes the type accessibilities supported by the runtime.
|
||||
|
||||
Accessibility | Description
|
||||
------------- | -----------
|
||||
public | The type is accessible by all assemblies.
|
||||
assembly | The type is accessible only from within its assembly.
|
||||
|
||||
The accessibility of a nested type depends on its accessibility domain, which is determined by both the declared accessibility of the member and the accessibility domain of the immediately containing type. However, the accessibility domain of a nested type cannot exceed that of the containing type.
|
||||
|
||||
The accessibility domain of a nested member `M` declared in a type `T`within a program `P` is defined as follows (noting that `M` might itself be a type):
|
||||
|
||||
* If the declared accessibility of `M` is `public`, the accessibility domain of `M` is the accessibility domain of `T`.
|
||||
|
||||
* If the declared accessibility of `M` is `protected internal`, the accessibility domain of `M` is the intersection of the accessibility domain of `T` with the program text of `P` and the program text of any type derived from `T` declared outside `P`.
|
||||
|
||||
* If the declared accessibility of `M` is `protected`, the accessibility domain of `M` is the intersection of the accessibility domain of `T` with the program text of `T` and any type derived from `T`.
|
||||
|
||||
* If the declared accessibility of `M` is `internal`, the accessibility domain of `M` is the intersection of the accessibility domain of `T` with the program text of`P`.
|
||||
|
||||
* If the declared accessibility of `M` is `private`, the accessibility domain of `M` is the program text of `T`.
|
||||
|
||||
### Type Names
|
||||
|
||||
The common type system imposes only two restrictions on names:
|
||||
|
||||
* All names are encoded as strings of Unicode (16-bit) characters.
|
||||
|
||||
* Names are not permitted to have an embedded (16-bit) value of 0x0000.
|
||||
|
||||
However, most languages impose additional restrictions on type names. All comparisons are done on a byte-by-byte basis, and are therefore case-sensitive and locale-independent.
|
||||
|
||||
Although a type might reference types from other modules and assemblies, a type must be fully defined within one .NET Core module. (Depending on compiler support, however, it can be divided into multiple source code files.) Type names need be unique only within a namespace. To fully identify a type, the type name must be qualified by the namespace that contains the implementation of the type.
|
||||
|
||||
### Base Types and Interfaces
|
||||
|
||||
A type can inherit values and behaviors from another type. The common type system does not allow types to inherit from more than one base type.
|
||||
|
||||
A type can implement any number of interfaces. To implement an interface, a type must implement all the virtual members of that interface. A virtual method can be implemented by a derived type and can be invoked either statically or dynamically.
|
||||
|
||||
## Type Members
|
||||
|
||||
The runtime enables you to define members of your type, which specifies the behavior and state of a type. Type members include the following:
|
||||
|
||||
* [Fields](#Fields)
|
||||
|
||||
* [Properties](#Properties)
|
||||
|
||||
* [Methods](#Methods)
|
||||
|
||||
* [Constructors](#Constructors)
|
||||
|
||||
* [Events](#Events)
|
||||
|
||||
* [Nested Types](#Nested-Types)
|
||||
|
||||
### Fields
|
||||
|
||||
A field describes and contains part of the type's state. Fields can be of any type supported by the runtime. Most commonly, fields are either `private` or `protected`, so that they are accessible only from within the class or from a derived class. If the value of a field can be modified from outside its type, a property set accessor is typically used. Publicly exposed fields are usually read-only and can be of two types:
|
||||
|
||||
* Constants, whose value is assigned at design time. These are static members of a class, although they are not defined using the `static` keyword.
|
||||
|
||||
* Read-only variables, whose values can be assigned in the class constructor.
|
||||
|
||||
The following example illustrates these two usages of read-only fields.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Constants
|
||||
{
|
||||
public const double Pi = 3.1416;
|
||||
public readonly DateTime BirthDate;
|
||||
|
||||
public Constants(DateTime birthDate)
|
||||
{
|
||||
this.BirthDate = birthDate;
|
||||
}
|
||||
}
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Constants con = new Constants(new DateTime(1974, 8, 18));
|
||||
Console.Write(Constants.Pi + "\n");
|
||||
Console.Write(con.BirthDate.ToString("d") + "\n");
|
||||
}
|
||||
}
|
||||
// The example displays the following output if run on a system whose current
|
||||
// culture is en-US:
|
||||
// 3.1417
|
||||
// 8/18/1974
|
||||
```
|
||||
|
||||
### Properties
|
||||
|
||||
A property names a value or state of the type and defines methods for getting or setting the property's value. Properties can be primitive types, collections of primitive types, user-defined types, or collections of user-defined types. Properties are often used to keep the public interface of a type independent from the type's actual representation. This enables properties to reflect values that are not directly stored in the class (for example, when a property returns a computed value) or to perform validation before values are assigned to private fields. The following example illustrates the latter pattern.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Person
|
||||
{
|
||||
private int m_Age;
|
||||
|
||||
public int Age
|
||||
{
|
||||
get { return m_Age; }
|
||||
set {
|
||||
if (value < 0 || value > 125)
|
||||
{
|
||||
throw new ArgumentOutOfRangeException("The value of the Age property must be between 0 and 125.");
|
||||
}
|
||||
else
|
||||
{
|
||||
m_Age = value;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
In addition to including the property itself, the Microsoft intermediate language (MSIL) for a type that contains a readable property includes a `get`*_propertyname* method, and the MSIL for a type that contains a writable property includes a `set`*_propertyname* method.
|
||||
|
||||
### Methods
|
||||
|
||||
A method describes operations that are available on the type. A method's signature specifies the allowable types of all its parameters and of its return value.
|
||||
|
||||
Although most methods define the precise number of parameters required for method calls, some methods support a variable number of parameters. The final declared parameter of these methods is marked with the [ParamArrayAttribute](https://docs.microsoft.com/dotnet/core/api/System.ParamArrayAttribute) attribute. Language compilers typically provide a keyword, such as params in C#, that makes explicit use of `ParamArrayAttribute` unnecessary.
|
||||
|
||||
### Constructors
|
||||
|
||||
A constructor is a special kind of method that creates new instances of a class or structure. Like any other method, a constructor can include parameters; however, constructors have no return value (that is, they return `void)`.
|
||||
|
||||
If the source code for a class does not explicitly define a constructor, the compiler includes a default (parameterless) constructor. However, if the source code for a class defines only parameterized constructors, the C# compiler doesn't generate a parameterless constructor.
|
||||
|
||||
If the source code for a structure defines constructors, they must be parameterized; a structure cannot define a default (parameterless) constructor, and compilers do not generate parameterless constructors for structures or other value types. All value types do have an implicit default constructor. This constructor is implemented by the common language runtime and initializes all fields of the structure to their default values.
|
||||
|
||||
### Events
|
||||
|
||||
An event defines an incident that can be responded to, and defines methods for subscribing to, unsubscribing from, and raising the event. Events are often used to inform other types of state changes.
|
||||
|
||||
### Nested Types
|
||||
|
||||
A nested type is a type that is a member of some other type. Nested types should be tightly coupled to their containing type and must not be useful as a general-purpose type. Nested types are useful when the declaring type uses and creates instances of the nested type, and use of the nested type is not exposed in public members.
|
||||
|
||||
Nested types are confusing to some developers and should not be publicly visible unless there is a compelling reason for visibility. In a well-designed library, developers should rarely have to use nested types to instantiate objects or declare variables.
|
||||
|
||||
## Characteristics of Type Members
|
||||
|
||||
The common type system allows type members to have a variety of characteristics; however, languages are not required to support all these characteristics. The following table describes member characteristics.
|
||||
|
||||
Characteristic | Can apply to | Description
|
||||
-------------- | ------------ | -----------
|
||||
abstract | Methods, properties, and events | The type does not supply the method's implementation. Types that inherit or implement abstract methods must supply an implementation for the method. The only exception is when the derived type is itself an abstract type. All abstract methods are virtual.
|
||||
private | All | Accessible only from within the same type as the member, or within a nested type.
|
||||
family | All | Accessible from within the same type as the member, and from derived types that inherit from it.
|
||||
assemby | All | Accessible only in the assembly in which the type is defined.
|
||||
family and assembly | All | Accessible only from types that qualify for both family and assembly access.
|
||||
family or assemby | All | Accessible only from types that qualify for either family or assembly access.
|
||||
public | All | Accessible from any type.
|
||||
final | Methods, properties, and events | The virtual method cannot be overridden in a derived type.
|
||||
initialize-only | Fields | The value can only be initialized, and cannot be written after initialization.
|
||||
instance | Fields, methods, properties, and events | If a member is not marked as static , or virtual, it is an instance member (there is no instance keyword). There will be as many copies of such members in memory as there are objects that use it.
|
||||
literal | Fields | The value assigned to the field is a fixed value, known at compile time, of a built-in value type. Literal fields are sometimes referred to as constants.
|
||||
newslot or override | All | Defines how the member interacts with inherited members that have the same signature: `newslot` hides inherited members that have the same signature; `override` replaces the definition of an inherited virtual method. The default is newslot.
|
||||
static | Fields, methods, properties, and events | The member belongs to the type it is defined on, not to a particular instance of the type; the member exists even if an instance of the type is not created, and it is shared among all instances of the type.
|
||||
virtual | Methods, properties, and events | The method can be implemented by a derived type and can be invoked either statically or dynamically. If dynamic invocation is used, the type of the instance that makes the call at run time (rather than the type known at compile time) determines which implementation of the method is called. To invoke a virtual method statically, the variable might have to be cast to a type that uses the desired version of the method.
|
||||
|
||||
### Overloading
|
||||
|
||||
Each type member has a unique signature. Method signatures consist of the method name and a parameter list (the order and types of the method's arguments). Multiple methods with the same name can be defined within a type as long as their signatures differ. When two or more methods with the same name are defined, the method is said to be overloaded. For example, in [System.Char](https://docs.microsoft.com/dotnet/core/api/System.Char), the `IsDigit` method is overloaded. One method takes a `Char`. The other method takes a `String` and an `Int32`.
|
||||
|
||||
> **Note*
|
||||
>
|
||||
> The return type is not considered part of a method's signature. That is, methods cannot be overloaded if they differ only by return type.
|
||||
|
||||
### Inheriting, Overriding, and Hiding Members
|
||||
|
||||
A derived type inherits all members of its base type; that is, these members are defined on, and available to, the derived type. The behavior or qualities of inherited members can be modified in two ways:
|
||||
|
||||
* A derived type can hide an inherited member by defining a new member with the same signature. This might be done to make a previously public member private or to define new behavior for an inherited method that is marked as `final`.
|
||||
|
||||
* A derived type can override an inherited virtual method. The overriding method provides a new definition of the method that will be invoked based on the type of the value at run time rather than the type of the variable known at compile time. A method can override a virtual method only if the virtual method is not marked as `final` and the new method is at least as accessible as the virtual method.
|
||||
|
||||
## See Also
|
||||
|
||||
[Type Conversion in the .NET Framework](typeconversion.md)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,70 +0,0 @@
|
|||
---
|
||||
title: Type Conversion Tables
|
||||
description: Type Conversion Tables
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 741987bb-af39-4895-b2b0-0d94dc240abd
|
||||
---
|
||||
|
||||
# Type Conversion Tables
|
||||
|
||||
Widening conversion occurs when a value of one type is converted to another type that is of equal or greater size. A narrowing conversion occurs when a value of one type is converted to a value of another type that is of a smaller size. The tables in this topic illustrate the behaviors exhibited by both types of conversions.
|
||||
|
||||
## Widening Conversions
|
||||
|
||||
Type | Can be converted without data loss to
|
||||
---- | -------------------------------------
|
||||
[Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) | [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) | [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16) | [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16) | [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[Char](https://docs.microsoft.com/dotnet/core/api/System.Char) | [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) | [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) | [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) | [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal)
|
||||
[Single](https://docs.microsoft.com/dotnet/core/api/System.Single) | [Double](https://docs.microsoft.com/dotnet/core/api/System.Double)
|
||||
|
||||
Some widening conversions to [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) can cause a loss of precision. The following table describes the widening conversions that sometimes result in a loss of information.
|
||||
|
||||
Type | Can be converted to
|
||||
---- | -------------------
|
||||
[Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) | [Single](https://docs.microsoft.com/dotnet/core/api/System.Single)
|
||||
[UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | [Single](https://docs.microsoft.com/dotnet/core/api/System.Single)
|
||||
[Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) | [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double)
|
||||
[UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) | [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double)
|
||||
[Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) | [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double)
|
||||
|
||||
## Narrowing Conversions
|
||||
|
||||
A narrowing conversion to [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) can cause a loss of information. If the target type cannot properly express the magnitude of the source, the resulting type is set to the constant `PositiveInfinity` or `NegativeInfinity`. `PositiveInfinity` results from dividing a positive number by zero and is also returned when the value of a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) exceeds the value of the `MaxValue` field. `NegativeInfinity` results from dividing a negative number by zero and is also returned when the value of a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) falls below the value of the `MinValue` field. A conversion from a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) to a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) might result in `PositiveInfinity` or `NegativeInfinity`.
|
||||
|
||||
A narrowing conversion can also result in a loss of information for other data types. However, an [OverflowException](https://docs.microsoft.com/dotnet/core/api/System.OverflowException) is thrown if the value of a type that is being converted falls outside of the range specified by the target type's `MaxValue` and `MinValue` fields, and the conversion is checked by the runtime to ensure that the value of the target type does not exceed its `MaxValue` or `MinValue`. Conversions that are performed with the [System.Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class are always checked in this manner.
|
||||
|
||||
The following table lists conversions that throw an [OverflowException](https://docs.microsoft.com/dotnet/core/api/System.OverflowException) using [System.Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) or any checked conversion if the value of the type being converted is outside the defined range of the resulting type.
|
||||
|
||||
Type | Can be converted to
|
||||
---- | -------------------
|
||||
[Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) | [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte)
|
||||
[SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)
|
||||
[Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16)
|
||||
[UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16)
|
||||
[Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32)
|
||||
[UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32)
|
||||
[Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)
|
||||
[UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64)
|
||||
[Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)
|
||||
[Single](https://docs.microsoft.com/dotnet/core/api/System.Single) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)
|
||||
[Double](https://docs.microsoft.com/dotnet/core/api/System.Double) | [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert)
|
||||
|
||||
|
|
@ -1,268 +0,0 @@
|
|||
---
|
||||
title: Composite Formatting
|
||||
description: Composite Formatting
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: b6f03ceb-d919-4ad9-9839-757dbfc2c0ac
|
||||
---
|
||||
|
||||
# Composite Formatting
|
||||
|
||||
The .NET Core composite formatting feature takes a list of objects and a composite format string as input. A composite format string consists of fixed text intermixed with indexed placeholders, called format items, that correspond to the objects in the list. The formatting operation yields a result string that consists of the original fixed text intermixed with the string representation of the objects in the list.
|
||||
|
||||
The composite formatting feature is supported by methods such as the following:
|
||||
|
||||
* [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_), which returns a formatted result string.
|
||||
|
||||
* [StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_), which appends a formatted result string to a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object.
|
||||
|
||||
* Some overloads of the [Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine) method, which display a formatted result string to the console.
|
||||
|
||||
* Some overloads of the [TextWriter.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.IO.TextWriter#System_IO_TextWriter_WriteLine) method, which write the formatted result string to a stream or file. Classes derived from [TextWriter](https://docs.microsoft.com/dotnet/core/api/System.IO.TextWriter), such as [StreamWriter](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamWriter), also share this functionality.
|
||||
|
||||
* [Debug.WriteLine(String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Debug#System_Diagnostics_Debug_WriteLine_System_String_System_Object___), which outputs a formatted message to trace listeners.
|
||||
|
||||
* The [Trace.TraceError(String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Trace#System_Diagnostics_Trace_TraceError_System_String_System_Object___), [Trace.TraceInformation(String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Trace#System_Diagnostics_Trace_TraceInformation_System_String_System_Object___), and [Trace.TraceWarning(String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Trace#System_Diagnostics_Trace_TraceWarning_System_String_System_Object___) methods, which output formatted messages to trace listeners.
|
||||
|
||||
* The [TraceSource.TraceInformation(String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.TraceSource#System_Diagnostics_TraceSource_TraceInformation_System_String_System_Object___) method, which writes an informational method to trace listeners.
|
||||
|
||||
## Composite Format String
|
||||
|
||||
A composite format string and object list are used as arguments of methods that support the composite formatting feature. A composite format string consists of zero or more runs of fixed text intermixed with one or more format items. The fixed text is any string that you choose, and each format item corresponds to an object or boxed structure in the list. The composite formatting feature returns a new result string where each format item is replaced by the string representation of the corresponding object in the list.
|
||||
|
||||
Consider the following [Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) code fragment.
|
||||
|
||||
```csharp
|
||||
string name = "Fred";
|
||||
String.Format("Name = {0}, hours = {1:hh}", name, DateTime.Now);
|
||||
```
|
||||
|
||||
The fixed text is `"Name = "` and `", hours = "`. The format items are `"{0}"`, whose index is 0, which corresponds to the object `name` and `"{1:hh}"`, whose index is 1, which corresponds to the object `DateTime.Now`.
|
||||
|
||||
## Format Item Syntax
|
||||
|
||||
Each format item takes the following form and consists of the following components:
|
||||
|
||||
__{__*index*[,*alignment*][:*formatString*]__}__
|
||||
|
||||
The matching braces ("{" and "}") are required.
|
||||
|
||||
### Index Component
|
||||
|
||||
The mandatory *index* component, also called a parameter specifier, is a number starting from 0 that identifies a corresponding item in the list of objects. That is, the format item whose parameter specifier is 0 formats the first object in the list, the format item whose parameter specifier is 1 formats the second object in the list, and so on. The following example includes four parameter specifiers, numbered zero through three, to represent prime numbers less than ten:
|
||||
|
||||
```csharp
|
||||
string primes;
|
||||
primes = String.Format("Prime numbers less than 10: {0}, {1}, {2}, {3}",
|
||||
2, 3, 5, 7 );
|
||||
Console.WriteLine(primes);
|
||||
// The example displays the following output:
|
||||
// Prime numbers less than 10: 2, 3, 5, 7
|
||||
```
|
||||
|
||||
Multiple format items can refer to the same element in the list of objects by specifying the same parameter specifier. For example, you can format the same numeric value in hexadecimal, scientific, and number format by specifying a composite format string such as : "0x{0:X} {0:E} {0:N}", as the following example shows.
|
||||
|
||||
```csharp
|
||||
string multiple = String.Format("0x{0:X} {0:E} {0:N}",
|
||||
Int64.MaxValue);
|
||||
Console.WriteLine(multiple);
|
||||
// The example displays the following output:
|
||||
// 0x7FFFFFFFFFFFFFFF 9.223372E+018 9,223,372,036,854,775,807.00
|
||||
```
|
||||
|
||||
Each format item can refer to any object in the list. For example, if there are three objects, you can format the second, first, and third object by specifying a composite format string like this: "{1} {0} {2}". An object that is not referenced by a format item is ignored. A [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException) is thrown at runtime if a parameter specifier designates an item outside the bounds of the list of objects.
|
||||
|
||||
### Alignment Component
|
||||
|
||||
The optional *alignment* component is a signed integer indicating the preferred formatted field width. If the value of *alignment* is less than the length of the formatted string, *alignment* is ignored and the length of the formatted string is used as the field width. The formatted data in the field is right-aligned if *alignment* is positive and left-aligned if *alignment* is negative. If padding is necessary, white space is used. The comma is required if *alignment* is specified.
|
||||
|
||||
The following example defines two arrays, one containing the names of employees and the other containing the hours they worked over a two-week period. The composite format string left-aligns the names in a 20-character field, and right-aligns their hours in a 5-character field. Note that the "N1" standard format string is also used to format the hours with one fractional digit.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] names = { "Adam", "Bridgette", "Carla", "Daniel",
|
||||
"Ebenezer", "Francine", "George" };
|
||||
decimal[] hours = { 40, 6.667m, 40.39m, 82, 40.333m, 80,
|
||||
16.75m };
|
||||
|
||||
Console.WriteLine("{0,-20} {1,5}\n", "Name", "Hours");
|
||||
for (int ctr = 0; ctr < names.Length; ctr++)
|
||||
Console.WriteLine("{0,-20} {1,5:N1}", names[ctr], hours[ctr]);
|
||||
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Name Hours
|
||||
//
|
||||
// Adam 40.0
|
||||
// Bridgette 6.7
|
||||
// Carla 40.4
|
||||
// Daniel 82.0
|
||||
// Ebenezer 40.3
|
||||
// Francine 80.0
|
||||
// George 16.8
|
||||
```
|
||||
|
||||
### Format String Component
|
||||
|
||||
The optional *formatString* component is a format string that is appropriate for the type of object being formatted. Specify a standard or custom numeric format string if the corresponding object is a numeric value, a standard or custom date and time format string if the corresponding object is a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) object, or an [enumeration format string](enumerationformat.md) if the corresponding object is an enumeration value. If *formatString* is not specified, the general ("G") format specifier for a numeric, date and time, or enumeration type is used. The colon is required if *formatString* is specified.
|
||||
|
||||
The following table lists types or categories of types in the .NET Framework class library that support a predefined set of format strings, and provides links to the topics that list the supported format strings. Note that string formatting is an extensible mechanism that makes it possible to define new format strings for all existing types as well as to define a set of format strings supported by an application-defined type. For more information, see the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) and [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface topics.
|
||||
|
||||
Type or type category | See
|
||||
--------------------- | ---
|
||||
Date and time types ([DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime), [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset)) | [Standard Date and Time Format Strings](standarddatetime.md), [Custom Date and Time Format Strings](customdatetime.md)
|
||||
Enumeration types (all types derived from [System.Enum](https://docs.microsoft.com/dotnet/core/api/System.Enum)) | [Enumeration Format Strings](enumerationformat.md)
|
||||
Numeric types ([BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger), [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64)) | [Standard Numeric Format Strings](standardnumeric.md), [Custom Numeric Format Strings](customnumeric.md)
|
||||
[Guid](https://docs.microsoft.com/dotnet/core/api/System.Guid) | [Guid.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.Guid#System_Guid_ToString_System_String_)
|
||||
[TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) | [Standard TimeSpan Format Strings](standardtimespan.md), [Custom TimeSpan Format Strings](customtimespan.md)
|
||||
|
||||
### Escaping Braces
|
||||
|
||||
Opening and closing braces are interpreted as starting and ending a format item. Consequently, you must use an escape sequence to display a literal opening brace or closing brace. Specify two opening braces ("{{") in the fixed text to display one opening brace ("{"), or two closing braces ("}}") to display one closing brace ("}"). Braces in a format item are interpreted sequentially in the order they are encountered. Interpreting nested braces is not supported.
|
||||
|
||||
The way escaped braces are interpreted can lead to unexpected results. For example, consider the format item "{{{0:D}}}", which is intended to display an opening brace, a numeric value formatted as a decimal number, and a closing brace. However, the format item is actually interpreted in the following manner:
|
||||
|
||||
1. The first two opening braces ("{{") are escaped and yield one opening brace.
|
||||
|
||||
2. The next three characters ("{0:") are interpreted as the start of a format item.
|
||||
|
||||
3. The next character ("D") would be interpreted as the Decimal standard numeric format specifier, but the next two escaped braces ("}}") yield a single brace. Because the resulting string ("D}") is not a standard numeric format specifier, the resulting string is interpreted as a custom format string that means display the literal string "D}".
|
||||
|
||||
4. The last brace ("}") is interpreted as the end of the format item.
|
||||
|
||||
5. The final result that is displayed is the literal string, "{D}". The numeric value that was to be formatted is not displayed.
|
||||
|
||||
One way to write your code to avoid misinterpreting escaped braces and format items is to format the braces and format item separately. That is, in the first format operation display a literal opening brace, in the next operation display the result of the format item, then in the final operation display a literal closing brace. The following example illustrates this approach.
|
||||
|
||||
```csharp
|
||||
int value = 6324;
|
||||
string output = string.Format("{0}{1:D}{2}",
|
||||
"{", value, "}");
|
||||
Console.WriteLine(output);
|
||||
// The example displays the following output:
|
||||
// {6324}
|
||||
```
|
||||
|
||||
### Processing Order
|
||||
|
||||
If the call to the composite formatting method includes an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) argument whose value is not null, the runtime calls its [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method to request an [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) implementation. If the method is able to return an [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) implementation, it is cached for later use.
|
||||
|
||||
Each value in the parameter list that corresponds to a format item is converted to a string by performing the following steps. If any condition in the first three steps is true, the string representation of the value is returned in that step, and subsequent steps are not executed.
|
||||
|
||||
1. If the value to be formatted is `null`, an empty string ("") is returned.
|
||||
|
||||
2. If an [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) implementation is available, the runtime calls its [Format](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_) method. It passes the method the format item's *formatString* value, if one is present, or `null` if it is not, along with the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) implementation.
|
||||
|
||||
3. If the value implements the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) interface, the interface's [ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.IFormattable#System_IFormattable_ToString_System_String_System_IFormatProvider_) method is called. The method is passed the *formatString* value, if one is present in the format item, or `null` if it is not. The [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) argument is determined as follows:
|
||||
|
||||
* For a numeric value, if a composite formatting method with a non-null [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) argument is called, the runtime requests a [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object from its [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method. If it is unable to supply one, if the value of the argument is `null`, or if the composite formatting method does not have an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter, the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo object for the current thread culture is used.
|
||||
|
||||
* For a date and time value, if a composite formatting method with a non-null [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) argument is called, the runtime requests a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object from its [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method. If it is unable to supply one, if the value of the argument is `null`, or if the composite formatting method does not have an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter, the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object for the current thread culture is used.
|
||||
|
||||
* For objects of other types, if a composite formatting is called with an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) argument, its value (including a `null`, if no [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) object is supplied) is passed directly to the [IFormattable.ToString](https://docs.microsoft.com/dotnet/core/api/System.IFormattable#System_IFormattable_ToString_System_String_System_IFormatProvider_) implementation. Otherwise, a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents the current thread culture is passed to the [IFormattable.ToString](https://docs.microsoft.com/dotnet/core/api/System.IFormattable#System_IFormattable_ToString_System_String_System_IFormatProvider_) implementation.
|
||||
|
||||
4. The type's parameterless `ToString` method, which either overrides [Object.ToString()](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) or inherits the behavior of its base class, is called. In this case, the format string specified by the *formatString* component in the format item, if it is present, is ignored.
|
||||
|
||||
Alignment is applied after the preceding steps have been performed.
|
||||
|
||||
## Code Examples
|
||||
|
||||
The following example shows one string created using composite formatting and another created using an object's `ToString` method. Both types of formatting produce equivalent results.
|
||||
|
||||
```csharp
|
||||
string FormatString1 = String.Format("{0:dddd MMMM}", DateTime.Now);
|
||||
string FormatString2 = DateTime.Now.ToString("dddd MMMM");
|
||||
```
|
||||
|
||||
Assuming that the current day is a Thursday in May, the value of both strings in the preceding example is `Thursday May` in the U.S. English culture.
|
||||
|
||||
[Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine) exposes the same functionality as [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_). The only difference between the two methods is that [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) returns its result as a string, while [Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine) writes the result to the output stream associated with the [Console](https://docs.microsoft.com/dotnet/core/api/System.Console) object. The following example uses the [Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine) method to format the value of `MyInt` to a currency value.
|
||||
|
||||
```csharp
|
||||
int MyInt = 100;
|
||||
Console.WriteLine("{0:C}", MyInt);
|
||||
// The example displays the following output
|
||||
// if en-US is the current culture:
|
||||
// $100.00
|
||||
```
|
||||
|
||||
The following example demonstrates formatting multiple objects, including formatting one object two different ways.
|
||||
|
||||
```csharp
|
||||
string myName = "Fred";
|
||||
Console.WriteLine(String.Format("Name = {0}, hours = {1:hh}, minutes = {1:mm}",
|
||||
myName, DateTime.Now));
|
||||
// Depending on the current time, the example displays output like the following:
|
||||
// Name = Fred, hours = 11, minutes = 30
|
||||
```
|
||||
|
||||
The following example demonstrates the use of alignment in formatting. The arguments that are formatted are placed between vertical bar characters (|) to highlight the resulting alignment.
|
||||
|
||||
```csharp
|
||||
string myFName = "Fred";
|
||||
string myLName = "Opals";
|
||||
int myInt = 100;
|
||||
string FormatFName = String.Format("First Name = |{0,10}|", myFName);
|
||||
string FormatLName = String.Format("Last Name = |{0,10}|", myLName);
|
||||
string FormatPrice = String.Format("Price = |{0,10:C}|", myInt);
|
||||
Console.WriteLine(FormatFName);
|
||||
Console.WriteLine(FormatLName);
|
||||
Console.WriteLine(FormatPrice);
|
||||
Console.WriteLine();
|
||||
|
||||
FormatFName = String.Format("First Name = |{0,-10}|", myFName);
|
||||
FormatLName = String.Format("Last Name = |{0,-10}|", myLName);
|
||||
FormatPrice = String.Format("Price = |{0,-10:C}|", myInt);
|
||||
Console.WriteLine(FormatFName);
|
||||
Console.WriteLine(FormatLName);
|
||||
Console.WriteLine(FormatPrice);
|
||||
// The example displays the following output on a system whose current
|
||||
// culture is en-US:
|
||||
// First Name = | Fred|
|
||||
// Last Name = | Opals|
|
||||
// Price = | $100.00|
|
||||
//
|
||||
// First Name = |Fred |
|
||||
// Last Name = |Opals |
|
||||
// Price = |$100.00 |
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine)
|
||||
|
||||
[String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_)
|
||||
|
||||
[Standard Date and Time Format Strings](standarddatetime.md)
|
||||
|
||||
[Custom Date and Time Format Strings](customdatetime.md)
|
||||
|
||||
[Enumeration Format Strings](enumerationformat.md)
|
||||
|
||||
[Standard Numeric Format Strings](standardnumeric.md)
|
||||
|
||||
[Custom Numeric Format Strings](customnumeric.md)
|
||||
|
||||
[Standard TimeSpan Format Strings](standardtimespan.md)
|
||||
|
||||
[Custom TimeSpan Format Strings](customtimespan.md)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -1,426 +0,0 @@
|
|||
---
|
||||
title: Custom Numeric Format Strings
|
||||
description: Custom Numeric Format Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 88cf267e-9574-4182-9d64-160e8d9a1882
|
||||
---
|
||||
|
||||
# Custom Numeric Format Strings
|
||||
|
||||
You can create a custom numeric format string, which consists of one or more custom numeric specifiers, to define how to format numeric data. A custom numeric format string is any format string that is not a [standard numeric format string](standardnumeric.md).
|
||||
|
||||
Custom numeric format strings are supported by some overloads of the `ToString` method of all numeric types. For example, you can supply a numeric format string to the [ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_ToString_System_String_) and [ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_ToString_System_String_System_IFormatProvider_) methods of the [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) type. Custom numeric format strings are also supported by the .NET Framework [composite formatting](compositeformat.md) feature, which is used by some `Write` and `WriteLine` methods of the [Console](https://docs.microsoft.com/dotnet/core/api/System.Console) and [StreamWriter](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamWriter) classes, the [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method, and the [StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_) method.
|
||||
|
||||
The following table describes the custom numeric format specifiers and displays sample output produced by each format specifier. See the [Notes](#Notes) section for additional information about using custom numeric format strings, and the [Example](#Example) section for a comprehensive illustration of their use.
|
||||
|
||||
Format specifier | Name | Description | Examples
|
||||
---------------- | ---- | ----------- | --------
|
||||
"0" | Zero placeholder | Replaces the zero with the corresponding digit if one is present; otherwise, zero appears in the result string. | `1234.5678 ("00000") -> 01235`; `0.45678 ("0.00", en-US) -> 0.46`; `0.45678 ("0.00", fr-FR) -> 0,46`
|
||||
"#" | Digit placeholder | Replaces the "#" symbol with the corresponding digit if one is present; otherwise, no digit appears in the result string. Note that no digit appears in the result string if the corresponding digit in the input string is a non-significant 0. For example, 0003 ("####") -> 3. | `1234.5678 ("#####") -> 1235`; `0.45678 ("#.##", en-US) -> .46`; `0.45678 ("#.##", fr-FR) -> ,46`
|
||||
"." | Decimal point | Determines the location of the decimal separator in the result string. | `0.45678 ("0.00", en-US) -> 0.46`; `0.45678 ("0.00", fr-FR) -> 0,46`
|
||||
"," | Group separator and number scaling | Serves as both a group separator and a number scaling specifier. As a group separator, it inserts a localized group separator character between each group. As a number scaling specifier, it divides a number by 1000 for each comma specified. | Group separator specifier: `2147483647 ("##,#", en-US) -> 2,147,483,647`; `2147483647 ("##,#", es-ES) -> 2.147.483.647`. Scaling specifier: `2147483647 ("#,#,,", en-US) -> 2,147`; `2147483647 ("#,#,,", es-ES) -> 2.147`
|
||||
"%" | Percentage placeholder | Multiplies a number by 100 and inserts a localized percentage symbol in the result string. | `0.3697 ("%#0.00", en-US) -> %36.97`; `0.3697 ("%#0.00", el-GR) -> %36,97`; `0.3697 ("##.0 %", en-US) -> 37.0 %`; `0.3697 ("##.0 %", el-GR) -> 37,0 %`
|
||||
"‰" | Per mille placeholder | Multiplies a number by 1000 and inserts a localized per mille symbol in the result string. | `0.03697 ("#0.00‰", en-US) -> 36.97‰`; `0.03697 ("#0.00‰", ru-RU) -> 36,97‰`
|
||||
"E0", "E+0", "E-0", "e0", "e+0", "e-0" | Exponential notation | If followed by at least one 0 (zero), formats the result using exponential notation. The case of "E" or "e" indicates the case of the exponent symbol in the result string. The number of zeros following the "E" or "e" character determines the minimum number of digits in the exponent. A plus sign (+) indicates that a sign character always precedes the exponent. A minus sign (-) indicates that a sign character precedes only negative exponents. | `987654 ("#0.0e0") -> 98.8e4`; `1503.92311 ("0.0##e+00") -> 1.504e+03`; `1.8901385E-16 ("0.0e+00") -> 1.9e-16`
|
||||
\ | Escape character | Causes the next character to be interpreted as a literal rather than as a custom format specifier. | `987654 ("\###00\#") -> #987654#`
|
||||
'string', "string" | Literal string delimiter | Indicates that the enclosed characters should be copied to the result string unchanged. | `68 ("# ' degrees'") -> 68 degrees`
|
||||
; | Section separator | Defines sections with separate format strings for positive, negative, and zero numbers. | `12.345 ("#0.0#;(#0.0#);-\0-") -> 12.35`; `0 ("#0.0#;(#0.0#);-\0-") -> -0-`; `-12.345 ("#0.0#;(#0.0#);-\0-") -> (12.35)`; `12.345 ("#0.0#;(#0.0#)") -> 12.35`; `0 ("#0.0#;(#0.0#)") -> 0.0 ; -12.345 ("#0.0#;(#0.0#)") -> (12.35)`
|
||||
Other | All other characters | The character is copied to the result string unchanged. | `68 ("# °") -> 68 °`
|
||||
|
||||
The following sections provide detailed information about each of the custom numeric format specifiers.
|
||||
|
||||
## The "0" Custom Specifier
|
||||
|
||||
The "0" custom format specifier serves as a zero-placeholder symbol. If the value that is being formatted has a digit in the position where the zero appears in the format string, that digit is copied to the result string; otherwise, a zero appears in the result string. The position of the leftmost zero before the decimal point and the rightmost zero after the decimal point determines the range of digits that are always present in the result string.
|
||||
|
||||
The "00" specifier causes the value to be rounded to the nearest digit preceding the decimal, where rounding away from zero is always used. For example, formatting 34.5 with "00" would result in the value 35.
|
||||
|
||||
The following example displays several values that are formatted by using custom format strings that include zero placeholders.
|
||||
|
||||
```csharp
|
||||
double value;
|
||||
|
||||
value = 123;
|
||||
Console.WriteLine(value.ToString("00000"));
|
||||
Console.WriteLine(String.Format("{0:00000}", value));
|
||||
// Displays 00123
|
||||
|
||||
value = 1.2;
|
||||
Console.WriteLine(value.ToString("0.00", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.00}", value));
|
||||
// Displays 1.20
|
||||
|
||||
Console.WriteLine(value.ToString("00.00", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:00.00}", value));
|
||||
// Displays 01.20
|
||||
|
||||
CultureInfo daDK = CultureInfo.CreateSpecificCulture("da-DK");
|
||||
Console.WriteLine(value.ToString("00.00", daDK));
|
||||
Console.WriteLine(String.Format(daDK, "{0:00.00}", value));
|
||||
// Displays 01,20
|
||||
|
||||
value = .56;
|
||||
Console.WriteLine(value.ToString("0.0", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.0}", value));
|
||||
// Displays 0.6
|
||||
|
||||
value = 1234567890;
|
||||
Console.WriteLine(value.ToString("0,0", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0,0}", value));
|
||||
// Displays 1,234,567,890
|
||||
|
||||
CultureInfo elGR = CultureInfo.CreateSpecificCulture("el-GR");
|
||||
Console.WriteLine(value.ToString("0,0", elGR));
|
||||
Console.WriteLine(String.Format(elGR, "{0:0,0}", value));
|
||||
// Displays 1.234.567.890
|
||||
|
||||
value = 1234567890.123456;
|
||||
Console.WriteLine(value.ToString("0,0.0", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0,0.0}", value));
|
||||
// Displays 1,234,567,890.1
|
||||
|
||||
value = 1234.567890;
|
||||
Console.WriteLine(value.ToString("0,0.00", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0,0.00}", value));
|
||||
// Displays 1,234.57
|
||||
```
|
||||
|
||||
## The "#" Custom Specifier
|
||||
|
||||
The "#" custom format specifier serves as a digit-placeholder symbol. If the value that is being formatted has a digit in the position where the "#" symbol appears in the format string, that digit is copied to the result string. Otherwise, nothing is stored in that position in the result string.
|
||||
|
||||
Note that this specifier never displays a zero that is not a significant digit, even if zero is the only digit in the string. It will display zero only if it is a significant digit in the number that is being displayed.
|
||||
|
||||
The "##" format string causes the value to be rounded to the nearest digit preceding the decimal, where rounding away from zero is always used. For example, formatting 34.5 with "##" would result in the value 35.
|
||||
|
||||
The following example displays several values that are formatted by using custom format strings that include digit placeholders.
|
||||
|
||||
```csharp
|
||||
double value;
|
||||
|
||||
value = 1.2;
|
||||
Console.WriteLine(value.ToString("#.##", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#.##}", value));
|
||||
// Displays 1.2
|
||||
|
||||
value = 123;
|
||||
Console.WriteLine(value.ToString("#####"));
|
||||
Console.WriteLine(String.Format("{0:#####}", value));
|
||||
// Displays 123
|
||||
|
||||
value = 123456;
|
||||
Console.WriteLine(value.ToString("[##-##-##]"));
|
||||
Console.WriteLine(String.Format("{0:[##-##-##]}", value));
|
||||
// Displays [12-34-56]
|
||||
|
||||
value = 1234567890;
|
||||
Console.WriteLine(value.ToString("#"));
|
||||
Console.WriteLine(String.Format("{0:#}", value));
|
||||
// Displays 1234567890
|
||||
|
||||
Console.WriteLine(value.ToString("(###) ###-####"));
|
||||
Console.WriteLine(String.Format("{0:(###) ###-####}", value));
|
||||
// Displays (123) 456-7890
|
||||
```
|
||||
|
||||
To return a result string in which absent digits or leading zeroes are replaced by spaces, use the [composite formatting](compositeformat.md) feature and specify a field width, as the following example illustrates.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Double value = .324;
|
||||
Console.WriteLine("The value is: '{0,5:#.###}'", value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output if the current culture
|
||||
// is en-US:
|
||||
// The value is: ' .324'
|
||||
```
|
||||
|
||||
## The "." Custom Specifier
|
||||
|
||||
The "." custom format specifier inserts a localized decimal separator into the result string. The first period in the format string determines the location of the decimal separator in the formatted value; any additional periods are ignored.
|
||||
|
||||
The character that is used as the decimal separator in the result string is not always a period; it is determined by the [NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) property of the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object that controls formatting.
|
||||
|
||||
The following example uses the "." format specifier to define the location of the decimal point in several result strings.
|
||||
|
||||
```csharp
|
||||
double value;
|
||||
|
||||
value = 1.2;
|
||||
Console.WriteLine(value.ToString("0.00", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.00}", value));
|
||||
// Displays 1.20
|
||||
|
||||
Console.WriteLine(value.ToString("00.00", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:00.00}", value));
|
||||
// Displays 01.20
|
||||
|
||||
Console.WriteLine(value.ToString("00.00",
|
||||
CultureInfo.CreateSpecificCulture("da-DK")));
|
||||
Console.WriteLine(String.Format(CultureInfo.CreateSpecificCulture("da-DK"),
|
||||
"{0:00.00}", value));
|
||||
// Displays 01,20
|
||||
|
||||
value = .086;
|
||||
Console.WriteLine(value.ToString("#0.##%", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#0.##%}", value));
|
||||
// Displays 8.6%
|
||||
|
||||
value = 86000;
|
||||
Console.WriteLine(value.ToString("0.###E+0", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.###E+0}", value));
|
||||
// Displays 8.6E+4
|
||||
```
|
||||
|
||||
## The "," Custom Specifier
|
||||
|
||||
The "," character serves as both a group separator and a number scaling specifier.
|
||||
|
||||
* Group separator: If one or more commas are specified between two digit placeholders (0 or #) that format the integral digits of a number, a group separator character is inserted between each number group in the integral part of the output.
|
||||
|
||||
The [NumberGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberGroupSeparator) and [NumberGroupSizes](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberGroupSizes) properties of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object determine the character used as the number group separator and the size of each number group. For example, if the string "#,#" and the invariant culture are used to format the number 1000, the output is "1,000".
|
||||
|
||||
* Number scaling specifier: If one or more commas are specified immediately to the left of the explicit or implicit decimal point, the number to be formatted is divided by 1000 for each comma. For example, if the string "0,," is used to format the number 100 million, the output is "100".
|
||||
|
||||
You can use group separator and number scaling specifiers in the same format string. For example, if the string "#,0,," and the invariant culture are used to format the number one billion, the output is "1,000".
|
||||
|
||||
The following example illustrates the use of the comma as a group separator.
|
||||
|
||||
```csharp
|
||||
double value = 1234567890;
|
||||
Console.WriteLine(value.ToString("#,#", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#,#}", value));
|
||||
// Displays 1,234,567,890
|
||||
|
||||
Console.WriteLine(value.ToString("#,##0,,", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#,##0,,}", value));
|
||||
// Displays 1,235
|
||||
```
|
||||
|
||||
The following example illustrates the use of the comma as a specifier for number scaling.
|
||||
|
||||
```csharp
|
||||
double value = 1234567890;
|
||||
Console.WriteLine(value.ToString("#,,", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#,,}", value));
|
||||
// Displays 1235
|
||||
|
||||
Console.WriteLine(value.ToString("#,,,", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#,,,}", value));
|
||||
// Displays 1
|
||||
|
||||
Console.WriteLine(value.ToString("#,##0,,", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#,##0,,}", value));
|
||||
// Displays 1,235
|
||||
```
|
||||
|
||||
## The "%" Custom Specifier
|
||||
|
||||
A percent sign (%) in a format string causes a number to be multiplied by 100 before it is formatted. The localized percent symbol is inserted in the number at the location where the % appears in the format string. The percent character used is defined by the [PercentSymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentSymbol) property of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object.
|
||||
|
||||
The following example defines several custom format strings that include the "%" custom specifier.
|
||||
|
||||
```csharp
|
||||
double value = .086;
|
||||
Console.WriteLine(value.ToString("#0.##%", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:#0.##%}", value));
|
||||
// Displays 8.6%
|
||||
```
|
||||
|
||||
## The "‰" Custom Specifier
|
||||
|
||||
A per mille character (‰ or \u2030) in a format string causes a number to be multiplied by 1000 before it is formatted. The appropriate per mille symbol is inserted in the returned string at the location where the ‰ symbol appears in the format string. The per mille character used is defined by the [NumberFormatInfo.PerMilleSymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PerMilleSymbol) property of the object that provides culture-specific formatting information.
|
||||
|
||||
The following example defines a custom format string that includes the "‰" custom specifier.
|
||||
|
||||
```csharp
|
||||
double value = .00354;
|
||||
string perMilleFmt = "#0.## " + '\u2030';
|
||||
Console.WriteLine(value.ToString(perMilleFmt, CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:" + perMilleFmt + "}", value));
|
||||
// Displays 3.54‰
|
||||
```
|
||||
|
||||
## The "E" and "e" Custom Specifiers
|
||||
|
||||
If any of the strings "E", "E+", "E-", "e", "e+", or "e-" are present in the format string and are followed immediately by at least one zero, the number is formatted by using scientific notation with an "E" or "e" inserted between the number and the exponent. The number of zeros following the scientific notation indicator determines the minimum number of digits to output for the exponent. The "E+" and "e+" formats indicate that a plus sign or minus sign should always precede the exponent. The "E", "E-", "e", or "e-" formats indicate that a sign character should precede only negative exponents.
|
||||
|
||||
The following example formats several numeric values using the specifiers for scientific notation.
|
||||
|
||||
```csharp
|
||||
double value = 86000;
|
||||
Console.WriteLine(value.ToString("0.###E+0", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.###E+0}", value));
|
||||
// Displays 8.6E+4
|
||||
|
||||
Console.WriteLine(value.ToString("0.###E+000", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.###E+000}", value));
|
||||
// Displays 8.6E+004
|
||||
|
||||
Console.WriteLine(value.ToString("0.###E-000", CultureInfo.InvariantCulture));
|
||||
Console.WriteLine(String.Format(CultureInfo.InvariantCulture,
|
||||
"{0:0.###E-000}", value));
|
||||
// Displays 8.6E004
|
||||
```
|
||||
|
||||
## The "\" Escape Character
|
||||
|
||||
The "#", "0", ".", ",", "%", and "‰" symbols in a format string are interpreted as format specifiers rather than as literal characters. Depending on their position in a custom format string, the uppercase and lowercase "E" as well as the + and - symbols may also be interpreted as format specifiers.
|
||||
|
||||
To prevent a character from being interpreted as a format specifier, you can precede it with a backslash, which is the escape character. The escape character signifies that the following character is a character literal that should be included in the result string unchanged.
|
||||
|
||||
To include a backslash in a result string, you must escape it with another backslash (\\).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Some compilers, such as the C# compiler, may also interpret a single backslash character as an escape character. To ensure that a string is interpreted correctly when formatting, you can use the verbatim string literal character (the @ character) before the string in C#, or add another backslash character before each backslash. The following C# example illustrates both approaches.
|
||||
|
||||
The following example uses the escape character to prevent the formatting operation from interpreting the "#", "0", and "\" characters as either escape characters or format specifiers. The example uses an additional backslash to ensure that a backslash is interpreted as a literal character.
|
||||
|
||||
```csharp
|
||||
int value = 123;
|
||||
Console.WriteLine(value.ToString("\\#\\#\\# ##0 dollars and \\0\\0 cents \\#\\#\\#"));
|
||||
Console.WriteLine(String.Format("{0:\\#\\#\\# ##0 dollars and \\0\\0 cents \\#\\#\\#}",
|
||||
value));
|
||||
// Displays ### 123 dollars and 00 cents ###
|
||||
|
||||
Console.WriteLine(value.ToString(@"\#\#\# ##0 dollars and \0\0 cents \#\#\#"));
|
||||
Console.WriteLine(String.Format(@"{0:\#\#\# ##0 dollars and \0\0 cents \#\#\#}",
|
||||
value));
|
||||
// Displays ### 123 dollars and 00 cents ###
|
||||
|
||||
Console.WriteLine(value.ToString("\\\\\\\\\\\\ ##0 dollars and \\0\\0 cents \\\\\\\\\\\\"));
|
||||
Console.WriteLine(String.Format("{0:\\\\\\\\\\\\ ##0 dollars and \\0\\0 cents \\\\\\\\\\\\}",
|
||||
value));
|
||||
// Displays \\\ 123 dollars and 00 cents \\\
|
||||
|
||||
Console.WriteLine(value.ToString(@"\\\\\\ ##0 dollars and \0\0 cents \\\\\\"));
|
||||
Console.WriteLine(String.Format(@"{0:\\\\\\ ##0 dollars and \0\0 cents \\\\\\}",
|
||||
value));
|
||||
// Displays \\\ 123 dollars and 00 cents \\\
|
||||
```
|
||||
|
||||
## The ";" Section Separator
|
||||
|
||||
The semicolon (;) is a conditional format specifier that applies different formatting to a number depending on whether its value is positive, negative, or zero. To produce this behavior, a custom format string can contain up to three sections separated by semicolons. These sections are described in the following table.
|
||||
|
||||
Number of sections | Description
|
||||
------------------ | -----------
|
||||
One section | The format string applies to all values.
|
||||
Two sections | The first section applies to positive values and zeros, and the second section applies to negative values. If the number to be formatted is negative, but becomes zero after rounding according to the format in the second section, the resulting zero is formatted according to the first section.
|
||||
Three sections | The first section applies to positive values, the second section applies to negative values, and the third section applies to zeros. The second section can be left empty (by having nothing between the semicolons), in which case the first section applies to all nonzero values. If the number to be formatted is nonzero, but becomes zero after rounding according to the format in the first or second section, the resulting zero is formatted according to the third section.
|
||||
|
||||
Section separators ignore any preexisting formatting associated with a number when the final value is formatted. For example, negative values are always displayed without a minus sign when section separators are used. If you want the final formatted value to have a minus sign, you should explicitly include the minus sign as part of the custom format specifier.
|
||||
|
||||
The following example uses the ";" format specifier to format positive, negative, and zero numbers differently.
|
||||
|
||||
```csharp
|
||||
double posValue = 1234;
|
||||
double negValue = -1234;
|
||||
double zeroValue = 0;
|
||||
|
||||
string fmt2 = "##;(##)";
|
||||
string fmt3 = "##;(##);**Zero**";
|
||||
|
||||
Console.WriteLine(posValue.ToString(fmt2));
|
||||
Console.WriteLine(String.Format("{0:" + fmt2 + "}", posValue));
|
||||
// Displays 1234
|
||||
|
||||
Console.WriteLine(negValue.ToString(fmt2));
|
||||
Console.WriteLine(String.Format("{0:" + fmt2 + "}", negValue));
|
||||
// Displays (1234)
|
||||
|
||||
Console.WriteLine(zeroValue.ToString(fmt3));
|
||||
Console.WriteLine(String.Format("{0:" + fmt3 + "}", zeroValue));
|
||||
// Displays **Zero**
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
### Floating-Point Infinities and NaN
|
||||
|
||||
Regardless of the format string, if the value of a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) floating-point type is positive infinity, negative infinity, or not a number (NaN), the formatted string is the value of the respective [PositiveInfinitySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PositiveInfinitySymbol), [NegativeInfinitySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeInfinitySymbol), or [NaNSymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NaNSymbol) property that is specified by the currently applicable [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object.
|
||||
|
||||
### Rounding and Fixed-Point Format Strings
|
||||
|
||||
For fixed-point format strings (that is, format strings that do not contain scientific notation format characters), numbers are rounded to as many decimal places as there are digit placeholders to the right of the decimal point. If the format string does not contain a decimal point, the number is rounded to the nearest integer. If the number has more digits than there are digit placeholders to the left of the decimal point, the extra digits are copied to the result string immediately before the first digit placeholder.
|
||||
|
||||
## Example
|
||||
|
||||
The following example demonstrates two custom numeric format strings. In both cases, the digit placeholder (#) displays the numeric data, and all other characters are copied to the result string.
|
||||
|
||||
```csharp
|
||||
double number1 = 1234567890;
|
||||
string value1 = number1.ToString("(###) ###-####");
|
||||
Console.WriteLine(value1);
|
||||
|
||||
int number2 = 42;
|
||||
string value2 = number2.ToString("My Number = #");
|
||||
Console.WriteLine(value2);
|
||||
// The example displays the following output:
|
||||
// (123) 456-7890
|
||||
// My Number = 42
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Globalization.NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo)
|
||||
|
||||
[Standard Numeric Format Strings](standardnumeric.md)
|
||||
|
||||
[How to: Pad a Number with Leading Zeros](operations/padnumber.md)
|
||||
|
||||
[Composite Formatting](compositeformat.md)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -1,100 +0,0 @@
|
|||
---
|
||||
title: Enumeration Format Strings
|
||||
description: Enumeration Format Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: f9cfe8e3-d8c8-4b77-a172-7940098cf4c2
|
||||
---
|
||||
|
||||
# Enumeration Format Strings
|
||||
|
||||
|
||||
You can use the [Enum.ToString](https://docs.microsoft.com/dotnet/core/api/System.Enum#System_Enum_ToString) method to create a new string object that represents the numeric, hexadecimal, or string value of an enumeration member. This method takes one of the enumeration formatting strings to specify the value that you want returned.
|
||||
|
||||
The following sections list the enumeration formatting strings and the values they return. These format specifiers are not case-sensitive.
|
||||
|
||||
## The G or g Format Strings
|
||||
|
||||
The G or g format strings display the enumeration entry as a string value, if possible, and otherwise displays the integer value of the current instance. If the enumeration is defined with the `Flags` attribute set, the string values of each valid entry are concatenated together, separated by commas. If the `Flags` attribute is not set, an invalid value is displayed as a numeric entry. The following example illustrates the G format specifier.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(ConsoleColor.Red.ToString("G")); // Displays Red
|
||||
FileAttributes attributes = FileAttributes.Hidden |
|
||||
FileAttributes.Archive;
|
||||
Console.WriteLine(attributes.ToString("G")); // Displays Hidden, Archive
|
||||
```
|
||||
|
||||
## The F or f Format Strings
|
||||
|
||||
The F or f format strings display the enumeration entry as a string value, if possible. If the value can be completely displayed as a summation of the entries in the enumeration (even if the `Flags` attribute is not present), the string values of each valid entry are concatenated together, separated by commas. If the value cannot be completely determined by the enumeration entries, then the value is formatted as the integer value. The following example illustrates the F format specifier.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(ConsoleColor.Blue.ToString("F")); // Displays Blue
|
||||
FileAttributes attributes = FileAttributes.Hidden |
|
||||
FileAttributes.Archive;
|
||||
Console.WriteLine(attributes.ToString("F")); // Displays Hidden, Archive
|
||||
```
|
||||
|
||||
## The D or d Format Strings
|
||||
|
||||
The D or d format strings display the enumeration entry as an integer value in the shortest representation possible. The following example illustrates the D format specifier.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(ConsoleColor.Cyan.ToString("D")); // Displays 11
|
||||
FileAttributes attributes = FileAttributes.Hidden |
|
||||
FileAttributes.Archive;
|
||||
Console.WriteLine(attributes.ToString("D")); // Displays 34
|
||||
````
|
||||
|
||||
## The X or x Format Strings
|
||||
|
||||
The X or x format strings display the enumeration entry as a hexadecimal value. The value is represented with leading zeros as necessary, to ensure that the value is a minimum eight digits in length. The following example illustrates the X format specifier.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(ConsoleColor.Cyan.ToString("X")); // Displays 0000000B
|
||||
FileAttributes attributes = FileAttributes.Hidden |
|
||||
FileAttributes.Archive;
|
||||
Console.WriteLine(attributes.ToString("X")); // Displays 00000022
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
The following example defines an enumeration called `Colors` that consists of three entries: `Red`, `Blue`, and `Green`.
|
||||
|
||||
```csharp
|
||||
public enum Color {Red = 1, Blue = 2, Green = 3}
|
||||
```
|
||||
|
||||
After the enumeration is defined, an instance can be declared in the following manner.
|
||||
|
||||
```csharp
|
||||
Color myColor = Color.Green;
|
||||
```
|
||||
|
||||
The `Color.ToString(System.String)` method can then be used to display the enumeration value in different ways, depending on the format specifier passed to it.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine("The value of myColor is {0}.",
|
||||
myColor.ToString("G"));
|
||||
Console.WriteLine("The value of myColor is {0}.",
|
||||
myColor.ToString("F"));
|
||||
Console.WriteLine("The value of myColor is {0}.",
|
||||
myColor.ToString("D"));
|
||||
Console.WriteLine("The value of myColor is 0x{0}.",
|
||||
myColor.ToString("X"));
|
||||
// The example displays the following output to the console:
|
||||
// The value of myColor is Green.
|
||||
// The value of myColor is Green.
|
||||
// The value of myColor is 3.
|
||||
// The value of myColor is 0x00000003.
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,884 +0,0 @@
|
|||
---
|
||||
title: Formatting Types
|
||||
description: Formatting Types
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 371ad7d7-e3ce-4be3-83f3-1f2c2e4e6d14
|
||||
---
|
||||
|
||||
# Formatting Types
|
||||
|
||||
Formatting is the process of converting an instance of a class, structure, or enumeration value to its string representation, often so that the resulting string can be displayed to users or deserialized to restore the original data type. This conversion can pose a number of challenges:
|
||||
|
||||
* The way that values are stored internally does not necessarily reflect the way that users want to view them. For example, a telephone number might be stored in the form **8009999999**, which is not user-friendly. It should instead be displayed as **800-999-9999**. See the [Custom Format Strings](#Custom-Format-Strings) section for an example that formats a number in this way.
|
||||
|
||||
* Sometimes the conversion of an object to its string representation is not intuitive. For example, it is not clear how the string representation of a **Temperature** object or a **Person** object should appear. For an example that formats a **Temperature** object in a variety of ways, see the [Standard Format Strings](#Standard-Format-Strings) section.
|
||||
|
||||
* Values often require culture-sensitive formatting. For example, in an application that uses numbers to reflect monetary values, numeric strings should include the current culture’s currency symbol, group separator (which, in most cultures, is the thousands separator), and decimal symbol. For an example, see the [Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface](#Culture-Sensitive-Formatting-with-Format-Providers-and-the-IFormatProvider-Interface) section.
|
||||
|
||||
* An application may have to display the same value in different ways. For example, an application may represent an enumeration member by displaying a string representation of its name or by displaying its underlying value. For an example that formats a member of the [DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DayOfWeek) enumeration in different ways, see the [Standard Format Strings](#Standard-Format-Strings) section.
|
||||
|
||||
.NET Core provides rich formatting support that enables developers to address these requirements.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Formatting converts the value of a type into a string representation. Parsing is the inverse of formatting. A parsing operation creates an instance of a data type from its string representation.
|
||||
|
||||
This overview contains the following sections:
|
||||
|
||||
* [Formatting in .NET Core](#Formatting-in-.NET-Core)
|
||||
|
||||
* [Default Formatting Using the ToString Method](#Default-Formatting-Using-the-ToString-Method)
|
||||
|
||||
* [Overriding the ToString Method](#Overriding-the-ToString-Method)
|
||||
|
||||
* [The ToString Method and Format Strings](#The-ToString-Method-and-Format-Strings)
|
||||
|
||||
* [Standard Format Strings](#Standard-Format-Strings)
|
||||
|
||||
* [Custom Format Strings](#Custom-Format-Strings)
|
||||
|
||||
* [Format Strings and .NET Core Types](#Format-Strings-and-.NET-Core-Types)
|
||||
|
||||
* [Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface](#Culture-Sensitive-Formatting-with-Format-Providers-and-the-IFormatProvider-Interface)
|
||||
|
||||
* [Culture-Sensitive Formatting of Numeric Values](#Culture-Sensitive-Formatting-of-Numeric-Values)
|
||||
|
||||
* [Culture-Sensitive Formatting of Date and Time Values](#Culture-Sensitive-Formatting-of-Date-and-Time-Values)
|
||||
|
||||
* [The IFormattable Interface](#The-IFormattable-Interface)
|
||||
|
||||
* [Composite Formatting](#Composite-Formatting)
|
||||
|
||||
* [Custom Formatting with ICustomFormatter](#Custom-Formatting-with-ICustomFormatter)
|
||||
|
||||
* [Related Topics](#Related-Topics)
|
||||
|
||||
* [Reference](#Reference)
|
||||
|
||||
## Formatting in .NET Core
|
||||
|
||||
The basic mechanism for formatting is the default implementation of the [Object.ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method, which is discussed in the [Default Formatting Using the ToString Method](#Default-Formatting-Using-the-ToString-Method) section later in this topic. However, .NET Core provides several ways to modify and extend its default formatting support. These include the following:
|
||||
|
||||
* Overriding the [Object.ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method to define a custom string representation of an object’s value. For more information, see the [Overriding the ToString Method](#Overriding-the-ToString-Method) section later in this topic.
|
||||
|
||||
* Defining format specifiers that enable the string representation of an object’s value to take multiple forms. For example, the "X" format specifier in the following statement converts an integer to the string representation of a hexadecimal value.
|
||||
|
||||
```csharp
|
||||
int integerValue = 60312;
|
||||
Console.WriteLine(integerValue.ToString("X")); // Displays EB98.
|
||||
```
|
||||
|
||||
For more information about format specifiers, see the [The ToString Method and Format Strings](#The-ToString-Method-and-Format-Strings) section.
|
||||
|
||||
* Using format providers to take advantage of the formatting conventions of a specific culture. For example, the following statement displays a currency value by using the formatting conventions of the en-US culture.
|
||||
|
||||
```csharp
|
||||
double cost = 1632.54;
|
||||
Console.WriteLine(cost.ToString("C",
|
||||
new System.Globalization.CultureInfo("en-US")));
|
||||
// The example displays the following output:
|
||||
// $1,632.54
|
||||
```
|
||||
|
||||
For more information about formatting with format providers, see the [Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface](#Culture-Sensitive-Formatting-with-Format-Providers-and-the-IFormatProvider-Interface) section.
|
||||
|
||||
* Implementing the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) interface to support both string conversion with the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class and composite formatting. For more information, see the [The IFormattable Interface](#The-IFormattable-Interface) section.
|
||||
|
||||
* Using composite formatting to embed the string representation of a value in a larger string. For more information, see the [Composite Formatting](#Composite-Formatting) section.
|
||||
|
||||
* Implementing [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) and [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) to provide a complete custom formatting solution. For more information, see the [Custom Formatting with ICustomFormatter](#Custom-Formatting-with-ICustomFormatter) section.
|
||||
|
||||
The following sections examine these methods for converting an object to its string representation.
|
||||
|
||||
## Default Formatting Using the ToString Method
|
||||
|
||||
Every type that is derived from [System.Object](https://docs.microsoft.com/dotnet/core/api/System.Object) automatically inherits a parameterless [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method, which returns the name of the type by default. The following example illustrates the default [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method. It defines a class named `Automobile` that has no implementation. When the class is instantiated and its [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method is called, it displays its type name. Note that the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method is not explicitly called in the example. The [Console.WriteLine(Object)](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine_System_Object_) method implicitly calls the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method of the object passed to it as an argument.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Automobile
|
||||
{
|
||||
// No implementation. All members are inherited from Object.
|
||||
}
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Automobile firstAuto = new Automobile();
|
||||
Console.WriteLine(firstAuto);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Automobile
|
||||
```
|
||||
|
||||
Because all types other than interfaces are derived from [Object](https://docs.microsoft.com/dotnet/core/api/System.Object), this functionality is automatically provided to your custom classes or structures. However, the functionality offered by the default [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method, is limited: Although it identifies the type, it fails to provide any information about an instance of the type. To provide a string representation of an object that provides information about that object, you must override the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Structures inherit from [ValueType](https://docs.microsoft.com/dotnet/core/api/System.ValueType), which in turn is derived from [Object](https://docs.microsoft.com/dotnet/core/api/System.Object). Although [ValueType](https://docs.microsoft.com/dotnet/core/api/System.ValueType) overrides [Object.ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString), its implementation is identical.
|
||||
|
||||
## Overriding the ToString Method
|
||||
|
||||
Displaying the name of a type is often of limited use and does not allow consumers of your types to differentiate one instance from another. However, you can override the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method to provide a more useful representation of an object’s value. The following example defines a `Temperature` object and overrides its [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method to display the temperature in degrees Celsius.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Temperature
|
||||
{
|
||||
private decimal temp;
|
||||
|
||||
public Temperature(decimal temperature)
|
||||
{
|
||||
this.temp = temperature;
|
||||
}
|
||||
|
||||
public override string ToString()
|
||||
{
|
||||
return this.temp.ToString("N1") + "°C";
|
||||
}
|
||||
}
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Temperature currentTemperature = new Temperature(23.6m);
|
||||
Console.WriteLine("The current temperature is " +
|
||||
currentTemperature.ToString());
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The current temperature is 23.6°C.
|
||||
```
|
||||
|
||||
In .NET Core, the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method of each primitive value type has been overridden to display the object’s value instead of its name. The following table shows the override for each primitive type. Note that most of the overridden methods call another overload of the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method and pass it the "G" format specifier, which defines the general format for its type, and an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) object that represents the current culture.
|
||||
|
||||
Type | ToString override
|
||||
---- | -----------------
|
||||
[Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean) | Returns either [Boolean.TrueString](https://docs.microsoft.com/dotnet/core/api/System.Boolean#System_Boolean_TrueString) or [Boolean.FalseString](https://docs.microsoft.com/dotnet/core/api/System.Boolean#System_Boolean_FalseString).
|
||||
[Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) | Calls `Byte.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) value for the current culture.
|
||||
[Char](https://docs.microsoft.com/dotnet/core/api/System.Char) | Returns the character as a string.
|
||||
[DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) | Calls `DateTime.ToString("G", DatetimeFormatInfo.CurrentInfo)` to format the date and time value for the current culture.
|
||||
[Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) | Calls `Decimal.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) value for the current culture.
|
||||
[Double](https://docs.microsoft.com/dotnet/core/api/System.Double) | Calls `Double.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value for the current culture.
|
||||
[Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16) value for the current culture.
|
||||
[Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) value for the current culture.
|
||||
[Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) value for the current culture.
|
||||
[SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) | value for the current culture.
|
||||
[Single](https://docs.microsoft.com/dotnet/core/api/System.Single) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) value for the current culture.
|
||||
[UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32)value for the current culture.
|
||||
[UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) value for the current culture.
|
||||
[UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) value for the current culture.
|
||||
|
||||
## The ToString Method and Format Strings
|
||||
|
||||
Relying on the default [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method or overriding [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) is appropriate when an object has a single string representation. However, the value of an object often has multiple representations. For example, a temperature can be expressed in degrees Fahrenheit, degrees Celsius, or kelvins. Similarly, the integer value 10 can be represented in numerous ways, including 10, 10.0, 1.0e01, or $10.00.
|
||||
|
||||
To enable a single value to have multiple string representations, .NET Core uses format strings. A format string is a string that contains one or more predefined format specifiers, which are single characters or groups of characters that define how the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method should format its output. The format string is then passed as a parameter to the object's [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method and determines how the string representation of that object's value should appear.
|
||||
|
||||
All numeric types, date and time types, and enumeration types in .NET Core support a predefined set of format specifiers. You can also use format strings to define multiple string representations of your application-defined data types.
|
||||
|
||||
### Standard Format Strings
|
||||
|
||||
A standard format string contains a single format specifier, which is an alphabetic character that defines the string representation of the object to which it is applied, along with an optional precision specifier that affects how many digits are displayed in the result string. If the precision specifier is omitted or is not supported, a standard format specifier is equivalent to a standard format string.
|
||||
|
||||
.NET Core defines a set of standard format specifiers for all numeric types, all date and time types, and all enumeration types. For example, each of these categories supports a "G" standard format specifier, which defines a general string representation of a value of that type.
|
||||
|
||||
Standard format strings for enumeration types directly control the string representation of a value. The format strings passed to an enumeration value’s [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method determine whether the value is displayed using its string name (the "G" and "F" format specifiers), its underlying integral value (the "D" format specifier), or its hexadecimal value (the "X" format specifier). The following example illustrates the use of standard format strings to format a [DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DayOfWeek) enumeration value.
|
||||
|
||||
```csharp
|
||||
DayOfWeek thisDay = DayOfWeek.Monday;
|
||||
string[] formatStrings = {"G", "F", "D", "X"};
|
||||
|
||||
foreach (string formatString in formatStrings)
|
||||
Console.WriteLine(thisDay.ToString(formatString));
|
||||
// The example displays the following output:
|
||||
// Monday
|
||||
// Monday
|
||||
// 1
|
||||
// 00000001
|
||||
```
|
||||
|
||||
For information about enumeration format strings, see [Enumeration Format Strings](enumerationformat.md).
|
||||
|
||||
Standard format strings for numeric types usually define a result string whose precise appearance is controlled by one or more property values. For example, the "C" format specifier formats a number as a currency value. When you call the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method with the "C" format specifier as the only parameter, the following property values from the current culture’s [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object are used to define the string representation of the numeric value:
|
||||
|
||||
* The [CurrencySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencySymbol) property, which specifies the current culture’s currency symbol.
|
||||
|
||||
* The [CurrencyNegativePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyNegativePattern) or [CurrencyPositivePattern]()https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyPositivePattern property, which returns an integer that determines the following:
|
||||
|
||||
* The placement of the currency symbol.
|
||||
|
||||
* Whether negative values are indicated by a leading negative sign, a trailing negative sign, or parentheses.
|
||||
|
||||
* Whether a space appears between the numeric value and the currency symbol.
|
||||
|
||||
* The [CurrencyDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalDigits) property, which defines the number of fractional digits in the result string.
|
||||
|
||||
* The [CurrencyDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalSeparator) property, which defines the decimal separator symbol in the result string.
|
||||
|
||||
* The [CurrencyGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyGroupSeparator) property, which defines the group separator symbol.
|
||||
|
||||
* The [CurrencyGroupSizes](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyGroupSizes) property, which defines the number of digits in each group to the left of the decimal.
|
||||
|
||||
* The [NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) property, which determines the negative sign used in the result string if parentheses are not used to indicate negative values.
|
||||
|
||||
In addition, numeric format strings may include a precision specifier. The meaning of this specifier depends on the format string with which it is used, but it typically indicates either the total number of digits or the number of fractional digits that should appear in the result string. For example, the following example uses the "X4" standard numeric string and a precision specifier to create a string value that has four hexadecimal digits.
|
||||
|
||||
```csharp
|
||||
byte[] byteValues = { 12, 163, 255 };
|
||||
foreach (byte byteValue in byteValues)
|
||||
Console.WriteLine(byteValue.ToString("X4"));
|
||||
// The example displays the following output:
|
||||
// 000C
|
||||
// 00A3
|
||||
// 00FF
|
||||
```
|
||||
|
||||
For more information about standard numeric formatting strings, see [Standard Numeric Format Strings](standardnumeric.md).
|
||||
|
||||
Standard format strings for date and time values are aliases for custom format strings stored by a particular [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) class. For example, calling the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method of a date and time value with the "D" format specifier displays the date and time by using the custom format string stored in the current culture’s [DateTimeFormatInfo.LongDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo#System_Globalization_DateTimeFormatInfo_LongDatePattern) property. (For more information about custom format strings, see the [Custom Format Strings](#Custom-Format-Strings) section.) The following example illustrates this relationship.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime date1 = new DateTime(2017, 6, 30);
|
||||
Console.WriteLine("D Format Specifier: {0:D}", date1);
|
||||
string longPattern = CultureInfo.CurrentCulture.DateTimeFormat.LongDatePattern;
|
||||
Console.WriteLine("'{0}' custom format string: {1}",
|
||||
longPattern, date1.ToString(longPattern));
|
||||
}
|
||||
}
|
||||
// The example displays the following output when run on a system whose
|
||||
// current culture is en-US:
|
||||
// D Format Specifier: Tuesday, June 30, 2017
|
||||
// 'dddd, MMMM dd, yyyy' custom format string: Tuesday, June 30, 2017
|
||||
```
|
||||
|
||||
For more information about standard date and time format strings, see [Standard Date and Time Format Strings](standarddatetime.md).
|
||||
|
||||
You can also use standard format strings to define the string representation of an application-defined object that is produced by the object’s `ToString(String)` method. You can define the specific standard format specifiers that your object supports, and you can determine whether they are case-sensitive or case-insensitive. Your implementation of the `ToString(String)` method should support the following:
|
||||
|
||||
* A "G" format specifier that represents a customary or common format of the object. The parameterless overload of your object's `ToString` method should call its `ToString(String)` overload and pass it the "G" standard format string.
|
||||
|
||||
* Support for a format specifier that is equal to a null reference. A format specifier that is equal to a null reference should be considered equivalent to the "G" format specifier.
|
||||
|
||||
For example, a `Temperature` class can internally store the temperature in degrees Celsius and use format specifiers to represent the value of the `Temperature` object in degrees Celsius, degrees Fahrenheit, and kelvins. The following example provides an illustration.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Temperature
|
||||
{
|
||||
private decimal m_Temp;
|
||||
|
||||
public Temperature(decimal temperature)
|
||||
{
|
||||
this.m_Temp = temperature;
|
||||
}
|
||||
|
||||
public decimal Celsius
|
||||
{
|
||||
get { return this.m_Temp; }
|
||||
}
|
||||
|
||||
public decimal Kelvin
|
||||
{
|
||||
get { return this.m_Temp + 273.15m; }
|
||||
}
|
||||
|
||||
public decimal Fahrenheit
|
||||
{
|
||||
get { return Math.Round(((decimal) (this.m_Temp * 9 / 5 + 32)), 2); }
|
||||
}
|
||||
|
||||
public override string ToString()
|
||||
{
|
||||
return this.ToString("C");
|
||||
}
|
||||
|
||||
public string ToString(string format)
|
||||
{
|
||||
// Handle null or empty string.
|
||||
if (String.IsNullOrEmpty(format)) format = "C";
|
||||
// Remove spaces and convert to uppercase.
|
||||
format = format.Trim().ToUpperInvariant();
|
||||
|
||||
// Convert temperature to Fahrenheit and return string.
|
||||
switch (format)
|
||||
{
|
||||
// Convert temperature to Fahrenheit and return string.
|
||||
case "F":
|
||||
return this.Fahrenheit.ToString("N2") + " °F";
|
||||
// Convert temperature to Kelvin and return string.
|
||||
case "K":
|
||||
return this.Kelvin.ToString("N2") + " K";
|
||||
// return temperature in Celsius.
|
||||
case "G":
|
||||
case "C":
|
||||
return this.Celsius.ToString("N2") + " °C";
|
||||
default:
|
||||
throw new FormatException(String.Format("The '{0}' format string is not supported.", format));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Temperature temp1 = new Temperature(0m);
|
||||
Console.WriteLine(temp1.ToString());
|
||||
Console.WriteLine(temp1.ToString("G"));
|
||||
Console.WriteLine(temp1.ToString("C"));
|
||||
Console.WriteLine(temp1.ToString("F"));
|
||||
Console.WriteLine(temp1.ToString("K"));
|
||||
|
||||
Temperature temp2 = new Temperature(-40m);
|
||||
Console.WriteLine(temp2.ToString());
|
||||
Console.WriteLine(temp2.ToString("G"));
|
||||
Console.WriteLine(temp2.ToString("C"));
|
||||
Console.WriteLine(temp2.ToString("F"));
|
||||
Console.WriteLine(temp2.ToString("K"));
|
||||
|
||||
Temperature temp3 = new Temperature(16m);
|
||||
Console.WriteLine(temp3.ToString());
|
||||
Console.WriteLine(temp3.ToString("G"));
|
||||
Console.WriteLine(temp3.ToString("C"));
|
||||
Console.WriteLine(temp3.ToString("F"));
|
||||
Console.WriteLine(temp3.ToString("K"));
|
||||
|
||||
Console.WriteLine(String.Format("The temperature is now {0:F}.", temp3));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 0.00 °C
|
||||
// 0.00 °C
|
||||
// 0.00 °C
|
||||
// 32.00 °F
|
||||
// 273.15 K
|
||||
// -40.00 °C
|
||||
// -40.00 °C
|
||||
// -40.00 °C
|
||||
// -40.00 °F
|
||||
// 233.15 K
|
||||
// 16.00 °C
|
||||
// 16.00 °C
|
||||
// 16.00 °C
|
||||
// 60.80 °F
|
||||
// 289.15 K
|
||||
// The temperature is now 16.00 °C.
|
||||
```
|
||||
|
||||
### Custom Format Strings
|
||||
|
||||
In addition to the standard format strings, .NET Core defines custom format strings for both numeric values and date and time values. A custom format string consists of one or more custom format specifiers that define the string representation of a value. For example, the custom date and time format string "yyyy/mm/dd hh:mm:ss.ffff t zzz" converts a date to its string representation in the form "2008/11/15 07:45:00.0000 P -08:00" for the en-US culture. Similarly, the custom format string "0000" converts the integer value 12 to "0012". For a complete list of custom format strings, see [Custom Date and Time Format Strings](customdatetime.md) and [Custom Numeric Format Strings](customnumeric.md).
|
||||
|
||||
If a format string consists of a single custom format specifier, the format specifier should be preceded by the percent (%) symbol to avoid confusion with a standard format specifier. The following example uses the "M" custom format specifier to display a one-digit or two-digit number of the month of a particular date.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2009, 9, 8);
|
||||
Console.WriteLine(date1.ToString("%M"));
|
||||
// Displays 9
|
||||
```
|
||||
|
||||
Many standard format strings for date and time values are aliases for custom format strings that are defined by properties of the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object. Custom format strings also offer considerable flexibility in providing application-defined formatting for numeric values or date and time values. You can define your own custom result strings for both numeric values and date and time values by combining multiple custom format specifiers into a single custom format string. The following example defines a custom format string that displays the day of the week in parentheses after the month name, day, and year.
|
||||
|
||||
```csharp
|
||||
string customFormat = "MMMM dd, yyyy (dddd)";
|
||||
DateTime date1 = new DateTime(2009, 8, 28);
|
||||
Console.WriteLine(date1.ToString(customFormat));
|
||||
// The example displays the following output if run on a system
|
||||
// whose language is English:
|
||||
// August 28, 2009 (Friday)
|
||||
```
|
||||
|
||||
The following example defines a custom format string that displays an [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) value as a standard, seven-digit U.S. telephone number along with its area code.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
long number = 8009999999;
|
||||
string fmt = "000-000-0000";
|
||||
Console.WriteLine(number.ToString(fmt));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 800-999-9999
|
||||
```
|
||||
|
||||
Although standard format strings can generally handle most of the formatting needs for your application-defined types, you may also define custom format specifiers to format your types.
|
||||
|
||||
### Format Strings and .NET Core Types
|
||||
|
||||
All numeric types (that is, the [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), and [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger) types), as well as the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime), [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset), [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan), [Guid](https://docs.microsoft.com/dotnet/core/api/System.Guid), and all enumeration types, support formatting with format strings. For information on the specific format strings supported by each type, see the following topics:
|
||||
|
||||
Title | Definition
|
||||
----- | ----------
|
||||
[Standard Numeric Format Strings](standardnumeric.md) | Describes standard format strings that create commonly used string representations of numeric values.
|
||||
[Custom Numeric Format Strings](customnumeric.md) | Describes custom format strings that create application-specific formats for numeric values.
|
||||
[Standard Date and Time Format Strings](standarddatetime.md) | Describes standard format strings that create commonly used string representations of [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values.
|
||||
[Custom Date and Time Format Strings](customdatetime.md) | Describes custom format strings that create application-specific formats for [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values.
|
||||
[Standard TimeSpan Format Strings](standardtimespan.md) | Describes standard format strings that create commonly used string representations of time intervals.
|
||||
[Custom TimeSpan Format Strings](customtimespan.md) | Describes custom format strings that create application-specific formats for time intervals.
|
||||
[Enumeration Format Strings](enumerationformat.md) | Describes standard format strings that are used to create string representations of enumeration values.
|
||||
[Guid.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.Guid#System_Guid_ToString_System_String_) | Describes standard format strings for [Guid](https://docs.microsoft.com/dotnet/core/api/System.Guid) values.
|
||||
|
||||
## Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface
|
||||
|
||||
Although format specifiers let you customize the formatting of objects, producing a meaningful string representation of objects often requires additional formatting information. For example, formatting a number as a currency value by using either the "C" standard format string or a custom format string such as "$ #,#.00" requires, at a minimum, information about the correct currency symbol, group separator, and decimal separator to be available to include in the formatted string. In .NET Core, this additional formatting information is made available through the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) interface, which is provided as a parameter to one or more overloads of the `ToString` method of numeric types and date and time types. [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) implementations are used in .NET Core to support culture-specific formatting. The following example illustrates how the string representation of an object changes when it is formatted with three [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) objects that represent different cultures.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
decimal value = 1603.42m;
|
||||
Console.WriteLine(value.ToString("C3", new CultureInfo("en-US")));
|
||||
Console.WriteLine(value.ToString("C3", new CultureInfo("fr-FR")));
|
||||
Console.WriteLine(value.ToString("C3", new CultureInfo("de-DE")));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// $1,603.420
|
||||
// 1 603,420 €
|
||||
// 1.603,420 €
|
||||
```
|
||||
|
||||
The [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) interface includes one method, [GetFormat(Type)](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_), which has a single parameter that specifies the type of object that provides formatting information. If the method can provide an object of that type, it returns it. Otherwise, it returns a null reference.
|
||||
|
||||
[IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) is a callback method. When you call a `ToString` method overload that includes an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter, it calls the [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method of that [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) object. The [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method is responsible for returning an object that provides the necessary formatting information, as specified by its *formatType* parameter, to the `ToString` method.
|
||||
|
||||
A number of formatting or string conversion methods include a parameter of type [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider), but in many cases the value of the parameter is ignored when the method is called. The following table lists some of the formatting methods that use the parameter and the type of the [Type](https://docs.microsoft.com/dotnet/core/api/System.Type) object that they pass to the [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method.
|
||||
|
||||
Method | Type of *formatType* parameter
|
||||
------ | ------------------------------
|
||||
`ToString` method of numeric types | [System.Globalization.NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo)
|
||||
`ToString` method of date and time types | [System.Globalization.DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo)
|
||||
[String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) | [System.ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter)
|
||||
[StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_) | [System.ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter)
|
||||
|
||||
.NET Core provides three classes that implement [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider):
|
||||
|
||||
* [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo), a class that provides formatting information for date and time values for a specific culture. Its [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) implementation returns an instance of itself.
|
||||
|
||||
* [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo), a class that provides numeric formatting information for a specific culture. Its [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) implementation returns an instance of itself.
|
||||
|
||||
* [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo). Its [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) implementation can return either a [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object to provide numeric formatting information or a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object to provide formatting information for date and time values.
|
||||
|
||||
You can also implement your own format provider to replace any one of these classes. However, your implementation’s `GetFormat` method must return an object of the type listed in the previous table if it has to provide formatting information to the `ToString` method.
|
||||
|
||||
### Culture-Sensitive Formatting of Numeric Values
|
||||
|
||||
By default, the formatting of numeric values is culture-sensitive. If you do not specify a culture when you call a formatting method, the formatting conventions of the current thread culture are used. This is illustrated in the following example, which changes the current thread culture four times and then calls the [Decimal.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.Decimal#System_Decimal_ToString) method. In each case, the result string reflects the formatting conventions of the current culture. This is because the `ToString` and `ToString(String)` methods wrap calls to each numeric type's `ToString(String, IFormatProvider)` method.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] cultureNames = { "en-US", "fr-FR", "es-MX", "de-DE" };
|
||||
Decimal value = 1043.17m;
|
||||
|
||||
foreach (var cultureName in cultureNames) {
|
||||
// Change the current thread culture.
|
||||
Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture(cultureName);
|
||||
Console.WriteLine("The current culture is {0}",
|
||||
Thread.CurrentThread.CurrentCulture.Name);
|
||||
Console.WriteLine(value.ToString("C2"));
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The current culture is en-US
|
||||
// $1,043.17
|
||||
//
|
||||
// The current culture is fr-FR
|
||||
// 1 043,17 €
|
||||
//
|
||||
// The current culture is es-MX
|
||||
// $1,043.17
|
||||
//
|
||||
// The current culture is de-DE
|
||||
// 1.043,17 €
|
||||
```
|
||||
|
||||
You can also format a numeric value for a specific culture by calling a `ToString` overload that has a *provider* parameter and passing it either of the following:
|
||||
|
||||
* A [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents the culture whose formatting conventions are to be used. Its [CultureInfo.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_GetFormat_System_Type_) method returns the value of the [CultureInfo.NumberFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_NumberFormat) property, which is the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object that provides culture-specific formatting information for numeric values.
|
||||
|
||||
* A [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object that defines the culture-specific formatting conventions to be used. Its [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_GetFormat_System_Type_) method returns an instance of itself.
|
||||
|
||||
The following example uses [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) objects that represent the English (United States) and English (Great Britain) cultures and the French and Russian neutral cultures to format a floating-point number.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Double value = 1043.62957;
|
||||
string[] cultureNames = { "en-US", "en-GB", "ru", "fr" };
|
||||
|
||||
foreach (var name in cultureNames) {
|
||||
NumberFormatInfo nfi = CultureInfo.CreateSpecificCulture(name).NumberFormat;
|
||||
Console.WriteLine("{0,-6} {1}", name + ":", value.ToString("N3", nfi));
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// en-US: 1,043.630
|
||||
// en-GB: 1,043.630
|
||||
// ru: 1 043,630
|
||||
// fr: 1 043,630
|
||||
```
|
||||
|
||||
### Culture-Sensitive Formatting of Date and Time Values
|
||||
|
||||
By default, the formatting of date and time values is culture-sensitive. If you do not specify a culture when you call a formatting method, the formatting conventions of the current thread culture are used. This is illustrated in the following example, which changes the current thread culture four times and then calls the [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_) method. In each case, the result string reflects the formatting conventions of the current culture. This is because the [DateTime.ToString()](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString), [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_), [DateTimeOffset.ToString()](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString), and [DateTimeOffset.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_) methods wrap calls to the [DateTime.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) and [DateTimeOffset.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_System_IFormatProvider_) methods.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] cultureNames = { "en-US", "fr-FR", "es-MX", "de-DE" };
|
||||
DateTime dateToFormat = new DateTime(2012, 5, 28, 11, 30, 0);
|
||||
|
||||
foreach (var cultureName in cultureNames) {
|
||||
// Change the current thread culture.
|
||||
Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture(cultureName);
|
||||
Console.WriteLine("The current culture is {0}",
|
||||
Thread.CurrentThread.CurrentCulture.Name);
|
||||
Console.WriteLine(dateToFormat.ToString("F"));
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The current culture is en-US
|
||||
// Monday, May 28, 2012 11:30:00 AM
|
||||
//
|
||||
// The current culture is fr-FR
|
||||
// lundi 28 mai 2012 11:30:00
|
||||
//
|
||||
// The current culture is es-MX
|
||||
// lunes, 28 de mayo de 2012 11:30:00 a.m.
|
||||
//
|
||||
// The current culture is de-DE
|
||||
// Montag, 28. Mai 2012 11:30:00
|
||||
```
|
||||
|
||||
You can also format a date and time value for a specific culture by calling a [DateTime.ToString](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_IFormatProvider_) or [DateTimeOffset.ToString](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_IFormatProvider_) overload that has a provider parameter and passing it either of the following:
|
||||
|
||||
* A [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents the culture whose formatting conventions are to be used. Its [CultureInfo.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_GetFormat_System_Type_) method returns the value of the [CultureInfo.NumberFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_NumberFormat) property, which is the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object that provides culture-specific formatting information for numeric values.
|
||||
|
||||
* A [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object that defines the culture-specific formatting conventions to be used. Its [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo#System_Globalization_DateTimeFormatInfo_GetFormat_System_Type_) method returns an instance of itself.
|
||||
|
||||
The following example uses [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) objects that represent the English (United States) and English (Great Britain) cultures and the French and Russian neutral cultures to format a date.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dat1 = new DateTime(2012, 5, 28, 11, 30, 0);
|
||||
string[] cultureNames = { "en-US", "en-GB", "ru", "fr" };
|
||||
|
||||
foreach (var name in cultureNames) {
|
||||
DateTimeFormatInfo dtfi = CultureInfo.CreateSpecificCulture(name).DateTimeFormat;
|
||||
Console.WriteLine("{0}: {1}", name, dat1.ToString(dtfi));
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// en-US: 5/28/2012 11:30:00 AM
|
||||
// en-GB: 28/05/2012 11:30:00
|
||||
// ru: 28.05.2012 11:30:00
|
||||
// fr: 28/05/2012 11:30:00
|
||||
```
|
||||
|
||||
## The IFormattable Interface
|
||||
|
||||
Typically, types that overload the `ToString` method with a format string and an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter also implement the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) interface. This interface has a single member, [IFormattable.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.IFormattable#System_IFormattable_ToString_System_String_System_IFormatProvider_), that includes both a format string and a format provider as parameters.
|
||||
|
||||
Implementing the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) interface for your application-defined class offers two advantages:
|
||||
|
||||
* Support for string conversion by the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class. Calls to the [Convert.ToString(Object)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ToString_System_Object_) and [Convert.ToString(Object, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ToString_System_Object_System_IFormatProvider_) methods call your [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) implementation automatically.
|
||||
|
||||
* Support for composite formatting. If a format item that includes a format string is used to format your custom type, the Common Language Runtime automatically calls your [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) implementation and passes it the format string. For more information about composite formatting with methods such as `String.Format` or `Console.WriteLine`, see the [Composite Formatting](#Composite-Formatting) section.
|
||||
|
||||
The following example defines a `Temperature` class that implements the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) interface. It supports the "C" or "G" format specifiers to display the temperature in Celsius, the "F" format specifier to display the temperature in Fahrenheit, and the "K" format specifier to display the temperature in Kelvin.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Temperature : IFormattable
|
||||
{
|
||||
private decimal m_Temp;
|
||||
|
||||
public Temperature(decimal temperature)
|
||||
{
|
||||
this.m_Temp = temperature;
|
||||
}
|
||||
|
||||
public decimal Celsius
|
||||
{
|
||||
get { return this.m_Temp; }
|
||||
}
|
||||
|
||||
public decimal Kelvin
|
||||
{
|
||||
get { return this.m_Temp + 273.15m; }
|
||||
}
|
||||
|
||||
public decimal Fahrenheit
|
||||
{
|
||||
get { return Math.Round((decimal) this.m_Temp * 9 / 5 + 32, 2); }
|
||||
}
|
||||
|
||||
public override string ToString()
|
||||
{
|
||||
return this.ToString("G", null);
|
||||
}
|
||||
|
||||
public string ToString(string format)
|
||||
{
|
||||
return this.ToString(format, null);
|
||||
}
|
||||
|
||||
public string ToString(string format, IFormatProvider provider)
|
||||
{
|
||||
// Handle null or empty arguments.
|
||||
if (String.IsNullOrEmpty(format)) format = "G";
|
||||
// Remove any white space and convert to uppercase.
|
||||
format = format.Trim().ToUpperInvariant();
|
||||
|
||||
if (provider == null) provider = NumberFormatInfo.CurrentInfo;
|
||||
|
||||
switch (format)
|
||||
{
|
||||
// Convert temperature to Fahrenheit and return string.
|
||||
case "F":
|
||||
return this.Fahrenheit.ToString("N2", provider) + "°F";
|
||||
// Convert temperature to Kelvin and return string.
|
||||
case "K":
|
||||
return this.Kelvin.ToString("N2", provider) + "K";
|
||||
// Return temperature in Celsius.
|
||||
case "C":
|
||||
case "G":
|
||||
return this.Celsius.ToString("N2", provider) + "°C";
|
||||
default:
|
||||
throw new FormatException(String.Format("The '{0}' format string is not supported.", format));
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following example instantiates a `Temperature` object. It then calls the [ToString](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ToString_System_Object_System_IFormatProvider_) method and uses several composite format strings to obtain different string representations of a `Temperature` object. Each of these method calls, in turn, calls the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) implementation of the `Temperature` class.
|
||||
|
||||
```csharp
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Temperature temp1 = new Temperature(22m);
|
||||
Console.WriteLine(Convert.ToString(temp1, new CultureInfo("ja-JP")));
|
||||
Console.WriteLine("Temperature: {0:K}", temp1);
|
||||
Console.WriteLine("Temperature: {0:F}", temp1);
|
||||
Console.WriteLine(String.Format(new CultureInfo("fr-FR"), "Temperature: {0:F}", temp1));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 22.00°C
|
||||
// Temperature: 295.15°K
|
||||
// Temperature: 71.60°F
|
||||
// Temperature: 71,60°F
|
||||
```
|
||||
|
||||
## Composite Formatting
|
||||
|
||||
Some methods, such as `String.Format` and `StringBuilder.AppendFormat`, support composite formatting. A composite format string is a kind of template that returns a single string that incorporates the string representation of zero, one, or more objects. Each object is represented in the composite format string by an indexed format item. The index of the format item corresponds to the position of the object that it represents in the method's parameter list. Indexes are zero-based. For example, in the following call to the `String.Format` method, the first format item, `{0:D}`, is replaced by the string representation of `thatDate`; the second format item, `{1}`, is replaced by the string representation of `item1`; and the third format item, `{2:C2}`, is replaced by the string representation of `item1.Value`.
|
||||
|
||||
```csharp
|
||||
result = String.Format("On {0:d}, the inventory of {1} was worth {2:C2}.",
|
||||
thatDate, item1, item1.Value);
|
||||
Console.WriteLine(result);
|
||||
// The example displays output like the following if run on a system
|
||||
// whose current culture is en-US:
|
||||
// On 5/1/2009, the inventory of WidgetA was worth $107.44.
|
||||
```
|
||||
|
||||
In addition to replacing a format item with the string representation of its corresponding object, format items also let you control the following:
|
||||
|
||||
* The specific way in which an object is represented as a string, if the object implements the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) interface and supports format strings. You do this by following the format item's index with a : (colon) followed by a valid format string. The previous example did this by formatting a date value with the "d" (short date pattern) format string (e.g., `{0:d}`) and by formatting a numeric value with the "C2" format string (e.g., `{2:C2}` to represent the number as a currency value with two fractional decimal digits.
|
||||
|
||||
* The width of the field that contains the object's string representation, and the alignment of the string representation in that field. You do this by following the format item's index with a , (comma) followed the field width. The string is right-aligned in the field if the field width is a positive value, and it is left-aligned if the field width is a negative value. The following example left-aligns date values in a 20-character field, and it right-aligns decimal values with one fractional digit in an 11-character field.
|
||||
|
||||
```csharp
|
||||
DateTime startDate = new DateTime(2015, 8, 28, 6, 0, 0);
|
||||
decimal[] temps = { 73.452m, 68.98m, 72.6m, 69.24563m,
|
||||
74.1m, 72.156m, 72.228m };
|
||||
Console.WriteLine("{0,-20} {1,11}\n", "Date", "Temperature");
|
||||
for (int ctr = 0; ctr < temps.Length; ctr++)
|
||||
Console.WriteLine("{0,-20:g} {1,11:N1}", startDate.AddDays(ctr), temps[ctr]);
|
||||
|
||||
// The example displays the following output:
|
||||
// Date Temperature
|
||||
//
|
||||
// 8/28/2015 6:00 AM 73.5
|
||||
// 8/29/2015 6:00 AM 69.0
|
||||
// 8/30/2015 6:00 AM 72.6
|
||||
// 8/31/2015 6:00 AM 69.2
|
||||
// 9/1/2015 6:00 AM 74.1
|
||||
// 9/2/2015 6:00 AM 72.2
|
||||
// 9/3/2015 6:00 AM 72.2
|
||||
```
|
||||
|
||||
Note that, if both the alignment string component and the format string component are present, the former precedes the latter (for example, `{0,-20:g}`.
|
||||
|
||||
For more information about composite formatting, see [Composite Formatting](compositeformat.md).
|
||||
|
||||
## Custom Formatting with ICustomFormatter
|
||||
|
||||
Two composite formatting methods, [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) and [StringBuilder.AppendFormat(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_), include a format provider parameter that supports custom formatting. When either of these formatting methods is called, it passes a [Type](https://docs.microsoft.com/dotnet/core/api/System.Type) object that represents an [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface to the format provider’s `GetFormat` method. The `GetFormat` method is then responsible for returning the [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) implementation that provides custom formatting.
|
||||
|
||||
The [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface has a single method, [Format(String, Object, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_), that is called automatically by a composite formatting method, once for each format item in a composite format string. The [Format(String, Object, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_) method has three parameters: a format string, which represents the *formatString* argument in a format item, an object to format, and an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) object that provides formatting services. Typically, the class that implements [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) also implements [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider), so this last parameter is a reference to the custom formatting class itself. The method returns a custom formatted string representation of the object to be formatted. If the method cannot format the object, it should return a null reference.
|
||||
|
||||
The following example provides an [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) implementation named `ByteByByteFormatter` that displays integer values as a sequence of two-digit hexadecimal values followed by a space.
|
||||
|
||||
```csharp
|
||||
public class ByteByByteFormatter : IFormatProvider, ICustomFormatter
|
||||
{
|
||||
public object GetFormat(Type formatType)
|
||||
{
|
||||
if (formatType == typeof(ICustomFormatter))
|
||||
return this;
|
||||
else
|
||||
return null;
|
||||
}
|
||||
|
||||
public string Format(string format, object arg,
|
||||
IFormatProvider formatProvider)
|
||||
{
|
||||
if (! formatProvider.Equals(this)) return null;
|
||||
|
||||
// Handle only hexadecimal format string.
|
||||
if (! format.StartsWith("X")) return null;
|
||||
|
||||
byte[] bytes;
|
||||
string output = null;
|
||||
|
||||
// Handle only integral types.
|
||||
if (arg is Byte)
|
||||
bytes = BitConverter.GetBytes((Byte) arg);
|
||||
else if (arg is Int16)
|
||||
bytes = BitConverter.GetBytes((Int16) arg);
|
||||
else if (arg is Int32)
|
||||
bytes = BitConverter.GetBytes((Int32) arg);
|
||||
else if (arg is Int64)
|
||||
bytes = BitConverter.GetBytes((Int64) arg);
|
||||
else if (arg is SByte)
|
||||
bytes = BitConverter.GetBytes((SByte) arg);
|
||||
else if (arg is UInt16)
|
||||
bytes = BitConverter.GetBytes((UInt16) arg);
|
||||
else if (arg is UInt32)
|
||||
bytes = BitConverter.GetBytes((UInt32) arg);
|
||||
else if (arg is UInt64)
|
||||
bytes = BitConverter.GetBytes((UInt64) arg);
|
||||
else
|
||||
return null;
|
||||
|
||||
for (int ctr = bytes.Length - 1; ctr >= 0; ctr--)
|
||||
output += String.Format("{0:X2} ", bytes[ctr]);
|
||||
|
||||
return output.Trim();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following example uses the `ByteByByteFormatter` class to format integer values. Note that the [ICustomFormatter.Format](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_) method is called more than once in the second [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method call, and that the default [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) provider is used in the third method call because the `.ByteByByteFormatter.Format` method does not recognize the "N0" format string and returns a null reference.
|
||||
|
||||
```csharp
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
long value = 3210662321;
|
||||
byte value1 = 214;
|
||||
byte value2 = 19;
|
||||
|
||||
Console.WriteLine(String.Format(new ByteByByteFormatter(), "{0:X}", value));
|
||||
Console.WriteLine(String.Format(new ByteByByteFormatter(), "{0:X} And {1:X} = {2:X} ({2:000})",
|
||||
value1, value2, value1 & value2));
|
||||
Console.WriteLine(String.Format(new ByteByByteFormatter(), "{0,10:N0}", value));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 00 00 00 00 BF 5E D1 B1
|
||||
// 00 D6 And 00 13 = 00 12 (018)
|
||||
// 3,210,662,321
|
||||
```
|
||||
|
||||
## Related Topics
|
||||
|
||||
Title | Definition
|
||||
----- | ----------
|
||||
[Standard Numeric Format Strings](standardnumeric.md) | Describes standard format strings that create commonly used string representations of numeric values.
|
||||
[Custom Numeric Format Strings](customnumeric.md) | Describes custom format strings that create application-specific formats for numeric values.
|
||||
[Standard Date and Time Format Strings](standarddatetime.md) | Describes standard format strings that create commonly used string representations of [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values.
|
||||
[Custom Date and Time Format Strings](customdatetime.md) | Describes custom format strings that create application-specific formats for [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values.
|
||||
[Standard TimeSpan Format Strings](standardtimespan.md) | Describes standard format strings that create commonly used string representations of time intervals.
|
||||
[Custom TimeSpan Format Strings](customtimespan.md) | Describes custom format strings that create application-specific formats for time intervals.
|
||||
[Enumeration Format Strings](enumerationformat.md) | Describes standard format strings that are used to create string representations of enumeration values.
|
||||
[Composite Formatting](compositeformat.md) | Describes how to embed one or more formatted values in a string. The string can subsequently be displayed on the console or written to a stream.
|
||||
|
||||
## Reference
|
||||
|
||||
[System.IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable)
|
||||
|
||||
[System.IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider)
|
||||
|
||||
[System.ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter)
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,162 +0,0 @@
|
|||
---
|
||||
title: How to: Define and Use Custom Numeric Format Providers
|
||||
description: How to: Define and Use Custom Numeric Format Providers
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: f8b97fda-595a-4626-af24-0f6e25fb3165
|
||||
---
|
||||
|
||||
# How to: Define and Use Custom Numeric Format Providers
|
||||
|
||||
.NET Core gives you extensive control over the string representation of numeric values. It supports the following features for customizing the format of numeric values:
|
||||
|
||||
* Standard numeric format strings, which provide a predefined set of formats for converting numbers to their string representation. You can use them with any numeric formatting method, such as [Decimal.ToString(String](https://docs.microsoft.com/dotnet/core/api/System.Decimal#System_Decimal_ToString_System_String_), that has a format parameter.
|
||||
|
||||
* Custom numeric format strings, which provide a set of symbols that can be combined to define custom numeric format specifiers. They can also be used with any numeric formatting method, such as [Decimal.ToString(String](https://docs.microsoft.com/dotnet/core/api/System.Decimal#System_Decimal_ToString_System_String_), that has a format parameter.
|
||||
|
||||
* Custom [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) or [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) objects, which define the symbols and format patterns used in displaying the string representations of numeric values. You can use them with any numeric formatting method, such as [ToString](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_ToString_System_IFormatProvider_), that has a *provider* parameter. Typically, the *provider* parameter is used to specify culture-specific formatting.
|
||||
|
||||
In some cases (such as when an application must display a formatted account number, an identification number, or a postal code) these three techniques are inappropriate. .NET Core also enables you to define a formatting object that is neither a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) nor a [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object to determine how a numeric value is formatted. This topic provides the step-by-step instructions for implementing such an object, and provides an example that formats telephone numbers.
|
||||
|
||||
## To define a custom format provider
|
||||
|
||||
1. Define a class that implements the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) and [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interfaces.
|
||||
|
||||
2. Implement the [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method. [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) is a callback method that the formatting method (such as the [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method) invokes to retrieve the object that is actually responsible for performing custom formatting. A typical implementation of [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) does the following:
|
||||
|
||||
a. Determines whether the [Type](https://docs.microsoft.com/dotnet/core/api/System.Type) object passed as a method parameter represents an [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface.
|
||||
|
||||
b. If the parameter does represent the [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface, [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) returns an object that implements the [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface that is responsible for providing custom formatting. Typically, the custom formatting object returns itself.
|
||||
|
||||
c. If the parameter does not represent the [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter) interface, [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) returns `null`.
|
||||
|
||||
3. Implement the [ICustomFormatter.Format](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_) method. This method is called by the [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method and is responsible for returning the string representation of a number. Implementing the method typically involves the following:
|
||||
|
||||
a. Optionally, make sure that the method is legitimately intended to provide formatting services by examining the *provider* parameter. For formatting objects that implement both [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) and [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter), this involves testing the *provider* parameter for equality with the current formatting object.
|
||||
|
||||
b. Determine whether the formatting object should support custom format specifiers. (For example, an "N" format specifier might indicate that a U.S. telephone number should be output in NANP format, and an "I" might indicate output in ITU-T Recommendation E.123 format.) If format specifiers are used, the method should handle the specific format specifier. It is passed to the method in the format parameter. If no specifier is present, the value of the *format* parameter is [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty).
|
||||
|
||||
c. Retrieve the numeric value passed to the method as the *arg* parameter. Perform whatever manipulations are required to convert it to its string representation.
|
||||
|
||||
d. Return the string representation of the *arg* parameter.
|
||||
|
||||
## To use a custom numeric formatting object
|
||||
|
||||
1. Create a new instance of the custom formatting class.
|
||||
|
||||
2. Call the [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) formatting method, passing it the custom formatting object, the formatting specifier (or [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty), if one is not used), and the numeric value to be formatted.
|
||||
|
||||
## Example
|
||||
|
||||
The following example defines a custom numeric format provider named `TelephoneFormatter` that converts a number that represents a U.S. telephone number to its NANP or E.123 format. The method handles two format specifiers, "N" (which outputs the NANP format) and "I" (which outputs the international E.123 format).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class TelephoneFormatter : IFormatProvider, ICustomFormatter
|
||||
{
|
||||
public object GetFormat(Type formatType)
|
||||
{
|
||||
if (formatType == typeof(ICustomFormatter))
|
||||
return this;
|
||||
else
|
||||
return null;
|
||||
}
|
||||
|
||||
public string Format(string format, object arg, IFormatProvider formatProvider)
|
||||
{
|
||||
// Check whether this is an appropriate callback
|
||||
if (! this.Equals(formatProvider))
|
||||
return null;
|
||||
|
||||
// Set default format specifier
|
||||
if (string.IsNullOrEmpty(format))
|
||||
format = "N";
|
||||
|
||||
string numericString = arg.ToString();
|
||||
|
||||
if (format == "N")
|
||||
{
|
||||
if (numericString.Length <= 4)
|
||||
return numericString;
|
||||
else if (numericString.Length == 7)
|
||||
return numericString.Substring(0, 3) + "-" + numericString.Substring(3, 4);
|
||||
else if (numericString.Length == 10)
|
||||
return "(" + numericString.Substring(0, 3) + ") " +
|
||||
numericString.Substring(3, 3) + "-" + numericString.Substring(6);
|
||||
else
|
||||
throw new FormatException(
|
||||
string.Format("'{0}' cannot be used to format {1}.",
|
||||
format, arg.ToString()));
|
||||
}
|
||||
else if (format == "I")
|
||||
{
|
||||
if (numericString.Length < 10)
|
||||
throw new FormatException(string.Format("{0} does not have 10 digits.", arg.ToString()));
|
||||
else
|
||||
numericString = "+1 " + numericString.Substring(0, 3) + " " + numericString.Substring(3, 3) + " " + numericString.Substring(6);
|
||||
}
|
||||
else
|
||||
{
|
||||
throw new FormatException(string.Format("The {0} format specifier is invalid.", format));
|
||||
}
|
||||
return numericString;
|
||||
}
|
||||
}
|
||||
|
||||
public class TestTelephoneFormatter
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0}", 0));
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0}", 911));
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0}", 8490216));
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0}", 4257884748));
|
||||
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0:N}", 0));
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0:N}", 911));
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0:N}", 8490216));
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0:N}", 4257884748));
|
||||
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0:I}", 4257884748));
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The custom numeric format provider can be used only with the [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method. The other overloads of numeric formatting methods (such as `ToString`) that have a parameter of type [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) all pass the [IFormatProvider.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) implementation a [Type](https://docs.microsoft.com/dotnet/core/api/System.Type) object that represents the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) type. In return, they expect the method to return a [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. If it does not, the custom numeric format provider is ignored, and the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object for the current culture is used in its place. In the example, the `TelephoneFormatter.GetFormat` method handles the possibility that it may be inappropriately passed to a numeric formatting method by examining the method parameter and returning *null* if it represents a type other than [ICustomFormatter](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter).
|
||||
|
||||
If a custom numeric format provider supports a set of format specifiers, make sure you provide a default behavior if no format specifier is supplied in the format item used in the S[String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method call. In the example, "N" is the default format specifier. This allows for a number to be converted to a formatted telephone number by providing an explicit format specifier. The following example illustrates such a method call.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0:N}", 4257884748));
|
||||
```
|
||||
|
||||
But it also allows the conversion to occur if no format specifier is present. The following example illustrates such a method call.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(String.Format(new TelephoneFormatter(), "{0}", 4257884748));
|
||||
```
|
||||
|
||||
If no default format specifier is defined, your implementation of the [ICustomFormatter.Format](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_) method should include code such as the following so that .NET Core can provide formatting that your code does not support.
|
||||
|
||||
```csharp
|
||||
if (arg is IFormattable)
|
||||
s = ((IFormattable)arg).ToString(format, formatProvider);
|
||||
else if (arg != null)
|
||||
s = arg.ToString();
|
||||
```
|
||||
|
||||
In the case of this example, the method that implements [ICustomFormatter.Format](https://docs.microsoft.com/dotnet/core/api/System.ICustomFormatter#System_ICustomFormatter_Format_System_String_System_Object_System_IFormatProvider_) is intended to serve as a callback method for the [String.Format(IFormatProvider, String, Object[])](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method. Therefore, it examines the *formatProvider* parameter to determine whether it contains a reference to the current `TelephoneFormatter` object. However, the method can also be called directly from code. In that case, you can use the *formatProvider *parameter to provide a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) or [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object that supplies culture-specific formatting information.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,209 +0,0 @@
|
|||
---
|
||||
title: How to: Display Dates in Non-Gregorian Calendars
|
||||
description: How to: Display Dates in Non-Gregorian Calendars
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 117f853f-8b81-4842-bf80-5a6b15484586
|
||||
---
|
||||
|
||||
# How to: Display Dates in Non-Gregorian Calendars
|
||||
|
||||
The [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) and [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) types use the Gregorian calendar as their default calendar. This means that calling a date and time value's `ToString` method displays the string representation of that date and time in the Gregorian calendar, even if that date and time was created using another calendar. This is illustrated in the following example, which uses two different ways to create a date and time value with the Persian calendar, but still displays those date and time values in the Gregorian calendar when it calls the [ToString](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString) method. This example reflects two commonly used but incorrect techniques for displaying the date in a particular calendar.
|
||||
|
||||
```csharp
|
||||
PersianCalendar persianCal = new PersianCalendar();
|
||||
|
||||
DateTime persianDate = persianCal.ToDateTime(1387, 3, 18, 12, 0, 0, 0);
|
||||
Console.WriteLine(persianDate.ToString());
|
||||
|
||||
persianDate = new DateTime(1387, 3, 18, persianCal);
|
||||
Console.WriteLine(persianDate.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/7/2008 12:00:00 PM
|
||||
// 6/7/2008 12:00:00 AM
|
||||
```
|
||||
|
||||
Two different techniques can be used to display the date in a particular calendar. The first requires that the calendar be the default calendar for a particular culture. The second can be used with any calendar.
|
||||
|
||||
## To display the date for a culture's default calendar
|
||||
|
||||
1. Instantiate a calendar object derived from the [Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar) class that represents the calendar to be used.
|
||||
|
||||
2. Instantiate a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object representing the culture whose formatting will be used to display the date.
|
||||
|
||||
3. Call the [Array.Exists<T>](https://docs.microsoft.com/dotnet/core/api/System.Array#System_Array_Exists__1___0___System_Predicate___0__) method to determine whether the calendar object is a member of the array returned by the [CultureInfo.OptionalCalendars](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_OptionalCalendars) property. This indicates that the calendar can serve as the default calendar for the [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object. If it is not a member of the array, follow the instructions in the "To Display the Date in Any Calendar" section.
|
||||
|
||||
4. Assign the calendar object to the [Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo#System_Globalization_DateTimeFormatInfo_Calendar) property of the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object returned by the [CultureInfo.DateTimeFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_DateTimeFormat) property.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) class also has a [Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_Calendar) property. However, it is read-only and constant; it does not change to reflect the new default calendar assigned to the [DateTimeFormatInfo.Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo#System_Globalization_DateTimeFormatInfo_Calendar) property.
|
||||
|
||||
5. Call either the [DateTime.ToString(IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_IFormatProvider_) or the [DateTime.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) method, and pass it the [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object whose default calendar was modified in the previous step.
|
||||
|
||||
## To display the date in any calendar
|
||||
|
||||
1. Instantiate a calendar object derived from the [Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar) class that represents the calendar to be used.
|
||||
|
||||
2. Determine which date and time elements should appear in the string representation of the date and time value.
|
||||
|
||||
3. For each date and time element that you want to display, call the calendar object's `Get…` method. The following methods are available:
|
||||
|
||||
* [GetYear](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetYear_System_DateTime_), to display the year in the appropriate calendar.
|
||||
|
||||
* [GetMonth](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetMonth_System_DateTime_), to display the month in the appropriate calendar.
|
||||
|
||||
* [GetDayOfMonth](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetDayOfMonth_System_DateTime_), to display the number of the day of the month in the appropriate calendar.
|
||||
|
||||
* [GetHour](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetHour_System_DateTime_), to display the hour of the day in the appropriate calendar.
|
||||
|
||||
* [GetMinute](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetMinute_System_DateTime_), to display the minutes in the hour in the appropriate calendar.
|
||||
|
||||
* [GetSecond](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetSecond_System_DateTime_), to display the seconds in the minute in the appropriate calendar.
|
||||
|
||||
* [GetMilliseconds](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar#System_Globalization_Calendar_GetMilliseconds_System_DateTime_) , to display the milliseconds in the second in the appropriate calendar.
|
||||
|
||||
## Example
|
||||
|
||||
The example displays a date using two different calendars. It displays the date after defining the Hijri calendar as the default calendar for the ar-JO culture, and displays the date using the Persian calendar, which is not supported as an optional calendar by the fa-IR culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class CalendarDates
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
HijriCalendar hijriCal = new HijriCalendar();
|
||||
CalendarUtility hijriUtil = new CalendarUtility(hijriCal);
|
||||
DateTime dateValue1 = new DateTime(1429, 6, 29, hijriCal);
|
||||
DateTimeOffset dateValue2 = new DateTimeOffset(dateValue1,
|
||||
TimeZoneInfo.Local.GetUtcOffset(dateValue1));
|
||||
CultureInfo jc = CultureInfo.CreateSpecificCulture("ar-JO");
|
||||
|
||||
// Display the date using the Gregorian calendar.
|
||||
Console.WriteLine("Using the system default culture: {0}",
|
||||
dateValue1.ToString("d"));
|
||||
// Display the date using the ar-JO culture's original default calendar.
|
||||
Console.WriteLine("Using the ar-JO culture's original default calendar: {0}",
|
||||
dateValue1.ToString("d", jc));
|
||||
// Display the date using the Hijri calendar.
|
||||
Console.WriteLine("Using the ar-JO culture with Hijri as the default calendar:");
|
||||
// Display a Date value.
|
||||
Console.WriteLine(hijriUtil.DisplayDate(dateValue1, jc));
|
||||
// Display a DateTimeOffset value.
|
||||
Console.WriteLine(hijriUtil.DisplayDate(dateValue2, jc));
|
||||
|
||||
Console.WriteLine();
|
||||
|
||||
PersianCalendar persianCal = new PersianCalendar();
|
||||
CalendarUtility persianUtil = new CalendarUtility(persianCal);
|
||||
CultureInfo ic = CultureInfo.CreateSpecificCulture("fa-IR");
|
||||
|
||||
// Display the date using the ir-FA culture's default calendar.
|
||||
Console.WriteLine("Using the ir-FA culture's default calendar: {0}",
|
||||
dateValue1.ToString("d", ic));
|
||||
// Display a Date value.
|
||||
Console.WriteLine(persianUtil.DisplayDate(dateValue1, ic));
|
||||
// Display a DateTimeOffset value.
|
||||
Console.WriteLine(persianUtil.DisplayDate(dateValue2, ic));
|
||||
}
|
||||
}
|
||||
|
||||
public class CalendarUtility
|
||||
{
|
||||
private Calendar thisCalendar;
|
||||
private CultureInfo targetCulture;
|
||||
|
||||
public CalendarUtility(Calendar cal)
|
||||
{
|
||||
this.thisCalendar = cal;
|
||||
}
|
||||
|
||||
private bool CalendarExists(CultureInfo culture)
|
||||
{
|
||||
this.targetCulture = culture;
|
||||
return Array.Exists(this.targetCulture.OptionalCalendars,
|
||||
this.HasSameName);
|
||||
}
|
||||
|
||||
private bool HasSameName(Calendar cal)
|
||||
{
|
||||
if (cal.ToString() == thisCalendar.ToString())
|
||||
return true;
|
||||
else
|
||||
return false;
|
||||
}
|
||||
|
||||
public string DisplayDate(DateTime dateToDisplay, CultureInfo culture)
|
||||
{
|
||||
DateTimeOffset displayOffsetDate = dateToDisplay;
|
||||
return DisplayDate(displayOffsetDate, culture);
|
||||
}
|
||||
|
||||
public string DisplayDate(DateTimeOffset dateToDisplay,
|
||||
CultureInfo culture)
|
||||
{
|
||||
string specifier = "yyyy/MM/dd";
|
||||
|
||||
if (this.CalendarExists(culture))
|
||||
{
|
||||
Console.WriteLine("Displaying date in supported {0} calendar...",
|
||||
this.thisCalendar.GetType().Name);
|
||||
culture.DateTimeFormat.Calendar = this.thisCalendar;
|
||||
return dateToDisplay.ToString(specifier, culture);
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine("Displaying date in unsupported {0} calendar...",
|
||||
thisCalendar.GetType().Name);
|
||||
|
||||
string separator = targetCulture.DateTimeFormat.DateSeparator;
|
||||
|
||||
return thisCalendar.GetYear(dateToDisplay.DateTime).ToString("0000") +
|
||||
separator +
|
||||
thisCalendar.GetMonth(dateToDisplay.DateTime).ToString("00") +
|
||||
separator +
|
||||
thisCalendar.GetDayOfMonth(dateToDisplay.DateTime).ToString("00");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output to the console:
|
||||
// Using the system default culture: 7/3/2008
|
||||
// Using the ar-JO culture's original default calendar: 03/07/2008
|
||||
// Using the ar-JO culture with Hijri as the default calendar:
|
||||
// Displaying date in supported HijriCalendar calendar...
|
||||
// 1429/06/29
|
||||
// Displaying date in supported HijriCalendar calendar...
|
||||
// 1429/06/29
|
||||
//
|
||||
// Using the ir-FA culture's default calendar: 7/3/2008
|
||||
// Displaying date in unsupported PersianCalendar calendar...
|
||||
// 1387/04/13
|
||||
// Displaying date in unsupported PersianCalendar calendar...
|
||||
// 1387/04/13
|
||||
```
|
||||
|
||||
Each [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object can support one or more calendars, which are indicated by the [OptionalCalendars](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_OptionalCalendars) property. One of these is designated as the culture's default calendar and is returned by the read-only [CultureInfo.Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_Calendar) property. Another of the optional calendars can be designated as the default by assigning a [Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar) object that represents that calendar to the [DateTimeFormatInfo.Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo#System_Globalization_DateTimeFormatInfo_Calendar) property returned by the [CultureInfo.DateTimeFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_DateTimeFormat) property. However, some calendars, such as the Persian calendar represented by the [PersianCalendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.PersianCalendar) class, do not serve as optional calendars for any culture.
|
||||
|
||||
The example defines a reusable calendar utility class, `CalendarUtility`, to handle many of the details of generating the string representation of a date using a particular calendar. The `CalendarUtility` class has the following members:
|
||||
|
||||
* A parameterized constructor whose single parameter is a [Calendar](https://docs.microsoft.com/dotnet/core/api/System.Globalization.Calendar) object in which a date is to be represented. This is assigned to a private field of the class.
|
||||
|
||||
* `CalendarExists`, a private method that returns a Boolean value indicating whether the calendar represented by the `CalendarUtility` object is supported by the [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that is passed to the method as a parameter. The method wraps a call to the [Array.Exists<T>](https://docs.microsoft.com/dotnet/core/api/System.Array#System_Array_Exists__1___0___System_Predicate___0__) method, to which it passes the [CultureInfo.OptionalCalendars](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_OptionalCalendars) array.
|
||||
|
||||
* `HasSameName`, a private method assigned to the [Predicate<T>](https://docs.microsoft.com/dotnet/core/api/System.Predicate%601) delegate that is passed as a parameter to the [Array.Exists<T>](https://docs.microsoft.com/dotnet/core/api/System.Array#System_Array_Exists__1___0___System_Predicate___0__) method. Each member of the array is passed to the method until the method returns `true`. The method determines whether the name of an optional calendar is the same as the calendar represented by the `CalendarUtility` object.
|
||||
|
||||
* `DisplayDate`, an overloaded public method that is passed two parameters: either a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) or [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value to express in the calendar represented by the `CalendarUtility` object; and the culture whose formatting rules are to be used. Its behavior in returning the string representation of a date depends on whether the target calendar is supported by the culture whose formatting rules are to be used.
|
||||
|
||||
Regardless of the calendar used to create a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) or [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value in this example, that value is typically expressed as a Gregorian date. This is because the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) and [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) types do not preserve any calendar information. Internally, they are represented as the number of ticks that have elapsed since midnight of January 1, 0001. The interpretation of that number depends on the calendar. For most cultures, the default calendar is the Gregorian calendar.
|
||||
|
||||
|
||||
|
|
@ -1,116 +0,0 @@
|
|||
---
|
||||
title: How to: Display Milliseconds in Date and Time Values
|
||||
description: How to: Display Milliseconds in Date and Time Values
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a9ccd37b-9d8a-416a-89f8-0edaa951e50a
|
||||
---
|
||||
|
||||
# How to: Display Milliseconds in Date and Time Values
|
||||
|
||||
The default date and time formatting methods, such as [DateTime.ToString()](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString), include the hours, minutes, and seconds of a time value but exclude its milliseconds component. This topic shows how to include a date and time's millisecond component in formatted date and time strings.
|
||||
|
||||
## To display the millisecond component of a DateTime value
|
||||
|
||||
1. If you are working with the string representation of a date, convert it to a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) or a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value by using the static [DateTime.Parse(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_) or [DateTimeOffset.Parse(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_Parse_System_String_) method.
|
||||
|
||||
2. To extract the string representation of a time's millisecond component, call the date and time value's [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_) or [DateTimeOffset.ToString](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_) method, and pass the `fff` or `FFF` custom format pattern either alone or with other custom format specifiers as the format parameter.
|
||||
|
||||
## Example
|
||||
|
||||
The example displays the millisecond component of a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) and a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value to the console, both alone and included in a longer date and time string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class MillisecondDisplay
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string dateString = "7/16/2008 8:32:45.126 AM";
|
||||
|
||||
try
|
||||
{
|
||||
DateTime dateValue = DateTime.Parse(dateString);
|
||||
DateTimeOffset dateOffsetValue = DateTimeOffset.Parse(dateString);
|
||||
|
||||
// Display Millisecond component alone.
|
||||
Console.WriteLine("Millisecond component only: {0}",
|
||||
dateValue.ToString("fff"));
|
||||
Console.WriteLine("Millisecond component only: {0}",
|
||||
dateOffsetValue.ToString("fff"));
|
||||
|
||||
// Display Millisecond component with full date and time.
|
||||
Console.WriteLine("Date and Time with Milliseconds: {0}",
|
||||
dateValue.ToString("MM/dd/yyyy hh:mm:ss.fff tt"));
|
||||
Console.WriteLine("Date and Time with Milliseconds: {0}",
|
||||
dateOffsetValue.ToString("MM/dd/yyyy hh:mm:ss.fff tt"));
|
||||
|
||||
// Append millisecond pattern to current culture's full date time pattern
|
||||
string fullPattern = DateTimeFormatInfo.CurrentInfo.FullDateTimePattern;
|
||||
fullPattern = Regex.Replace(fullPattern, "(:ss|:s)", "$1.fff");
|
||||
|
||||
// Display Millisecond component with modified full date and time pattern.
|
||||
Console.WriteLine("Modified full date time pattern: {0}",
|
||||
dateValue.ToString(fullPattern));
|
||||
Console.WriteLine("Modified full date time pattern: {0}",
|
||||
dateOffsetValue.ToString(fullPattern));
|
||||
}
|
||||
catch (FormatException)
|
||||
{
|
||||
Console.WriteLine("Unable to convert {0} to a date.", dateString);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output if the current culture is en-US:
|
||||
// Millisecond component only: 126
|
||||
// Millisecond component only: 126
|
||||
// Date and Time with Milliseconds: 07/16/2008 08:32:45.126 AM
|
||||
// Date and Time with Milliseconds: 07/16/2008 08:32:45.126 AM
|
||||
// Modified full date time pattern: Wednesday, July 16, 2008 8:32:45.126 AM
|
||||
// Modified full date time pattern: Wednesday, July 16, 2008 8:32:45.126 AM
|
||||
```
|
||||
|
||||
The `fff` format pattern includes any trailing zeros in the millisecond value. The `FFF` format pattern suppresses them. The difference is illustrated in the following example.
|
||||
|
||||
```csharp
|
||||
DateTime dateValue = new DateTime(2008, 7, 16, 8, 32, 45, 180);
|
||||
Console.WriteLine(dateValue.ToString("fff"));
|
||||
Console.WriteLine(dateValue.ToString("FFF"));
|
||||
// The example displays the following output to the console:
|
||||
// 180
|
||||
// 18
|
||||
```
|
||||
|
||||
A problem with defining a complete custom format specifier that includes the millisecond component of a date and time is that it defines a hard-coded format that may not correspond to the arrangement of time elements in the application's current culture. A better alternative is to retrieve one of the date and time display patterns defined by the current culture's [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object and modify it to include milliseconds. The example also illustrates this approach. It retrieves the current culture's full date and time pattern from the [DateTimeFormatInfo.FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo#System_Globalization_DateTimeFormatInfo_FullDateTimePattern) property, and then inserts the custom pattern `.ffff` after its seconds pattern. Note that the example uses a regular expression to perform this operation in a single method call.
|
||||
|
||||
You can also use a custom format specifier to display a fractional part of seconds other than milliseconds. For example, the `f` or `F` custom format specifier displays tenths of a second, the `ff` or `FF` custom format specifier displays hundredths of a second, and the `ffff` or `FFFF` custom format specifier displays ten thousandths of a second. Fractional parts of a millisecond are truncated instead of rounded in the returned string. These format specifiers are used in the following example.
|
||||
|
||||
```csharp
|
||||
DateTime dateValue = new DateTime(2008, 7, 16, 8, 32, 45, 180);
|
||||
Console.WriteLine("{0} seconds", dateValue.ToString("s.f"));
|
||||
Console.WriteLine("{0} seconds", dateValue.ToString("s.ff"));
|
||||
Console.WriteLine("{0} seconds", dateValue.ToString("s.ffff"));
|
||||
// The example displays the following output to the console:
|
||||
// 45.1 seconds
|
||||
// 45.18 seconds
|
||||
// 45.1800 seconds
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> It is possible to display very small fractional units of a second, such as ten thousandths of a second or hundred-thousandths of a second. However, these values may not be meaningful. The precision of date and time values depends on the resolution of the system clock.
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Globalization/DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo)
|
||||
|
||||
|
|
@ -1,236 +0,0 @@
|
|||
---
|
||||
title: How to: Extract the Day of the Week from a Specific Date
|
||||
description: How to: Extract the Day of the Week from a Specific Date
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: c562e432-ef15-4c27-b580-c6f888504289
|
||||
---
|
||||
|
||||
# How to: Extract the Day of the Week from a Specific Date
|
||||
|
||||
.NET Core makes it easy to determine the ordinal day of the week for a particular date, and to display the localized weekday name for a particular date. An enumerated value that indicates the day of the week corresponding to a particular date is available from the [Datetime.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_DayOfWeek) or [DateTimeOffset.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_DayOfWeek) property. In contrast, retrieving the weekday name is a formatting operation that can be performed by calling a formatting method, such as a date and time value's `ToString` method or the [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method. This topic shows how to perform these formatting operations.
|
||||
|
||||
## To extract a number indicating the day of the week from a specific date
|
||||
|
||||
1. If you are working with the string representation of a date, convert it to a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) or a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value by using the static [DateTime.Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_) or [DateTimeOffset.Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) method.
|
||||
|
||||
2. Use the [Datetime.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_DayOfWeek) or [DateTimeOffset.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_DayOfWeek) property to retrieve a [DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DayOfWeek) value that indicates the day of the week.
|
||||
|
||||
3. If necessary, cast the [DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DayOfWeek) value to an integer.
|
||||
|
||||
The following example displays an integer that represents the day of the week of a specific date.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dateValue = new DateTime(2008, 6, 11);
|
||||
Console.WriteLine((int) dateValue.DayOfWeek);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 3
|
||||
```
|
||||
|
||||
## To extract the abbreviated weekday name from a specific date
|
||||
|
||||
1. If you are working with the string representation of a date, convert it to a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) or a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value by using the static [DateTime.Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_) or [DateTimeOffset.Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) method.
|
||||
|
||||
2. You can extract the abbreviated weekday name of the current culture or of a specific culture:
|
||||
|
||||
a. To extract the abbreviated weekday name for the current culture, call the date and time value's [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_) or [DateTimeOffset.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_) instance method, and pass the string "ddd" as the *format* parameter. The following example illustrates the call to the `ToString(String)` method.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dateValue = new DateTime(2008, 6, 11);
|
||||
Console.WriteLine(dateValue.ToString("ddd"));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Wed
|
||||
```
|
||||
|
||||
b. To extract the abbreviated weekday name for a specific culture, call the date and time value’s [DateTime.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) or [DateTimeOffset.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_System_IFormatProvider_) instance method. Pass the string "ddd" as the *format* parameter. Pass either a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) or a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object that represents the culture whose weekday name you want to retrieve as the *provider* parameter. The following code illustrates a call to the [ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) method using a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents the fr-FR culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dateValue = new DateTime(2008, 6, 11);
|
||||
Console.WriteLine(dateValue.ToString("ddd",
|
||||
new CultureInfo("fr-FR")));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// mer.
|
||||
```
|
||||
|
||||
## To extract the full weekday name from a specific date
|
||||
|
||||
1. If you are working with the string representation of a date, convert it to a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) or a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value by using the static [DateTime.Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_) or [DateTimeOffset.Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) method.
|
||||
|
||||
2. You can extract the abbreviated weekday name of the current culture or of a specific culture:
|
||||
|
||||
a. To extract the abbreviated weekday name for the current culture, call the date and time value's [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_) or [DateTimeOffset.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_) instance method, and pass the string "dddd" as the *format* parameter. The following example illustrates the call to the `ToString(String)` method.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dateValue = new DateTime(2008, 6, 11);
|
||||
Console.WriteLine(dateValue.ToString("dddd"));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Wednesday
|
||||
```
|
||||
|
||||
b. To extract the weekday name for a specific culture, call the date and time value’s [DateTime.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) or [DateTimeOffset.ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_System_IFormatProvider_) instance method. Pass the string "dddd" as the *format* parameter. Pass either a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) or a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object that represents the culture whose weekday name you want to retrieve as the *provider* parameter. The following code illustrates a call to the [ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) method using a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents the es-ES culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dateValue = new DateTime(2008, 6, 11);
|
||||
Console.WriteLine(dateValue.ToString("dddd",
|
||||
new CultureInfo("es-ES")));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// miércoles.
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
The example illustrates calls to the [Datetime.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_DayOfWeek) and [DateTimeOffset.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_DayOfWeek) properties and the [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_) or [DateTimeOffset.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_) methods to retrieve the number that represents the day of the week, the abbreviated weekday name, and the full weekday name for a particular date.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string dateString = "6/11/2007";
|
||||
DateTime dateValue;
|
||||
DateTimeOffset dateOffsetValue;
|
||||
|
||||
try
|
||||
{
|
||||
DateTimeFormatInfo dateTimeFormats;
|
||||
// Convert date representation to a date value
|
||||
dateValue = DateTime.Parse(dateString, CultureInfo.InvariantCulture);
|
||||
dateOffsetValue = new DateTimeOffset(dateValue,
|
||||
TimeZoneInfo.Local.GetUtcOffset(dateValue));
|
||||
|
||||
// Convert date representation to a number indicating the day of week
|
||||
Console.WriteLine((int) dateValue.DayOfWeek);
|
||||
Console.WriteLine((int) dateOffsetValue.DayOfWeek);
|
||||
|
||||
// Display abbreviated weekday name using current culture
|
||||
Console.WriteLine(dateValue.ToString("ddd"));
|
||||
Console.WriteLine(dateOffsetValue.ToString("ddd"));
|
||||
|
||||
// Display full weekday name using current culture
|
||||
Console.WriteLine(dateValue.ToString("dddd"));
|
||||
Console.WriteLine(dateOffsetValue.ToString("dddd"));
|
||||
|
||||
// Display abbreviated weekday name for de-DE culture
|
||||
Console.WriteLine(dateValue.ToString("ddd", new CultureInfo("de-DE")));
|
||||
Console.WriteLine(dateOffsetValue.ToString("ddd",
|
||||
new CultureInfo("de-DE")));
|
||||
|
||||
// Display abbreviated weekday name with de-DE DateTimeFormatInfo object
|
||||
dateTimeFormats = new CultureInfo("de-DE").DateTimeFormat;
|
||||
Console.WriteLine(dateValue.ToString("ddd", dateTimeFormats));
|
||||
Console.WriteLine(dateOffsetValue.ToString("ddd", dateTimeFormats));
|
||||
|
||||
// Display full weekday name for fr-FR culture
|
||||
Console.WriteLine(dateValue.ToString("ddd", new CultureInfo("fr-FR")));
|
||||
Console.WriteLine(dateOffsetValue.ToString("ddd",
|
||||
new CultureInfo("fr-FR")));
|
||||
|
||||
// Display abbreviated weekday name with fr-FR DateTimeFormatInfo object
|
||||
dateTimeFormats = new CultureInfo("fr-FR").DateTimeFormat;
|
||||
Console.WriteLine(dateValue.ToString("dddd", dateTimeFormats));
|
||||
Console.WriteLine(dateOffsetValue.ToString("dddd", dateTimeFormats));
|
||||
}
|
||||
catch (FormatException)
|
||||
{
|
||||
Console.WriteLine("Unable to convert {0} to a date.", dateString);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 1
|
||||
// 1
|
||||
// Mon
|
||||
// Mon
|
||||
// Monday
|
||||
// Monday
|
||||
// Mo
|
||||
// Mo
|
||||
// Mo
|
||||
// Mo
|
||||
// lun.
|
||||
// lun.
|
||||
// lundi
|
||||
// lundi
|
||||
```
|
||||
|
||||
You can also use the value returned by the [Datetime.DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_DayOfWeek) property to retrieve the weekday name of a particular date. This requires only a call to the [Enum.ToString](https://docs.microsoft.com/dotnet/core/api/System.Enum#System_Enum_System_IConvertible_ToString_System_IFormatProvider_) method on the [DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DayOfWeek) value returned by the property. However, this technique does not produce a localized weekday name for the current culture, as the following example illustrates.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Change current culture to fr-FR
|
||||
CultureInfo originalCulture = Thread.CurrentThread.CurrentCulture;
|
||||
Thread.CurrentThread.CurrentCulture = new CultureInfo("fr-FR");
|
||||
|
||||
DateTime dateValue = new DateTime(2008, 6, 11);
|
||||
// Display the DayOfWeek string representation
|
||||
Console.WriteLine(dateValue.DayOfWeek.ToString());
|
||||
// Restore original current culture
|
||||
Thread.CurrentThread.CurrentCulture = originalCulture;
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Wednesday
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,31 +0,0 @@
|
|||
---
|
||||
title: Performing Formatting Operations
|
||||
description: Performing Formatting Operations
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 40a9e983-d9b0-4d1d-8aeb-f54c41b37111
|
||||
---
|
||||
|
||||
# Performing Formatting Operations
|
||||
|
||||
The following topics provide step-by-step instructions for performing specific formatting operations.
|
||||
|
||||
* [How to: Pad a Number with Leading Zeros](padnumber.md)
|
||||
|
||||
* [How to: Define and Use Custom Numeric Format Providers](definecustom.md)
|
||||
|
||||
* [How to: Extract the Day of the Week from a Specific Date](extractday.md)
|
||||
|
||||
* [How to: Round-trip Date and Time Values](roundtrip.md)
|
||||
|
||||
* [How to: Display Milliseconds in Date and Time Values](displaymilliseconds.md)
|
||||
|
||||
* [How to: Display Dates in Non-Gregorian Calendars](displaydates.md)
|
||||
|
||||
|
|
@ -1,185 +0,0 @@
|
|||
---
|
||||
title: How to: Pad a Number with Leading Zeros
|
||||
description: How to: Pad a Number with Leading Zeros
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: fb17b3f4-01ec-449f-8f84-dcee820b3ecc
|
||||
---
|
||||
|
||||
# How to: Pad a Number with Leading Zeros
|
||||
|
||||
You can add leading zeros to an integer by using the "D" standard numeric format string with a precision specifier. You can add leading zeros to both integer and floating-point numbers by using a custom numeric format string. This topic shows how to use both methods to pad a number with leading zeros.
|
||||
|
||||
## To pad an integer with leading zeros to a specific length
|
||||
|
||||
1. Determine the minimum number of digits you want the integer value to display. Include any leading digits in this number.
|
||||
|
||||
2. Determine whether you want to display the integer as a decimal value or a hexadecimal value.
|
||||
|
||||
* To display the integer as a decimal value, call its `ToString(String)` method, and pass the string "D*n*" as the value of the format parameter, where *n* represents the minimum length of the string.
|
||||
|
||||
* To display the integer as a hexadecimal value, call its `ToString(String)` method and pass the string "X*n*" as the value of the format parameter, where *n* represents the minimum length of the string.
|
||||
|
||||
You can also use the format string in a method, such as [Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine) or [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_), that uses composite formatting.
|
||||
|
||||
The following example formats several integer values with leading zeros so that the total length of the formatted number is at least eight characters.
|
||||
|
||||
```csharp
|
||||
byte byteValue = 254;
|
||||
short shortValue = 10342;
|
||||
int intValue = 1023983;
|
||||
long lngValue = 6985321;
|
||||
ulong ulngValue = UInt64.MaxValue;
|
||||
|
||||
// Display integer values by calling the ToString method.
|
||||
Console.WriteLine("{0,22} {1,22}", byteValue.ToString("D8"), byteValue.ToString("X8"));
|
||||
Console.WriteLine("{0,22} {1,22}", shortValue.ToString("D8"), shortValue.ToString("X8"));
|
||||
Console.WriteLine("{0,22} {1,22}", intValue.ToString("D8"), intValue.ToString("X8"));
|
||||
Console.WriteLine("{0,22} {1,22}", lngValue.ToString("D8"), lngValue.ToString("X8"));
|
||||
Console.WriteLine("{0,22} {1,22}", ulngValue.ToString("D8"), ulngValue.ToString("X8"));
|
||||
Console.WriteLine();
|
||||
|
||||
// Display the same integer values by using composite formatting.
|
||||
Console.WriteLine("{0,22:D8} {0,22:X8}", byteValue);
|
||||
Console.WriteLine("{0,22:D8} {0,22:X8}", shortValue);
|
||||
Console.WriteLine("{0,22:D8} {0,22:X8}", intValue);
|
||||
Console.WriteLine("{0,22:D8} {0,22:X8}", lngValue);
|
||||
Console.WriteLine("{0,22:D8} {0,22:X8}", ulngValue);
|
||||
// The example displays the following output:
|
||||
// 00000254 000000FE
|
||||
// 00010342 00002866
|
||||
// 01023983 000F9FEF
|
||||
// 06985321 006A9669
|
||||
// 18446744073709551615 FFFFFFFFFFFFFFFF
|
||||
//
|
||||
// 00000254 000000FE
|
||||
// 00010342 00002866
|
||||
// 01023983 000F9FEF
|
||||
// 06985321 006A9669
|
||||
// 18446744073709551615 FFFFFFFFFFFFFFFF
|
||||
// 18446744073709551615 FFFFFFFFFFFFFFFF
|
||||
```
|
||||
|
||||
## To pad an integer with a specific number of leading zeros
|
||||
|
||||
1. Determine how many leading zeros you want the integer value to display.
|
||||
|
||||
2. Determine whether you want to display the integer as a decimal value or a hexadecimal value. Formatting it as a decimal value requires that you use the "D" standard format specifier; formatting it as a hexadecimal value requires that you use the "X" standard format specifier.
|
||||
|
||||
3. Determine the length of the unpadded numeric string by calling the integer value's `ToString("D").Length` or `ToString("X").Length` method.
|
||||
|
||||
4. Add the number of leading zeros that you want to include in the formatted string to the length of the unpadded numeric string. This defines the total length of the padded string.
|
||||
|
||||
5. Call the integer value's `ToString(String)` method, and pass the string "D*n*" for decimal strings and "X*n*" for hexadecimal strings, where *n* represents the total length of the padded string. You can also use the "D*n*" or "X*n*" format string in a method that supports composite formatting.
|
||||
|
||||
The following example pads an integer value with five leading zeros.
|
||||
|
||||
```csharp
|
||||
int value = 160934;
|
||||
int decimalLength = value.ToString("D").Length + 5;
|
||||
int hexLength = value.ToString("X").Length + 5;
|
||||
Console.WriteLine(value.ToString("D" + decimalLength.ToString()));
|
||||
Console.WriteLine(value.ToString("X" + hexLength.ToString()));
|
||||
// The example displays the following output:
|
||||
// 00000160934
|
||||
// 00000274A6
|
||||
```
|
||||
|
||||
## To pad a numeric value with leading zeros to a specific length
|
||||
|
||||
1. Determine how many digits to the left of the decimal you want the string representation of the number to have. Include any leading zeros in this total number of digits.
|
||||
|
||||
2. Define a custom numeric format string that uses the zero placeholder ("0") to represent the minimum number of zeros.
|
||||
|
||||
3. Call the number's `ToString(String)` method and pass it the custom format string. You can also use the custom format string with a method that supports composite formatting.
|
||||
|
||||
The following example formats several numeric values with leading zeros so that the total length of the formatted number is at least eight digits to the left of the decimal.
|
||||
|
||||
```csharp
|
||||
string fmt = "00000000.##";
|
||||
int intValue = 1053240;
|
||||
decimal decValue = 103932.52m;
|
||||
float sngValue = 1549230.10873992f;
|
||||
double dblValue = 9034521202.93217412;
|
||||
|
||||
// Display the numbers using the ToString method.
|
||||
Console.WriteLine(intValue.ToString(fmt));
|
||||
Console.WriteLine(decValue.ToString(fmt));
|
||||
Console.WriteLine(sngValue.ToString(fmt));
|
||||
Console.WriteLine(dblValue.ToString(fmt));
|
||||
Console.WriteLine();
|
||||
|
||||
// Display the numbers using composite formatting.
|
||||
string formatString = " {0,15:" + fmt + "}";
|
||||
Console.WriteLine(formatString, intValue);
|
||||
Console.WriteLine(formatString, decValue);
|
||||
Console.WriteLine(formatString, sngValue);
|
||||
Console.WriteLine(formatString, dblValue);
|
||||
// The example displays the following output:
|
||||
// 01053240
|
||||
// 00103932.52
|
||||
// 01549230
|
||||
// 9034521202.93
|
||||
//
|
||||
// 01053240
|
||||
// 00103932.52
|
||||
// 01549230
|
||||
// 9034521202.93
|
||||
```
|
||||
|
||||
## To pad a numeric value with a specific number of leading zeros
|
||||
|
||||
1. Determine how many leading zeros you want the numeric value to have.
|
||||
|
||||
2. Determine the number of digits to the left of the decimal in the unpadded numeric string. To do this:
|
||||
|
||||
a. Determine whether the string representation of a number includes a decimal point symbol.
|
||||
|
||||
b. If it does include a decimal point symbol, determine the number of characters to the left of the decimal point.
|
||||
|
||||
-or-
|
||||
|
||||
If it does not include a decimal point symbol, determine the string's length.
|
||||
|
||||
3. Create a custom format string that uses the zero placeholder ("0") for each of the leading zeros to appear in the string, and that uses either the zero placeholder or the digit placeholder ("#") to represent each digit in the default string.
|
||||
|
||||
4. Supply the custom format string as a parameter either to the number's ToString(String) method or to a method that supports composite formatting.
|
||||
|
||||
The following example pads two [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) values with five leading zeros.
|
||||
|
||||
```csharp
|
||||
double[] dblValues = { 9034521202.93217412, 9034521202 };
|
||||
foreach (double dblValue in dblValues)
|
||||
{
|
||||
string decSeparator = System.Globalization.NumberFormatInfo.CurrentInfo.NumberDecimalSeparator;
|
||||
string fmt, formatString;
|
||||
|
||||
if (dblValue.ToString().Contains(decSeparator))
|
||||
{
|
||||
int digits = dblValue.ToString().IndexOf(decSeparator);
|
||||
fmt = new String('0', 5) + new String('#', digits) + ".##";
|
||||
}
|
||||
else
|
||||
{
|
||||
fmt = new String('0', dblValue.ToString().Length);
|
||||
}
|
||||
formatString = "{0,20:" + fmt + "}";
|
||||
|
||||
Console.WriteLine(dblValue.ToString(fmt));
|
||||
Console.WriteLine(formatString, dblValue);
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 000009034521202.93
|
||||
// 000009034521202.93
|
||||
// 9034521202
|
||||
// 9034521202
|
||||
```
|
||||
|
||||
|
||||
|
|
@ -1,112 +0,0 @@
|
|||
---
|
||||
title: How to: Round-trip Date and Time Values
|
||||
description: How to: Round-trip Date and Time Values
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 4eecf836-2bf7-4dfe-a5be-16a62e85d1cb
|
||||
---
|
||||
|
||||
# How to: Round-trip Date and Time Values
|
||||
|
||||
In many applications, a date and time value is intended to unambiguously identify a single point in time. This topic shows how to save and restore a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value, and a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value so that the restored value identifies the same time as the saved value.
|
||||
|
||||
## To round-trip a DateTime value
|
||||
|
||||
1. Convert the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value to its string representation by calling the [DateTime.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_) method with the "o" format specifier.
|
||||
|
||||
2. Save the string representation of the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value to a file, or pass it across a process, application domain, or machine boundary.
|
||||
|
||||
3. Retrieve the string that represents the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value.
|
||||
|
||||
4. Call the [DateTime.Parse(String, IFormatProvider, DateTimeStyles)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_System_IFormatProvider_System_Globalization_DateTimeStyles_) method, and pass [DateTimeStyles.RoundtripKind](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeStyles#System_Globalization_DateTimeStyles_RoundtripKind) as the value of the *styles* parameter.
|
||||
|
||||
The following example illustrates how to round-trip a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value.
|
||||
|
||||
```csharp
|
||||
const string fileName = @".\DateFile.txt";
|
||||
|
||||
StreamWriter outFile = new StreamWriter(fileName);
|
||||
|
||||
// Save DateTime value.
|
||||
DateTime dateToSave = DateTime.SpecifyKind(new DateTime(2008, 6, 12, 18, 45, 15),
|
||||
DateTimeKind.Local);
|
||||
string dateString = dateToSave.ToString("o");
|
||||
Console.WriteLine("Converted {0} ({1}) to {2}.",
|
||||
dateToSave.ToString(),
|
||||
dateToSave.Kind.ToString(),
|
||||
dateString);
|
||||
outFile.WriteLine(dateString);
|
||||
Console.WriteLine("Wrote {0} to {1}.", dateString, fileName);
|
||||
outFile.Close();
|
||||
|
||||
// Restore DateTime value.
|
||||
DateTime restoredDate;
|
||||
|
||||
StreamReader inFile = new StreamReader(fileName);
|
||||
dateString = inFile.ReadLine();
|
||||
inFile.Close();
|
||||
restoredDate = DateTime.Parse(dateString, null, DateTimeStyles.RoundtripKind);
|
||||
Console.WriteLine("Read {0} ({2}) from {1}.", restoredDate.ToString(),
|
||||
fileName,
|
||||
restoredDate.Kind.ToString());
|
||||
// The example displays the following output:
|
||||
// Converted 6/12/2008 6:45:15 PM (Local) to 2008-06-12T18:45:15.0000000-05:00.
|
||||
// Wrote 2008-06-12T18:45:15.0000000-05:00 to .\DateFile.txt.
|
||||
// Read 6/12/2008 6:45:15 PM (Local) from .\DateFile.txt.
|
||||
```
|
||||
|
||||
When round-tripping a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value, this technique successfully preserves the time for all local and universal times. For example, if a local [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value is saved on a system in the U.S. Pacific Standard Time zone and is restored on a system in the U.S. Central Standard Time zone, the restored date and time will be two hours later than the original time, which reflects the time difference between the two time zones. However, this technique is not necessarily accurate for unspecified times. All [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values whose [Kind]([DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime)) property is [Unspecified](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Unspecified) are treated as if they are local times. If this is not the case, the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) will not successfully identify the correct point in time. The workaround for this limitation is to tightly couple a date and time value with its time zone for the save and restore operation.
|
||||
|
||||
## To round-trip a DateTimeOffset value
|
||||
|
||||
Convert the [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value to its string representation by calling the [DateTimeOffset.ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToString_System_String_) method with the "o" format specifier.
|
||||
|
||||
2. Save the string representation of the [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value to a file, or pass it across a process, application domain, or machine boundary.
|
||||
|
||||
3. Retrieve the string that represents the [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value.
|
||||
|
||||
4. Call the [DateTimeOffset.Parse(String, IFormatProvider, DateTimeStyles)](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_Parse_System_String_System_IFormatProvider_System_Globalization_DateTimeStyles_) method, and pass [DateTimeStyles.RoundtripKind](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeStyles#System_Globalization_DateTimeStyles_RoundtripKind) as the value of the *styles* parameter.
|
||||
|
||||
The following example illustrates how to round-trip a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value.
|
||||
|
||||
```csharp
|
||||
const string fileName = @".\DateOff.txt";
|
||||
|
||||
StreamWriter outFile = new StreamWriter(fileName);
|
||||
|
||||
// Save DateTime value.
|
||||
DateTimeOffset dateToSave = new DateTimeOffset(2008, 6, 12, 18, 45, 15,
|
||||
new TimeSpan(7, 0, 0));
|
||||
string dateString = dateToSave.ToString("o");
|
||||
Console.WriteLine("Converted {0} to {1}.", dateToSave.ToString(),
|
||||
dateString);
|
||||
outFile.WriteLine(dateString);
|
||||
Console.WriteLine("Wrote {0} to {1}.", dateString, fileName);
|
||||
outFile.Close();
|
||||
|
||||
// Restore DateTime value.
|
||||
DateTimeOffset restoredDateOff;
|
||||
|
||||
StreamReader inFile = new StreamReader(fileName);
|
||||
dateString = inFile.ReadLine();
|
||||
inFile.Close();
|
||||
restoredDateOff = DateTimeOffset.Parse(dateString, null,
|
||||
DateTimeStyles.RoundtripKind);
|
||||
Console.WriteLine("Read {0} from {1}.", restoredDateOff.ToString(),
|
||||
fileName);
|
||||
// The example displays the following output:
|
||||
// Converted 6/12/2008 6:45:15 PM +07:00 to 2008-06-12T18:45:15.0000000+07:00.
|
||||
// Wrote 2008-06-12T18:45:15.0000000+07:00 to .\DateOff.txt.
|
||||
// Read 6/12/2008 6:45:15 PM +07:00 from .\DateOff.txt.
|
||||
```
|
||||
|
||||
This technique always unambiguously identifies a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value as a single point in time. The value can then be converted to Coordinated Universal Time (UTC) by calling the [DateTimeOffset.ToUniversalTime](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToUniversalTime) method, or it can be converted to the time in a particular time zone by calling the [DateTimeOffset.ToOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ToOffset_System_TimeSpan_) or [TimeZoneInfo.ConvertTime(DateTimeOffset, TimeZoneInfo)]https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_ConvertTime_System_DateTime_System_TimeZoneInfo_ method. The major limitation of this technique is that date and time arithmetic, when performed on a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value that represents the time in a particular time zone, may not produce accurate results for that time zone. This is because when a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value is instantiated, it is disassociated from its time zone. Therefore, that time zone's adjustment rules can no longer be applied when you perform date and time calculations. You can work around this problem by defining a custom type that includes both a date and time value and its accompanying time zone.
|
||||
|
||||
|
||||
|
|
@ -1,526 +0,0 @@
|
|||
---
|
||||
title: Standard Date and Time Format Strings
|
||||
description: Standard Date and Time Format Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 76214bda-5607-48bd-b9ee-b8888bdaf1e7
|
||||
---
|
||||
|
||||
# Standard Date and Time Format Strings
|
||||
|
||||
A standard date and time format string uses a single format specifier to define the text representation of a date and time value. Any date and time format string that contains more than one character, including white space, is interpreted as a custom date and time format string; for more information, see [Custom Date and Time Format Strings](customdatetime.md). A standard or custom format string can be used in two ways:
|
||||
|
||||
* To define the string that results from a formatting operation.
|
||||
|
||||
* To define the text representation of a date and time value that can be converted to a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) or [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value by a parsing operation.
|
||||
|
||||
Standard date and time format strings can be used with both [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) and [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) values.
|
||||
|
||||
The following table describes the standard date and time format specifiers. Unless otherwise noted, a particular standard date and time format specifier produces an identical string representation regardless of whether it is used with a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) or a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value. See the [Notes](#Notes) section for additional information about using standard date and time format strings.
|
||||
|
||||
Format specifier | Description | Examples
|
||||
---------------- | ----------- | --------
|
||||
"d" | Short date pattern. | `2009-06-15T13:45:30 -> 6/15/2009 (en-US)`; `2009-06-15T13:45:30 -> 15/06/2009 (fr-FR)`; `2009-06-15T13:45:30 -> 2009/06/15 (ja-JP)`
|
||||
"D" | Long date pattern. | `2009-06-15T13:45:30 -> Monday, June 15, 2009 (en-US)`; `2009-06-15T13:45:30 -> 15 июня 2009 г. (ru-RU)`; `2009-06-15T13:45:30 -> Montag, 15. Juni 2009 (de-DE)`
|
||||
"f" | Full date/time pattern (short time). | `2009-06-15T13:45:30 -> Monday, June 15, 2009 1:45 PM (en-US)`; `2009-06-15T13:45:30 -> den 15 juni 2009 13:45 (sv-SE)`; `2009-06-15T13:45:30 -> Δευτέρα, 15 Ιουνίου 2009 1:45 μμ (el-GR)`
|
||||
"F" | Full date/time pattern (long time). | `2009-06-15T13:45:30 -> Monday, June 15, 2009 1:45:30 PM (en-US)`; `2009-06-15T13:45:30 -> den 15 juni 2009 13:45:30 (sv-SE)`; `2009-06-15T13:45:30 -> Δευτέρα, 15 Ιουνίου 2009 1:45:30 μμ (el-GR)`
|
||||
"g" | General date/time pattern (short time). | `2009-06-15T13:45:30 -> 6/15/2009 1:45 PM (en-US)`; `2009-06-15T13:45:30 -> 15/06/2009 13:45 (es-ES)`; `2009-06-15T13:45:30 -> 2009/6/15 13:45 (zh-CN)`
|
||||
"G" | General date/time pattern (long time). | `2009-06-15T13:45:30 -> 6/15/2009 1:45:30 PM (en-US)`; `2009-06-15T13:45:30 -> 15/06/2009 13:45:30 (es-ES)`; `2009-06-15T13:45:30 -> 2009/6/15 13:45:30 (zh-CN)`
|
||||
"M", "m' | Month/day pattern. | `2009-06-15T13:45:30 -> June 15 (en-US)`; `2009-06-15T13:45:30 -> 15. juni (da-DK)`; `2009-06-15T13:45:30 -> 15 Juni (id-ID)`
|
||||
"O", "o" | Round-trip date/time pattern. | [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) values: `2009-06-15T13:45:30 (DateTimeKind.Local) --> 2009-06-15T13:45:30.0000000-07:00`; `2009-06-15T13:45:30 (DateTimeKind.Utc) --> 2009-06-15T13:45:30.0000000Z`; `2009-06-15T13:45:30 (DateTimeKind.Unspecified) --> 2009-06-15T13:45:30.0000000`. [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) values: `2009-06-15T13:45:30-07:00 --> 2009-06-15T13:45:30.0000000-07:00`
|
||||
"R", "r" | RFC1123 pattern. | `2009-06-15T13:45:30 -> Mon, 15 Jun 2009 20:45:30 GMT`
|
||||
"s" | Sortable date/time pattern. | `2009-06-15T13:45:30 (DateTimeKind.Local) -> 2009-06-15T13:45:30`; `2009-06-15T13:45:30 (DateTimeKind.Utc) -> 2009-06-15T13:45:30`
|
||||
"t" | Short time pattern. | `2009-06-15T13:45:30 -> 1:45 PM (en-US)`; `2009-06-15T13:45:30 -> 13:45 (hr-HR)`; `2009-06-15T13:45:30 -> 01:45 م (ar-EG)`
|
||||
"T" | Long time pattern. | `2009-06-15T13:45:30 -> 1:45:30 PM (en-US)`; `2009-06-15T13:45:30 -> 13:45:30 (hr-HR)`; `2009-06-15T13:45:30 -> 01:45:30 م (ar-EG)`
|
||||
"u" | Universal sortable date/time pattern. | With a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) value: `2009-06-15T13:45:30 -> 2009-06-15 13:45:30Z`. With a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value: `2009-06-15T13:45:30 -> 2009-06-15 20:45:30Z`
|
||||
"U" | Universal full date/time pattern. | `2009-06-15T13:45:30 -> Monday, June 15, 2009 8:45:30 PM (en-US)`; `2009-06-15T13:45:30 -> den 15 juni 2009 20:45:30 (sv-SE)`; `2009-06-15T13:45:30 -> Δευτέρα, 15 Ιουνίου 2009 8:45:30 μμ (el-GR)`
|
||||
"Y", "y" | Year month pattern. | `2009-06-15T13:45:30 -> June, 2009 (en-US)`; `2009-06-15T13:45:30 -> juni 2009 (da-DK)`; `2009-06-15T13:45:30 -> Juni 2009 (id-ID)`
|
||||
Any other single character | Unknown specifier. | Throws a run-time [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException ).
|
||||
|
||||
## How Standard Format Strings Work
|
||||
|
||||
In a formatting operation, a standard format string is simply an alias for a custom format string. The advantage of using an alias to refer to a custom format string is that, although the alias remains invariant, the custom format string itself can vary. This is important because the string representations of date and time values typically vary by culture. For example, the "d" standard format string indicates that a date and time value is to be displayed using a short date pattern. For the invariant culture, this pattern is "MM/dd/yyyy". For the fr-FR culture, it is "dd/MM/yyyy". For the ja-JP culture, it is "yyyy/MM/dd".
|
||||
|
||||
If a standard format string in a formatting operation maps to a particular culture's custom format string, your application can define the specific culture whose custom format strings are used in one of these ways:
|
||||
|
||||
* You can use the default (or current) culture. The following example displays a date using the current culture's short date format. In this case, the current culture is en-US.
|
||||
|
||||
```csharp
|
||||
// Display using current (en-us) culture's short date format
|
||||
DateTime thisDate = new DateTime(2008, 3, 15);
|
||||
Console.WriteLine(thisDate.ToString("d")); // Displays 3/15/2008
|
||||
```
|
||||
|
||||
* You can pass a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo ) object representing the culture whose formatting is to be used to a method that has an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider ) parameter. The following example displays a date using the short date format of the pt-BR culture.
|
||||
|
||||
```csharp
|
||||
// Display using pt-BR culture's short date format
|
||||
DateTime thisDate = new DateTime(2008, 3, 15);
|
||||
CultureInfo culture = new CultureInfo("pt-BR");
|
||||
Console.WriteLine(thisDate.ToString("d", culture)); // Displays 15/3/2008
|
||||
```
|
||||
|
||||
* You can pass a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object that provides formatting information to a method that has an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider ) parameter. The following example displays a date using the short date format from a DateTimeFormatInfo object for the hr-HR culture.
|
||||
|
||||
```csharp
|
||||
// Display using date format information from hr-HR culture
|
||||
DateTime thisDate = new DateTime(2008, 3, 15);
|
||||
DateTimeFormatInfo fmt = (new CultureInfo("hr-HR")).DateTimeFormat;
|
||||
Console.WriteLine(thisDate.ToString("d", fmt)); // Displays 15.3.2008
|
||||
```
|
||||
|
||||
In some cases, the standard format string serves as a convenient abbreviation for a longer custom format string that is invariant. Four standard format strings fall into this category: "O" (or "o"), "R" (or "r"), "s", and "u". These strings correspond to custom format strings defined by the invariant culture. They produce string representations of date and time values that are intended to be identical across cultures. The following table provides information on these four standard date and time format strings.
|
||||
|
||||
Standard format string | Defined by DateTimeFormatInfo.InvariantInfo property | Custom format string
|
||||
---------------------- | ---------------------------------------------------- | --------------------
|
||||
"O" or "o" | None | `yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'fffffffzz`
|
||||
"R" or "r" | [RFC1123Pattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_RFC1123Pattern) | `ddd, dd MMM yyyy HH':'mm':'ss 'GMT'`
|
||||
"s" | [SortableDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_SortableDateTimePattern) | `yyyy'-'MM'-'dd'T'HH':'mm':'ss`
|
||||
"u" | [UniversalSortableDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_UniversalSortableDateTimePattern) | `yyyy'-'MM'-'dd HH':'mm':'ss'Z'`
|
||||
|
||||
The following sections describe the standard format specifiers for [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) and [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) values.
|
||||
|
||||
## The Short Date ("d") Format Specifier
|
||||
|
||||
The "d" standard format specifier represents a custom date and time format string that is defined by a specific culture's [DateTimeFormatInfo.ShortDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortDatePattern) property. For example, the custom format string that is returned by the [ShortDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortDatePattern) property of the invariant culture is "MM/dd/yyyy".
|
||||
|
||||
The following example uses the "d" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008,4, 10);
|
||||
Console.WriteLine(date1.ToString("d", DateTimeFormatInfo.InvariantInfo));
|
||||
// Displays 04/10/2008
|
||||
Console.WriteLine(date1.ToString("d",
|
||||
CultureInfo.CreateSpecificCulture("en-US")));
|
||||
// Displays 4/10/2008
|
||||
Console.WriteLine(date1.ToString("d",
|
||||
CultureInfo.CreateSpecificCulture("en-NZ")));
|
||||
// Displays 10/04/2008
|
||||
Console.WriteLine(date1.ToString("d",
|
||||
CultureInfo.CreateSpecificCulture("de-DE")));
|
||||
// Displays 10.04.2008
|
||||
```
|
||||
|
||||
## The Long Date ("D") Format Specifier
|
||||
|
||||
The "D" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.LongDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongDatePattern) property. For example, the custom format string for the invariant culture is "dddd, dd MMMM yyyy".
|
||||
|
||||
The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that control the formatting of the returned string.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[LongDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongDatePattern) | Defines the overall format of the result string.
|
||||
[DayNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_DayNames) | Defines the localized day names that can appear in the result string.
|
||||
[MonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthNames) | Defines the localized month names that can appear in the result string.
|
||||
|
||||
The following example uses the "D" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10);
|
||||
Console.WriteLine(date1.ToString("D",
|
||||
CultureInfo.CreateSpecificCulture("en-US")));
|
||||
// Displays Thursday, April 10, 2008
|
||||
Console.WriteLine(date1.ToString("D",
|
||||
CultureInfo.CreateSpecificCulture("pt-BR")));
|
||||
// Displays quinta-feira, 10 de abril de 2008
|
||||
Console.WriteLine(date1.ToString("D",
|
||||
CultureInfo.CreateSpecificCulture("es-MX")));
|
||||
// Displays jueves, 10 de abril de 2008
|
||||
```
|
||||
|
||||
## The Full Date Short Time ("f") Format Specifier
|
||||
|
||||
The "f" standard format specifier represents a combination of the long date ("D") and short time ("t") patterns, separated by a space.
|
||||
|
||||
The result string is affected by the formatting information of a specific [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object. The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier returned by the [DateTimeFormatInfo.LongDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongDatePattern) and [DateTimeFormatInfo.ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) properties of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[LongDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongDatePattern) | Defines the format of the date component of the result string.
|
||||
[ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) | Defines the format of the time component of the result string.
|
||||
[DayNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_DayNames) | Defines the localized day names that can appear in the result string.
|
||||
[MonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthNames) | Defines the localized month names that can appear in the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The following example uses the "f" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("f",
|
||||
CultureInfo.CreateSpecificCulture("en-US")));
|
||||
// Displays Thursday, April 10, 2008 6:30 AM
|
||||
Console.WriteLine(date1.ToString("f",
|
||||
CultureInfo.CreateSpecificCulture("fr-FR")));
|
||||
// Displays jeudi 10 avril 2008 06:30
|
||||
```
|
||||
|
||||
## The Full Date Long Time ("F") Format Specifier
|
||||
|
||||
The "F" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_FullDateTimePattern) property. For example, the custom format string for the invariant culture is "dddd, dd MMMM yyyy HH:mm:ss".
|
||||
|
||||
The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier that is returned by the [FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_FullDateTimePattern) property of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_FullDateTimePattern) | Defines the overall format of the result string.
|
||||
[DayNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_DayNames) | Defines the localized day names that can appear in the result string.
|
||||
[MonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthNames) | Defines the localized month names that can appear in the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The following example uses the "F" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("F",
|
||||
CultureInfo.CreateSpecificCulture("en-US")));
|
||||
// Displays Thursday, April 10, 2008 6:30:00 AM
|
||||
Console.WriteLine(date1.ToString("F",
|
||||
CultureInfo.CreateSpecificCulture("fr-FR")));
|
||||
// Displays jeudi 10 avril 2008 06:30:00
|
||||
```
|
||||
|
||||
## The General Date Short Time ("g") Format Specifier
|
||||
|
||||
The "g" standard format specifier represents a combination of the short date ("d") and short time ("t") patterns, separated by a space.
|
||||
|
||||
The result string is affected by the formatting information of a specific [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object. The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier that is returned by the [DateTimeFormatInfo.ShortDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortDatePattern) and [DateTimeFormatInfo.ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) properties of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[ShortDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortDatePattern) | Defines the format of the date component of the result string.
|
||||
[ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) | Defines the format of the time component of the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The following example uses the "g" format specifier to display a date and time value.
|
||||
|
||||
``` csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("g",
|
||||
DateTimeFormatInfo.InvariantInfo));
|
||||
// Displays 04/10/2008 06:30
|
||||
Console.WriteLine(date1.ToString("g",
|
||||
CultureInfo.CreateSpecificCulture("en-us")));
|
||||
// Displays 4/10/2008 6:30 AM
|
||||
Console.WriteLine(date1.ToString("g",
|
||||
CultureInfo.CreateSpecificCulture("fr-BE")));
|
||||
// Displays 10/04/2008 6:30
|
||||
```
|
||||
|
||||
## The General Date Long Time ("G") Format Specifier
|
||||
|
||||
The "G" standard format specifier represents a combination of the short date ("d") and long time ("T") patterns, separated by a space.
|
||||
|
||||
The result string is affected by the formatting information of a specific [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object. The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier that is returned by the [DateTimeFormatInfo.ShortDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortDatePattern) and [DateTimeFormatInfo.LongTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongTimePattern) properties of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[ShortDatePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortDatePattern) | Defines the format of the date component of the result string.
|
||||
[LongTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongTimePattern) | Defines the format of the time component of the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The following example uses the "G" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("G",
|
||||
DateTimeFormatInfo.InvariantInfo));
|
||||
// Displays 04/10/2008 06:30:00
|
||||
Console.WriteLine(date1.ToString("G",
|
||||
CultureInfo.CreateSpecificCulture("en-us")));
|
||||
// Displays 4/10/2008 6:30:00 AM
|
||||
Console.WriteLine(date1.ToString("G",
|
||||
CultureInfo.CreateSpecificCulture("nl-BE")));
|
||||
// Displays 10/04/2008 6:30:00
|
||||
```
|
||||
|
||||
## The Month ("M", "m") Format Specifier
|
||||
|
||||
The "M" or "m" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.MonthDayPattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthDayPattern) property. For example, the custom format string for the invariant culture is "MMMM dd".
|
||||
|
||||
The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that control the formatting of the returned string.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[MonthDayPattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthDayPattern) | Defines the overall format of the result string.
|
||||
[MonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthNames) | Defines the localized month names that can appear in the result string.
|
||||
|
||||
The following example uses the "m" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("m",
|
||||
CultureInfo.CreateSpecificCulture("en-us")));
|
||||
// Displays April 10
|
||||
Console.WriteLine(date1.ToString("m",
|
||||
CultureInfo.CreateSpecificCulture("ms-MY")));
|
||||
// Displays 10 April
|
||||
```
|
||||
|
||||
## The Round-trip ("O", "o") Format Specifier
|
||||
|
||||
The "O" or "o" standard format specifier represents a custom date and time format string using a pattern that preserves time zone information and emits a result string that complies with ISO 8601. For [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) values, this format specifier is designed to preserve date and time values along with the [DateTime.Kind](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_Kind) property in text. The formatted string can be parsed back by using the [DateTime.Parse(String, IFormatProvider, DateTimeStyles)](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_Parse_System_String_System_IFormatProvider_System_Globalization_DateTimeStyles_) or [DateTime.ParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_Parse_System_String_System_IFormatProvider_System_Globalization_DateTimeStyles_) method if the styles parameter is set to [DateTimeStyles.RoundtripKind](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeStyles #System_Globalization_DateTimeStyles_RoundtripKind).
|
||||
|
||||
The "O" or "o" standard format specifier corresponds to the "yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'fffffffK" custom format string for DateTime values and to the "yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'fffffffzzz" custom format string for [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) values. In this string, the pairs of single quotation marks that delimit individual characters, such as the hyphens, the colons, and the letter "T", indicate that the individual character is a literal that cannot be changed. The apostrophes do not appear in the output string.
|
||||
|
||||
The O" or "o" standard format specifier (and the "yyyy'-'MM'-'dd'T'HH':'mm':'ss'.'fffffffK" custom format string) takes advantage of the three ways that ISO 8601 represents time zone information to preserve the [Kind](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_Kind) property of [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) values:
|
||||
|
||||
* The time zone component of [DateTimeKind.Local](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind #System_DateTimeKind_Local) date and time values is an offset from UTC (for example, +01:00, -07:00). All DateTimeOffset values are also represented in this format.
|
||||
|
||||
* The time zone component of [DateTimeKind.Utc](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind #System_DateTimeKind_Utc) date and time values uses "Z" (which stands for zero offset) to represent UTC.
|
||||
|
||||
* [DateTimeKind.Unspecified](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind #System_DateTimeKind_Unspecified) date and time values have no time zone information.
|
||||
|
||||
Because the O" or "o" standard format specifier conforms to an international standard, the formatting or parsing operation that uses the specifier always uses the invariant culture and the Gregorian calendar.
|
||||
|
||||
Strings that are passed to the `Parse`, `TryParse`, `ParseExact`, and `TryParseExact` methods of [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) and [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) can be parsed by using the "O" or "o" format specifier if they are in one of these formats. In the case of [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) objects, the parsing overload that you call should also include a styles parameter with a value of [DateTimeStyles.RoundtripKind](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeStyles #System_Globalization_DateTimeStyles_RoundtripKind). Note that if you call a parsing method with the custom format string that corresponds to the "O" or "o" format specifier, you won't get the same results as "O" or "o". This is because parsing methods that use a custom format string can't parse the string representation of date and time values that lack a time zone component or use "Z" to indicate UTC.
|
||||
|
||||
The following example uses the "o" format specifier to display a series of [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) values and a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value on a system in the U.S. Pacific Time zone.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime dat = new DateTime(2009, 6, 15, 13, 45, 30,
|
||||
DateTimeKind.Unspecified);
|
||||
Console.WriteLine("{0} ({1}) --> {0:O}", dat, dat.Kind);
|
||||
|
||||
DateTime uDat = new DateTime(2009, 6, 15, 13, 45, 30,
|
||||
DateTimeKind.Utc);
|
||||
Console.WriteLine("{0} ({1}) --> {0:O}", uDat, uDat.Kind);
|
||||
|
||||
DateTime lDat = new DateTime(2009, 6, 15, 13, 45, 30,
|
||||
DateTimeKind.Local);
|
||||
Console.WriteLine("{0} ({1}) --> {0:O}\n", lDat, lDat.Kind);
|
||||
|
||||
DateTimeOffset dto = new DateTimeOffset(lDat);
|
||||
Console.WriteLine("{0} --> {0:O}", dto);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 6/15/2009 1:45:30 PM (Unspecified) --> 2009-06-15T13:45:30.0000000
|
||||
// 6/15/2009 1:45:30 PM (Utc) --> 2009-06-15T13:45:30.0000000Z
|
||||
// 6/15/2009 1:45:30 PM (Local) --> 2009-06-15T13:45:30.0000000-07:00
|
||||
//
|
||||
// 6/15/2009 1:45:30 PM -07:00 --> 2009-06-15T13:45:30.0000000-07:00
|
||||
```
|
||||
|
||||
The following example uses the "o" format specifier to create a formatted string, and then restores the original date and time value by calling a date and time `Parse` method.
|
||||
|
||||
```csharp
|
||||
// Round-trip DateTime values.
|
||||
DateTime originalDate, newDate;
|
||||
string dateString;
|
||||
// Round-trip a local time.
|
||||
originalDate = DateTime.SpecifyKind(new DateTime(2008, 4, 10, 6, 30, 0), DateTimeKind.Local);
|
||||
dateString = originalDate.ToString("o");
|
||||
newDate = DateTime.Parse(dateString, null, DateTimeStyles.RoundtripKind);
|
||||
Console.WriteLine("Round-tripped {0} {1} to {2} {3}.", originalDate, originalDate.Kind,
|
||||
newDate, newDate.Kind);
|
||||
// Round-trip a UTC time.
|
||||
originalDate = DateTime.SpecifyKind(new DateTime(2008, 4, 12, 9, 30, 0), DateTimeKind.Utc);
|
||||
dateString = originalDate.ToString("o");
|
||||
newDate = DateTime.Parse(dateString, null, DateTimeStyles.RoundtripKind);
|
||||
Console.WriteLine("Round-tripped {0} {1} to {2} {3}.", originalDate, originalDate.Kind,
|
||||
newDate, newDate.Kind);
|
||||
// Round-trip time in an unspecified time zone.
|
||||
originalDate = DateTime.SpecifyKind(new DateTime(2008, 4, 13, 12, 30, 0), DateTimeKind.Unspecified);
|
||||
dateString = originalDate.ToString("o");
|
||||
newDate = DateTime.Parse(dateString, null, DateTimeStyles.RoundtripKind);
|
||||
Console.WriteLine("Round-tripped {0} {1} to {2} {3}.", originalDate, originalDate.Kind,
|
||||
newDate, newDate.Kind);
|
||||
|
||||
// Round-trip a DateTimeOffset value.
|
||||
DateTimeOffset originalDTO = new DateTimeOffset(2008, 4, 12, 9, 30, 0, new TimeSpan(-8, 0, 0));
|
||||
dateString = originalDTO.ToString("o");
|
||||
DateTimeOffset newDTO = DateTimeOffset.Parse(dateString, null, DateTimeStyles.RoundtripKind);
|
||||
Console.WriteLine("Round-tripped {0} to {1}.", originalDTO, newDTO);
|
||||
// The example displays the following output:
|
||||
// Round-tripped 4/10/2008 6:30:00 AM Local to 4/10/2008 6:30:00 AM Local.
|
||||
// Round-tripped 4/12/2008 9:30:00 AM Utc to 4/12/2008 9:30:00 AM Utc.
|
||||
// Round-tripped 4/13/2008 12:30:00 PM Unspecified to 4/13/2008 12:30:00 PM Unspecified.
|
||||
// Round-tripped 4/12/2008 9:30:00 AM -08:00 to 4/12/2008 9:30:00 AM -08:00.
|
||||
```
|
||||
|
||||
## The RFC1123 ("R", "r") Format Specifier
|
||||
|
||||
The "R" or "r" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.RFC1123Pattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_RFC1123Pattern) property. The pattern reflects a defined standard, and the property is read-only. Therefore, it is always the same, regardless of the culture used or the format provider supplied. The custom format string is "ddd, dd MMM yyyy HH':'mm':'ss 'GMT'". When this standard format specifier is used, the formatting or parsing operation always uses the invariant culture.
|
||||
|
||||
The result string is affected by the following properties of the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object returned by the [DateTimeFormatInfo.InvariantInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_InvariantInfo) property that represents the invariant culture.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[RFC1123Pattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_RFC1123Pattern) | Defines the format of the result string.
|
||||
[AbbreviatedDayNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AbbreviatedDayNames) | Defines the abbreviated day names that can appear in the result string.
|
||||
[AbbreviatedMonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AbbreviatedMonthNames) | Defines the abbreviated month names that can appear in the result string.
|
||||
|
||||
Although the RFC 1123 standard expresses a time as Coordinated Universal Time (UTC), the formatting operation does not modify the value of the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) object that is being formatted. Therefore, you must convert the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) value to UTC by calling the [DateTime.ToUniversalTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_ToUniversalTime) method before you perform the formatting operation. In contrast, [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) values perform this conversion automatically; there is no need to call the [DateTimeOffset.ToUniversalTime](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset #System_DateTimeOffset_ToUniversalTime) method before the formatting operation.
|
||||
|
||||
The following example uses the "r" format specifier to display a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) and a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value on a system in the U.S. Pacific Time zone.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
DateTimeOffset dateOffset = new DateTimeOffset(date1,
|
||||
TimeZoneInfo.Local.GetUtcOffset(date1));
|
||||
Console.WriteLine(date1.ToUniversalTime().ToString("r"));
|
||||
// Displays Thu, 10 Apr 2008 13:30:00 GMT
|
||||
Console.WriteLine(dateOffset.ToUniversalTime().ToString("r"));
|
||||
// Displays Thu, 10 Apr 2008 13:30:00 GMT
|
||||
```
|
||||
|
||||
## The Sortable ("s") Format Specifier
|
||||
|
||||
The "s" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.SortableDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_SortableDateTimePattern) property. The pattern reflects a defined standard (ISO 8601), and the property is read-only. Therefore, it is always the same, regardless of the culture used or the format provider supplied. The custom format string is "yyyy'-'MM'-'dd'T'HH':'mm':'ss".
|
||||
|
||||
The purpose of the "s" format specifier is to produce result strings that sort consistently in ascending or descending order based on date and time values. As a result, although the "s" standard format specifier represents a date and time value in a consistent format, the formatting operation does not modify the value of the date and time object that is being formatted to reflect its [DateTime.Kind](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_Kind) property or its [DateTimeOffset.Offset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset #System_DateTimeOffset_Offset)value. For example, the result strings produced by formatting the date and time values 2014-11-15T18:32:17+00:00 and 2014-11-15T18:32:17+08:00 are identical.
|
||||
|
||||
When this standard format specifier is used, the formatting or parsing operation always uses the invariant culture.
|
||||
|
||||
The following example uses the "s" format specifier to display a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) and a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value on a system in the U.S. Pacific Time zone.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("s"));
|
||||
// Displays 2008-04-10T06:30:00
|
||||
```
|
||||
|
||||
## The Short Time ("t") Format Specifier
|
||||
|
||||
The "t" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) property. For example, the custom format string for the invariant culture is "HH:mm".
|
||||
|
||||
The result string is affected by the formatting information of a specific [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object. The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier that is returned by the [DateTimeFormatInfo.ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) property of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[DateTimeFormatInfo.ShortTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_ShortTimePattern) | Defines the format of the time component of the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The following example uses the "t" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("t",
|
||||
CultureInfo.CreateSpecificCulture("en-us")));
|
||||
// Displays 6:30 AM
|
||||
Console.WriteLine(date1.ToString("t",
|
||||
CultureInfo.CreateSpecificCulture("es-ES")));
|
||||
// Displays 6:30
|
||||
```
|
||||
|
||||
## The Long Time ("T") Format Specifier
|
||||
|
||||
The "T" standard format specifier represents a custom date and time format string that is defined by a specific culture's [DateTimeFormatInfo.LongTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongTimePattern) property. For example, the custom format string for the invariant culture is "HH:mm:ss".
|
||||
|
||||
The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier that is returned by the [DateTimeFormatInfo.LongTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongTimePattern) property of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[LongTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_LongTimePattern) | Defines the format of the time component of the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The following example uses the "T" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("T",
|
||||
CultureInfo.CreateSpecificCulture("en-us")));
|
||||
// Displays 6:30:00 AM
|
||||
Console.WriteLine(date1.ToString("T",
|
||||
CultureInfo.CreateSpecificCulture("es-ES")));
|
||||
// Displays 6:30:00
|
||||
```
|
||||
|
||||
## The Universal Sortable ("u") Format Specifier
|
||||
|
||||
The "u" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.UniversalSortableDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_UniversalSortableDateTimePattern) property. The pattern reflects a defined standard, and the property is read-only. Therefore, it is always the same, regardless of the culture used or the format provider supplied. The custom format string is "yyyy'-'MM'-'dd HH':'mm':'ss'Z'". When this standard format specifier is used, the formatting or parsing operation always uses the invariant culture.
|
||||
|
||||
Although the result string should express a time as Coordinated Universal Time (UTC), no conversion of the original [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) value is performed during the formatting operation. Therefore, you must convert a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) value to UTC by calling the [DateTime.ToUniversalTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime #System_DateTime_ToUniversalTime) method before formatting it. In contrast, [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) values perform this conversion automatically; there is no need to call the [DateTimeOffset.ToUniversalTime](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset #System_DateTimeOffset_ToUniversalTime) method before the formatting operation.
|
||||
|
||||
The following example uses the "u" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToUniversalTime().ToString("u"));
|
||||
// Displays 2008-04-10 13:30:00Z
|
||||
```
|
||||
|
||||
## The Universal Full ("U") Format Specifier
|
||||
|
||||
The "U" standard format specifier represents a custom date and time format string that is defined by a specified culture's [DateTimeFormatInfo.FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_FullDateTimePattern) property. The pattern is the same as the "F" pattern. However, the [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime ) value is automatically converted to UTC before it is formatted.
|
||||
|
||||
The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that may control the formatting of the returned string. The custom format specifier that is returned by the [FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_FullDateTimePattern) property of some cultures may not make use of all properties.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[FullDateTimePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_FullDateTimePattern) | Defines the overall format of the result string.
|
||||
[DayNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_DayNames) | Defines the localized day names that can appear in the result string.
|
||||
[MonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthNames) | Defines the localized month names that can appear in the result string.
|
||||
[AMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_AMDesignator) | Defines the string that indicates times from midnight to before noon in a 12-hour clock.
|
||||
[PMDesignator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_PMDesignator) | Defines the string that indicates times from noon to before midnight in a 12-hour clock.
|
||||
|
||||
The "U" format specifier is not supported by the [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) type and throws a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException ) if it is used to format a [DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset ) value.
|
||||
|
||||
The following example uses the "U" format specifier to display a date and time value.
|
||||
|
||||
``` csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("U",
|
||||
CultureInfo.CreateSpecificCulture("en-US")));
|
||||
// Displays Thursday, April 10, 2008 1:30:00 PM
|
||||
Console.WriteLine(date1.ToString("U",
|
||||
CultureInfo.CreateSpecificCulture("sv-FI")));
|
||||
// Displays den 10 april 2008 13:30:00
|
||||
```
|
||||
|
||||
## The Year Month ("Y", "y") Format Specifier
|
||||
|
||||
The "Y" or "y" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.YearMonthPattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_YearMonthPattern) property of a specified culture. For example, the custom format string for the invariant culture is "yyyy MMMM".
|
||||
|
||||
The following table lists the [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object properties that control the formatting of the returned string.
|
||||
|
||||
Property | Description
|
||||
-------- | -----------
|
||||
[YearMonthPattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_YearMonthPattern) | Defines the overall format of the result string.
|
||||
[MonthNames](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo #System_Globalization_DateTimeFormatInfo_MonthNames) | Defines the localized month names that can appear in the result string.
|
||||
|
||||
The following example uses the "y" format specifier to display a date and time value.
|
||||
|
||||
```csharp
|
||||
DateTime date1 = new DateTime(2008, 4, 10, 6, 30, 0);
|
||||
Console.WriteLine(date1.ToString("Y",
|
||||
CultureInfo.CreateSpecificCulture("en-US")));
|
||||
// Displays April, 2008
|
||||
Console.WriteLine(date1.ToString("y",
|
||||
CultureInfo.CreateSpecificCulture("af-ZA")));
|
||||
// Displays April 2008
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
### DateTimeFormatInfo Properties
|
||||
|
||||
Formatting is influenced by properties of the current [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object, which is provided implicitly by the current thread culture or explicitly by the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider ) parameter of the method that invokes formatting. For the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider ) parameter, your application should specify a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo ) object, which represents a culture, or a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object, which represents a particular culture's date and time formatting conventions. Many of the standard date and time format specifiers are aliases for formatting patterns defined by properties of the current [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) object. Your application can change the result produced by some standard date and time format specifiers by changing the corresponding date and time format patterns of the corresponding [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo ) property.
|
||||
|
||||
## See Also
|
||||
|
||||
[Custom Date and Time Format Strings](customdatetime.md)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,486 +0,0 @@
|
|||
---
|
||||
title: Standard Numeric Format Strings
|
||||
description: Standard Numeric Format Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 80e64c39-00ce-4032-944a-fc07290fc3c8
|
||||
---
|
||||
|
||||
# Standard Numeric Format Strings
|
||||
|
||||
Standard numeric format strings are used to format common numeric types. A standard numeric format string takes the form **A**_xx_, where:
|
||||
|
||||
* **A** is a single alphabetic character called the *format specifier*. Any numeric format string that contains more than one alphabetic character, including white space, is interpreted as a custom numeric format string. For more information, see [Custom Numeric Format Strings](customnumeric.md).
|
||||
|
||||
* *xx* is an optional integer called the *precision specifier*. The precision specifier ranges from 0 to 99 and affects the number of digits in the result. Note that the precision specifier controls the number of digits in the string representation of a number. It does not round the number itself. To perform a rounding operation, use the [Math.Ceiling](https://docs.microsoft.com/dotnet/core/api/System.Math#methods), [Math.Floor](https://docs.microsoft.com/dotnet/core/api/System.Math#methods), or [Math.Round](https://docs.microsoft.com/dotnet/core/api/System.Math#methods) methods.
|
||||
|
||||
When *precision specifier* controls the number of fractional digits in the result string, the result strings reflect numbers that are rounded away from zero (that is, using [MidpointRounding.AwayFromZero](https://docs.microsoft.com/dotnet/core/api/System.MidpointRounding#System_MidpointRounding_AwayFromZero)).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The precision specifier determines the number of digits in the result string. To pad a result string with leading or trailing spaces, use the [composite formatting](compositeformat.md) feature and define an *alignment component* in the format item.
|
||||
|
||||
Standard numeric format strings are supported by some overloads of the `ToString` method of all numeric types. For example, you can supply a numeric format string to the [ToString(String)](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_ToString_System_String_) and [ToString(String, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_ToString_System_String_System_IFormatProvider_) methods of the [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) type. Standard numeric format strings are also supported by the .NET Core [composite formatting](compositeformat.md) feature, which is used by some `Write` and `WriteLine` methods of the [Console](https://docs.microsoft.com/dotnet/core/api/System.Console) and [StreamWriter](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamWriter) classes, the [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_) method, and the [StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_) method. The composite format feature allows you to include the string representation of multiple data items in a single string, to specify field width, and to align numbers in a field. For more information, see [Composite Formatting](compositeformat.md).
|
||||
|
||||
The following table describes the standard numeric format specifiers and displays sample output produced by each format specifier. See the [Notes](#Notes) section for additional information about using standard numeric format strings, and the [Example](#Example) section for a comprehensive illustration of their use.
|
||||
|
||||
Format specifier | Name | Result | Supported By | Precision specifier | Default precision specifier | Examples
|
||||
---------------- | ---- | ------ | ------------ | ------------------- | --------------------------- | --------
|
||||
"C" or "c" | Currency | A currency value | All numeric types | Number of decimal digits | Defined by [NumberFormatInfo.CurrencyDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalDigits) | `123.456 ("C", en-US) -> $123.46`; `123.456 ("C", fr-FR) -> 123,46 €`; `123.456 ("C", ja-JP) -> ¥123`; `-123.456 ("C3", en-US) -> ($123.456)`; `-123.456 ("C3", fr-FR) -> -123,456 €`; `-123.456 ("C3", ja-JP) -> -¥123.456`
|
||||
"D" or "d" | Decimal | Integer digits with optional negative sign | Integral types only | Minimum number of digits | Minimum number of digits required | `1234 ("D") -> 1234`; `-1234 ("D6") -> -001234`
|
||||
"E" or "e" | Exponential (scientific) | Exponential notation | All numeric types | Number of decimal digits | 6 | `1052.0329112756 ("E", en-US) -> 1.052033E+003`; `1052.0329112756 ("e", fr-FR) -> 1,052033e+003`; `-1052.0329112756 ("e2", en-US) -> -1.05e+003`; `-1052.0329112756 ("E2", fr_FR) -> -1,05E+003`
|
||||
"F" or "f" | Fixed-point | Integral and decimal digits with optional negative sign | All numeric types | Number of decimal digits | Defined by [NumberFormatInfo.NumberDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalDigits) | `1234.567 ("F", en-US) -> 1234.57`; `1234.567 ("F", de-DE) -> 1234,57`; `1234 ("F1", en-US) -> 1234.0`; `1234 ("F1", de-DE) -> 1234,0`; `-1234.56 ("F4", en-US) -> -1234.5600`; `-1234.56 ("F4", de-DE) -> -1234,5600`
|
||||
"G" or "g" | General | The more compact of either fixed-point or scientific notation | All numeric types | Number of significant digits | Depends on numeric type | `-123.456 ("G", en-US) -> -123.456`; `-123.456 ("G", sv-SE) -> -123,456`; `123.4546 ("G4", en-US) -> 123.5`; `123.4546 ("G4", sv-SE) -> 123,5`; `-1.234567890e-25 ("G", en-US) -> -1.23456789E-25`; `-1.234567890e-25 ("G", sv-SE) -> -1,23456789E-25`
|
||||
"N" or "n" | Number | Integral and decimal digits, group separators, and a decimal separator with optional negative sign | All numeric types | Desired number of decimal places | Defined by [NumberFormatInfo.NumberDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalDigits) | `1234.567 ("N", en-US) -> 1,234.57`; `1234.567 ("N", ru-RU) -> 1 234,57`; `1234 ("N1", en-US) -> 1,234.0`; `1234 ("N1", ru-RU) -> 1 234,0`; `-1234.56 ("N3", en-US) -> -1,234.560`; `-1234.56 ("N3", ru-RU) -> -1 234,560`
|
||||
"P" or "p" | Percent | Number multiplied by 100 and displayed with a percent symbol | All numeric types | Desired number of decimal places | Defined by [NumberFormatInfo.PercentDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentDecimalDigits) | `1 ("P", en-US) -> 100.00 %`; `1 ("P", fr-FR) -> 100,00 %`; `-0.39678 ("P1", en-US) -> -39.7 %`; `-0.39678 ("P1", fr-FR) -> -39,7 %`
|
||||
"R" or "r" | Round-trip | A string that can round-trip to an identical number | [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), and [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger) | Ignored | | `123456789.12345678 ("R") -> 123456789.12345678`; `-1234567890.12345678 ("R") -> -1234567890.1234567`
|
||||
"X" or "x" | Hexadecimal | A hexadecimal string | Integral types only | Number of digits in the result string | | `55 ("X") -> FF`; `-1 ("x") -> ff`; `255 ("x4") -> 00ff`; `-1 ("X4") -> 00FF`
|
||||
Any other single character | Unknown specifier | Throws a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException) at run time | | | |
|
||||
|
||||
## Using Standard Numeric Format Strings
|
||||
|
||||
A standard numeric format string can be used to define the formatting of a numeric value in one of two ways:
|
||||
|
||||
* It can be passed to an overload of the `ToString` method that has a *format* parameter. The following example formats a numeric value as a currency string in the current (in this case, the en-US) culture.
|
||||
|
||||
```csharp
|
||||
decimal value = 123.456m;
|
||||
Console.WriteLine(value.ToString("C2"));
|
||||
// Displays $123.46
|
||||
```
|
||||
|
||||
* It can be supplied as the *formatString* argument in a format item used with such methods as [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_), [Console.WriteLine](https://docs.microsoft.com/dotnet/core/api/System.Console#System_Console_WriteLine), and [StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_). For more information, see [Composite Formatting](compositeformat.md). The following example uses a format item to insert a currency value in a string.
|
||||
|
||||
```csharp
|
||||
decimal value = 123.456m;
|
||||
Console.WriteLine("Your account balance is {0:C2}.", value);
|
||||
// Displays "Your account balance is $123.46."
|
||||
```
|
||||
|
||||
Optionally, you an supply an alignment argument to specify the width of the numeric field and whether its value is right- or left-aligned. The following example left-aligns a currency value in a 28-character field, and it right-aligns a currency value in a 14-character field.
|
||||
|
||||
```csharp
|
||||
decimal[] amounts = { 16305.32m, 18794.16m };
|
||||
Console.WriteLine(" Beginning Balance Ending Balance");
|
||||
Console.WriteLine(" {0,-28:C2}{1,14:C2}", amounts[0], amounts[1]);
|
||||
// Displays:
|
||||
// Beginning Balance Ending Balance
|
||||
// $16,305.32 $18,794.16
|
||||
```
|
||||
|
||||
The following sections provide detailed information about each of the standard numeric format strings.
|
||||
|
||||
## The Currency ("C") Format Specifier
|
||||
|
||||
The "C" (or currency) format specifier converts a number to a string that represents a currency amount. The precision specifier indicates the desired number of decimal places in the result string. If the precision specifier is omitted, the default precision is defined by the [NumberFormatInfo.CurrencyDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalDigits) property.
|
||||
|
||||
If the value to be formatted has more than the specified or default number of decimal places, the fractional value is rounded in the result string. If the value to the right of the number of specified decimal places is 5 or greater, the last digit in the result string is rounded away from zero.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. The following table lists the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) properties that control the formatting of the returned string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[CurrencyPositivePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyPositivePattern) | Defines the placement of the currency symbol for positive values.
|
||||
[CurrencyNegativePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyNegativePattern) | Defines the placement of the currency symbol for negative values, and specifies whether the negative sign is represented by parentheses or the [NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) property.
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the negative sign used if [CurrencyNegativePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyNegativePattern) indicates that parentheses are not used.
|
||||
[CurrencySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencySymbol) | Defines the currency symbol.
|
||||
[CurrencyDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalDigits) | Defines the default number of decimal digits in a currency value. This value can be overridden by using the precision specifier.
|
||||
[CurrencyDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalSeparator) | Defines the string that separates integral and decimal digits.
|
||||
[CurrencyGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyGroupSeparator) | Defines the string that separates groups of integral numbers.
|
||||
[CurrencyGroupSizes](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyGroupSizes) | Defines the number of integer digits that appear in a group.
|
||||
|
||||
The following example formats a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value with the currency format specifier.
|
||||
|
||||
```csharp
|
||||
double value = 12345.6789;
|
||||
Console.WriteLine(value.ToString("C", CultureInfo.CurrentCulture));
|
||||
|
||||
Console.WriteLine(value.ToString("C3", CultureInfo.CurrentCulture));
|
||||
|
||||
Console.WriteLine(value.ToString("C3",
|
||||
CultureInfo.CreateSpecificCulture("da-DK")));
|
||||
// The example displays the following output on a system whose
|
||||
// current culture is English (United States):
|
||||
// $12,345.68
|
||||
// $12,345.679
|
||||
// kr 12.345,679
|
||||
```
|
||||
|
||||
## The Decimal ("D") Format Specifier
|
||||
|
||||
The "D" (or decimal) format specifier converts a number to a string of decimal digits (0-9), prefixed by a minus sign if the number is negative. This format is supported only for integral types.
|
||||
|
||||
The precision specifier indicates the minimum number of digits desired in the resulting string. If required, the number is padded with zeros to its left to produce the number of digits given by the precision specifier. If no precision specifier is specified, the default is the minimum value required to represent the integer without leading zeros.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. As the following table shows, a single property affects the formatting of the result string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative.
|
||||
|
||||
The following example formats an [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) value with the decimal format specifier.
|
||||
|
||||
```csharp
|
||||
int value;
|
||||
|
||||
value = 12345;
|
||||
Console.WriteLine(value.ToString("D"));
|
||||
// Displays 12345
|
||||
Console.WriteLine(value.ToString("D8"));
|
||||
// Displays 00012345
|
||||
|
||||
value = -12345;
|
||||
Console.WriteLine(value.ToString("D"));
|
||||
// Displays -12345
|
||||
Console.WriteLine(value.ToString("D8"));
|
||||
// Displays -00012345
|
||||
```
|
||||
|
||||
## The Exponential ("E") Format Specifier
|
||||
|
||||
The exponential ("E") format specifier converts a number to a string of the form "-d.ddd…E+ddd" or "-d.ddd…e+ddd", where each "d" indicates a digit (0-9). The string starts with a minus sign if the number is negative. Exactly one digit always precedes the decimal point.
|
||||
|
||||
The precision specifier indicates the desired number of digits after the decimal point. If the precision specifier is omitted, a default of six digits after the decimal point is used.
|
||||
|
||||
The case of the format specifier indicates whether to prefix the exponent with an "E" or an "e". The exponent always consists of a plus or minus sign and a minimum of three digits. The exponent is padded with zeros to meet this minimum, if required.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. The following table lists the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) properties that control the formatting of the returned string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative for both the coefficient and exponent.
|
||||
[NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) | Defines the string that separates the integral digit from decimal digits in the coefficient.
|
||||
[PositiveSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PositiveSign) | Defines the string that indicates that an exponent is positive.
|
||||
|
||||
The following example formats a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value with the exponential format specifier.
|
||||
|
||||
```csharp
|
||||
double value = 12345.6789;
|
||||
Console.WriteLine(value.ToString("E", CultureInfo.InvariantCulture));
|
||||
// Displays 1.234568E+004
|
||||
|
||||
Console.WriteLine(value.ToString("E10", CultureInfo.InvariantCulture));
|
||||
// Displays 1.2345678900E+004
|
||||
|
||||
Console.WriteLine(value.ToString("e4", CultureInfo.InvariantCulture));
|
||||
// Displays 1.2346e+004
|
||||
|
||||
Console.WriteLine(value.ToString("E",
|
||||
CultureInfo.CreateSpecificCulture("fr-FR")));
|
||||
// Displays 1,234568E+004
|
||||
```
|
||||
|
||||
## The Fixed-Point ("F") Format Specifier
|
||||
|
||||
The fixed-point ("F) format specifier converts a number to a string of the form "-ddd.ddd…" where each "d" indicates a digit (0-9). The string starts with a minus sign if the number is negative.
|
||||
|
||||
The precision specifier indicates the desired number of decimal places. If the precision specifier is omitted, the current [NumberFormatInfo.NumberDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalDigits) property supplies the numeric precision.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. The following table lists the properties of the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object that control the formatting of the result string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative.
|
||||
[NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) | Defines the string that separates integral digits from decimal digits.
|
||||
[NumberFormatInfo.NumberDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalDigits) | Defines the default number of decimal digits. This value can be overridden by using the precision specifier.
|
||||
|
||||
The following example formats a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) and an [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) value with the fixed-point format specifier.
|
||||
|
||||
```csharp
|
||||
int integerNumber;
|
||||
integerNumber = 17843;
|
||||
Console.WriteLine(integerNumber.ToString("F",
|
||||
CultureInfo.InvariantCulture));
|
||||
// Displays 17843.00
|
||||
|
||||
integerNumber = -29541;
|
||||
Console.WriteLine(integerNumber.ToString("F3",
|
||||
CultureInfo.InvariantCulture));
|
||||
// Displays -29541.000
|
||||
|
||||
double doubleNumber;
|
||||
doubleNumber = 18934.1879;
|
||||
Console.WriteLine(doubleNumber.ToString("F", CultureInfo.InvariantCulture));
|
||||
// Displays 18934.19
|
||||
|
||||
Console.WriteLine(doubleNumber.ToString("F0", CultureInfo.InvariantCulture));
|
||||
// Displays 18934
|
||||
|
||||
doubleNumber = -1898300.1987;
|
||||
Console.WriteLine(doubleNumber.ToString("F1", CultureInfo.InvariantCulture));
|
||||
// Displays -1898300.2
|
||||
|
||||
Console.WriteLine(doubleNumber.ToString("F3",
|
||||
CultureInfo.CreateSpecificCulture("es-ES")));
|
||||
// Displays -1898300,199
|
||||
```
|
||||
|
||||
## The General ("G") Format Specifier
|
||||
|
||||
The general ("G") format specifier converts a number to the more compact of either fixed-point or scientific notation, depending on the type of the number and whether a precision specifier is present. The precision specifier defines the maximum number of significant digits that can appear in the result string. If the precision specifier is omitted or zero, the type of the number determines the default precision, as indicated in the following table.
|
||||
|
||||
Numeric type | Default precision
|
||||
------------ | -----------------
|
||||
[Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) or [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) | 3 digits
|
||||
[Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16) or [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16) | 5 digits
|
||||
[Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) or [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | 10 digits
|
||||
[Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) | 19 digits
|
||||
[UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) | 20 digits
|
||||
[BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger) | Unlimited (same as "R")
|
||||
[Single](https://docs.microsoft.com/dotnet/core/api/System.Single) | 7 digits
|
||||
[Double](https://docs.microsoft.com/dotnet/core/api/System.Double) | 15 digits
|
||||
[Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) | 29 digits
|
||||
|
||||
Fixed-point notation is used if the exponent that would result from expressing the number in scientific notation is greater than -5 and less than the precision specifier; otherwise, scientific notation is used. The result contains a decimal point if required, and trailing zeros after the decimal point are omitted. If the precision specifier is present and the number of significant digits in the result exceeds the specified precision, the excess trailing digits are removed by rounding.
|
||||
|
||||
However, if the number is a [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) and the precision specifier is omitted, fixed-point notation is always used and trailing zeros are preserved.
|
||||
|
||||
If scientific notation is used, the exponent in the result is prefixed with "E" if the format specifier is "G", or "e" if the format specifier is "g". The exponent contains a minimum of two digits. This differs from the format for scientific notation that is produced by the exponential format specifier, which includes a minimum of three digits in the exponent.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. The following table lists the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) properties that control the formatting of the result string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative.
|
||||
[NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) | Defines the string that separates integral digits from decimal digits.
|
||||
[PositiveSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PositiveSign) | Defines the string that indicates that an exponent is positive.
|
||||
|
||||
The following example formats assorted floating-point values with the general format specifier.
|
||||
|
||||
```csharp
|
||||
double number;
|
||||
|
||||
number = 12345.6789;
|
||||
Console.WriteLine(number.ToString("G", CultureInfo.InvariantCulture));
|
||||
// Displays 12345.6789
|
||||
Console.WriteLine(number.ToString("G",
|
||||
CultureInfo.CreateSpecificCulture("fr-FR")));
|
||||
// Displays 12345,6789
|
||||
|
||||
Console.WriteLine(number.ToString("G7", CultureInfo.InvariantCulture));
|
||||
// Displays 12345.68
|
||||
|
||||
number = .0000023;
|
||||
Console.WriteLine(number.ToString("G", CultureInfo.InvariantCulture));
|
||||
// Displays 2.3E-06
|
||||
Console.WriteLine(number.ToString("G",
|
||||
CultureInfo.CreateSpecificCulture("fr-FR")));
|
||||
// Displays 2,3E-06
|
||||
|
||||
number = .0023;
|
||||
Console.WriteLine(number.ToString("G", CultureInfo.InvariantCulture));
|
||||
// Displays 0.0023
|
||||
|
||||
number = 1234;
|
||||
Console.WriteLine(number.ToString("G2", CultureInfo.InvariantCulture));
|
||||
// Displays 1.2E+03
|
||||
|
||||
number = Math.PI;
|
||||
Console.WriteLine(number.ToString("G5", CultureInfo.InvariantCulture));
|
||||
// Displays 3.1416
|
||||
```
|
||||
|
||||
## The Numeric ("N") Format Specifier
|
||||
|
||||
The numeric ("N") format specifier converts a number to a string of the form "-d,ddd,ddd.ddd…", where "-" indicates a negative number symbol if required, "d" indicates a digit (0-9), "," indicates a group separator, and "." indicates a decimal point symbol. The precision specifier indicates the desired number of digits after the decimal point. If the precision specifier is omitted, the number of decimal places is defined by the current [NumberFormatInfo.NumberDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalDigits) property.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. The following table lists the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) properties that control the formatting of the result string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative.
|
||||
[NumberNegativePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberNegativePattern) | Defines the format of negative values, and specifies whether the negative sign is represented by parentheses or the [NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) property.
|
||||
[NumberGroupSizes](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberGroupSizes) | Defines the number of integral digits that appear between group separators.
|
||||
[NumberGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberGroupSeparator) | Defines the string that separates groups of integral numbers.
|
||||
[NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) | Defines the string that separates integral digits from decimal digits.
|
||||
[NumberDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalDigits) | Defines the default number of decimal digits. This value can be overridden by using a precision specifier.
|
||||
|
||||
The following example formats assorted floating-point values with the number format specifier.
|
||||
|
||||
```csharp
|
||||
double dblValue = -12445.6789;
|
||||
Console.WriteLine(dblValue.ToString("N", CultureInfo.InvariantCulture));
|
||||
// Displays -12,445.68
|
||||
Console.WriteLine(dblValue.ToString("N1",
|
||||
CultureInfo.CreateSpecificCulture("sv-SE")));
|
||||
// Displays -12 445,7
|
||||
|
||||
int intValue = 123456789;
|
||||
Console.WriteLine(intValue.ToString("N1", CultureInfo.InvariantCulture));
|
||||
// Displays 123,456,789.0
|
||||
```
|
||||
|
||||
## The Percent ("P") Format Specifier
|
||||
|
||||
The percent ("P") format specifier multiplies a number by 100 and converts it to a string that represents a percentage. The precision specifier indicates the desired number of decimal places. If the precision specifier is omitted, the default numeric precision supplied by the current [PercentDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentDecimalDigits) property is used.
|
||||
|
||||
The following table lists the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) properties that control the formatting of the returned string.
|
||||
|
||||
umberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[PercentPositivePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentPositivePattern) | Defines the placement of the percent symbol for positive values.
|
||||
[PercentNegativePattern](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentNegativePattern) |
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative.
|
||||
[PercentSymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentSymbol) | Defines the percent symbol.
|
||||
[PercentDecimalDigits](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentDecimalDigits) | Defines the default number of decimal digits in a percentage value. This value can be overridden by using the precision specifier.
|
||||
[PercentDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentDecimalSeparator) | Defines the string that separates integral and decimal digits.
|
||||
[PercentGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentGroupSeparator) | Defines the string that separates groups of integral numbers.
|
||||
[PercentGroupSizes](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PercentGroupSizes) | Defines the number of integer digits that appear in a group.
|
||||
|
||||
The following example formats floating-point values with the percent format specifier.
|
||||
|
||||
```csharp
|
||||
double number = .2468013;
|
||||
Console.WriteLine(number.ToString("P", CultureInfo.InvariantCulture));
|
||||
// Displays 24.68 %
|
||||
Console.WriteLine(number.ToString("P",
|
||||
CultureInfo.CreateSpecificCulture("hr-HR")));
|
||||
// Displays 24,68%
|
||||
Console.WriteLine(number.ToString("P1", CultureInfo.InvariantCulture));
|
||||
// Displays 24.7 %
|
||||
```
|
||||
|
||||
## The Round-trip ("R") Format Specifier
|
||||
|
||||
The round-trip ("R") format specifier is used to ensure that a numeric value that is converted to a string will be parsed back into the same numeric value. This format is supported only for the [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), and [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger) types.
|
||||
|
||||
When a [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger) value is formatted using this specifier, its string representation contains all the significant digits in the [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger) value. When a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value is formatted using this specifier, it is first tested using the general format, with 15 digits of precision for a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) and 7 digits of precision for a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single). If the value is successfully parsed back to the same numeric value, it is formatted using the general format specifier. If the value is not successfully parsed back to the same numeric value, it is formatted using 17 digits of precision for a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) and 9 digits of precision for a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single).
|
||||
|
||||
Although you can include a precision specifier, it is ignored. Round trips are given precedence over precision when using this specifier.
|
||||
|
||||
The result string is affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object. The following table lists the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) properties that control the formatting of the result string.
|
||||
|
||||
NumberFormatInfo property | Description
|
||||
------------------------- | -----------
|
||||
[NegativeSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeSign) | Defines the string that indicates that a number is negative.
|
||||
[NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) | Defines the string that separates integral digits from decimal digits.
|
||||
[PositiveSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PositiveSign) | Defines the string that indicates that an exponent is positive.
|
||||
|
||||
The following example formats [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) values with the round-trip format specifier.
|
||||
|
||||
```csharp
|
||||
double value;
|
||||
|
||||
value = Math.PI;
|
||||
Console.WriteLine(value.ToString("r"));
|
||||
// Displays 3.1415926535897931
|
||||
Console.WriteLine(value.ToString("r",
|
||||
CultureInfo.CreateSpecificCulture("fr-FR")));
|
||||
// Displays 3,1415926535897931
|
||||
value = 1.623e-21;
|
||||
Console.WriteLine(value.ToString("r"));
|
||||
// Displays 1.623E-21
|
||||
```
|
||||
|
||||
## The Hexadecimal ("X") Format Specifier
|
||||
|
||||
The hexadecimal ("X") format specifier converts a number to a string of hexadecimal digits. The case of the format specifier indicates whether to use uppercase or lowercase characters for hexadecimal digits that are greater than 9. For example, use "X" to produce "ABCDEF", and "x" to produce "abcdef". This format is supported only for integral types.
|
||||
|
||||
The precision specifier indicates the minimum number of digits desired in the resulting string. If required, the number is padded with zeros to its left to produce the number of digits given by the precision specifier.
|
||||
|
||||
The result string is not affected by the formatting information of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object.
|
||||
|
||||
The following example formats [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) values with the hexadecimal format specifier.
|
||||
|
||||
```csharp
|
||||
int value;
|
||||
|
||||
value = 0x2045e;
|
||||
Console.WriteLine(value.ToString("x"));
|
||||
// Displays 2045e
|
||||
Console.WriteLine(value.ToString("X"));
|
||||
// Displays 2045E
|
||||
Console.WriteLine(value.ToString("X8"));
|
||||
// Displays 0002045E
|
||||
|
||||
value = 123456789;
|
||||
Console.WriteLine(value.ToString("X"));
|
||||
// Displays 75BCD15
|
||||
Console.WriteLine(value.ToString("X2"));
|
||||
// Displays 75BCD15
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
### NumberFormatInfo Properties
|
||||
|
||||
Formatting is influenced by the properties of the current [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object, which is provided implicitly by the current thread culture or explicitly by the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter of the method that invokes formatting. Specify a [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) or [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object for that parameter.
|
||||
|
||||
### Integral and Floating-Point Numeric Types
|
||||
|
||||
Some descriptions of standard numeric format specifiers refer to integral or floating-point numeric types. The integral numeric types are [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), and [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger). The floating-point numeric types are [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), and [Double](https://docs.microsoft.com/dotnet/core/api/System.Double).
|
||||
|
||||
### Floating-Point Infinities and NaN
|
||||
|
||||
Regardless of the format string, if the value of a [Single](https://docs.microsoft.com/dotnet/core/api/System.Single) or [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) floating-point type is positive infinity, negative infinity, or not a number (NaN), the formatted string is the value of the respective [PositiveInfinitySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_PositiveInfinitySymbol), [NegativeInfinitySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NegativeInfinitySymbol), or [NaNSymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NaNSymbol) property that is specified by the currently applicable [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object.
|
||||
|
||||
## Example
|
||||
|
||||
The following example formats an integral and a floating-point numeric value using the en-US culture and all the standard numeric format specifiers. This example uses two particular numeric types ([Double](https://docs.microsoft.com/dotnet/core/api/System.Double) and [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32)), but would yield similar results for any of the other numeric base types ([Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), and [BigInteger](https://docs.microsoft.com/dotnet/core/api/System.Numerics.BigInteger), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), and [Single](https://docs.microsoft.com/dotnet/core/api/System.Single)).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Threading;
|
||||
|
||||
public class NumericFormats
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Display string representations of numbers for en-us culture
|
||||
CultureInfo ci = new CultureInfo("en-us");
|
||||
|
||||
// Output floating point values
|
||||
double floating = 10761.937554;
|
||||
Console.WriteLine("C: {0}",
|
||||
floating.ToString("C", ci)); // Displays "C: $10,761.94"
|
||||
Console.WriteLine("E: {0}",
|
||||
floating.ToString("E03", ci)); // Displays "E: 1.076E+004"
|
||||
Console.WriteLine("F: {0}",
|
||||
floating.ToString("F04", ci)); // Displays "F: 10761.9376"
|
||||
Console.WriteLine("G: {0}",
|
||||
floating.ToString("G", ci)); // Displays "G: 10761.937554"
|
||||
Console.WriteLine("N: {0}",
|
||||
floating.ToString("N03", ci)); // Displays "N: 10,761.938"
|
||||
Console.WriteLine("P: {0}",
|
||||
(floating/10000).ToString("P02", ci)); // Displays "P: 107.62 %"
|
||||
Console.WriteLine("R: {0}",
|
||||
floating.ToString("R", ci)); // Displays "R: 10761.937554"
|
||||
Console.WriteLine();
|
||||
|
||||
// Output integral values
|
||||
int integral = 8395;
|
||||
Console.WriteLine("C: {0}",
|
||||
integral.ToString("C", ci)); // Displays "C: $8,395.00"
|
||||
Console.WriteLine("D: {0}",
|
||||
integral.ToString("D6", ci)); // Displays "D: 008395"
|
||||
Console.WriteLine("E: {0}",
|
||||
integral.ToString("E03", ci)); // Displays "E: 8.395E+003"
|
||||
Console.WriteLine("F: {0}",
|
||||
integral.ToString("F01", ci)); // Displays "F: 8395.0"
|
||||
Console.WriteLine("G: {0}",
|
||||
integral.ToString("G", ci)); // Displays "G: 8395"
|
||||
Console.WriteLine("N: {0}",
|
||||
integral.ToString("N01", ci)); // Displays "N: 8,395.0"
|
||||
Console.WriteLine("P: {0}",
|
||||
(integral/10000.0).ToString("P02", ci)); // Displays "P: 83.95 %"
|
||||
Console.WriteLine("X: 0x{0}",
|
||||
integral.ToString("X", ci)); // Displays "X: 0x20CB"
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Globalization.NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo)
|
||||
|
||||
[Custom Numeric Format Strings](customnumeric.md)
|
||||
|
||||
[Composite Formatting](compositeformat.md)
|
|
@ -1,243 +0,0 @@
|
|||
---
|
||||
title: Standard TimeSpan Format Strings
|
||||
description: Standard TimeSpan Format Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: df592c05-fb7f-47a9-b615-2cc696b111d7
|
||||
---
|
||||
|
||||
# Standard TimeSpan Format Strings
|
||||
|
||||
A standard [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) format string uses a single format specifier to define the text representation of a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value that results from a formatting operation. Any format string that contains more than one character, including white space, is interpreted as a custom [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) format string. For more information, see [Custom TimeSpan Format Strings](customtimespan.md).
|
||||
|
||||
The string representations of [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) values are produced by calls to the overloads of the [TimeSpan.ToString](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_ToString) method, as well as by methods that support composite formatting, such as [String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object_). For more information, see [Formatting Types](../formattingtypes.md) and [Composite Formatting](compositeformat.md). The following example illustrates the use of standard format strings in formatting operations
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
TimeSpan duration = new TimeSpan(1, 12, 23, 62);
|
||||
string output = "Time of Travel: " + duration.ToString("c");
|
||||
Console.WriteLine(output);
|
||||
|
||||
Console.WriteLine("Time of Travel: {0:c}", duration);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Time of Travel: 1.12:24:02
|
||||
// Time of Travel: 1.12:24:02
|
||||
```
|
||||
|
||||
Standard [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) format strings are also used by the [TimeSpan.ParseExact](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_ParseExact_System_String_System_String_System_IFormatProvider_) and [TimeSpan.TryParseExact](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_TryParseExact_System_String_System_String_System_IFormatProvider_System_Globalization_TimeSpanStyles_System_TimeSpan__) methods to define the required format of input strings for parsing operations. (Parsing converts the string representation of a value to that value.) The following example illustrates the use of standard format strings in parsing operations.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string value = "1.03:14:56.1667";
|
||||
TimeSpan interval;
|
||||
try {
|
||||
interval = TimeSpan.ParseExact(value, "c", null);
|
||||
Console.WriteLine("Converted '{0}' to {1}", value, interval);
|
||||
}
|
||||
catch (FormatException) {
|
||||
Console.WriteLine("{0}: Bad Format", value);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("{0}: Out of Range", value);
|
||||
}
|
||||
|
||||
if (TimeSpan.TryParseExact(value, "c", null, out interval))
|
||||
Console.WriteLine("Converted '{0}' to {1}", value, interval);
|
||||
else
|
||||
Console.WriteLine("Unable to convert {0} to a time interval.",
|
||||
value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Converted '1.03:14:56.1667' to 1.03:14:56.1667000
|
||||
// Converted '1.03:14:56.1667' to 1.03:14:56.1667000
|
||||
```
|
||||
|
||||
The following table lists the standard time interval format specifiers.
|
||||
|
||||
Format specifier | Name | Description | Examples
|
||||
---------------- | ---- | ----------- | --------
|
||||
"c" | Constant (invariant) format | This specifier is not culture-sensitive. It takes the form [-][d’.’]hh’:’mm’:’ss[‘.’fffffff]. (The "t" and "T" format strings produce the same results.) | `TimeSpan.Zero -> 00:00:00`; `New TimeSpan(0, 0, 30, 0) -> 00:30:00`; `New TimeSpan(3, 17, 25, 30, 500) -> 3.17:25:30.5000000`
|
||||
"g" | General short format | This specifier outputs only what is needed. It is culture-sensitive and takes the form [-][d’:’]h’:’mm’:’ss[.FFFFFFF]. | `New TimeSpan(1, 3, 16, 50, 500) -> 1:3:16:50.5 (en-US)`; `New TimeSpan(1, 3, 16, 50, 500) -> 1:3:16:50,5 (fr-FR)`; `New TimeSpan(1, 3, 16, 50, 599) -> 1:3:16:50.599 (en-US)`; `New TimeSpan(1, 3, 16, 50, 599) -> 1:3:16:50,599 (fr-FR)`
|
||||
"G" | General long format | This specifier always outputs days and seven fractional digits. It is culture-sensitive and takes the form [-]d’:’hh’:’mm’:’ss.fffffff. | `New TimeSpan(18, 30, 0) -> 0:18:30:00.0000000 (en-US)`; `New TimeSpan(18, 30, 0) -> 0:18:30:00,0000000 (fr-FR)`
|
||||
|
||||
## The Constant ("c") Format Specifier
|
||||
|
||||
The "c" format specifier returns the string representation of a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value in the following form:
|
||||
|
||||
[-][_d_.]_hh_:_mm_:_ss_[._fffffff_]
|
||||
|
||||
Elements in square brackets ([ and ]) are optional. The period (.) and colon (:) are literal symbols. The following table describes the remaining elements.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
- | An optional negative sign, which indicates a negative time interval.
|
||||
*d* | The optional number of days, with no leading zeros.
|
||||
*hh* | The number of hours, which ranges from "00" to "23".
|
||||
*mm* | The number of minutes, which ranges from "00" to "59".
|
||||
*ss* | The number of seconds, which ranges from "0" to "59".
|
||||
*fffffff* | The optional fractional portion of a second. Its value can range from "0000001" (one tick, or one ten-millionth of a second) to "9999999" (9,999,999 ten-millionths of a second, or one second less one tick).
|
||||
|
||||
Unlike the "g" and "G" format specifiers, the "c" format specifier is not culture-sensitive. It produces the string representation of a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value that is invariant and that is common to all previous versions of the .NET Framework before the .NET Framework 4. "c" is the default [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) format string; the [TimeSpan.ToString](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_ToString) method formats a time interval value by using the "c" format string.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) also supports the "t" and "T" standard format strings, which are identical in behavior to the "c" standard format string.
|
||||
|
||||
The following example instantiates two [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) objects, uses them to perform arithmetic operations, and displays the result. In each case, it uses composite formatting to display the [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value by using the "c" format specifier.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
TimeSpan interval1, interval2;
|
||||
interval1 = new TimeSpan(7, 45, 16);
|
||||
interval2 = new TimeSpan(18, 12, 38);
|
||||
|
||||
Console.WriteLine("{0:c} - {1:c} = {2:c}", interval1,
|
||||
interval2, interval1 - interval2);
|
||||
Console.WriteLine("{0:c} + {1:c} = {2:c}", interval1,
|
||||
interval2, interval1 + interval2);
|
||||
|
||||
interval1 = new TimeSpan(0, 0, 1, 14, 365);
|
||||
interval2 = TimeSpan.FromTicks(2143756);
|
||||
Console.WriteLine("{0:c} + {1:c} = {2:c}", interval1,
|
||||
interval2, interval1 + interval2);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 07:45:16 - 18:12:38 = -10:27:22
|
||||
// 07:45:16 + 18:12:38 = 1.01:57:54
|
||||
// 00:01:14.3650000 + 00:00:00.2143756 = 00:01:14.5793756
|
||||
```
|
||||
|
||||
## The General Short ("g") Format Specifier
|
||||
|
||||
The "g" [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) format specifier returns the string representation of a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value in a compact form by including only the elements that are necessary. It has the following form:
|
||||
|
||||
[-][_d_:]_h_:_mm_:_ss_[._FFFFFFF_]
|
||||
|
||||
Elements in square brackets ([ and ]) are optional. The colon (:) is a literal symbol. The following table describes the remaining elements.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
- | An optional negative sign, which indicates a negative time interval.
|
||||
*d* | The optional number of days, with no leading zeros.
|
||||
*hh* | The number of hours, which ranges from "0" to "23", with no leading zeros.
|
||||
*mm* | The number of minutes, which ranges from "00" to "59".
|
||||
*ss* | The number of seconds, which ranges from "0" to "59".
|
||||
. | The fractional seconds separator.
|
||||
*FFFFFFF* | The fractional seconds. As few digits as possible are displayed.
|
||||
|
||||
Like the "G" format specifier, the "g" format specifier is localized. Its fractional seconds separator is based on the current culture.
|
||||
|
||||
The following example instantiates two [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) objects, uses them to perform arithmetic operations, and displays the result. In each case, it uses composite formatting to display the [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value by using the "g" format specifier. In addition, it formats the [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value by using the formatting conventions of the current system culture (which, in this case, is English - United States or en-US) and the French - France (fr-FR) culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
TimeSpan interval1, interval2;
|
||||
interval1 = new TimeSpan(7, 45, 16);
|
||||
interval2 = new TimeSpan(18, 12, 38);
|
||||
|
||||
Console.WriteLine("{0:g} - {1:g} = {2:g}", interval1,
|
||||
interval2, interval1 - interval2);
|
||||
Console.WriteLine(String.Format(new CultureInfo("fr-FR"),
|
||||
"{0:g} + {1:g} = {2:g}", interval1,
|
||||
interval2, interval1 + interval2));
|
||||
|
||||
interval1 = new TimeSpan(0, 0, 1, 14, 36);
|
||||
interval2 = TimeSpan.FromTicks(2143756);
|
||||
Console.WriteLine("{0:g} + {1:g} = {2:g}", interval1,
|
||||
interval2, interval1 + interval2);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 7:45:16 - 18:12:38 = -10:27:22
|
||||
// 7:45:16 + 18:12:38 = 1:1:57:54
|
||||
// 0:01:14.036 + 0:00:00.2143756 = 0:01:14.2503756
|
||||
```
|
||||
|
||||
## The General Long ("G") Format Specifier
|
||||
|
||||
The "G" [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) format specifier returns the string representation of a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value in a long form that always includes both days and fractional seconds. The string that results from the "G" standard format specifier has the following form:
|
||||
|
||||
[-]*d*:*hh*:*mm*:*ss*.*fffffff*
|
||||
|
||||
Elements in square brackets ([ and ]) are optional. The colon (:) is a literal symbol. The following table describes the remaining elements.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
- | An optional negative sign, which indicates a negative time interval.
|
||||
*d* | The optional number of days, with no leading zeros.
|
||||
*hh* | The number of hours, which ranges from "0" to "23".
|
||||
*mm* | The number of minutes, which ranges from "00" to "59".
|
||||
*ss* | The number of seconds, which ranges from "0" to "59".
|
||||
. | The fractional seconds separator.
|
||||
*fffffff* | The fractional seconds.
|
||||
|
||||
Like the "G" format specifier, the "g" format specifier is localized. Its fractional seconds separator is based on the current culture.
|
||||
|
||||
The following example instantiates two [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) objects, uses them to perform arithmetic operations, and displays the result. In each case, it uses composite formatting to display the [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value by using the "G" format specifier. In addition, it formats the [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value by using the formatting conventions of the current system culture (which, in this case, is English - United States or en-US) and the French - France (fr-FR) culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
TimeSpan interval1, interval2;
|
||||
interval1 = new TimeSpan(7, 45, 16);
|
||||
interval2 = new TimeSpan(18, 12, 38);
|
||||
|
||||
Console.WriteLine("{0:G} - {1:G} = {2:G}", interval1,
|
||||
interval2, interval1 - interval2);
|
||||
Console.WriteLine(String.Format(new CultureInfo("fr-FR"),
|
||||
"{0:G} + {1:G} = {2:G}", interval1,
|
||||
interval2, interval1 + interval2));
|
||||
|
||||
interval1 = new TimeSpan(0, 0, 1, 14, 36);
|
||||
interval2 = TimeSpan.FromTicks(2143756);
|
||||
Console.WriteLine("{0:G} + {1:G} = {2:G}", interval1,
|
||||
interval2, interval1 + interval2);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 0:07:45:16.0000000 - 0:18:12:38.0000000 = -0:10:27:22.0000000
|
||||
// 0:07:45:16,0000000 + 0:18:12:38,0000000 = 1:01:57:54,0000000
|
||||
// 0:00:01:14.0360000 + 0:00:00:00.2143756 = 0:00:01:14.2503756
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Composite Formatting](compositeformat.md)
|
||||
|
||||
|
|
@ -1,32 +0,0 @@
|
|||
---
|
||||
title: Working with Base Types
|
||||
description: Working with Base Types
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: c2279b39-4de6-4df3-b5a4-3836509884d2
|
||||
---
|
||||
|
||||
# Working with Base Types
|
||||
|
||||
This section describes .NET Core base type operations, including formatting, conversion, and common operations.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Type Conversion](conversio/conversiontables.md) - Describes how to convert from one type to another.
|
||||
|
||||
[Formatting Types](format/index.md) - Describes how to format strings using the string format specifiers.
|
||||
|
||||
[Manipulating Strings](manipulating/index.md) - Describes how to manipulate and format strings.
|
||||
|
||||
[Parsing Strings](parsing/index.md) - Describes how to convert strings into types.
|
||||
|
||||
## Related Sections
|
||||
|
||||
[Common Type System](commontypesystem.md) - Describes types used in .NET Core.
|
||||
|
|
@ -1,163 +0,0 @@
|
|||
---
|
||||
title: How to: Perform Basic String Manipulations
|
||||
description: How to: Perform Basic String Manipulations
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a5f414e3-acb8-47f2-a1fd-4d327d593205
|
||||
---
|
||||
|
||||
# How to: Perform Basic String Manipulations
|
||||
|
||||
The following example uses some of the methods discussed in the Basic String Operations topics to construct a class that performs string manipulations in a manner that might be found in a real-world application. The `MailToData` class stores the name and address of an individual in separate properties and provides a way to combine the `City`, `State`, and `Zip` fields into a single string for display to the user. Furthermore, the class allows the user to enter the city, state, and ZIP Code information as a single string; the application automatically parses the single string and enters the proper information into the corresponding property.
|
||||
|
||||
For simplicity, this example uses a console application with a command-line interface.
|
||||
|
||||
## Example
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
class MainClass
|
||||
{
|
||||
static void Main()
|
||||
{
|
||||
MailToData MyData = new MailToData();
|
||||
|
||||
Console.Write("Enter Your Name: ");
|
||||
MyData.Name = Console.ReadLine();
|
||||
Console.Write("Enter Your Address: ");
|
||||
MyData.Address = Console.ReadLine();
|
||||
Console.Write("Enter Your City, State, and ZIP Code separated by spaces: ");
|
||||
MyData.CityStateZip = Console.ReadLine();
|
||||
Console.WriteLine();
|
||||
|
||||
if (MyData.Validated) {
|
||||
Console.WriteLine("Name: {0}", MyData.Name);
|
||||
Console.WriteLine("Address: {0}", MyData.Address);
|
||||
Console.WriteLine("City: {0}", MyData.City);
|
||||
Console.WriteLine("State: {0}", MyData.State);
|
||||
Console.WriteLine("Zip: {0}", MyData.Zip);
|
||||
|
||||
Console.WriteLine("\nThe following address will be used:");
|
||||
Console.WriteLine(MyData.Address);
|
||||
Console.WriteLine(MyData.CityStateZip);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public class MailToData
|
||||
{
|
||||
string name = "";
|
||||
string address = "";
|
||||
string citystatezip = "";
|
||||
string city = "";
|
||||
string state = "";
|
||||
string zip = "";
|
||||
bool parseSucceeded = false;
|
||||
|
||||
public string Name
|
||||
{
|
||||
get{return name;}
|
||||
set{name = value;}
|
||||
}
|
||||
|
||||
public string Address
|
||||
{
|
||||
get{return address;}
|
||||
set{address = value;}
|
||||
}
|
||||
|
||||
public string CityStateZip
|
||||
{
|
||||
get {
|
||||
return String.Format("{0}, {1} {2}", city, state, zip);
|
||||
}
|
||||
set {
|
||||
citystatezip = value.Trim();
|
||||
ParseCityStateZip();
|
||||
}
|
||||
}
|
||||
|
||||
public string City
|
||||
{
|
||||
get{return city;}
|
||||
set{city = value;}
|
||||
}
|
||||
|
||||
public string State
|
||||
{
|
||||
get{return state;}
|
||||
set{state = value;}
|
||||
}
|
||||
|
||||
public string Zip
|
||||
{
|
||||
get{return zip;}
|
||||
set{zip = value;}
|
||||
}
|
||||
|
||||
public bool Validated
|
||||
{
|
||||
get { return parseSucceeded; }
|
||||
}
|
||||
|
||||
private void ParseCityStateZip()
|
||||
{
|
||||
string msg = "";
|
||||
const string msgEnd = "\nYou must enter spaces between city, state, and zip code.\n";
|
||||
|
||||
// Throw a FormatException if the user did not enter the necessary spaces
|
||||
// between elements.
|
||||
try
|
||||
{
|
||||
// City may consist of multiple words, so we'll have to parse the
|
||||
// string from right to left starting with the zip code.
|
||||
int zipIndex = citystatezip.LastIndexOf(" ");
|
||||
if (zipIndex == -1) {
|
||||
msg = "\nCannot identify a zip code." + msgEnd;
|
||||
throw new FormatException(msg);
|
||||
}
|
||||
zip = citystatezip.Substring(zipIndex + 1);
|
||||
|
||||
int stateIndex = citystatezip.LastIndexOf(" ", zipIndex - 1);
|
||||
if (stateIndex == -1) {
|
||||
msg = "\nCannot identify a state." + msgEnd;
|
||||
throw new FormatException(msg);
|
||||
}
|
||||
state = citystatezip.Substring(stateIndex + 1, zipIndex - stateIndex - 1);
|
||||
state = state.ToUpper();
|
||||
|
||||
city = citystatezip.Substring(0, stateIndex);
|
||||
if (city.Length == 0) {
|
||||
msg = "\nCannot identify a city." + msgEnd;
|
||||
throw new FormatException(msg);
|
||||
}
|
||||
parseSucceeded = true;
|
||||
}
|
||||
catch (FormatException ex)
|
||||
{
|
||||
Console.WriteLine(ex.Message);
|
||||
}
|
||||
}
|
||||
|
||||
private string ReturnCityStateZip()
|
||||
{
|
||||
// Make state uppercase.
|
||||
state = state.ToUpper();
|
||||
|
||||
// Put the value of city, state, and zip together in the proper manner.
|
||||
string MyCityStateZip = String.Concat(city, ", ", state, " ", zip);
|
||||
|
||||
return MyCityStateZip;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
When the preceding code is executed, the user is asked to enter his or her name and address. The application places the information in the appropriate properties and displays the information back to the user, creating a single string that displays the city, state, and ZIP Code information.
|
||||
|
|
@ -1,52 +0,0 @@
|
|||
---
|
||||
title: Changing Case
|
||||
description: Changing Case
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: b4fbbe41-e16f-4767-ae19-fdc9bc0b6f10
|
||||
---
|
||||
|
||||
# Changing Case
|
||||
|
||||
If you write an application that accepts input from a user, you can never be sure what case he or she will use to enter the data. Often, you want strings to be cased consistently, particularly if you are displaying them in the user interface. The following table describes two case-changing methods.
|
||||
|
||||
Method name | Use
|
||||
----------- | ---
|
||||
[String.ToUpper](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpper) | Converts all characters in a string to uppercase.
|
||||
[String.ToLower](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToLower) | Converts all characters in a string to lowercase.
|
||||
|
||||
> **Warning**
|
||||
>
|
||||
> Note that the `String.ToUpper` and `String.ToLower` methods should not be used to convert strings in order to compare them or test them for equality.
|
||||
|
||||
## Comparing strings of mixed case
|
||||
|
||||
To compare strings of mixed case to determine whether they are equal, their, call one of the overloads of the [String.Equals](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_System_StringComparison_) method with a *comparisonType* parameter, and provide a value of either [StringComparison.CurrentCultureIgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCultureIgnoreCase) or [StringComparison.OrdinalIgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) for the *comparisonType* argument.
|
||||
|
||||
## ToUpper
|
||||
|
||||
The [String.ToUpper](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpper) method changes all characters in a string to uppercase. The following example converts the string "Hello World!" from mixed case to uppercase.
|
||||
|
||||
```csharp
|
||||
string properString = "Hello World!";
|
||||
Console.WriteLine(properString.ToUpper());
|
||||
// This example displays the following output:
|
||||
// HELLO WORLD!
|
||||
```
|
||||
|
||||
## ToLower
|
||||
|
||||
The [String.ToLower](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToLower) method is similar to the previous method, but instead converts all the characters in a string to lowercase. The following example converts the string "Hello World!" to lowercase.
|
||||
|
||||
```csharp
|
||||
string properString = "Hello World!";
|
||||
Console.WriteLine(properString.ToLower());
|
||||
// This example displays the following output:
|
||||
// hello world!
|
||||
```
|
|
@ -1,167 +0,0 @@
|
|||
---
|
||||
title: Comparing Strings
|
||||
description: Comparing Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: bb028741-1d8b-4719-acb7-c8fa917608c6
|
||||
---
|
||||
|
||||
# Comparing Strings
|
||||
|
||||
.NET Core provides several methods to compare the values of strings. The following table lists and describes the value-comparison methods.
|
||||
|
||||
Method name | Use
|
||||
----------- | ---
|
||||
[String.Compare](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_Int32_System_String_System_Int32_System_Int32_) | Compares the values of two strings. Returns an integer value.
|
||||
[String.CompareOrdinal](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareOrdinal_System_String_System_Int32_System_String_System_Int32_System_Int32_) | Compares two strings without regard to local culture. Returns an integer value.
|
||||
[String.CompareTo](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) | Compares the current string object to another string. Returns an integer value.
|
||||
[String.StartsWith](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_StartsWith_System_String_) | Determines whether a string begins with the string passed. Returns a Boolean value.
|
||||
[String.EndsWith](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) | Determines whether a string ends with the string passed. Returns a Boolean value.
|
||||
[String.Equals](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) | Determines whether two strings are the same. Returns a Boolean value.
|
||||
[String.IndexOf](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_) | Returns the index position of a character or string, starting from the beginning of the string you are examining. Returns an integer value.
|
||||
[String.LastIndexOf](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_Char_) | Returns the index position of a character or string, starting from the end of the string you are examining. Returns an integer value.
|
||||
|
||||
## Compare
|
||||
|
||||
The static [String.Compare](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_Int32_System_String_System_Int32_System_Int32_) method provides a thorough way of comparing two strings. This method is culturally aware. You can use this function to compare two strings or substrings of two strings. Additionally, overloads are provided that regard or disregard case and cultural variance. The following table shows the three integer values that this method might return.
|
||||
|
||||
Return value | Condition
|
||||
------------ | ---------
|
||||
A negative integer | The first string precedes the second string in the sort order, or the first string is `null`.
|
||||
0 | The first string and the second string are equal, or both strings are `null`.
|
||||
A positive integer, or 1 | The first string follows the second string in the sort order, or the second string is null.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> The [String.Compare](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_Int32_System_String_System_Int32_System_Int32_) method is primarily intended for use when ordering or sorting strings. You should not use the [String.Compare](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_Int32_System_String_System_Int32_System_Int32_) method to test for equality (that is, to explicitly look for a return value of 0 with no regard for whether one string is less than or greater than the other). Instead, to determine whether two strings are equal, use the [String.Equals(String, String, StringComparison)](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_System_String_System_StringComparison_) method.
|
||||
|
||||
The following example uses the [String.Compare](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_Int32_System_String_System_Int32_System_Int32_) method to determine the relative values of two strings.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World!";
|
||||
Console.WriteLine(String.Compare(string1, "Hello World?"));
|
||||
```
|
||||
|
||||
This example displays `-1` to the console.
|
||||
|
||||
## CompareOrdinal
|
||||
|
||||
The [String.CompareOrdinal](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareOrdinal_System_String_System_Int32_System_String_System_Int32_System_Int32_) method compares two string objects without considering the local culture. The return values of this method are identical to the values returned by the `Compare` method in the previous table.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> The [String.CompareOrdinal](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareOrdinal_System_String_System_Int32_System_String_System_Int32_System_Int32_) method is primarily intended for use when ordering or sorting strings. You should not use the [String.CompareOrdinal](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareOrdinal_System_String_System_Int32_System_String_System_Int32_System_Int32_) method to test for equality (that is, to explicitly look for a return value of 0 with no regard for whether one string is less than or greater than the other). Instead, to determine whether two strings are equal, use the [String.Equals(String, String, StringComparison)](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_System_String_System_StringComparison_) method.
|
||||
|
||||
The following example uses the `CompareOrdinal` method to compare the values of two strings.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World!";
|
||||
Console.WriteLine(String.CompareOrdinal(string1, "hello world!"));
|
||||
```
|
||||
|
||||
This example displays `-32` to the console.
|
||||
|
||||
## CompareTo
|
||||
|
||||
The [String.CompareTo](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) method compares the string that the current string object encapsulates to another string or object. The return values of this method are identical to the values returned by the `String.Compare` method in the previous table.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> The [String.CompareTo](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) method is primarily intended for use when ordering or sorting strings. You should not use the [String.CompareTo](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) method to test for equality (that is, to explicitly look for a return value of 0 with no regard for whether one string is less than or greater than the other). Instead, to determine whether two strings are equal, use the [String.Equals(String, String, StringComparison)](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_System_String_System_StringComparison_) method.
|
||||
|
||||
The following example uses the `String.CompareTo` method to compare the `string1` object to the `string2` object.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
string string2 = "Hello World!";
|
||||
int MyInt = string1.CompareTo(string2);
|
||||
Console.WriteLine( MyInt );
|
||||
```
|
||||
|
||||
This example displays `-1` to the console.
|
||||
|
||||
## Equals
|
||||
|
||||
The [String.Equals](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) method can easily determine if two strings are the same. This case-sensitive method returns a `true` or `false` Boolean value. It can be used from an existing class, as illustrated in the next example. The following example uses the `Equals` method to determine whether a string object contains the phrase "Hello World".
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
Console.WriteLine(string1.Equals("Hello World"));
|
||||
```
|
||||
|
||||
This example displays `true` to the console.
|
||||
|
||||
This method can also be used as a static method. The following example compares two string objects using a static method.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
string string2 = "Hello World";
|
||||
Console.WriteLine(String.Equals(string1, string2));
|
||||
```
|
||||
|
||||
This example displays `true` to the console.
|
||||
|
||||
## StartsWith and EndsWith
|
||||
|
||||
You can use the [String.StartsWith](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_StartsWith_System_String_) method to determine whether a string object begins with the same characters that encompass another string. This case-sensitive method returns `true` if the current string object begins with the passed string and `false` if it does not. The following example uses this method to determine if a string object begins with "Hello".
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
Console.WriteLine(string1.StartsWith("Hello"));
|
||||
```
|
||||
|
||||
This example displays `true` to the console.
|
||||
|
||||
The [String.EndsWith](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) method compares a passed string to the characters that exist at the end of the current string object. It also returns a Boolean value. The following example checks the end of a string using the `EndsWith` method.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
Console.WriteLine(string1.EndsWith("Hello"));
|
||||
```
|
||||
|
||||
This example displays `false` to the console.
|
||||
|
||||
## IndexOf and LastIndexOf
|
||||
|
||||
You can use the [String.IndexOf](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_) method to determine the position of the first occurrence of a particular character within a string. This case-sensitive method starts counting from the beginning of a string and returns the position of a passed character using a zero-based index. If the character cannot be found, a value of –1 is returned.
|
||||
|
||||
The following example uses the `IndexOf` method to search for the first occurrence of the '`l`' character in a string.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
Console.WriteLine(string1.IndexOf('l'));
|
||||
```
|
||||
|
||||
This example displays `2` to the console.
|
||||
|
||||
The [String.LastIndexOf](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_Char_) method is similar to the `String.IndexOf` method except that it returns the position of the last occurrence of a particular character within a string. It is case-sensitive and uses a zero-based index.
|
||||
|
||||
The following example uses the `LastIndexOf` method to search for the last occurrence of the '`l`' character in a string.
|
||||
|
||||
```csharp
|
||||
string string1 = "Hello World";
|
||||
Console.WriteLine(string1.LastIndexOf('l'));
|
||||
```
|
||||
|
||||
This example displays `9` to the console.
|
||||
|
||||
Both methods are useful when used in conjunction with the [String.Remove](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Remove_System_Int32_) method. You can use either the `IndexOf` or `LastIndexOf` methods to retrieve the position of a character, and then supply that position to the `Remove method` in order to remove a character or a word that begins with that character.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,117 +0,0 @@
|
|||
---
|
||||
title: Creating New Strings
|
||||
description: Creating New Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 8e8ddb1b-218e-4007-ae6d-b2e7f91f155d
|
||||
---
|
||||
|
||||
# Creating New Strings
|
||||
|
||||
.NET Core allows strings to be created using simple assignment, and also overloads a class constructor to support string creation using a number of different parameters. .NET Core also provides several methods in the [System.String](https://docs.microsoft.com/dotnet/core/api/System.String) class that create new string objects by combining several strings, arrays of strings, or objects.
|
||||
|
||||
## Creating Strings Using Assignment
|
||||
|
||||
The easiest way to create a new [String](https://docs.microsoft.com/dotnet/core/api/System.String) object is simply to assign a string literal to a [String](https://docs.microsoft.com/dotnet/core/api/System.String) object.
|
||||
|
||||
## Creating Strings Using a Class Constructor
|
||||
|
||||
You can use overloads of the [String](https://docs.microsoft.com/dotnet/core/api/System.String) class constructor to create strings from character arrays. You can also create a new string by duplicating a particular character a specified number of times.
|
||||
|
||||
## Methods that Return Strings
|
||||
|
||||
The following table lists several useful methods that return new string objects.
|
||||
|
||||
Method name | Use
|
||||
----------- | ---
|
||||
[String.Format](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_String_System_Object_) | Builds a formatted string from a set of input objects.
|
||||
[String.Concat](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Concat_System_String_System_String_) | Builds strings from two or more strings.
|
||||
[String.Join](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Join_System_String_System_String___) |Builds a new string by combining an array of strings.
|
||||
[String.Insert](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Insert_System_Int32_System_String_) | Builds a new string by inserting a string into the specified index of an existing string.
|
||||
[String.CopyTo](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CopyTo_System_Int32_System_Char___System_Int32_System_Int32_) | Copies specified characters in a string into a specified position in an array of characters.
|
||||
|
||||
### Format
|
||||
|
||||
You can use the `String.Format` method to create formatted strings and concatenate strings representing multiple objects. This method automatically converts any passed object into a string. For example, if your application must display an [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) value and a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value to the user, you can easily construct a string to represent these values using the `Format` method.
|
||||
|
||||
The following example uses the `Format` method to create a string that uses an integer variable.
|
||||
|
||||
```csharp
|
||||
int numberOfFleas = 12;
|
||||
string miscInfo = String.Format("Your dog has {0} fleas. " +
|
||||
"It is time to get a flea collar. " +
|
||||
"The current universal date is: {1:u}.",
|
||||
numberOfFleas, DateTime.Now);
|
||||
Console.WriteLine(miscInfo);
|
||||
// The example displays the following output:
|
||||
// Your dog has 12 fleas. It is time to get a flea collar.
|
||||
// The current universal date is: 2008-03-28 13:31:40Z.
|
||||
```
|
||||
|
||||
In this example, [DateTime.Now](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Now) displays the current date and time in a manner specified by the culture associated with the current thread.
|
||||
|
||||
### Concat
|
||||
|
||||
The `String.Concat` method can be used to easily create a new string object from two or more existing objects. It provides a language-independent way to concatenate strings. This method accepts any class that derives from `System.Object`. The following example creates a string from two existing string objects and a separating character.
|
||||
|
||||
```csharp
|
||||
string helloString1 = "Hello";
|
||||
string helloString2 = "World!";
|
||||
Console.WriteLine(String.Concat(helloString1, ' ', helloString2));
|
||||
// The example displays the following output:
|
||||
// Hello World!
|
||||
```
|
||||
|
||||
### Join
|
||||
|
||||
The `String.Join` method creates a new string from an array of strings and a separator string. This method is useful if you want to concatenate multiple strings together, making a list perhaps separated by a comma.
|
||||
|
||||
The following example uses a space to bind a string array.
|
||||
|
||||
```csharp
|
||||
string[] words = {"Hello", "and", "welcome", "to", "my" , "world!"};
|
||||
Console.WriteLine(String.Join(" ", words));
|
||||
// The example displays the following output:
|
||||
// Hello and welcome to my world!
|
||||
```
|
||||
|
||||
### Insert
|
||||
|
||||
The `String.Insert` method creates a new string by inserting a string into a specified position in another string. This method uses a zero-based index. The following example inserts a string into the fifth index position of `MyString` and creates a new string with this value.
|
||||
|
||||
```csharp
|
||||
string sentence = "Once a time.";
|
||||
Console.WriteLine(sentence.Insert(4, " upon"));
|
||||
// The example displays the following output:
|
||||
// Once upon a time.
|
||||
```
|
||||
|
||||
### CopyTo
|
||||
|
||||
The `String.CopyTo` method copies portions of a string into an array of characters. You can specify both the beginning index of the string and the number of characters to be copied. This method takes the source index, an array of characters, the destination index, and the number of characters to copy. All indexes are zero-based.
|
||||
|
||||
The following example uses the `CopyTo` method to copy the characters of the word "Hello" from a string object to the first index position of an array of characters.
|
||||
|
||||
```csharp
|
||||
string greeting = "Hello World!";
|
||||
char[] charArray = {'W','h','e','r','e'};
|
||||
Console.WriteLine("The original character array: {0}", new string(charArray));
|
||||
greeting.CopyTo(0, charArray,0 ,5);
|
||||
Console.WriteLine("The new character array: {0}", new string(charArray));
|
||||
// The example displays the following output:
|
||||
// The original character array: Where
|
||||
// The new character array: Hello
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,35 +0,0 @@
|
|||
---
|
||||
title: Basic String Operations
|
||||
description: Basic String Operations
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: c8d8f231-a6e2-422f-b218-d6fcfff91b8a
|
||||
---
|
||||
|
||||
# Basic String Operations
|
||||
|
||||
Applications often respond to users by constructing messages based on user input. For example, it is not uncommon for Web sites to respond to a newly logged-on user with a specialized greeting that includes the user's name. Several methods in the [System.String](https://docs.microsoft.com/dotnet/core/api/System.String) and [System.Text.StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) classes allow you to dynamically construct custom strings to display in your user interface. These methods also help you perform a number of basic string operations like creating new strings from arrays of bytes, comparing the values of strings, and modifying existing strings.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Creating New Strings](creatingnew.md) - Describes basic ways to convert objects into strings and to combine strings.
|
||||
|
||||
[Trimming and Removing Characters](trimming.md) - Describes how to trim or remove characters in a string.
|
||||
|
||||
[Padding Strings](padding.md) - Describes how to insert characters or empty spaces into a string.
|
||||
|
||||
[Comparing Strings](comparing.md) - Describes how to compare the contents of two or more strings.
|
||||
|
||||
[Changing Case](,,/changingcase.md) - Describes how to change the case of characters within a string.
|
||||
|
||||
[Using the StringBuilder Class](stringbuilder.md) - Describes how to create and modify dynamic string objects with the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class.
|
||||
|
||||
[How to: Perform Basic String Manipulations](basicmanipulations.md) - Demonstrates the use of basic string operations.
|
||||
|
||||
|
|
@ -1,47 +0,0 @@
|
|||
---
|
||||
title: Padding Strings
|
||||
description: Padding Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: f143f80f-ca6f-4d38-a643-22ea1bbfa597
|
||||
---
|
||||
|
||||
# Padding Strings
|
||||
|
||||
Use one of the following [System.String]( https://docs.microsoft.com/dotnet/core/api/System.String) methods to create a new string that consists of an original string that is padded with leading or trailing characters to a specified total length. The padding character can be spaces or a specified character, and consequently appears to be either right-aligned or left-aligned.
|
||||
|
||||
Method name | Use
|
||||
----------- | ---
|
||||
[String.PadLeft]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadLeft_System_Int32_) | Pads a string with leading characters to a specified total length.
|
||||
[String.PadRight]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadRight_System_Int32_) | Pads a string with trailing characters to a specified total length.
|
||||
|
||||
## PadLeft
|
||||
|
||||
The [String.PadLeft]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadLeft_System_Int32_) method creates a new string by concatenating enough leading pad characters to an original string to achieve a specified total length. The [String.PadLeft(Int32)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadLeft_System_Int32_) method uses white space as the padding character and the [String.PadLeft(Int32, Char)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadLeft_System_Int32_System_Char_) method enables you to specify your own padding character.
|
||||
|
||||
The following code example uses the [PadLeft]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadLeft_System_Int32_System_Char__) method to create a new string that is twenty characters long. The example displays "`--------Hello World!`" to the console.
|
||||
|
||||
```csharp
|
||||
string MyString = "Hello World!";
|
||||
Console.WriteLine(MyString.PadLeft(20, '-'));
|
||||
```
|
||||
|
||||
## PadRight
|
||||
|
||||
The [String.PadRight]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadRight_System_Int32_) method creates a new string by concatenating enough trailing pad characters to an original string to achieve a specified total length. The [String.PadRight(Int32)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadRight_System_Int32_) method uses white space as the padding character and the [String.PadRight(Int32, Char)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadRight_System_Int32_System_Char_) method enables you to specify your own padding character.
|
||||
|
||||
The following code example uses the [PadRight(Int32, Char)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_PadRight_System_Int32_System_Char_) method to create a new string that is twenty characters long. The example displays "`Hello World!--------`" to the console.
|
||||
|
||||
```csharp
|
||||
string MyString = "Hello World!";
|
||||
Console.WriteLine(MyString.PadRight(20, '-'));
|
||||
```
|
||||
|
||||
|
||||
|
|
@ -1,163 +0,0 @@
|
|||
---
|
||||
title: Using the StringBuilder Class
|
||||
description: Using the StringBuilder Class
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a66da95e-e26e-4f41-b2a0-dfcb943cb4a9
|
||||
---
|
||||
|
||||
# Using the StringBuilder Class
|
||||
|
||||
The [String](https://docs.microsoft.com/dotnet/core/api/System.String) object is immutable. Every time you use one of the methods in the [System.String](https://docs.microsoft.com/dotnet/core/api/System.String) class, you create a new string object in memory, which requires a new allocation of space for that new object. In situations where you need to perform repeated modifications to a string, the overhead associated with creating a new [String](https://docs.microsoft.com/dotnet/core/api/System.String) object can be costly. The [System.Text.StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class can be used when you want to modify a string without creating a new object. For example, using the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class can boost performance when concatenating many strings together in a loop.
|
||||
|
||||
## Importing the System.Text Namespace
|
||||
|
||||
The [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class is found in the [System.Text](https://docs.microsoft.com/dotnet/core/api/System.Text) namespace. To avoid having to provide a fully qualified type name in your code, you can import the [System.Text](https://docs.microsoft.com/dotnet/core/api/System.Text) namespace:
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
```
|
||||
|
||||
## Instantiating a StringBuilder Object
|
||||
|
||||
You can create a new instance of the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class by initializing your variable with one of the overloaded constructor methods, as illustrated in the following example.
|
||||
|
||||
```csharp
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Hello World!");
|
||||
```
|
||||
|
||||
## Setting the Capacity and Length
|
||||
|
||||
Although the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) is a dynamic object that allows you to expand the number of characters in the string that it encapsulates, you can specify a value for the maximum number of characters that it can hold. This value is called the capacity of the object and should not be confused with the length of the string that the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) holds. For example, you might create a new instance of the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class with the string "Hello", which has a length of 5, and you might specify that the object has a maximum capacity of 25. When you modify the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder), it does not reallocate size for itself until the capacity is reached. When this occurs, the new space is allocated automatically and the capacity is doubled. You can specify the capacity of the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) class using one of the overloaded constructors. The following example specifies that the `MyStringBuilder` object can be expanded to a maximum of 25 spaces.
|
||||
|
||||
```csharp
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Hello World!", 25);
|
||||
```
|
||||
|
||||
Additionally, you can use the read/write [Capacity](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Capacity) property to set the maximum length of your object. The following example uses the [Capacity](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Capacity) property to define the maximum object length.
|
||||
|
||||
```csharp
|
||||
MyStringBuilder.Capacity = 25;
|
||||
```
|
||||
|
||||
The [EnsureCapacity](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_EnsureCapacity_System_Int32_) method can be used to check the capacity of the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder). If the capacity is greater than the passed value, no change is made; however, if the capacity is smaller than the passed value, the current capacity is changed to match the passed value.
|
||||
|
||||
The [Length](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Length) property can also be viewed or set. If you set the [Length](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Length) property to a value that is greater than the [Capacity](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Capacity) property, the [Capacity](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Capacity) property is automatically changed to the same value as the [Length](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Length) property. Setting the [Length](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Length) property to a value that is less than the length of the string within the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) shortens the string.
|
||||
|
||||
## Modifying the StringBuilder String
|
||||
|
||||
The following table lists the methods you can use to modify the contents of a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder).
|
||||
|
||||
Method name | Use
|
||||
----------- | ---
|
||||
[StringBuilder.Append](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Append_System_Char_) | Appends information to the end of the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder).
|
||||
[StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_) | Replaces a format specifier passed in a string with formatted text.
|
||||
[StringBuilder.Insert](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Insert_System_Int32_System_Char_) | Inserts a string or object into the specified index of the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder).
|
||||
[StringBuilder.Remove](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Remove_System_Int32_System_Int32_) | Removes a specified number of characters from the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder).
|
||||
[StringBuilder.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Replace_System_Char_System_Char_) | Replaces a specified character at a specified index.
|
||||
|
||||
### Append
|
||||
|
||||
The [StringBuilder.Append](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Append_System_Char_) method can be used to add text or a string representation of an object to the end of a string represented by the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder). The following example initializes a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) to "Hello World" and then appends some text to the end of the object. Space is allocated automatically as needed.
|
||||
|
||||
```csharp
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Hello World!");
|
||||
MyStringBuilder.Append(" What a beautiful day.");
|
||||
Console.WriteLine(MyStringBuilder);
|
||||
// The example displays the following output:
|
||||
// Hello World! What a beautiful day.
|
||||
```
|
||||
|
||||
### AppendFormat
|
||||
|
||||
The [StringBuilder.AppendFormat](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_AppendFormat_System_IFormatProvider_System_String_System_Object_) method adds text to the end of the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object. It supports the composite formatting feature by calling the [IFormattable](https://docs.microsoft.com/dotnet/core/api/System.IFormattable) implementation of the object or objects to be formatted. Therefore, it accepts the standard format strings for numeric, date and time, and enumeration values, the custom format strings for numeric and date and time values, and the format strings defined for custom types. You can use this method to customize the format of variables and append those values to a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder). The following example uses the AppendFormat method to place an integer value formatted as a currency value at the end of a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object.
|
||||
|
||||
```csharp
|
||||
int MyInt = 25;
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Your total is ");
|
||||
MyStringBuilder.AppendFormat("{0:C} ", MyInt);
|
||||
Console.WriteLine(MyStringBuilder);
|
||||
// The example displays the following output:
|
||||
// Your total is $25.00
|
||||
```
|
||||
|
||||
### Insert
|
||||
|
||||
The [StringBuilder.Insert](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Insert_System_Int32_System_Char_) method adds a string or object to a specified position in the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object. The following example uses this method to insert a word into the sixth position of a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object.
|
||||
|
||||
```csharp
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Hello World!");
|
||||
MyStringBuilder.Insert(6,"Beautiful ");
|
||||
Console.WriteLine(MyStringBuilder);
|
||||
// The example displays the following output:
|
||||
// Hello Beautiful World!
|
||||
```
|
||||
|
||||
### Remove
|
||||
|
||||
You can use the [StringBuilder.Remove](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Remove_System_Int32_System_Int32_) method to remove a specified number of characters from the current [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object, beginning at a specified zero-based index. The following example uses the [Remove](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Remove_System_Int32_System_Int32_) method to shorten a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object.
|
||||
|
||||
```csharp
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Hello World!");
|
||||
MyStringBuilder.Remove(5,7);
|
||||
Console.WriteLine(MyStringBuilder);
|
||||
// The example displays the following output:
|
||||
// Hello
|
||||
```
|
||||
|
||||
### Replace
|
||||
|
||||
The [StringBuilder.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Replace_System_Char_System_Char_) | Replaces a specified character at a specified index.
|
||||
method can be used to replace characters within the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object with another specified character. The following example uses the [Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_Replace_System_Char_System_Char_) | Replaces a specified character at a specified index.
|
||||
method to search a [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object for all instances of the exclamation point character (!) and replace them with the question mark character (?).
|
||||
|
||||
```csharp
|
||||
StringBuilder MyStringBuilder = new StringBuilder("Hello World!");
|
||||
MyStringBuilder.Replace('!', '?');
|
||||
Console.WriteLine(MyStringBuilder);
|
||||
// The example displays the following output:
|
||||
// Hello World?
|
||||
```
|
||||
|
||||
## Converting a StringBuilder Object to a String
|
||||
|
||||
You must convert the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object to a String object before you can pass the string represented by the [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) object to a method that has a [String](https://docs.microsoft.com/dotnet/core/api/System.String) parameter or display it in the user interface. You do this conversion by calling the [StringBuilder.ToString](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_ToString) method. The following example calls a number of [StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) methods and then calls the [StringBuilder.ToString](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder#System_Text_StringBuilder_ToString) method to display the string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
StringBuilder sb = new StringBuilder();
|
||||
bool flag = true;
|
||||
string[] spellings = { "recieve", "receeve", "receive" };
|
||||
sb.AppendFormat("Which of the following spellings is {0}:", flag);
|
||||
sb.AppendLine();
|
||||
for (int ctr = 0; ctr <= spellings.GetUpperBound(0); ctr++) {
|
||||
sb.AppendFormat(" {0}. {1}", ctr, spellings[ctr]);
|
||||
sb.AppendLine();
|
||||
}
|
||||
sb.AppendLine();
|
||||
Console.WriteLine(sb.ToString());
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Which of the following spellings is True:
|
||||
// 0. recieve
|
||||
// 1. receeve
|
||||
// 2. receive
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Text.StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder)
|
|
@ -1,138 +0,0 @@
|
|||
---
|
||||
title: Trimming and Removing Characters from Strings
|
||||
description: Trimming and Removing Characters from Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 8eb77d66-4f6d-44ad-96b3-e6e8d5426cfb
|
||||
---
|
||||
|
||||
# Trimming and Removing Characters from Strings
|
||||
|
||||
If you are parsing a sentence into individual words, you might end up with words that have blank spaces (also called white spaces) on either end of the word. In this situation, you can use one of the trim methods in the [System.String](https://docs.microsoft.com/dotnet/core/api/System.String) class to remove any number of spaces or other characters from a specified position in the string. The following table describes the available trim methods.
|
||||
|
||||
Method name | Use
|
||||
----------- | ---
|
||||
[String.Trim](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Trim) | Removes white spaces or characters specified in an array of characters from the beginning and end of a string.
|
||||
[String.TrimEnd](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimEnd_System_Char___) | Removes characters specified in an array of characters from the end of a string.
|
||||
[String.TrimStart](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimStart_System_Char___) | Removes characters specified in an array of characters from the beginning of a string.
|
||||
[String.Remove](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Remove_System_Int32_) | Removes a specified number of characters from a specified index position in a string.
|
||||
|
||||
|
||||
## Trim
|
||||
|
||||
You can easily remove white spaces from both ends of a string by using the [String.Trim](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Trim) method, as shown in the following example.
|
||||
|
||||
```csharp
|
||||
string MyString = " Big ";
|
||||
Console.WriteLine("Hello{0}World!", MyString);
|
||||
string TrimString = MyString.Trim();
|
||||
Console.WriteLine("Hello{0}World!", TrimString);
|
||||
// The example displays the following output:
|
||||
// Hello Big World!
|
||||
// HelloBigWorld!
|
||||
```
|
||||
|
||||
You can also remove characters that you specify in a character array from the beginning and end of a string. The following example removes white-space characters, periods, and asterisks.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
String header = "* A Short String. *";
|
||||
Console.WriteLine(header);
|
||||
Console.WriteLine(header.Trim( new Char[] { ' ', '*', '.' } ));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// * A Short String. *
|
||||
// A Short String
|
||||
```
|
||||
|
||||
## TrimEnd
|
||||
|
||||
The [String.TrimEnd](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimEnd_System_Char___) method removes characters from the end of a string, creating a new string object. An array of characters is passed to this method to specify the characters to be removed. The order of the elements in the character array does not affect the trim operation. The trim stops when a character not specified in the array is found.
|
||||
|
||||
The following example removes the last letters of a string using the TrimEnd method. In this example, the position of the `'r'` character and the `'W'` character are reversed to illustrate that the order of characters in the array does not matter. Notice that this code removes the last word of `MyString` plus part of the first.
|
||||
|
||||
```csharp
|
||||
string MyString = "Hello World!";
|
||||
char[] MyChar = {'r','o','W','l','d','!',' '};
|
||||
string NewString = MyString.TrimEnd(MyChar);
|
||||
Console.WriteLine(NewString);
|
||||
```
|
||||
|
||||
This code displays `He` to the console.
|
||||
|
||||
The following example removes the last word of a string using the [TrimEnd](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimEnd_System_Char___) method. In this code, a comma follows the word `Hello` and, because the comma is not specified in the array of characters to trim, the trim ends at the comma.
|
||||
|
||||
```csharp
|
||||
string MyString = "Hello, World!";
|
||||
char[] MyChar = {'r','o','W','l','d','!',' '};
|
||||
string NewString = MyString.TrimEnd(MyChar);
|
||||
Console.WriteLine(NewString);
|
||||
```
|
||||
|
||||
This code displays `Hello,` to the console.
|
||||
|
||||
## TrimStart
|
||||
|
||||
The [String.TrimStart](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimStart_System_Char___) method is similar to the [String.TrimEnd](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimEnd_System_Char___) method except that it creates a new string by removing characters from the beginning of an existing string object. An array of characters is passed to the [TrimStart](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimStart_System_Char___) method to specify the characters to be removed. As with the [TrimEnd](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_TrimEnd_System_Char___) method, the order of the elements in the character array does not affect the trim operation. The trim stops when a character not specified in the array is found.
|
||||
|
||||
The following example removes the first word of a string. In this example, the position of the `'l'` character and the `'H'` character are reversed to illustrate that the order of characters in the array does not matter.
|
||||
|
||||
```csharp
|
||||
string MyString = "Hello World!";
|
||||
char[] MyChar = {'e', 'H','l','o',' ' };
|
||||
string NewString = MyString.TrimStart(MyChar);
|
||||
Console.WriteLine(NewString);
|
||||
```
|
||||
|
||||
This code displays `World!` to the console.
|
||||
|
||||
## Remove
|
||||
|
||||
The [String.Remove](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Remove_System_Int32_) method removes a specified number of characters that begin at a specified position in an existing string. This method assumes a zero-based index.
|
||||
|
||||
The following example removes ten characters from a string beginning at position five of a zero-based index of the string.
|
||||
|
||||
```csharp
|
||||
string MyString = "Hello Beautiful World!";
|
||||
Console.WriteLine(MyString.Remove(5,10));
|
||||
// The example displays the following output:
|
||||
// Hello World!
|
||||
```
|
||||
|
||||
You can also remove a specified character or substring from a string by calling the [String.Replace(String, String)](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Replace_System_String_System_String_) method and specifying an empty string ([String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty)) as the replacement. The following example removes all commas from a string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
String phrase = "a cold, dark night";
|
||||
Console.WriteLine("Before: {0}", phrase);
|
||||
phrase = phrase.Replace(",", "");
|
||||
Console.WriteLine("After: {0}", phrase);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Before: a cold, dark night
|
||||
// After: a cold dark night
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,662 +0,0 @@
|
|||
---
|
||||
title: Best Practices for Using Strings
|
||||
description: Best Practices for Using Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 558265e1-6fab-4e81-aa7b-d589aa94cb13
|
||||
---
|
||||
|
||||
# Best Practices for Using Strings
|
||||
|
||||
.NET Core provides extensive support for developing localized and globalized applications, and makes it easy to apply the conventions of either the current culture or a specific culture when performing common operations such as sorting and displaying strings. But sorting or comparing strings is not always a culture-sensitive operation. For example, strings that are used internally by an application typically should be handled identically across all cultures. When culturally independent string data, such as XML tags, HTML tags, user names, file paths, and the names of system objects, are interpreted as if they were culture-sensitive, application code can be subject to subtle bugs, poor performance, and, in some cases, security issues.
|
||||
|
||||
This article examines the string sorting, comparison, and casing methods in .NET Core, presents recommendations for selecting an appropriate string-handling method, and provides additional information about string-handling methods. It also examines how formatted data, such as numeric data and date and time data, is handled for display and for storage.
|
||||
|
||||
This article contains the following sections:
|
||||
|
||||
* [Recommendations for String Usage](#Recommendations-for-String-Usage)
|
||||
|
||||
* [Specifying String Comparisons Explicitly](#Specifying-String-Comparisons-Explicitly)
|
||||
|
||||
* [The Details of String Comparison](#The-Details-of-String-Comparison)
|
||||
|
||||
* [Choosing a StringComparison Member for Your Method Call](#Choosing-a-StringComparison-Member-for-Your-Method-Call)
|
||||
|
||||
* [Common String Comparison Methods](#Common-String-Comparison-Methods)
|
||||
|
||||
* [Methods that Perform String Comparison Indirectly](#Methods-that-Perform-String-Comparison-Indirectly)
|
||||
|
||||
* [Displaying and Persisting Formatted Data](#Displaying-and-Persisting-Formatted-Data)
|
||||
|
||||
## Recommendations for String Usage
|
||||
|
||||
When you develop with .NET Core, follow these simple recommendations when you use strings:
|
||||
|
||||
* Use overloads that explicitly specify the string comparison rules for string operations. Typically, this involves calling a method overload that has a parameter of type [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison).
|
||||
|
||||
* Use [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) or [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) for comparisons as your safe default for culture-agnostic string matching.
|
||||
|
||||
* Use comparisons with [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) or [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) for better performance.
|
||||
|
||||
* Use string operations that are based on [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture) when you display output to the user.
|
||||
|
||||
* Use the non-linguistic [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) or [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) values instead of string operations based on [CultureInfo.InvariantCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_InvariantCulture) when the comparison is linguistically irrelevant (symbolic, for example).
|
||||
|
||||
* Use the [String.ToUpperInvariant]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpperInvariant) method instead of the [String.ToLowerInvariant]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToLowerInvariant) method when you normalize strings for comparison.
|
||||
|
||||
* Use an overload of the [String.Equals]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_) method to test whether two strings are equal.
|
||||
|
||||
* Use the [String.Compare]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_String_) and [String.CompareTo]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) methods to sort strings, not to check for equality.
|
||||
|
||||
* Use culture-sensitive formatting to display non-string data, such as numbers and dates, in a user interface. Use formatting with the invariant culture to persist non-string data in string form.
|
||||
|
||||
Avoid the following practices when you use strings:
|
||||
|
||||
* Do not use overloads that do not explicitly or implicitly specify the string comparison rules for string operations.
|
||||
|
||||
* Do not use an overload of the [String.Compare]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_String_) or [String.CompareTo]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) method and test for a return value of zero to determine whether two strings are equal.
|
||||
|
||||
* Do not use culture-sensitive formatting to persist numeric data or date and time data in string form.
|
||||
|
||||
## Specifying String Comparisons Explicitly
|
||||
|
||||
Most of the string manipulation methods in .NET Core are overloaded. Typically, one or more overloads accept default settings, whereas others accept no defaults and instead define the precise way in which strings are to be compared or manipulated. Most of the methods that do not rely on defaults include a parameter of type [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison), which is an enumeration that explicitly specifies rules for string comparison by culture and case. The following table describes the [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) enumeration members.
|
||||
|
||||
StringComparison member | Description
|
||||
----------------------- | -----------
|
||||
[StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture) | Performs a case-sensitive comparison using the current culture.
|
||||
[CurrentCultureIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCultureIgnoreCase) | Performs a case-insensitive comparison using the current culture.
|
||||
[StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) | Performs an ordinal comparison.
|
||||
[StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) | Performs a case-insensitive ordinal comparison.
|
||||
|
||||
For example, the [String.IndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_) method, which returns the index of a substring in a [String]( https://docs.microsoft.com/dotnet/core/api/System.String) object that matches either a character or a string, has nine overloads:
|
||||
|
||||
* [IndexOf(Char)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_), [IndexOf(Char, Int32)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_System_Int32_), and [IndexOf(Char, Int32, Int32)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_System_Int32_System_Int32_), which by default perform an ordinal (case-sensitive and culture-insensitive) search for a character in the string.
|
||||
|
||||
* [IndexOf(String)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_), [IndexOf(String, Int32)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_System_Int32_), and [IndexOf(String, Int32, Int32)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_System_Int32_System_Int32_), which by default perform a case-sensitive and culture-sensitive search for a substring in the string.
|
||||
|
||||
* [IndexOf(String, StringComparison)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_System_StringComparison_), [IndexOf(String, Int32, StringComparison)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_System_Int32_System_StringComparison_), and [IndexOf(String, Int32, Int32, StringComparison)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_System_Int32_System_Int32_System_StringComparison_), which include a parameter of type StringComparison that allows the form of the comparison to be specified.
|
||||
|
||||
We recommend that you select an overload that does not use default values, for the following reasons:
|
||||
|
||||
* Some overloads with default parameters (those that search for a [Char]( https://docs.microsoft.com/dotnet/core/api/System.Char) in the string instance) perform an ordinal comparison, whereas others (those that search for a string in the string instance) are culture-sensitive. It is difficult to remember which method uses which default value, and easy to confuse the overloads.
|
||||
|
||||
* The intent of the code that relies on default values for method calls is not clear. In the following example, which relies on defaults, it is difficult to know whether the developer actually intended an ordinal or a linguistic comparison of two strings, or whether a case difference between `protocol` and "http" might cause the test for equality to return `false`.
|
||||
|
||||
```csharp
|
||||
string protocol = GetProtocol(url);
|
||||
if (String.Equals(protocol, "http", StringComparison.OrdinalIgnoreCase)) {
|
||||
// ...Code to handle HTTP protocol.
|
||||
}
|
||||
else {
|
||||
throw new InvalidOperationException();
|
||||
}
|
||||
```
|
||||
|
||||
## The Details of String Comparison
|
||||
|
||||
String comparison is the heart of many string-related operations, particularly sorting and testing for equality. Strings sort in a determined order: If "my" appears before "string" in a sorted list of strings, "my" must compare less than or equal to "string". Additionally, comparison implicitly defines equality. The comparison operation returns zero for strings it deems equal. A good interpretation is that neither string is less than the other. Most meaningful operations involving strings include one or both of these procedures: comparing with another string, and executing a well-defined sort operation.
|
||||
|
||||
However, evaluating two strings for equality or sort order does not yield a single, correct result; the outcome depends on the criteria used to compare the strings. In particular, string comparisons that are ordinal or that are based on the casing and sorting conventions of the current culture or the invariant culture (a locale-agnostic culture based on the English language) may produce different results.
|
||||
|
||||
### String Comparisons that Use the Current Culture
|
||||
|
||||
One criterion involves using the conventions of the current culture when comparing strings. Comparisons that are based on the current culture use the thread's current culture or locale. You should always use comparisons that are based on the current culture when data is linguistically relevant, and when it reflects culture-sensitive user interaction.
|
||||
|
||||
However, comparison and casing behavior in .NET Core changes when the culture changes. This happens when an application executes on a computer that has a different culture than the computer on which the application was developed, or when the executing thread changes its culture. This behavior is intentional, but it remains non-obvious to many developers. The following example illustrates differences in sort order between the U.S. English ("en-US") and Swedish ("sv-SE") cultures. Note that the words "ångström", "Windows", and "Visual Studio" appear in different positions in the sorted string arrays.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] values= { "able", "ångström", "apple", "Æble",
|
||||
"Windows", "Visual Studio" };
|
||||
Array.Sort(values);
|
||||
DisplayArray(values);
|
||||
|
||||
// Change culture to Swedish (Sweden).
|
||||
string originalCulture = CultureInfo.CurrentCulture.Name;
|
||||
Thread.CurrentThread.CurrentCulture = new CultureInfo("sv-SE");
|
||||
Array.Sort(values);
|
||||
DisplayArray(values);
|
||||
|
||||
// Restore the original culture.
|
||||
Thread.CurrentThread.CurrentCulture = new CultureInfo(originalCulture);
|
||||
}
|
||||
|
||||
private static void DisplayArray(string[] values)
|
||||
{
|
||||
Console.WriteLine("Sorting using the {0} culture:",
|
||||
CultureInfo.CurrentCulture.Name);
|
||||
foreach (string value in values)
|
||||
Console.WriteLine(" {0}", value);
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Sorting using the en-US culture:
|
||||
// able
|
||||
// Æble
|
||||
// ångström
|
||||
// apple
|
||||
// Visual Studio
|
||||
// Windows
|
||||
//
|
||||
// Sorting using the sv-SE culture:
|
||||
// able
|
||||
// Æble
|
||||
// apple
|
||||
// Windows
|
||||
// Visual Studio
|
||||
// ångström
|
||||
```
|
||||
|
||||
Case-insensitive comparisons that use the current culture are the same as culture-sensitive comparisons, except that they ignore case as dictated by the thread's current culture. This behavior may manifest itself in sort orders as well.
|
||||
|
||||
Comparisons that use current culture semantics are the default for the following methods:
|
||||
|
||||
* [String.Compare]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_String_) overloads that do not include a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) parameter.
|
||||
|
||||
* [String.CompareTo]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_CompareTo_System_String_) overloads.
|
||||
|
||||
* The default [String.StartsWith(String)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_StartsWith_System_String_) method.
|
||||
|
||||
* The default [String.EndsWith(String)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_EndsWith_System_String_) method.
|
||||
|
||||
* [String.IndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_EndsWith_System_String_) overloads that accept a [String]( https://docs.microsoft.com/dotnet/core/api/System.String) as a search parameter and that do not have a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) parameter.
|
||||
|
||||
* [String.LastIndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_Char_) overloads that accept a [String]( https://docs.microsoft.com/dotnet/core/api/System.String) as a search parameter and that do not have a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) parameter.
|
||||
|
||||
In any case, we recommend that you call an overload that has a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) parameter to make the intent of the method call clear.
|
||||
|
||||
Subtle and not so subtle bugs can emerge when non-linguistic string data is interpreted linguistically, or when string data from a particular culture is interpreted using the conventions of another culture. The canonical example is the Turkish-I problem.
|
||||
|
||||
For nearly all Latin alphabets, including U.S. English, the character "i" (\u0069) is the lowercase version of the character "I" (\u0049). This casing rule quickly becomes the default for someone programming in such a culture. However, the Turkish ("tr-TR") alphabet includes an "I with a dot" character "İ" (\u0130), which is the capital version of "i". Turkish also includes a lowercase "i without a dot" character, "ı" (\u0131), which capitalizes to "I". This behavior occurs in the Azerbaijani ("az") culture as well.
|
||||
|
||||
Therefore, assumptions made about capitalizing "i" or lowercasing "I" are not valid among all cultures. If you use the default overloads for string comparison routines, they will be subject to variance between cultures. If the data to be compared is non-linguistic, using the default overloads can produce undesirable results, as the following attempt to perform a case-insensitive comparison of the strings "file" and "FILE" illustrates.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string fileUrl = "file";
|
||||
Thread.CurrentThread.CurrentCulture = new CultureInfo("en-US");
|
||||
Console.WriteLine("Culture = {0}",
|
||||
Thread.CurrentThread.CurrentCulture.DisplayName);
|
||||
Console.WriteLine("(file == FILE) = {0}",
|
||||
fileUrl.StartsWith("FILE", true, null));
|
||||
Console.WriteLine();
|
||||
|
||||
Thread.CurrentThread.CurrentCulture = new CultureInfo("tr-TR");
|
||||
Console.WriteLine("Culture = {0}",
|
||||
Thread.CurrentThread.CurrentCulture.DisplayName);
|
||||
Console.WriteLine("(file == FILE) = {0}",
|
||||
fileUrl.StartsWith("FILE", true, null));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Culture = English (United States)
|
||||
// (file == FILE) = True
|
||||
//
|
||||
// Culture = Turkish (Turkey)
|
||||
// (file == FILE) = False
|
||||
```
|
||||
|
||||
his comparison could cause significant problems if the culture is inadvertently used in security-sensitive settings, as in the following example. A method call such as `IsFileURI("file:")` returns `true` if the current culture is U.S. English, but `false` if the current culture is Turkish. Thus, on Turkish systems, someone could circumvent security measures that block access to case-insensitive URIs that begin with "FILE:".
|
||||
|
||||
```csharp
|
||||
public static bool IsFileURI(String path)
|
||||
{
|
||||
return path.StartsWith("FILE:", true, null);
|
||||
}
|
||||
```
|
||||
|
||||
In this case, because "file:" is meant to be interpreted as a non-linguistic, culture-insensitive identifier, the code should instead be written as shown in the following example.
|
||||
|
||||
```csharp
|
||||
public static bool IsFileURI(string path)
|
||||
{
|
||||
return path.StartsWith("FILE:", StringComparison.OrdinalIgnoreCase);
|
||||
}
|
||||
```
|
||||
|
||||
## Ordinal String Operations
|
||||
|
||||
Specifying the [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) or [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) value in a method call signifies a non-linguistic comparison in which the features of natural languages are ignored. Methods that are invoked with these [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) values base string operation decisions on simple byte comparisons instead of casing or equivalence tables that are parameterized by culture. In most cases, this approach best fits the intended interpretation of strings while making code faster and more reliable.
|
||||
|
||||
Ordinal comparisons are string comparisons in which each byte of each string is compared without linguistic interpretation; for example, "windows" does not match "Windows". Use this comparison when the context dictates that strings should be matched exactly or demands conservative matching policy. Additionally, ordinal comparison is the fastest comparison operation because it applies no linguistic rules when determining a result.
|
||||
|
||||
Strings in .NET Core can contain embedded null characters. One of the clearest differences between ordinal and culture-sensitive comparison (including comparisons that use the invariant culture) concerns the handling of embedded null characters in a string. These characters are ignored when you use the [String.Compare]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_Int32_System_String_System_Int32_System_Int32_) and [String.Equals]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_) methods to perform culture-sensitive comparisons (including comparisons that use the invariant culture). As a result, in culture-sensitive comparisons, strings that contain embedded null characters can be considered equal to strings that do not.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> Although string comparison methods disregard embedded null characters, string search methods such as [String.Contains]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Contains_System_String_), [String.EndsWith]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_EndsWith_System_String_), [String.IndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_), [String.LastIndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_String_), and [String.StartsWith]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_StartsWith_System_String_) do not.
|
||||
|
||||
The following example performs a culture-sensitive comparison of the string "Aa" with a similar string that contains several embedded null characters between "A" and "a", and shows how the two strings are considered equal.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string str1 = "Aa";
|
||||
string str2 = "A" + new String('\u0000', 3) + "a";
|
||||
Console.WriteLine("Comparing '{0}' ({1}) and '{2}' ({3}):",
|
||||
str1, ShowBytes(str1), str2, ShowBytes(str2));
|
||||
Console.WriteLine(" With String.Compare:");
|
||||
Console.WriteLine(" Current Culture: {0}",
|
||||
String.Compare(str1, str2, StringComparison.CurrentCulture));
|
||||
Console.WriteLine(" Invariant Culture: {0}",
|
||||
String.Compare(str1, str2, StringComparison.InvariantCulture));
|
||||
|
||||
Console.WriteLine(" With String.Equals:");
|
||||
Console.WriteLine(" Current Culture: {0}",
|
||||
String.Equals(str1, str2, StringComparison.CurrentCulture));
|
||||
Console.WriteLine(" Invariant Culture: {0}",
|
||||
String.Equals(str1, str2, StringComparison.InvariantCulture));
|
||||
}
|
||||
|
||||
private static string ShowBytes(string str)
|
||||
{
|
||||
string hexString = String.Empty;
|
||||
for (int ctr = 0; ctr < str.Length; ctr++)
|
||||
{
|
||||
string result = String.Empty;
|
||||
result = Convert.ToInt32(str[ctr]).ToString("X4");
|
||||
result = " " + result.Substring(0,2) + " " + result.Substring(2, 2);
|
||||
hexString += result;
|
||||
}
|
||||
return hexString.Trim();
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Comparing 'Aa' (00 41 00 61) and 'A a' (00 41 00 00 00 00 00 00 00 61):
|
||||
// With String.Compare:
|
||||
// Current Culture: 0
|
||||
// Invariant Culture: 0
|
||||
// With String.Equals:
|
||||
// Current Culture: True
|
||||
// Invariant Culture: True
|
||||
```
|
||||
|
||||
However, the strings are not considered equal when you use ordinal comparison, as the following example shows.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine("Comparing '{0}' ({1}) and '{2}' ({3}):",
|
||||
str1, ShowBytes(str1), str2, ShowBytes(str2));
|
||||
Console.WriteLine(" With String.Compare:");
|
||||
Console.WriteLine(" Ordinal: {0}",
|
||||
String.Compare(str1, str2, StringComparison.Ordinal));
|
||||
|
||||
Console.WriteLine(" With String.Equals:");
|
||||
Console.WriteLine(" Ordinal: {0}",
|
||||
String.Equals(str1, str2, StringComparison.Ordinal));
|
||||
// The example displays the following output:
|
||||
// Comparing 'Aa' (00 41 00 61) and 'A a' (00 41 00 00 00 00 00 00 00 61):
|
||||
// With String.Compare:
|
||||
// Ordinal: 97
|
||||
// With String.Equals:
|
||||
// Ordinal: False
|
||||
```
|
||||
|
||||
Case-insensitive ordinal comparisons are the next most conservative approach. These comparisons ignore most casing; for example, "windows" matches "Windows". When dealing with ASCII characters, this policy is equivalent to [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal), except that it ignores the usual ASCII casing. Therefore, any character in [A, Z] (\u0041-\u005A) matches the corresponding character in [a,z] (\u0061-\007A). Casing outside the ASCII range uses the invariant culture's tables. Therefore, the following comparison:
|
||||
|
||||
```csharp
|
||||
String.Compare(strA, strB, StringComparison.OrdinalIgnoreCase);
|
||||
```
|
||||
|
||||
is equivalent to (but faster than) this comparison:
|
||||
|
||||
```csharp
|
||||
String.Compare(strA.ToUpperInvariant(), strB.ToUpperInvariant(),
|
||||
StringComparison.Ordinal);
|
||||
```
|
||||
|
||||
These comparisons are still very fast.
|
||||
|
||||
Both [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) and [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) use the binary values directly, and are best suited for matching. When you are not sure about your comparison settings, use one of these two values. However, because they perform a byte-by-byte comparison, they do not sort by a linguistic sort order (like an English dictionary) but by a binary sort order. The results may look odd in most contexts if displayed to users.
|
||||
|
||||
Ordinal semantics are the default for [String.Equals]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_) overloads that do not include a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) argument (including the equality operator). In any case, we recommend that you call an overload that has a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) parameter.
|
||||
|
||||
### String Operations that Use the Invariant Culture
|
||||
|
||||
Comparisons with the invariant culture use the [CompareInfo]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CompareInfo#System_Globalization_CompareInfo) property returned by the static [CultureInfo.InvariantCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_InvariantCulture) property. This behavior is the same on all systems; it translates any characters outside its range into what it believes are equivalent invariant characters. This policy can be useful for maintaining one set of string behavior across cultures, but it often provides unexpected results.
|
||||
|
||||
Case-insensitive comparisons with the invariant culture use the static [CompareInfo]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CompareInfo#System_Globalization_CompareInfo) property returned by the static [CultureInfo.InvariantCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_InvariantCulture) property for comparison information as well. Any case differences among these translated characters are ignored.
|
||||
|
||||
The `CultureInfo.InvariantCulture.CompareInfo` object makes the [String.Compare]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_String_) method interpret certain sets of characters as equivalent. For example, the following equivalence is valid under the invariant culture:
|
||||
|
||||
InvariantCulture: a + ̊ = å
|
||||
|
||||
The LATIN SMALL LETTER A character "a" (\u0061), when it is next to the COMBINING RING ABOVE character "+ " ̊" (\u030a), is interpreted as the LATIN SMALL LETTER A WITH RING ABOVE character "å" (\u00e5).
|
||||
|
||||
When interpreting file names, cookies, or anything else where a combination such as "å" can appear, ordinal comparisons still offer the most transparent and fitting behavior.
|
||||
|
||||
On balance, the invariant culture has very few properties that make it useful for comparison. It does comparison in a linguistically relevant manner, which prevents it from guaranteeing full symbolic equivalence, but it is not the choice for display in any culture. For example, if a large data file that contains a list of sorted identifiers for display accompanies an application, adding to this list would require an insertion with invariant-style sorting.
|
||||
|
||||
## Choosing a StringComparison Member for Your Method Call
|
||||
|
||||
The following table outlines the mapping from semantic string context to a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) enumeration member.
|
||||
|
||||
Data | Behavior | Corresponding System.StringComparison value
|
||||
---- | -------- | -------------------------------------------
|
||||
Case-sensitive internal identifiers, case-sensitive identifiers in standards such as XML and HTTP, or case-sensitive security-related settings. | A non-linguistic identifier, where bytes match exactly. | [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal)
|
||||
Case-insensitive internal identifiers, case-insensitive identifiers in standards such as XML and HTTP, file paths, registry keys and values, environment variables, resource identifiers (for example, handle names), or case-insensitive security-related settings. | A non-linguistic identifier, where case is irrelevant. | [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase)
|
||||
Data displayed to the user or most user input. | Data that requires local linguistic customs. | [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture) or [CurrentCultureIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCultureIgnoreCase)
|
||||
|
||||
## Common String Comparison Methods
|
||||
|
||||
The following sections describe the methods that are most commonly used for string comparison.
|
||||
|
||||
### String.Compare
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
As the operation most central to string interpretation, all instances of these method calls should be examined to determine whether strings should be interpreted according to the current culture, or dissociated from the culture (symbolically). Typically, it is the latter, and a [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal) comparison should be used instead.
|
||||
|
||||
The [System.Globalization.CompareInfo]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CompareInfo#System_Globalization_CompareInfo) class, which is returned by the [CultureInfo.CompareInfo]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_CompareInfo) property, also includes a [Compare]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_CompareInfo) method that provides a large number of matching options (ordinal, ignoring white space, ignoring kana type, and so on) by means of the [CompareOptions]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CompareOptions) flag enumeration.
|
||||
|
||||
### String.CompareTo
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
This method does not currently offer an overload that specifies a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) type. It is usually possible to convert this method to the recommended [String.Compare(String, String, StringComparison)]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_String_System_StringComparison_) form.
|
||||
|
||||
Types that implement the [IComparable]( https://docs.microsoft.com/dotnet/core/api/System.IComparable) and [IComparable<T>]( https://docs.microsoft.com/dotnet/core/api/System.IComparable%601) interfaces implement this method. Because it does not offer the option of a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) parameter, implementing types often let the user specify a [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) in their constructor. The following example defines a `FileName` class whose class constructor includes a [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) parameter. This [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) object is then used in the `FileName.CompareTo` method.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class FileName : IComparable
|
||||
{
|
||||
string fname;
|
||||
StringComparer comparer;
|
||||
|
||||
public FileName(string name, StringComparer comparer)
|
||||
{
|
||||
if (String.IsNullOrEmpty(name))
|
||||
throw new ArgumentNullException("name");
|
||||
|
||||
this.fname = name;
|
||||
|
||||
if (comparer != null)
|
||||
this.comparer = comparer;
|
||||
else
|
||||
this.comparer = StringComparer.OrdinalIgnoreCase;
|
||||
}
|
||||
|
||||
public string Name
|
||||
{
|
||||
get { return fname; }
|
||||
}
|
||||
|
||||
public int CompareTo(object obj)
|
||||
{
|
||||
if (obj == null) return 1;
|
||||
|
||||
if (! (obj is FileName))
|
||||
return comparer.Compare(this.fname, obj.ToString());
|
||||
else
|
||||
return comparer.Compare(this.fname, ((FileName) obj).Name);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### String.Equals
|
||||
|
||||
Default interpretation: [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal).
|
||||
|
||||
The [String]( https://docs.microsoft.com/dotnet/core/api/System.String) class lets you test for equality by calling either the static or instance [Equals]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Equals_System_String_) method overloads, or by using the static equality operator. The overloads and operator use ordinal comparison by default. However, we still recommend that you call an overload that explicitly specifies the [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) type even if you want to perform an ordinal comparison; this makes it easier to search code for a certain string interpretation.
|
||||
|
||||
### String.ToUpper and String.ToLower
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
You should be careful when you use these methods, because forcing a string to a uppercase or lowercase is often used as a small normalization for comparing strings regardless of case. If so, consider using a case-insensitive comparison.
|
||||
|
||||
The [String.ToUpperInvariant]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpperInvariant) and [String.ToLowerInvariant]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToLowerInvariant) methods are also available. [ToUpperInvariant]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpperInvariant) is the standard way to normalize case. Comparisons made using [StringComparison.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_OrdinalIgnoreCase) are behaviorally the composition of two calls: calling [ToUpperInvariant]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpperInvariant) on both string arguments, and doing a comparison using [StringComparison.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_Ordinal).
|
||||
|
||||
Overloads are also available for converting to uppercase and lowercase in a specific culture, by passing a [CultureInfo]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents that culture to the method.
|
||||
|
||||
### Char.ToUpper and Char.ToLower
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
These methods work similarly to the [String.ToUpper]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToUpper) and [String.ToLower]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_ToLower) methods described in the previous section.
|
||||
|
||||
### String.StartsWith and String.EndsWith
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
By default, both of these methods perform a culture-sensitive comparison.
|
||||
|
||||
### String.IndexOf and String.LastIndexOf
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
There is a lack of consistency in how the default overloads of these methods perform comparisons. All [String.IndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_) and [String.LastIndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_Char_) methods that include a [Char]( https://docs.microsoft.com/dotnet/core/api/System.Char) parameter perform an ordinal comparison, but the default [String.IndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_Char_) and [String.LastIndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_Char_) methods that include a [String]( https://docs.microsoft.com/dotnet/core/api/System.String) parameter perform a culture-sensitive comparison.
|
||||
|
||||
If you call the [String.IndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_IndexOf_System_String_) or [String.LastIndexOf]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_LastIndexOf_System_String_) method and pass it a string to locate in the current instance, we recommend that you call an overload that explicitly specifies the [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) type. The overloads that include a [Char]( https://docs.microsoft.com/dotnet/core/api/System.Char) argument do not allow you to specify a [StringComparison]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison) type.
|
||||
|
||||
## Methods that Perform String Comparison Indirectly
|
||||
|
||||
Some non-string methods that have string comparison as a central operation use the [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) type. The [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) class includes four static properties that return [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) instances whose [StringComparer.Compare]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer#System_StringComparer_Compare_System_String_System_String_) methods perform the following types of string comparisons:
|
||||
|
||||
* Culture-sensitive string comparisons using the current culture. This [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) object is returned by the [StringComparer.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer#System_StringComparer_CurrentCulture) property.
|
||||
|
||||
* Case-insensitive comparisons using the current culture. This [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) object is returned by the [StringComparer.CurrentCultureIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer#System_StringComparer_CurrentCultureIgnoreCase) property.
|
||||
|
||||
* Ordinal comparison. This [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) object is returned by the [StringComparer.Ordinal]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer#System_StringComparer_Ordinal) property.
|
||||
|
||||
* Case-insensitive ordinal comparison. This [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) object is returned by the [StringComparer.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer#System_StringComparer_OrdinalIgnoreCase) property.
|
||||
|
||||
### Array.Sort and Array.BinarySearch
|
||||
|
||||
Default interpretation: [StringComparison.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.StringComparison#System_StringComparison_CurrentCulture).
|
||||
|
||||
When you store any data in a collection, or read persisted data from a file or database into a collection, switching the current culture can invalidate the invariants in the collection. The [Array.BinarySearch]( https://docs.microsoft.com/dotnet/core/api/System.Array#System_Array_BinarySearch_System_Array_System_Object_) method assumes that the elements in the array to be searched are already sorted. To sort any string element in the array, the [Array.Sort]( https://docs.microsoft.com/dotnet/core/api/System.Array#System_Array_Sort_System_Array_) method calls the [String.Compare]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Compare_System_String_System_String_) method to order individual elements. Using a culture-sensitive comparer can be dangerous if the culture changes between the time that the array is sorted and its contents are searched. For example, in the following code, storage and retrieval operate on the comparer that is provided implicitly by the `Thread.CurrentThread.CurrentCulture` property. If the culture can change between the calls to `StoreNames` and `DoesNameExist`, and especially if the array contents are persisted somewhere between the two method calls, the binary search may fail.
|
||||
|
||||
```csharp
|
||||
// Incorrect.
|
||||
string []storedNames;
|
||||
|
||||
public void StoreNames(string [] names)
|
||||
{
|
||||
int index = 0;
|
||||
storedNames = new string[names.Length];
|
||||
|
||||
foreach (string name in names)
|
||||
{
|
||||
this.storedNames[index++] = name;
|
||||
}
|
||||
|
||||
Array.Sort(names); // Line A.
|
||||
}
|
||||
|
||||
public bool DoesNameExist(string name)
|
||||
{
|
||||
return (Array.BinarySearch(this.storedNames, name) >= 0); // Line B.
|
||||
}
|
||||
```
|
||||
|
||||
A recommended variation appears in the following example, which uses the same ordinal (culture-insensitive) comparison method both to sort and to search the array. The change code is reflected in the lines labeled `Line A` and `Line B` in the two examples.
|
||||
|
||||
```csharp
|
||||
// Correct.
|
||||
string []storedNames;
|
||||
|
||||
public void StoreNames(string [] names)
|
||||
{
|
||||
int index = 0;
|
||||
storedNames = new string[names.Length];
|
||||
|
||||
foreach (string name in names)
|
||||
{
|
||||
this.storedNames[index++] = name;
|
||||
}
|
||||
|
||||
Array.Sort(names, StringComparer.Ordinal); // Line A.
|
||||
}
|
||||
|
||||
public bool DoesNameExist(string name)
|
||||
{
|
||||
return (Array.BinarySearch(this.storedNames, name, StringComparer.Ordinal) >= 0); // Line B.
|
||||
}
|
||||
```
|
||||
|
||||
### Collections Example: Hashtable Constructor
|
||||
|
||||
Hashing strings provides a second example of an operation that is affected by the way in which strings are compared.
|
||||
|
||||
The following example instantiates a [Hashtable]( https://docs.microsoft.com/dotnet/core/api/System.Collections.Hashtable) object by passing it the [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) object that is returned by the [StringComparer.OrdinalIgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer#System_StringComparer_OrdinalIgnoreCase) property. Because a class [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) that is derived from [StringComparer]( https://docs.microsoft.com/dotnet/core/api/System.StringComparer) implements the [IEqualityComparer]( https://docs.microsoft.com/dotnet/core/api/System.Collections.IEqualityComparer) interface, its [GetHashCode]( https://docs.microsoft.com/dotnet/core/api/System.Collections.IEqualityComparer#System_Collections_IEqualityComparer_GetHashCode_System_Object_) method is used to compute the hash code of strings in the hash table.
|
||||
|
||||
```csharp
|
||||
const int initialTableCapacity = 100;
|
||||
Hashtable h;
|
||||
|
||||
public void PopulateFileTable(string directory)
|
||||
{
|
||||
h = new Hashtable(initialTableCapacity,
|
||||
StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
foreach (string file in Directory.GetFiles(directory))
|
||||
h.Add(file, File.GetCreationTime(file));
|
||||
}
|
||||
|
||||
public void PrintCreationTime(string targetFile)
|
||||
{
|
||||
Object dt = h[targetFile];
|
||||
if (dt != null)
|
||||
{
|
||||
Console.WriteLine("File {0} was created at time {1}.",
|
||||
targetFile,
|
||||
(DateTime) dt);
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine("File {0} does not exist.", targetFile);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Displaying and Persisting Formatted Data
|
||||
|
||||
When you display non-string data such as numbers and dates and times to users, format them by using the user's cultural settings. By default, the [String.Format]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_String_System_Object_) method and the `ToString` methods of the numeric types and the date and time types use the current thread culture for formatting operations. To explicitly specify that the formatting method should use the current culture, you can call an overload of a formatting method that has a provider parameter, such as [String.Format(IFormatProvider, String, Object[])]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Format_System_IFormatProvider_System_String_System_Object___) or [DateTime.ToString(IFormatProvider)]( https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_IFormatProvider_), and pass it the [CultureInfo.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_CurrentCulture) property.
|
||||
|
||||
You can persist non-string data either as binary data or as formatted data. If you choose to save it as formatted data, you should call a formatting method overload that includes a *provider* parameter and pass it the [CultureInfo.InvariantCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_InvariantCulture) property. The invariant culture provides a consistent format for formatted data that is independent of culture and machine. In contrast, persisting data that is formatted by using cultures other than the invariant culture has a number of limitations:
|
||||
|
||||
* The data is likely to be unusable if it is retrieved on a system that has a different culture, or if the user of the current system changes the current culture and tries to retrieve the data.
|
||||
|
||||
* The properties of a culture on a specific computer can differ from standard values. At any time, a user can customize culture-sensitive display settings. Because of this, formatted data that is saved on a system may not be readable after the user customizes cultural settings. The portability of formatted data across computers is likely to be even more limited.
|
||||
|
||||
* International, regional, or national standards that govern the formatting of numbers or dates and times change over time, and these changes are incorporated into operating system updates. When formatting conventions change, data that was formatted by using the previous conventions may become unreadable.
|
||||
|
||||
The following example illustrates the limited portability that results from using culture-sensitive formatting to persist data. The example saves an array of date and time values to a file. These are formatted by using the conventions of the English (United States) culture. After the application changes the current thread culture to French (Switzerland), it tries to read the saved values by using the formatting conventions of the current culture. The attempt to read two of the data items throws a [FormatException]( https://docs.microsoft.com/dotnet/core/api/System.FormatException) exception, and the array of dates now contains two incorrect elements that are equal to [MinValue]( https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_MinValue).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.IO;
|
||||
using System.Text;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
private static string filename = @".\dates.dat";
|
||||
|
||||
public static void Main()
|
||||
{
|
||||
DateTime[] dates = { new DateTime(1758, 5, 6, 21, 26, 0),
|
||||
new DateTime(1818, 5, 5, 7, 19, 0),
|
||||
new DateTime(1870, 4, 22, 23, 54, 0),
|
||||
new DateTime(1890, 9, 8, 6, 47, 0),
|
||||
new DateTime(1905, 2, 18, 15, 12, 0) };
|
||||
// Write the data to a file using the current culture.
|
||||
WriteData(dates);
|
||||
// Change the current culture.
|
||||
Thread.CurrentThread.CurrentCulture = CultureInfo.CreateSpecificCulture("fr-CH");
|
||||
// Read the data using the current culture.
|
||||
DateTime[] newDates = ReadData();
|
||||
foreach (var newDate in newDates)
|
||||
Console.WriteLine(newDate.ToString("g"));
|
||||
}
|
||||
|
||||
private static void WriteData(DateTime[] dates)
|
||||
{
|
||||
StreamWriter sw = new StreamWriter(filename, false, Encoding.UTF8);
|
||||
for (int ctr = 0; ctr < dates.Length; ctr++) {
|
||||
sw.Write("{0}", dates[ctr].ToString("g", CultureInfo.CurrentCulture));
|
||||
if (ctr < dates.Length - 1) sw.Write("|");
|
||||
}
|
||||
sw.Close();
|
||||
}
|
||||
|
||||
private static DateTime[] ReadData()
|
||||
{
|
||||
bool exceptionOccurred = false;
|
||||
|
||||
// Read file contents as a single string, then split it.
|
||||
StreamReader sr = new StreamReader(filename, Encoding.UTF8);
|
||||
string output = sr.ReadToEnd();
|
||||
sr.Close();
|
||||
|
||||
string[] values = output.Split( new char[] { '|' } );
|
||||
DateTime[] newDates = new DateTime[values.Length];
|
||||
for (int ctr = 0; ctr < values.Length; ctr++) {
|
||||
try {
|
||||
newDates[ctr] = DateTime.Parse(values[ctr], CultureInfo.CurrentCulture);
|
||||
}
|
||||
catch (FormatException) {
|
||||
Console.WriteLine("Failed to parse {0}", values[ctr]);
|
||||
exceptionOccurred = true;
|
||||
}
|
||||
}
|
||||
if (exceptionOccurred) Console.WriteLine();
|
||||
return newDates;
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Failed to parse 4/22/1870 11:54 PM
|
||||
// Failed to parse 2/18/1905 3:12 PM
|
||||
//
|
||||
// 05.06.1758 21:26
|
||||
// 05.05.1818 07:19
|
||||
// 01.01.0001 00:00
|
||||
// 09.08.1890 06:47
|
||||
// 01.01.0001 00:00
|
||||
// 01.01.0001 00:00
|
||||
```
|
||||
|
||||
However, if you replace the [CultureInfo.CurrentCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_CurrentCulture) property with [CultureInfo.InvariantCulture]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_InvariantCulture) in the calls to [DateTime.ToString(String, IFormatProvider)]( https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ToString_System_String_System_IFormatProvider_) and [DateTime.Parse(String, IFormatProvider)]( https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_System_IFormatProvider_), the persisted date and time data is successfully restored, as the following output shows.
|
||||
|
||||
```csharp
|
||||
// 06.05.1758 21:26
|
||||
// 05.05.1818 07:19
|
||||
// 22.04.1870 23:54
|
||||
// 08.09.1890 06:47
|
||||
// 18.02.1905 15:12
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,890 +0,0 @@
|
|||
---
|
||||
title: Character Encoding in .NET Core
|
||||
description: Character Encoding in .NET Core
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: edfdc925-db67-4ff9-9f26-01340023f231
|
||||
---
|
||||
|
||||
# Character Encoding in .NET Core
|
||||
|
||||
Characters are abstract entities that can be represented in many different ways. A character encoding is a system that pairs each character in a supported character set with some value that represents that character. For example, Morse code is a character encoding that pairs each character in the Roman alphabet with a pattern of dots and dashes that are suitable for transmission over telegraph lines. A character encoding for computers pairs each character in a supported character set with a numeric value that represents that character. A character encoding has two distinct components:
|
||||
|
||||
* An encoder, which translates a sequence of characters into a sequence of numeric values (bytes).
|
||||
|
||||
* A decoder, which translates a sequence of bytes into a sequence of characters.
|
||||
|
||||
Character encoding describes the rules by which an encoder and a decoder operate. For example, the [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) class describes the rules for encoding to, and decoding from, 8-bit Unicode Transformation Format (UTF-8), which uses one to four bytes to represent a single Unicode character. Encoding and decoding can also include validation. For example, the [UnicodeEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UnicodeEncoding) class checks all surrogates to make sure they constitute valid surrogate pairs. (A surrogate pair consists of a character with a code point that ranges from U+D800 to U+DBFF followed by a character with a code point that ranges from U+DC00 to U+DFFF.) A fallback strategy determines how an encoder handles invalid characters or how a decoder handles invalid bytes.
|
||||
|
||||
> **Warning**
|
||||
>
|
||||
> The .NET Core encoding classes provide a way to store and convert character data. They should not be used to store binary data in string form. Depending on the encoding used, converting binary data to string format with the encoding classes can introduce unexpected behavior and produce inaccurate or corrupted data. To convert binary data to a string form, use the [Convert.ToBase64String](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ToBase64String_System_Byte___) method.
|
||||
|
||||
.NET Core uses the UTF-16 encoding (represented by the [UnicodeEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UnicodeEncoding) class) to represent characters and strings. Applications that target the common language runtime use encoders to map Unicode character representations supported by the common language runtime to other encoding schemes. They use decoders to map characters from non-Unicode encodings to Unicode.
|
||||
|
||||
This topic consists of the following sections:
|
||||
|
||||
* [Encodings in .NET Core](#Encodings-in-.NET-Core)
|
||||
|
||||
* [Selecting an Encoding Class](#Selecting-an-Encoding-Class)
|
||||
|
||||
* [Using an Encoding Object](#Using-an-Encoding-Object)
|
||||
|
||||
* [Choosing a Fallback Strategy](#Choosing-a-Fallback-Strategy)
|
||||
|
||||
* [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy)
|
||||
|
||||
## Encodings in .NET Core
|
||||
|
||||
All character encoding classes in .NET Core inherit from the [System.Text.Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding) class, which is an abstract class that defines the functionality common to all character encodings. To access the individual encoding objects implemented in .NET Core, do the following:
|
||||
|
||||
* Use the static properties of the [Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding) class, which return objects that represent the standard character encodings available in .NET Core (ASCII, UTF-7, UTF-8, UTF-16, and UTF-32). For example, the [Encoding.Unicode](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_Unicode) property returns a [UnicodeEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UnicodeEncoding) object. Each object uses replacement fallback to handle strings that it cannot encode and bytes that it cannot decode. (For more information, see the [Replacement Fallback](#Replacement-Fallback) section.)
|
||||
|
||||
* Call the encoding's class constructor. Objects for the ASCII, UTF-7, UTF-8, UTF-16, and UTF-32 encodings can be instantiated in this way. By default, each object uses replacement fallback to handle strings that it cannot encode and bytes that it cannot decode, but you can specify that an exception should be thrown instead. (For more information, see the [Replacement Fallback](#Replacement-Fallback) and [Exception Fallback](#Exception-Fallback) sections.)
|
||||
|
||||
* Call the [Encoding.Encoding(Int32)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding__ctor_System_Int32_) constructor and pass it an integer that represents the encoding. Standard encoding objects use replacement fallback, and code page and double-byte character set (DBCS) encoding objects use best-fit fallback to handle strings that they cannot encode and bytes that they cannot decode. (For more information, see the [Best-Fit Fallback](#Best-Fit-Fallback) section.)
|
||||
|
||||
* Call the [Encoding.GetEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_Int32_) method, which returns any standard, code page, or DBCS encoding available in .NET Core. Overloads let you specify a fallback object for both the encoder and the decoder.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The Unicode Standard assigns a code point (a number) and a name to each character in every supported script. For example, the character "A" is represented by the code point U+0041 and the name "LATIN CAPITAL LETTER A". The Unicode Transformation Format (UTF) encodings define ways to encode that code point into a sequence of one or more bytes. A Unicode encoding scheme simplifies world-ready application development because it allows characters from any character set to be represented in a single encoding. Application developers no longer have to keep track of the encoding scheme that was used to produce characters for a specific language or writing system, and data can be shared among systems internationally without being corrupted.
|
||||
>
|
||||
> .NET Core supports three encodings defined by the Unicode standard: UTF-8, UTF-16, and UTF-32. For more information, see The Unicode Standard at the [Unicode](http://www.unicode.org/) home page.
|
||||
|
||||
.NET Core supports the character encoding systems listed in the following table.
|
||||
|
||||
Encoding | Class | Description | Advantages/disadvantages
|
||||
-------- | ----- | ----------- | ------------------------
|
||||
ASCII | [ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding) | Encodes a limited range of characters by using the lower seven bits of a byte. | Because this encoding only supports character values from U+0000 through U+007F, in most cases it is inadequate for internationalized applications.
|
||||
UTF-7 | [UTF7Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF7Encoding) | Represents characters as sequences of 7-bit ASCII characters. Non-ASCII Unicode characters are represented by an escape sequence of ASCII characters. | UTF-7 supports protocols such as e-mail and newsgroup protocols. However, UTF-7 is not particularly secure or robust. In some cases, changing one bit can radically alter the interpretation of an entire UTF-7 string. In other cases, different UTF-7 strings can encode the same text. For sequences that include non-ASCII characters, UTF-7 requires more space than UTF-8, and encoding/decoding is slower. Consequently, you should use UTF-8 instead of UTF-7 if possible.
|
||||
UTF-8 | [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) | Represents each Unicode code point as a sequence of one to four bytes. | UTF-8 supports 8-bit data sizes and works well with many existing operating systems. For the ASCII range of characters, UTF-8 is identical to ASCII encoding and allows a broader set of characters. However, for Chinese-Japanese-Korean (CJK) scripts, UTF-8 can require three bytes for each character, and can potentially cause larger data sizes than UTF-16. Note that sometimes the amount of ASCII data, such as HTML tags, justifies the increased size for the CJK range.
|
||||
UTF-16 | [UnicodeEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UnicodeEncoding) | Represents each Unicode code point as a sequence of one or two 16-bit integers. Most common Unicode characters require only one UTF-16 code point, although Unicode supplementary characters (U+10000 and greater) require two UTF-16 surrogate code points. Both little-endian and big-endian byte orders are supported. | UTF-16 encoding is used by the common language runtime to represent Char and String values, and it is used by the Windows operating system to represent WCHAR values.
|
||||
UTF-32 | [UTF32Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF32Encoding) | Represents each Unicode code point as a 32-bit integer. Both little-endian and big-endian byte orders are supported. | UTF-32 encoding is used when applications want to avoid the surrogate code point behavior of UTF-16 encoding on operating systems for which encoded space is too important. Single glyphs rendered on a display can still be encoded with more than one UTF-32 character.
|
||||
|
||||
These encodings enable you to work with Unicode characters as well as with encodings that are most commonly used in legacy applications. In addition, you can create a custom encoding by defining a class that derives from [Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding) and overriding its members.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> By default, .NET Core does not make available any code page encodings other than code page 28591 and the Unicode encodings, such as UTF-8 and UTF-16. However, you can add the code page encodings found in standard Windows apps that target the .NET Framework to your app. For complete information, see the [CodePagesEncodingProvider](https://docs.microsoft.com/dotnet/core/api/System.Text.CodePagesEncodingProvider) topic.
|
||||
|
||||
## Selecting an Encoding Class
|
||||
|
||||
If you have the opportunity to choose the encoding to be used by your application, you should use a Unicode encoding, preferably either [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) or [UnicodeEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UnicodeEncoding). (.NET Core also supports a third Unicode encoding, [UTF32Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF32Encoding).)
|
||||
|
||||
If you are planning to use an ASCII encoding ([ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding)), choose [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) instead. The two encodings are identical for the ASCII character set, but [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) has the following advantages:
|
||||
|
||||
* It can represent every Unicode character, whereas [ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding) supports only the Unicode character values between U+0000 and U+007F.
|
||||
|
||||
* It provides error detection and better security.
|
||||
|
||||
* It has been tuned to be as fast as possible and should be faster than any other encoding. Even for content that is entirely ASCII, operations performed with [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) are faster than operations performed with [ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding).
|
||||
|
||||
You should consider using [ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding) only for legacy applications. However, even for legacy applications, [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding) might be a better choice for the following reasons (assuming default settings):
|
||||
|
||||
* If your application has content that is not strictly ASCII and encodes it with [ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding), each non-ASCII character encodes as a question mark (?). If the application then decodes this data, the information is lost.
|
||||
|
||||
|
||||
* If your application has content that is not strictly ASCII and encodes it with [UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding), the result seems unintelligible if interpreted as ASCII. However, if the application then uses a UTF-8 decoder to decode this data, the data performs a round trip successfully.
|
||||
|
||||
In a web application, characters sent to the client in response to a web request should reflect the encoding used on the client. In most cases, you should set the [HttpResponse.ContentEncoding](https://docs.microsoft.com/dotnet/core/api/System.Net.HttpResponseHeader#System_Net_HttpResponseHeader_ContentEncoding) property to the value returned by the [HttpRequest.ContentEncoding](https://docs.microsoft.com/dotnet/core/api/System.Net.HttpRequestHeader#System_Net_HttpRequestHeader_ContentEncoding) property to display text in the encoding that the user expects.
|
||||
|
||||
## Using an Encoding Object
|
||||
|
||||
An encoder converts a string of characters (most commonly, Unicode characters) to its numeric (byte) equivalent. For example, you might use an ASCII encoder to convert Unicode characters to ASCII so that they can be displayed at the console. To perform the conversion, you call the [Encoding.GetBytes](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetBytes_System_Char___) method. If you want to determine how many bytes are needed to store the encoded characters before performing the encoding, you can call the [GetByteCount](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetByteCount_System_Char___) method.
|
||||
|
||||
The following example uses a single byte array to encode strings in two separate operations. It maintains an index that indicates the starting position in the byte array for the next set of ASCII-encoded bytes. It calls the [ASCIIEncoding.GetByteCount(String)](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding#System_Text_ASCIIEncoding_GetByteCount_System_String_) method to ensure that the byte array is large enough to accommodate the encoded string. It then calls the [ASCIIEncoding.GetBytes(String, Int32, Int32, Byte[], Int32)](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding#System_Text_ASCIIEncoding_GetBytes_System_String_System_Int32_System_Int32_System_Byte___System_Int32_) method to encode the characters in the string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] strings= { "This is the first sentence. ",
|
||||
"This is the second sentence. " };
|
||||
Encoding asciiEncoding = Encoding.ASCII;
|
||||
|
||||
// Create array of adequate size.
|
||||
byte[] bytes = new byte[49];
|
||||
// Create index for current position of array.
|
||||
int index = 0;
|
||||
|
||||
Console.WriteLine("Strings to encode:");
|
||||
foreach (var stringValue in strings) {
|
||||
Console.WriteLine(" {0}", stringValue);
|
||||
|
||||
int count = asciiEncoding.GetByteCount(stringValue);
|
||||
if (count + index >= bytes.Length)
|
||||
Array.Resize(ref bytes, bytes.Length + 50);
|
||||
|
||||
int written = asciiEncoding.GetBytes(stringValue, 0,
|
||||
stringValue.Length,
|
||||
bytes, index);
|
||||
|
||||
index = index + written;
|
||||
}
|
||||
Console.WriteLine("\nEncoded bytes:");
|
||||
Console.WriteLine("{0}", ShowByteValues(bytes, index));
|
||||
Console.WriteLine();
|
||||
|
||||
// Decode Unicode byte array to a string.
|
||||
string newString = asciiEncoding.GetString(bytes, 0, index);
|
||||
Console.WriteLine("Decoded: {0}", newString);
|
||||
}
|
||||
|
||||
private static string ShowByteValues(byte[] bytes, int last )
|
||||
{
|
||||
string returnString = " ";
|
||||
for (int ctr = 0; ctr <= last - 1; ctr++) {
|
||||
if (ctr % 20 == 0)
|
||||
returnString += "\n ";
|
||||
returnString += String.Format("{0:X2} ", bytes[ctr]);
|
||||
}
|
||||
return returnString;
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Strings to encode:
|
||||
// This is the first sentence.
|
||||
// This is the second sentence.
|
||||
//
|
||||
// Encoded bytes:
|
||||
//
|
||||
// 54 68 69 73 20 69 73 20 74 68 65 20 66 69 72 73 74 20 73 65
|
||||
// 6E 74 65 6E 63 65 2E 20 54 68 69 73 20 69 73 20 74 68 65 20
|
||||
// 73 65 63 6F 6E 64 20 73 65 6E 74 65 6E 63 65 2E 20
|
||||
//
|
||||
// Decoded: This is the first sentence. This is the second sentence.
|
||||
```
|
||||
|
||||
A decoder converts a byte array that reflects a particular character encoding into a set of characters, either in a character array or in a string. To decode a byte array into a character array, you call the [Encoding.GetChars](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetChars_System_Byte___) method. To decode a byte array into a string, you call the [GetString](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetString_System_Byte___) method. If you want to determine how many characters are needed to store the decoded bytes before performing the decoding, you can call the [GetCharCount](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetCharCount_System_Byte___) method.
|
||||
|
||||
The following example encodes three strings and then decodes them into a single array of characters. It maintains an index that indicates the starting position in the character array for the next set of decoded characters. It calls the [GetCharCount](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetCharCount_System_Byte___) method to ensure that the character array is large enough to accommodate all the decoded characters. It then calls the [ASCIIEncoding.GetChars(Byte[], Int32, Int32, Char[], Int32)](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding#System_Text_ASCIIEncoding_GetChars_System_Byte___System_Int32_System_Int32_System_Char___System_Int32_) method to decode the byte array.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] strings = { "This is the first sentence. ",
|
||||
"This is the second sentence. ",
|
||||
"This is the third sentence. " };
|
||||
Encoding asciiEncoding = Encoding.ASCII;
|
||||
// Array to hold encoded bytes.
|
||||
byte[] bytes;
|
||||
// Array to hold decoded characters.
|
||||
char[] chars = new char[50];
|
||||
// Create index for current position of character array.
|
||||
int index = 0;
|
||||
|
||||
foreach (var stringValue in strings) {
|
||||
Console.WriteLine("String to Encode: {0}", stringValue);
|
||||
// Encode the string to a byte array.
|
||||
bytes = asciiEncoding.GetBytes(stringValue);
|
||||
// Display the encoded bytes.
|
||||
Console.Write("Encoded bytes: ");
|
||||
for (int ctr = 0; ctr < bytes.Length; ctr++)
|
||||
Console.Write(" {0}{1:X2}",
|
||||
ctr % 20 == 0 ? Environment.NewLine : "",
|
||||
bytes[ctr]);
|
||||
Console.WriteLine();
|
||||
|
||||
// Decode the bytes to a single character array.
|
||||
int count = asciiEncoding.GetCharCount(bytes);
|
||||
if (count + index >= chars.Length)
|
||||
Array.Resize(ref chars, chars.Length + 50);
|
||||
|
||||
int written = asciiEncoding.GetChars(bytes, 0,
|
||||
bytes.Length,
|
||||
chars, index);
|
||||
index = index + written;
|
||||
Console.WriteLine();
|
||||
}
|
||||
|
||||
// Instantiate a single string containing the characters.
|
||||
string decodedString = new string(chars, 0, index - 1);
|
||||
Console.WriteLine("Decoded string: ");
|
||||
Console.WriteLine(decodedString);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// String to Encode: This is the first sentence.
|
||||
// Encoded bytes:
|
||||
// 54 68 69 73 20 69 73 20 74 68 65 20 66 69 72 73 74 20 73 65
|
||||
// 6E 74 65 6E 63 65 2E 20
|
||||
//
|
||||
// String to Encode: This is the second sentence.
|
||||
// Encoded bytes:
|
||||
// 54 68 69 73 20 69 73 20 74 68 65 20 73 65 63 6F 6E 64 20 73
|
||||
// 65 6E 74 65 6E 63 65 2E 20
|
||||
//
|
||||
// String to Encode: This is the third sentence.
|
||||
// Encoded bytes:
|
||||
// 54 68 69 73 20 69 73 20 74 68 65 20 74 68 69 72 64 20 73 65
|
||||
// 6E 74 65 6E 63 65 2E 20
|
||||
//
|
||||
// Decoded string:
|
||||
// This is the first sentence. This is the second sentence. This is the third sentence.
|
||||
```
|
||||
|
||||
The encoding and decoding methods of a class derived from [Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding) are designed to work on a complete set of data; that is, all the data to be encoded or decoded is supplied in a single method call. However, in some cases, data is available in a stream, and the data to be encoded or decoded may be available only from separate read operations. This requires the encoding or decoding operation to remember any saved state from its previous invocation. Methods of classes derived from [Encoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoder) and [Decoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder) are able to handle encoding and decoding operations that span multiple method calls.
|
||||
|
||||
An [Encoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoder) object for a particular encoding is available from that encoding's [Encoding.GetEncoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoder) property. A [Decoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder) object for a particular encoding is available from that encoding's [Encoding.GetDecoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetDecoder) property. For decoding operations, note that classes derived from [Decoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder) include a [Decoder.GetChars](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder#System_Text_Decoder_GetChars_System_Byte___System_Int32_System_Int32_System_Char___System_Int32_) method, but they do not have a method that corresponds to [Encoding.GetString](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetString_System_Byte___).
|
||||
|
||||
The following example illustrates the difference between using the [Encoding.GetChars](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetChars_System_Byte___) and [Decoder.GetChars](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder#System_Text_Decoder_GetChars_System_Byte___System_Int32_System_Int32_System_Char___System_Int32_) methods for decoding a Unicode byte array. The example encodes a string that contains some Unicode characters to a file, and then uses the two decoding methods to decode them ten bytes at a time. Because a surrogate pair occurs in the tenth and eleventh bytes, it is decoded in separate method calls. As the output shows, the [Encoding.GetChars](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetChars_System_Byte___) method is not able to correctly decode the bytes and instead replaces them with U+FFFD (REPLACEMENT CHARACTER). On the other hand, the [Decoder.GetChars](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder#System_Text_Decoder_GetChars_System_Byte___System_Int32_System_Int32_System_Char___System_Int32_) method is able to successfully decode the byte array to get the original string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.IO;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Use default replacement fallback for invalid encoding.
|
||||
UnicodeEncoding enc = new UnicodeEncoding(true, false, false);
|
||||
|
||||
// Define a string with various Unicode characters.
|
||||
string str1 = "AB YZ 19 \uD800\udc05 \u00e4";
|
||||
str1 += "Unicode characters. \u00a9 \u010C s \u0062\u0308";
|
||||
Console.WriteLine("Created original string...\n");
|
||||
|
||||
// Convert string to byte array.
|
||||
byte[] bytes = enc.GetBytes(str1);
|
||||
|
||||
FileStream fs = File.Create(@".\characters.bin");
|
||||
BinaryWriter bw = new BinaryWriter(fs);
|
||||
bw.Write(bytes);
|
||||
bw.Close();
|
||||
|
||||
// Read bytes from file.
|
||||
FileStream fsIn = File.OpenRead(@".\characters.bin");
|
||||
BinaryReader br = new BinaryReader(fsIn);
|
||||
|
||||
const int count = 10; // Number of bytes to read at a time.
|
||||
byte[] bytesRead = new byte[10]; // Buffer (byte array).
|
||||
int read; // Number of bytes actually read.
|
||||
string str2 = String.Empty; // Decoded string.
|
||||
|
||||
// Try using Encoding object for all operations.
|
||||
do {
|
||||
read = br.Read(bytesRead, 0, count);
|
||||
str2 += enc.GetString(bytesRead, 0, read);
|
||||
} while (read == count);
|
||||
br.Close();
|
||||
Console.WriteLine("Decoded string using UnicodeEncoding.GetString()...");
|
||||
CompareForEquality(str1, str2);
|
||||
Console.WriteLine();
|
||||
|
||||
// Use Decoder for all operations.
|
||||
fsIn = File.OpenRead(@".\characters.bin");
|
||||
br = new BinaryReader(fsIn);
|
||||
Decoder decoder = enc.GetDecoder();
|
||||
char[] chars = new char[50];
|
||||
int index = 0; // Next character to write in array.
|
||||
int written = 0; // Number of chars written to array.
|
||||
do {
|
||||
read = br.Read(bytesRead, 0, count);
|
||||
if (index + decoder.GetCharCount(bytesRead, 0, read) - 1 >= chars.Length)
|
||||
Array.Resize(ref chars, chars.Length + 50);
|
||||
|
||||
written = decoder.GetChars(bytesRead, 0, read, chars, index);
|
||||
index += written;
|
||||
} while (read == count);
|
||||
br.Close();
|
||||
// Instantiate a string with the decoded characters.
|
||||
string str3 = new String(chars, 0, index);
|
||||
Console.WriteLine("Decoded string using UnicodeEncoding.Decoder.GetString()...");
|
||||
CompareForEquality(str1, str3);
|
||||
}
|
||||
|
||||
private static void CompareForEquality(string original, string decoded)
|
||||
{
|
||||
bool result = original.Equals(decoded);
|
||||
Console.WriteLine("original = decoded: {0}",
|
||||
original.Equals(decoded, StringComparison.Ordinal));
|
||||
if (! result) {
|
||||
Console.WriteLine("Code points in original string:");
|
||||
foreach (var ch in original)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
Console.WriteLine();
|
||||
|
||||
Console.WriteLine("Code points in decoded string:");
|
||||
foreach (var ch in decoded)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Created original string...
|
||||
//
|
||||
// Decoded string using UnicodeEncoding.GetString()...
|
||||
// original = decoded: False
|
||||
// Code points in original string:
|
||||
// 0041 0042 0020 0059 005A 0020 0031 0039 0020 D800 DC05 0020 00E4 0055 006E 0069 0063 006F
|
||||
// 0064 0065 0020 0063 0068 0061 0072 0061 0063 0074 0065 0072 0073 002E 0020 00A9 0020 010C
|
||||
// 0020 0073 0020 0062 0308
|
||||
// Code points in decoded string:
|
||||
// 0041 0042 0020 0059 005A 0020 0031 0039 0020 FFFD FFFD 0020 00E4 0055 006E 0069 0063 006F
|
||||
// 0064 0065 0020 0063 0068 0061 0072 0061 0063 0074 0065 0072 0073 002E 0020 00A9 0020 010C
|
||||
// 0020 0073 0020 0062 0308
|
||||
//
|
||||
// Decoded string using UnicodeEncoding.Decoder.GetString()...
|
||||
// original = decoded: True
|
||||
```
|
||||
|
||||
## Choosing a Fallback Strategy
|
||||
|
||||
When a method tries to encode or decode a character but no mapping exists, it must implement a fallback strategy that determines how the failed mapping should be handled. There are three types of fallback strategies:
|
||||
|
||||
* Best-fit fallback
|
||||
|
||||
* Replacement fallback
|
||||
|
||||
* Exception fallback
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> The most common problems in encoding operations occur when a Unicode character cannot be mapped to a particular code page encoding. The most common problems in decoding operations occur when invalid byte sequences cannot be translated into valid Unicode characters. For these reasons, you should know which fallback strategy a particular encoding object uses. Whenever possible, you should specify the fallback strategy used by an encoding object when you instantiate the object.
|
||||
|
||||
### Best-Fit Fallback
|
||||
|
||||
When a character does not have an exact match in the target encoding, the encoder can try to map it to a similar character. (Best-fit fallback is mostly an encoding rather than a decoding issue. There are very few code pages that contain characters that cannot be successfully mapped to Unicode.) Best-fit fallback is the default for code page and double-byte character set encodings that are retrieved by the [Encoding.GetEncoding(Int32)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_Int32_) and [Encoding.GetEncoding(String)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_String_) overloads.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
. In theory, the Unicode encoding classes provided in .NET Core ([UTF8Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF8Encoding), [UnicodeEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UnicodeEncoding), and [UTF32Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.UTF32Encoding)) support every character in every character set, so they can be used to eliminate best-fit fallback issues.
|
||||
|
||||
|
||||
Best-fit strategies vary for different code pages, and they are not documented in detail. For example, for some code pages, full-width Latin characters map to the more common half-width Latin characters. For other code pages, this mapping is not made. Even under an aggressive best-fit strategy, there is no imaginable fit for some characters in some encodings. For example, a Chinese ideograph has no reasonable mapping to code page 1252. In this case, a replacement string is used. By default, this string is just a single QUESTION MARK (U+003F).
|
||||
|
||||
The following example uses code page 1252 (the Windows code page for Western European languages) to illustrate best-fit mapping and its drawbacks. The [Encoding.GetEncoding(Int32](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_Int32_) method is used to retrieve an encoding object for code page 1252. By default, it uses a best-fit mapping for Unicode characters that it does not support. The example instantiates a string that contains three non-ASCII characters - CIRCLED LATIN CAPITAL LETTER S (U+24C8), SUPERSCRIPT FIVE (U+2075), and INFINITY (U+221E) - separated by spaces. As the output from the example shows, when the string is encoded, the three original non-space characters are replaced by QUESTION MARK (U+003F), DIGIT FIVE (U+0035), and DIGIT EIGHT (U+0038). DIGIT EIGHT is a particularly poor replacement for the unsupported INFINITY character, and QUESTION MARK indicates that no mapping was available for the original character.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Get an encoding for code page 1252 (Western Europe character set).
|
||||
Encoding cp1252 = Encoding.GetEncoding(1252);
|
||||
|
||||
// Define and display a string.
|
||||
string str = "\u24c8 \u2075 \u221e";
|
||||
Console.WriteLine("Original string: " + str);
|
||||
Console.Write("Code points in string: ");
|
||||
foreach (var ch in str)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine("\n");
|
||||
|
||||
// Encode a Unicode string.
|
||||
Byte[] bytes = cp1252.GetBytes(str);
|
||||
Console.Write("Encoded bytes: ");
|
||||
foreach (byte byt in bytes)
|
||||
Console.Write("{0:X2} ", byt);
|
||||
Console.WriteLine("\n");
|
||||
|
||||
// Decode the string.
|
||||
string str2 = cp1252.GetString(bytes);
|
||||
Console.WriteLine("String round-tripped: {0}", str.Equals(str2));
|
||||
if (! str.Equals(str2)) {
|
||||
Console.WriteLine(str2);
|
||||
foreach (var ch in str2)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Original string: Ⓢ ⁵ ∞
|
||||
// Code points in string: 24C8 0020 2075 0020 221E
|
||||
//
|
||||
// Encoded bytes: 3F 20 35 20 38
|
||||
//
|
||||
// String round-tripped: False
|
||||
// ? 5 8
|
||||
// 003F 0020 0035 0020 0038
|
||||
```
|
||||
|
||||
Best-fit mapping is the default behavior for an [Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding) object that encodes Unicode data into code page data, and there are legacy applications that rely on this behavior. However, most new applications should avoid best-fit behavior for security reasons. For example, applications should not put a domain name through a best-fit encoding.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> You can also implement a custom best-fit fallback mapping for an encoding. For more information, see the [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) section.
|
||||
|
||||
If best-fit fallback is the default for an encoding object, you can choose another fallback strategy when you retrieve an [Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding) object by calling the [Encoding.GetEncoding(Int32, EncoderFallback, DecoderFallback)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_Int32_System_Text_EncoderFallback_System_Text_DecoderFallback_) or [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_String_System_Text_EncoderFallback_System_Text_DecoderFallback_) overload. The following section includes an example that replaces each character that cannot be mapped to code page 1252 with an asterisk (*).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Encoding cp1252r = Encoding.GetEncoding(1252,
|
||||
new EncoderReplacementFallback("*"),
|
||||
new DecoderReplacementFallback("*"));
|
||||
|
||||
string str1 = "\u24C8 \u2075 \u221E";
|
||||
Console.WriteLine(str1);
|
||||
foreach (var ch in str1)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
|
||||
byte[] bytes = cp1252r.GetBytes(str1);
|
||||
string str2 = cp1252r.GetString(bytes);
|
||||
Console.WriteLine("Round-trip: {0}", str1.Equals(str2));
|
||||
if (! str1.Equals(str2)) {
|
||||
Console.WriteLine(str2);
|
||||
foreach (var ch in str2)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Ⓢ ⁵ ∞
|
||||
// 24C8 0020 2075 0020 221E
|
||||
// Round-trip: False
|
||||
// * * *
|
||||
// 002A 0020 002A 0020 002A
|
||||
```
|
||||
|
||||
### Replacement Fallback
|
||||
|
||||
When a character does not have an exact match in the target scheme, but there is no appropriate character that it can be mapped to, the application can specify a replacement character or string. This is the default behavior for the Unicode decoder, which replaces any two-byte sequence that it cannot decode with REPLACEMENT_CHARACTER (U+FFFD). It is also the default behavior of the [ASCIIEncoding](https://docs.microsoft.com/dotnet/core/api/System.Text.ASCIIEncoding) class, which replaces each character that it cannot encode or decode with a question mark. The following example illustrates character replacement for the Unicode string from the previous example. As the output shows, each character that cannot be decoded into an ASCII byte value is replaced by 0x3F, which is the ASCII code for a question mark.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Encoding enc = Encoding.ASCII;
|
||||
|
||||
string str1 = "\u24C8 \u2075 \u221E";
|
||||
Console.WriteLine(str1);
|
||||
foreach (var ch in str1)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine("\n");
|
||||
|
||||
// Encode the original string using the ASCII encoder.
|
||||
byte[] bytes = enc.GetBytes(str1);
|
||||
Console.Write("Encoded bytes: ");
|
||||
foreach (var byt in bytes)
|
||||
Console.Write("{0:X2} ", byt);
|
||||
Console.WriteLine("\n");
|
||||
|
||||
// Decode the ASCII bytes.
|
||||
string str2 = enc.GetString(bytes);
|
||||
Console.WriteLine("Round-trip: {0}", str1.Equals(str2));
|
||||
if (! str1.Equals(str2)) {
|
||||
Console.WriteLine(str2);
|
||||
foreach (var ch in str2)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Ⓢ ⁵ ∞
|
||||
// 24C8 0020 2075 0020 221E
|
||||
//
|
||||
// Encoded bytes: 3F 20 3F 20 3F
|
||||
//
|
||||
// Round-trip: False
|
||||
// ? ? ?
|
||||
// 003F 0020 003F 0020 003F
|
||||
```
|
||||
|
||||
.NET Core includes the [EncoderReplacementFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderReplacementFallback) and [DecoderReplacementFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderReplacementFallback) classes, which substitute a replacement string if a character does not map exactly in an encoding or decoding operation. By default, this replacement string is a question mark, but you can call a class constructor overload to choose a different string. Typically, the replacement string is a single character, although this is not a requirement. The following example changes the behavior of the code page 1252 encoder by instantiating an [EncoderReplacementFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderReplacementFallback) object that uses an asterisk (*) as a replacement string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Encoding cp1252r = Encoding.GetEncoding(1252,
|
||||
new EncoderReplacementFallback("*"),
|
||||
new DecoderReplacementFallback("*"));
|
||||
|
||||
string str1 = "\u24C8 \u2075 \u221E";
|
||||
Console.WriteLine(str1);
|
||||
foreach (var ch in str1)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
|
||||
byte[] bytes = cp1252r.GetBytes(str1);
|
||||
string str2 = cp1252r.GetString(bytes);
|
||||
Console.WriteLine("Round-trip: {0}", str1.Equals(str2));
|
||||
if (! str1.Equals(str2)) {
|
||||
Console.WriteLine(str2);
|
||||
foreach (var ch in str2)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Ⓢ ⁵ ∞
|
||||
// 24C8 0020 2075 0020 221E
|
||||
// Round-trip: False
|
||||
// * * *
|
||||
// 002A 0020 002A 0020 002A
|
||||
```
|
||||
> **Note**
|
||||
>
|
||||
> You can also implement a replacement class for an encoding. For more information, see the [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) section.
|
||||
|
||||
In addition to QUESTION MARK (U+003F), the Unicode REPLACEMENT CHARACTER (U+FFFD) is commonly used as a replacement string, particularly when decoding byte sequences that cannot be successfully translated into Unicode characters. However, you are free to choose any replacement string, and it can contain multiple characters.
|
||||
|
||||
### Exception Fallback
|
||||
|
||||
Instead of providing a best-fit fallback or a replacement string, an encoder can throw an [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) if it is unable to encode a set of characters, and a decoder can throw a [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) if it is unable to decode a byte array. To throw an exception in encoding and decoding operations, you supply an [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) object and a [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) object, respectively, to the [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_String_System_Text_EncoderFallback_System_Text_DecoderFallback_) method. The following example illustrates exception fallback with the ASCIIEncoding class.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Encoding enc = Encoding.GetEncoding("us-ascii",
|
||||
new EncoderExceptionFallback(),
|
||||
new DecoderExceptionFallback());
|
||||
|
||||
string str1 = "\u24C8 \u2075 \u221E";
|
||||
Console.WriteLine(str1);
|
||||
foreach (var ch in str1)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine("\n");
|
||||
|
||||
// Encode the original string using the ASCII encoder.
|
||||
byte[] bytes = {};
|
||||
try {
|
||||
bytes = enc.GetBytes(str1);
|
||||
Console.Write("Encoded bytes: ");
|
||||
foreach (var byt in bytes)
|
||||
Console.Write("{0:X2} ", byt);
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
catch (EncoderFallbackException e) {
|
||||
Console.Write("Exception: ");
|
||||
if (e.IsUnknownSurrogate())
|
||||
Console.WriteLine("Unable to encode surrogate pair 0x{0:X4} 0x{1:X3} at index {2}.",
|
||||
Convert.ToUInt16(e.CharUnknownHigh),
|
||||
Convert.ToUInt16(e.CharUnknownLow),
|
||||
e.Index);
|
||||
else
|
||||
Console.WriteLine("Unable to encode 0x{0:X4} at index {1}.",
|
||||
Convert.ToUInt16(e.CharUnknown),
|
||||
e.Index);
|
||||
return;
|
||||
}
|
||||
Console.WriteLine();
|
||||
|
||||
// Decode the ASCII bytes.
|
||||
try {
|
||||
string str2 = enc.GetString(bytes);
|
||||
Console.WriteLine("Round-trip: {0}", str1.Equals(str2));
|
||||
if (! str1.Equals(str2)) {
|
||||
Console.WriteLine(str2);
|
||||
foreach (var ch in str2)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
catch (DecoderFallbackException e) {
|
||||
Console.Write("Unable to decode byte(s) ");
|
||||
foreach (byte unknown in e.BytesUnknown)
|
||||
Console.Write("0x{0:X2} ");
|
||||
|
||||
Console.WriteLine("at index {0}", e.Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Ⓢ ⁵ ∞
|
||||
// 24C8 0020 2075 0020 221E
|
||||
//
|
||||
// Exception: Unable to encode 0x24C8 at index 0.
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> You can also implement a custom exception handler for an encoding operation. For more information, see the [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) section.
|
||||
|
||||
The [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) and [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) objects provide the following information about the condition that caused the exception:
|
||||
|
||||
* The [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) object includes an [IsUnknownSurrogate](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException#System_Text_EncoderFallbackException_IsUnknownSurrogate) method, which indicates whether the character or characters that cannot be encoded represent an unknown surrogate pair (in which case, the method returns `true`) or an unknown single character (in which case, the method returns `false`). The characters in the surrogate pair are available from the [EncoderFallbackException.CharUnknownHigh](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException#System_Text_EncoderFallbackException_CharUnknownHigh) and [EncoderFallbackException.CharUnknownLow](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException#System_Text_EncoderFallbackException_CharUnknownLow) properties. The unknown single character is available from the [EncoderFallbackException.CharUnknown](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException#System_Text_EncoderFallbackException_CharUnknown) property. The [EncoderFallbackException.Index](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException#System_Text_EncoderFallbackException_Index) property indicates the position in the string at which the first character that could not be encoded was found.
|
||||
|
||||
* The [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) object includes a [BytesUnknown](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException#System_Text_DecoderFallbackException_BytesUnknown) property that returns an array of bytes that cannot be decoded. The [DecoderFallbackException.Index](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException#System_Text_DecoderFallbackException_Index) property indicates the starting position of the unknown bytes.
|
||||
|
||||
Although the [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) and [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) objects provide adequate diagnostic information about the exception, they do not provide access to the encoding or decoding buffer. Therefore, they do not allow invalid data to be replaced or corrected within the encoding or decoding method.
|
||||
|
||||
## Implementing a Custom Fallback Strategy
|
||||
|
||||
In addition to the best-fit mapping that is implemented internally by code pages, .NET Core includes the following classes for implementing a fallback strategy:
|
||||
|
||||
* Use [EncoderReplacementFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderReplacementFallback) and [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) to replace characters in encoding operations.
|
||||
|
||||
* Use [DecoderReplacementFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderReplacementFallback) and [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) to replace characters in decoding operations.
|
||||
|
||||
* Use [EncoderExceptionFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderExceptionFallback) and [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) to throw an [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) when a character cannot be encoded.
|
||||
|
||||
* Use [DecoderExceptionFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderExceptionFallback) and [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) to throw a [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) when a character cannot be decoded.
|
||||
|
||||
In addition, you can implement a custom solution that uses best-fit fallback, replacement fallback, or exception fallback, by following these steps:
|
||||
|
||||
1. Derive a class from [EncoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback) for encoding operations, and from [DecoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback) for decoding operations.
|
||||
|
||||
2. Derive a class from [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) for encoding operations, and from [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) for decoding operations.
|
||||
|
||||
3. For exception fallback, if the predefined [EncoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackException) and [DecoderFallbackException](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackException) classes do not meet your needs, derive a class from an exception object such as [Exception](https://docs.microsoft.com/dotnet/core/api/System.Exception) or [ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException).
|
||||
|
||||
### Deriving from EncoderFallback or DecoderFallback
|
||||
|
||||
To implement a custom fallback solution, you must create a class that inherits from [EncoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback) for encoding operations, and from [DecoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback) for decoding operations. Instances of these classes are passed to the [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_String_System_Text_EncoderFallback_System_Text_DecoderFallback_) method and serve as the intermediary between the encoding class and the fallback implementation.
|
||||
|
||||
When you create a custom fallback solution for an encoder or decoder, you must implement the following members:
|
||||
|
||||
* The [EncoderFallback.MaxCharCount](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback#System_Text_EncoderFallback_MaxCharCount) or [DecoderFallback.MaxCharCount](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback#System_Text_DecoderFallback_MaxCharCount) property, which returns the maximum possible number of characters that the best-fit, replacement, or exception fallback can return to replace a single character. For a custom exception fallback, its value is zero.
|
||||
|
||||
* The [EncoderFallback.CreateFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback#System_Text_EncoderFallback_CreateFallbackBuffer) or [DecoderFallback.CreateFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback#System_Text_DecoderFallback_CreateFallbackBuffer) method, which returns your custom [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) or [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) implementation. The method is called by the encoder when it encounters the first character that it is unable to successfully encode, or by the decoder when it encounters the first byte that it is unable to successfully decode.
|
||||
|
||||
### Deriving from EncoderFallbackBuffer or DecoderFallbackBuffer
|
||||
|
||||
To implement a custom fallback solution, you must also create a class that inherits from [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) for encoding operations, and from [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) for decoding operations. Instances of these classes are returned by the `CreateFallbackBuffer` method of the [EncoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback) and [DecoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback) classes. The [EncoderFallback.CreateFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback#System_Text_EncoderFallback_CreateFallbackBuffer) method is called by the encoder when it encounters the first character that it is not able to encode, and the [DecoderFallback.CreateFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback#System_Text_DecoderFallback_CreateFallbackBuffer) method is called by the decoder when it encounters one or more bytes that it is not able to decode. The [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) and [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) classes provide the fallback implementation. Each instance represents a buffer that contains the fallback characters that will replace the character that cannot be encoded or the byte sequence that cannot be decoded.
|
||||
|
||||
When you create a custom fallback solution for an encoder or decoder, you must implement the following members:
|
||||
|
||||
* The [EncoderFallbackBuffer.Fallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer#System_Text_EncoderFallbackBuffer_Fallback_System_Char_System_Char_System_Int32_) or [DecoderFallbackBuffer.Fallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer#System_Text_DecoderFallbackBuffer_Fallback_System_Byte___System_Int32_) method. [EncoderFallbackBuffer.Fallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer#System_Text_EncoderFallbackBuffer_Fallback_System_Char_System_Char_System_Int32_) is called by the encoder to provide the fallback buffer with information about the character that it cannot encode. Because the character to be encoded may be a surrogate pair, this method is overloaded. One overload is passed the character to be encoded and its index in the string. The second overload is passed the high and low surrogate along with its index in the string. The [DecoderFallbackBuffer.Fallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer#System_Text_DecoderFallbackBuffer_Fallback_System_Byte___System_Int32_) method is called by the decoder to provide the fallback buffer with information about the bytes that it cannot decode. This method is passed an array of bytes that it cannot decode, along with the index of the first byte. The fallback method should return `true` if the fallback buffer can supply a best-fit or replacement character or characters; otherwise, it should return `false`. For an exception fallback, the fallback method should throw an exception.
|
||||
|
||||
* The [EncoderFallbackBuffer.GetNextChar](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer#System_Text_EncoderFallbackBuffer_GetNextChar) or [DecoderFallbackBuffer.GetNextChar](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer#System_Text_DecoderFallbackBuffer_GetNextChar) method, which is called repeatedly by the encoder or decoder to get the next character from the fallback buffer. When all fallback characters have been returned, the method should return U+0000.
|
||||
|
||||
* The [EncoderFallbackBuffer.Remaining](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer#System_Text_EncoderFallbackBuffer_Remaining) or [DecoderFallbackBuffer.Remaining](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer#System_Text_DecoderFallbackBuffer_Remaining) property, which returns the number of characters remaining in the fallback buffer.
|
||||
|
||||
* The [EncoderFallbackBuffer.MovePrevious](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer#System_Text_EncoderFallbackBuffer_MovePrevious) or [DecoderFallbackBuffer.MovePrevious](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer#System_Text_DecoderFallbackBuffer_MovePrevious) method, which moves the current position in the fallback buffer to the previous character.
|
||||
|
||||
* The [EncoderFallbackBuffer.Reset](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer#System_Text_EncoderFallbackBuffer_Reset) or [DecoderFallbackBuffer.Reset](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer#System_Text_DEcoderFallbackBuffer_Reset) method, which reinitializes the fallback buffer.
|
||||
|
||||
If the fallback implementation is a best-fit fallback or a replacement fallback, the classes derived from [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) and [DecoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallbackBuffer) also maintain two private instance fields: the exact number of characters in the buffer; and the index of the next character in the buffer to return.
|
||||
|
||||
### An EncoderFallback Example
|
||||
|
||||
An earlier example used replacement fallback to replace Unicode characters that did not correspond to ASCII characters with an asterisk (*). The following example uses a custom best-fit fallback implementation instead to provide a better mapping of non-ASCII characters.
|
||||
|
||||
The following code defines a class named `CustomMapper` that is derived from [EncoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback) to handle the best-fit mapping of non-ASCII characters. Its `CreateFallbackBuffer` method returns a `CustomMapperFallbackBuffer` object, which provides the [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer) implementation. The `CustomMapper` class uses a [Dictionary<TKey, TValue>](https://docs.microsoft.com/dotnet/core/api/System.Collections.Generic.Dictionary%602) object to store the mappings of unsupported Unicode characters (the key value) and their corresponding 8-bit characters (which are stored in two consecutive bytes in a 64-bit integer). To make this mapping available to the fallback buffer, the `CustomMapper` instance is passed as a parameter to the `CustomMapperFallbackBuffer` class constructor. Because the longest mapping is the string "INF" for the Unicode character U+221E, the `MaxCharCount` property returns 3.
|
||||
|
||||
```csharp
|
||||
public class CustomMapper : EncoderFallback
|
||||
{
|
||||
public string DefaultString;
|
||||
internal Dictionary<ushort, ulong> mapping;
|
||||
|
||||
public CustomMapper() : this("*")
|
||||
{
|
||||
}
|
||||
|
||||
public CustomMapper(string defaultString)
|
||||
{
|
||||
this.DefaultString = defaultString;
|
||||
|
||||
// Create table of mappings
|
||||
mapping = new Dictionary<ushort, ulong>();
|
||||
mapping.Add(0x24C8, 0x53);
|
||||
mapping.Add(0x2075, 0x35);
|
||||
mapping.Add(0x221E, 0x49004E0046);
|
||||
}
|
||||
|
||||
public override EncoderFallbackBuffer CreateFallbackBuffer()
|
||||
{
|
||||
return new CustomMapperFallbackBuffer(this);
|
||||
}
|
||||
|
||||
public override int MaxCharCount
|
||||
{
|
||||
get { return 3; }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following code defines the `CustomMapperFallbackBuffer` class, which is derived from [EncoderFallbackBuffer](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallbackBuffer). The dictionary that contains best-fit mappings and that is defined in the `CustomMapper` instance is available from its class constructor. Its `Fallback` method returns `true` if any of the Unicode characters that the ASCII encoder cannot encode are defined in the mapping dictionary; otherwise, it returns `false`. For each fallback, the private `count` variable indicates the number of characters that remain to be returned, and the private `index` variable indicates the position in the string buffer, `charsToReturn`, of the next character to return.
|
||||
|
||||
```csharp
|
||||
public class CustomMapperFallbackBuffer : EncoderFallbackBuffer
|
||||
{
|
||||
int count = -1; // Number of characters to return
|
||||
int index = -1; // Index of character to return
|
||||
CustomMapper fb;
|
||||
string charsToReturn;
|
||||
|
||||
public CustomMapperFallbackBuffer(CustomMapper fallback)
|
||||
{
|
||||
this.fb = fallback;
|
||||
}
|
||||
|
||||
public override bool Fallback(char charUnknownHigh, char charUnknownLow, int index)
|
||||
{
|
||||
// Do not try to map surrogates to ASCII.
|
||||
return false;
|
||||
}
|
||||
|
||||
public override bool Fallback(char charUnknown, int index)
|
||||
{
|
||||
// Return false if there are already characters to map.
|
||||
if (count >= 1) return false;
|
||||
|
||||
// Determine number of characters to return.
|
||||
charsToReturn = String.Empty;
|
||||
|
||||
ushort key = Convert.ToUInt16(charUnknown);
|
||||
if (fb.mapping.ContainsKey(key)) {
|
||||
byte[] bytes = BitConverter.GetBytes(fb.mapping[key]);
|
||||
int ctr = 0;
|
||||
foreach (var byt in bytes) {
|
||||
if (byt > 0) {
|
||||
ctr++;
|
||||
charsToReturn += (char) byt;
|
||||
}
|
||||
}
|
||||
count = ctr;
|
||||
}
|
||||
else {
|
||||
// Return default.
|
||||
charsToReturn = fb.DefaultString;
|
||||
count = 1;
|
||||
}
|
||||
this.index = charsToReturn.Length - 1;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
public override char GetNextChar()
|
||||
{
|
||||
// We'll return a character if possible, so subtract from the count of chars to return.
|
||||
count--;
|
||||
// If count is less than zero, we've returned all characters.
|
||||
if (count < 0)
|
||||
return '\u0000';
|
||||
|
||||
this.index--;
|
||||
return charsToReturn[this.index + 1];
|
||||
}
|
||||
|
||||
public override bool MovePrevious()
|
||||
{
|
||||
// Original: if count >= -1 and pos >= 0
|
||||
if (count >= -1) {
|
||||
count++;
|
||||
return true;
|
||||
}
|
||||
else {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
public override int Remaining
|
||||
{
|
||||
get { return count < 0 ? 0 : count; }
|
||||
}
|
||||
|
||||
public override void Reset()
|
||||
{
|
||||
count = -1;
|
||||
index = -1;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following code then instantiates the `CustomMapper` object and passes an instance of it to the [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding#System_Text_Encoding_GetEncoding_System_String_System_Text_EncoderFallback_System_Text_DecoderFallback_) method. The output indicates that the best-fit fallback implementation successfully handles the three non-ASCII characters in the original string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Text;
|
||||
|
||||
class Program
|
||||
{
|
||||
static void Main()
|
||||
{
|
||||
Encoding enc = Encoding.GetEncoding("us-ascii", new CustomMapper(), new DecoderExceptionFallback());
|
||||
|
||||
string str1 = "\u24C8 \u2075 \u221E";
|
||||
Console.WriteLine(str1);
|
||||
for (int ctr = 0; ctr <= str1.Length - 1; ctr++) {
|
||||
Console.Write("{0} ", Convert.ToUInt16(str1[ctr]).ToString("X4"));
|
||||
if (ctr == str1.Length - 1)
|
||||
Console.WriteLine();
|
||||
}
|
||||
Console.WriteLine();
|
||||
|
||||
// Encode the original string using the ASCII encoder.
|
||||
byte[] bytes = enc.GetBytes(str1);
|
||||
Console.Write("Encoded bytes: ");
|
||||
foreach (var byt in bytes)
|
||||
Console.Write("{0:X2} ", byt);
|
||||
|
||||
Console.WriteLine("\n");
|
||||
|
||||
// Decode the ASCII bytes.
|
||||
string str2 = enc.GetString(bytes);
|
||||
Console.WriteLine("Round-trip: {0}", str1.Equals(str2));
|
||||
if (! str1.Equals(str2)) {
|
||||
Console.WriteLine(str2);
|
||||
foreach (var ch in str2)
|
||||
Console.Write("{0} ", Convert.ToUInt16(ch).ToString("X4"));
|
||||
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Text.Encoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoder)
|
||||
|
||||
[System.Text.EncoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.EncoderFallback)
|
||||
|
||||
[System.Text.Decoder](https://docs.microsoft.com/dotnet/core/api/System.Text.Decoder)
|
||||
|
||||
[System.Text.DecoderFallback](https://docs.microsoft.com/dotnet/core/api/System.Text.DecoderFallback)
|
||||
|
||||
[System.Text.Encoding](https://docs.microsoft.com/dotnet/core/api/System.Text.Encoding)
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,32 +0,0 @@
|
|||
---
|
||||
title: Manipulating Strings
|
||||
description: Manipulating Strings
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 2692b299-7a1a-49f3-82fe-08b2a3de6bf5
|
||||
---
|
||||
|
||||
# Manipulating Strings
|
||||
|
||||
.NET Core provides an extensive set of routines that enable you to efficiently create, compare, and modify strings as well as rapidly parse large amounts of text and data to search for, remove, and replace text patterns.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Best Practices for Using Strings](bestpractices.md) - Examines string-sorting, comparison, and casing methods in .NET Core, and provides recommendations for selecting a string-handling method .
|
||||
|
||||
[Regular Expressions](regularexpressions.md) - Provides detailed information about .NET Core regular expressions, including language elements, regular expression behavior, and examples.
|
||||
|
||||
[Basic String Operations](basicstringoperations.md) - Describes string operations provided by the [System.String](https://docs.microsoft.com/dotnet/core/api/System.String) and [System.Text.StringBuilder](https://docs.microsoft.com/dotnet/core/api/System.Text.StringBuilder) classes, including creating new strings from arrays of bytes, comparing string values, and modifying existing strings.
|
||||
|
||||
[Character Encoding in .NET Core](characterencoding.md) - Describes how to encode and decode character formats such as Unicode.
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,438 +0,0 @@
|
|||
---
|
||||
title: Backtracking in Regular Expressions
|
||||
description: Backtracking in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 702f05ee-3831-4556-bc32-e2dac0046032
|
||||
---
|
||||
|
||||
# Backtracking in Regular Expressions
|
||||
|
||||
Backtracking occurs when a regular expression pattern contains optional quantifiers or alternation constructs, and the regular expression engine returns to a previous saved state to continue its search for a match. Backtracking is central to the power of regular expressions; it makes it possible for expressions to be powerful and flexible, and to match very complex patterns. At the same time, this power comes at a cost. Backtracking is often the single most important factor that affects the performance of the regular expression engine. Fortunately, the developer has control over the behavior of the regular expression engine and how it uses backtracking. This topic explains how backtracking works and how it can be controlled.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> In general, a Nondeterministic Finite Automaton (NFA) engine like the .NET Core regular expression engine places the responsibility for crafting efficient, fast regular expressions on the developer.
|
||||
|
||||
This topic contains the following sections:
|
||||
|
||||
* [Linear Comparison Without Backtracking](#Linear-Comparison-Without-Backtracking)
|
||||
|
||||
* [Backtracking with Optional Quantifiers or Alternation Constructs](#Backtracking-with-Optional-Quantifiers-or-Alternation-Constructs)
|
||||
|
||||
* [Backtracking with Nested Optional Quantifiers](#Backtracking-with-Nested-Optional-Quantifiers)
|
||||
|
||||
* [Controlling Backtracking](#Controlling_Backtracking)
|
||||
|
||||
## Linear Comparison Without Backtracking
|
||||
|
||||
If a regular expression pattern has no optional quantifiers or alternation constructs, the regular expression engine executes in linear time. That is, after the regular expression engine matches the first language element in the pattern with text in the input string, it tries to match the next language element in the pattern with the next character or group of characters in the input string. This continues until the match either succeeds or fails. In either case, the regular expression engine advances by one character at a time in the input string.
|
||||
|
||||
The following example provides an illustration. The regular expression `e{2}\w\b` looks for two occurrences of the letter "e" followed by any word character followed by a word boundary.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "needing a reed";
|
||||
string pattern = @"e{2}\w\b";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("{0} found at position {1}",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// eed found at position 11
|
||||
```
|
||||
|
||||
Although this regular expression includes the quantifier `{2}`, it is evaluated in a linear manner. The regular expression engine does not backtrack because `{2}` is not an optional quantifier; it specifies an exact number and not a variable number of times that the previous subexpression must match. As a result, the regular expression engine tries to match the regular expression pattern with the input string as shown in the following table.
|
||||
|
||||
Operation | Position in pattern | Position in string | Result
|
||||
--------- | ------------------- | ------------------ | ------
|
||||
1 | e | "needing a reed" (index 0) | No match.
|
||||
2 | e | "eeding a reed" (index 1) | Possible match.
|
||||
3 | e{2} | "eding a reed" (index 2) | Possible match.
|
||||
4 | \w | "ding a reed" (index 3) | Possible match.
|
||||
5 | \b | "ing a reed" (index 4) | Possible match fails.
|
||||
6 | e | "eding a reed" (index 2) | Possible match.
|
||||
7 | e{2} | "ding a reed" (index 3) | Possible match fails.
|
||||
8 | e | "ding a reed" (index 3) | Match fails.
|
||||
9 | e | "ing a reed" (index 4) | No match.
|
||||
10 | e | "ng a reed" (index 5) | No match.
|
||||
11 | e | "g a reed" (index 6) | No match.
|
||||
12 | e | " a reed" (index 7) | No match.
|
||||
13 | e | "a reed" (index 8) | No match.
|
||||
14 | e | " reed" (index 9) | No match.
|
||||
15 | e | "reed" (index 10) | No match
|
||||
16 | e | "eed" (index 11) | Possible match.
|
||||
17 | e{2} | "ed" (index 12) | Possible match.
|
||||
18 | \w | "d" (index 13) | Possible match.
|
||||
19 | \b | "" (index 14) | Match.
|
||||
|
||||
|
||||
If a regular expression pattern includes no optional quantifiers or alternation constructs, the maximum number of comparisons required to match the regular expression pattern with the input string is roughly equivalent to the number of characters in the input string. In this case, the regular expression engine uses 19 comparisons to identify possible matches in this 13-character string. In other words, the regular expression engine runs in near-linear time if it contains no optional quantifiers or alternation constructs.
|
||||
|
||||
## Backtracking with Optional Quantifiers or Alternation Constructs
|
||||
|
||||
When a regular expression includes optional quantifiers or alternation constructs, the evaluation of the input string is no longer linear. Pattern matching with an NFA engine is driven by the language elements in the regular expression and not by the characters to be matched in the input string. Therefore, the regular expression engine tries to fully match optional or alternative subexpressions. When it advances to the next language element in the subexpression and the match is unsuccessful, the regular expression engine can abandon a portion of its successful match and return to an earlier saved state in the interest of matching the regular expression as a whole with the input string. This process of returning to a previous saved state to find a match is known as backtracking.
|
||||
|
||||
For example, consider the regular expression pattern `.*(es)`, which matches the characters "es" and all the characters that precede it. As the following example shows, if the input string is "Essential services are provided by regular expressions.", the pattern matches the whole string up to and including the "es" in "expressions".
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "Essential services are provided by regular expressions.";
|
||||
string pattern = ".*(es)";
|
||||
Match m = Regex.Match(input, pattern, RegexOptions.IgnoreCase);
|
||||
if (m.Success) {
|
||||
Console.WriteLine("'{0}' found at position {1}",
|
||||
m.Value, m.Index);
|
||||
Console.WriteLine("'es' found at position {0}",
|
||||
m.Groups[1].Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
// 'Essential services are provided by regular expres' found at position 0
|
||||
// 'es' found at position 47
|
||||
```
|
||||
|
||||
To do this, the regular expression engine uses backtracking as follows:
|
||||
|
||||
* It matches the `.*` (which matches zero, one, or more occurrences of any character) with the whole input string.
|
||||
|
||||
* It attempts to match "e" in the regular expression pattern. However, the input string has no remaining characters available to match.
|
||||
|
||||
* It backtracks to its last successful match, "Essential services are provided by regular expressions", and attempts to match "e" with the period at the end of the sentence. The match fails.
|
||||
|
||||
* It continues to backtrack to a previous successful match one character at a time until the tentatively matched substring is "Essential services are provided by regular expr". It then compares the "e" in the pattern to the second "e" in "expressions" and finds a match.
|
||||
|
||||
* It compares "s" in the pattern to the "s" that follows the matched "e" character (the first "s" in "expressions"). The match is successful.
|
||||
|
||||
When you use backtracking, matching the regular expression pattern with the input string, which is 55 characters long, requires 67 comparison operations. Interestingly, if the regular expression pattern included a lazy quantifier, `.*?(es),` matching the regular expression would require additional comparisons. In this case, instead of having to backtrack from the end of the string to the "r" in "expressions", the regular expression engine would have to backtrack all the way to the beginning of the string to match "Es" and would require 113 comparisons. Generally, if a regular expression pattern has a single alternation construct or a single optional quantifier, the number of comparison operations required to match the pattern is more than twice the number of characters in the input string.
|
||||
|
||||
## Backtracking with Nested Optional Quantifiers
|
||||
|
||||
The number of comparison operations required to match a regular expression pattern can increase exponentially if the pattern includes a large number of alternation constructs, if it includes nested alternation constructs, or, most commonly, if it includes nested optional quantifiers. For example, the regular expression pattern `^(a+)+$` is designed to match a complete string that contains one or more "a" characters. The example provides two input strings of identical length, but only the first string matches the pattern. The [System.Diagnostics.Stopwatch](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Stopwatch) class is used to determine how long the match operation takes.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Diagnostics;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "^(a+)+$";
|
||||
string[] inputs = { "aaaaaa", "aaaaa!" };
|
||||
Regex rgx = new Regex(pattern);
|
||||
Stopwatch sw;
|
||||
|
||||
foreach (string input in inputs) {
|
||||
sw = Stopwatch.StartNew();
|
||||
Match match = rgx.Match(input);
|
||||
sw.Stop();
|
||||
if (match.Success)
|
||||
Console.WriteLine("Matched {0} in {1}", match.Value, sw.Elapsed);
|
||||
else
|
||||
Console.WriteLine("No match found in {0}", sw.Elapsed);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
As the output from the example shows, the regular expression engine took about twice as long to find that an input string did not match the pattern as it did to identify a matching string. This is because an unsuccessful match always represents a worst-case scenario. The regular expression engine must use the regular expression to follow all possible paths through the data before it can conclude that the match is unsuccessful, and the nested parentheses create many additional paths through the data. The regular expression engine concludes that the second string did not match the pattern by doing the following:
|
||||
|
||||
* It checks that it was at the beginning of the string, and then matches the first five characters in the string with the pattern a+. It then determines that there are no additional groups of "a" characters in the string. Finally, it tests for the end of the string. Because one additional character remains in the string, the match fails. This failed match requires 9 comparisons. The regular expression engine also saves state information from its matches of "a" (which we will call match 1), "aa" (match 2), "aaa" (match 3), and "aaaa" (match 4).
|
||||
|
||||
* It returns to the previously saved match 4. It determines that there is one additional "a" character to assign to an additional captured group. Finally, it tests for the end of the string. Because one additional character remains in the string, the match fails. This failed match requires 4 comparisons. So far, a total of 13 comparisons have been performed.
|
||||
|
||||
* It returns to the previously saved match 3. It determines that there are two additional "a" characters to assign to an additional captured group. However, the end-of-string test fails. It then returns to match3 and tries to match the two additional "a" characters in two additional captured groups. The end-of-string test still fails. These failed matches require 12 comparisons. So far, a total of 25 comparisons have been performed.
|
||||
|
||||
Comparison of the input string with the regular expression continues in this way until the regular expression engine has tried all possible combinations of matches, and then concludes that there is no match. Because of the nested quantifiers, this comparison is an O(2n) or an exponential operation, where n is the number of characters in the input string. This means that in the worst case, an input string of 30 characters requires approximately 1,073,741,824 comparisons, and an input string of 40 characters requires approximately 1,099,511,627,776 comparisons. If you use strings of these or even greater lengths, regular expression methods can take an extremely long time to complete when they process input that does not match the regular expression pattern.
|
||||
|
||||
## Controlling Backtracking
|
||||
|
||||
Backtracking lets you create powerful, flexible regular expressions. However, as the previous section showed, these benefits may be coupled with unacceptably poor performance. To prevent excessive backtracking, you should define a time-out interval when you instantiate a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object or call a static regular expression matching method. This is discussed in the next section. In addition, .NET Core supports three regular expression language elements that limit or suppress backtracking and that support complex regular expressions with little or no performance penalty: [nonbacktracking subexpressions](#nonbacktracking-subexpressions), [lookbehind assertions](#lookbehind-assertions), and [lookahead assertions](#lookahead assertions).
|
||||
|
||||
### Defining a Time-out Interval
|
||||
|
||||
You can set a time-out value that represents the longest interval the regular expression engine will search for a single match before it abandons the attempt and throws a [RegexMatchTimeoutException](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexMatchTimeoutException) exception. You specify the time-out interval by supplying a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value to the [Regex.Regex(String, RegexOptions, TimeSpan)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex__ctor_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_) constructor for instance regular expressions. In addition, each static pattern matching method has an overload with a [TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) value to the [Regex.Regex(String, RegexOptions, TimeSpan)] parameter that allows you to specify a time-out value. By default, the time-out interval is set to [Regex.InfiniteMatchTimeout](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_InfiniteMatchTimeout) and the regular expression engine does not time out.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> We recommend th>at you always set a time-out interval if your regular expression relies on backtracking.
|
||||
|
||||
A [RegexMatchTimeoutException](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexMatchTimeoutException)n exception indicates that the regular expression engine was unable to find a match within in the specified time-out interval but does not indicate why the exception was thrown. The reason might be excessive backtracking, but it is also possible that the time-out interval was set too low given the system load at the time the exception was thrown. When you handle the exception, you can choose to abandon further matches with the input string or increase the time-out interval and retry the matching operation.
|
||||
|
||||
For example, the following code calls the [Regex.Regex(String, RegexOptions, TimeSpan)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex__ctor_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_) constructor to instantiate a Regex object with a time-out value of one second. The regular expression pattern `(a+)+$`, which matches one or more sequences of one or more "a" characters at the end of a line, is subject to excessive backtracking. If a [RegexMatchTimeoutException](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexMatchTimeoutException) is thrown, the example increases the time-out value up to a maximum interval of three seconds. After that, it abandons the attempt to match the pattern.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.ComponentModel;
|
||||
using System.Diagnostics;
|
||||
using System.Security;
|
||||
using System.Text.RegularExpressions;
|
||||
using System.Threading;
|
||||
|
||||
public class Example
|
||||
{
|
||||
const int MaxTimeoutInSeconds = 3;
|
||||
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(a+)+$"; // DO NOT REUSE THIS PATTERN.
|
||||
Regex rgx = new Regex(pattern, RegexOptions.IgnoreCase, TimeSpan.FromSeconds(1));
|
||||
Stopwatch sw = null;
|
||||
|
||||
string[] inputs= { "aa", "aaaa>",
|
||||
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
|
||||
"aaaaaaaaaaaaaaaaaaaaaa>",
|
||||
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>" };
|
||||
|
||||
foreach (var inputValue in inputs) {
|
||||
Console.WriteLine("Processing {0}", inputValue);
|
||||
bool timedOut = false;
|
||||
do {
|
||||
try {
|
||||
sw = Stopwatch.StartNew();
|
||||
// Display the result.
|
||||
if (rgx.IsMatch(inputValue)) {
|
||||
sw.Stop();
|
||||
Console.WriteLine(@"Valid: '{0}' ({1:ss\.fffffff} seconds)",
|
||||
inputValue, sw.Elapsed);
|
||||
}
|
||||
else {
|
||||
sw.Stop();
|
||||
Console.WriteLine(@"'{0}' is not a valid string. ({1:ss\.fffff} seconds)",
|
||||
inputValue, sw.Elapsed);
|
||||
}
|
||||
}
|
||||
catch (RegexMatchTimeoutException e) {
|
||||
sw.Stop();
|
||||
// Display the elapsed time until the exception.
|
||||
Console.WriteLine(@"Timeout with '{0}' after {1:ss\.fffff}",
|
||||
inputValue, sw.Elapsed);
|
||||
Thread.Sleep(1500); // Pause for 1.5 seconds.
|
||||
|
||||
// Increase the timeout interval and retry.
|
||||
TimeSpan timeout = e.MatchTimeout.Add(TimeSpan.FromSeconds(1));
|
||||
if (timeout.TotalSeconds > MaxTimeoutInSeconds) {
|
||||
Console.WriteLine("Maximum timeout interval of {0} seconds exceeded.",
|
||||
MaxTimeoutInSeconds);
|
||||
timedOut = false;
|
||||
}
|
||||
else {
|
||||
Console.WriteLine("Changing the timeout interval to {0}",
|
||||
timeout);
|
||||
rgx = new Regex(pattern, RegexOptions.IgnoreCase, timeout);
|
||||
timedOut = true;
|
||||
}
|
||||
}
|
||||
} while (timedOut);
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays output like the following :
|
||||
// Processing aa
|
||||
// Valid: 'aa' (00.0000779 seconds)
|
||||
//
|
||||
// Processing aaaa>
|
||||
// 'aaaa>' is not a valid string. (00.00005 seconds)
|
||||
//
|
||||
// Processing aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
|
||||
// Valid: 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' (00.0000043 seconds)
|
||||
//
|
||||
// Processing aaaaaaaaaaaaaaaaaaaaaa>
|
||||
// Timeout with 'aaaaaaaaaaaaaaaaaaaaaa>' after 01.00469
|
||||
// Changing the timeout interval to 00:00:02
|
||||
// Timeout with 'aaaaaaaaaaaaaaaaaaaaaa>' after 02.01202
|
||||
// Changing the timeout interval to 00:00:03
|
||||
// Timeout with 'aaaaaaaaaaaaaaaaaaaaaa>' after 03.01043
|
||||
// Maximum timeout interval of 3 seconds exceeded.
|
||||
//
|
||||
// Processing aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>
|
||||
// Timeout with 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa>' after 03.01018
|
||||
// Maximum timeout interval of 3 seconds exceeded.
|
||||
```
|
||||
|
||||
### Nonbacktracking Subexpression
|
||||
|
||||
The **(?>** _subexpression_**)** language element suppresses backtracking in a subexpression. It is useful for preventing the performance problems associated with failed matches.
|
||||
|
||||
The following example illustrates how suppressing backtracking improves performance when using nested quantifiers. It measures the time required for the regular expression engine to determine that an input string does not match two regular expressions. The first regular expression uses backtracking to attempt to match a string that contains one or more occurrences of one or more hexadecimal digits, followed by a colon, followed by one or more hexadecimal digits, followed by two colons. The second regular expression is identical to the first, except that it disables backtracking. As the output from the example shows, the performance improvement from disabling backtracking is significant.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Diagnostics;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "b51:4:1DB:9EE1:5:27d60:f44:D4:cd:E:5:0A5:4a:D24:41Ad:";
|
||||
bool matched;
|
||||
Stopwatch sw;
|
||||
|
||||
Console.WriteLine("With backtracking:");
|
||||
string backPattern = "^(([0-9a-fA-F]{1,4}:)*([0-9a-fA-F]{1,4}))*(::)$";
|
||||
sw = Stopwatch.StartNew();
|
||||
matched = Regex.IsMatch(input, backPattern);
|
||||
sw.Stop();
|
||||
Console.WriteLine("Match: {0} in {1}", Regex.IsMatch(input, backPattern), sw.Elapsed);
|
||||
Console.WriteLine();
|
||||
|
||||
Console.WriteLine("Without backtracking:");
|
||||
string noBackPattern = "^((?>[0-9a-fA-F]{1,4}:)*(?>[0-9a-fA-F]{1,4}))*(::)$";
|
||||
sw = Stopwatch.StartNew();
|
||||
matched = Regex.IsMatch(input, noBackPattern);
|
||||
sw.Stop();
|
||||
Console.WriteLine("Match: {0} in {1}", Regex.IsMatch(input, noBackPattern), sw.Elapsed);
|
||||
}
|
||||
}
|
||||
// The example displays output like the following:
|
||||
// With backtracking:
|
||||
// Match: False in 00:00:27.4282019
|
||||
//
|
||||
// Without backtracking:
|
||||
// Match: False in 00:00:00.0001391
|
||||
```
|
||||
|
||||
### Lookbehind Assertions
|
||||
|
||||
.NET Core includes two language elements, **(?<**=_subexpression_**)** and **(?<!**_subexpression_**)**, that match the previous character or characters in the input string. Both language elements are zero-width assertions; that is, they determine whether the character or characters that immediately precede the current character can be matched by *subexpression*, without advancing or backtracking.
|
||||
|
||||
**(?<**=_subexpression_**)** is a positive lookbehind assertion; that is, the character or characters before the current position must match *subexpression*. **(?<!**_subexpression_**)** is a negative lookbehind assertion; that is, the character or characters before the current position must not match *subexpression*. Both positive and negative lookbehind assertions are most useful when *subexpression* is a subset of the previous *subexpression*.
|
||||
|
||||
The following example uses two equivalent regular expression patterns that validate the user name in an e-mail address. The first pattern is subject to poor performance because of excessive backtracking. The second pattern modifies the first regular expression by replacing a nested quantifier with a positive lookbehind assertion. The output from the example displays the execution time of the [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Diagnostics;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Stopwatch sw;
|
||||
string input = "aaaaaaaaaaaaaaaaaaaa";
|
||||
bool result;
|
||||
|
||||
string pattern = @"^[0-9A-Z]([-.\w]*[0-9A-Z])?@";
|
||||
sw = Stopwatch.StartNew();
|
||||
result = Regex.IsMatch(input, pattern, RegexOptions.IgnoreCase);
|
||||
sw.Stop();
|
||||
Console.WriteLine("Match: {0} in {1}", result, sw.Elapsed);
|
||||
|
||||
string behindPattern = @"^[0-9A-Z][-.\w]*(?<=[0-9A-Z])@";
|
||||
sw = Stopwatch.StartNew();
|
||||
result = Regex.IsMatch(input, behindPattern, RegexOptions.IgnoreCase);
|
||||
sw.Stop();
|
||||
Console.WriteLine("Match with Lookbehind: {0} in {1}", result, sw.Elapsed);
|
||||
}
|
||||
}
|
||||
// The example displays output similar to the following:
|
||||
// Match: True in 00:00:00.0017549
|
||||
// Match with Lookbehind: True in 00:00:00.0000659
|
||||
```
|
||||
|
||||
The first regular expression pattern, `^[0-9A-Z]([-.\w]*[0-9A-Z])*@, is defined as shown in the following table.`
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Start the match at the beginning of the string.
|
||||
`[0-9A-Z]` | Match an alphanumeric character. This comparison is case-insensitive, because the [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method is called with the [RegexOptions.IgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) option.
|
||||
`[-.\w]*` | Match zero, one, or more occurrences of a hyphen, period, or word character.
|
||||
`[0-9A-Z]` | Match an alphanumeric character.
|
||||
`([-.\w]*[0-9A-Z])*` | Match zero or more occurrences of the combination of zero or more hyphens, periods, or word characters, followed by an alphanumeric character. This is the first capturing group.
|
||||
`@` | Match an at sign ("@").
|
||||
|
||||
The second regular expression pattern, `^[0-9A-Z][-.\w]*(?<=[0-9A-Z])@`, uses a positive lookbehind assertion. It is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Start the match at the beginning of the string.
|
||||
`[0-9A-Z]` | Match an alphanumeric character. This comparison is case-insensitive, because the [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method is called with the [RegexOptions.IgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) option.
|
||||
`[-.\w]*` | Match zero or more occurrences of a hyphen, period, or word character.
|
||||
`(?<=[0-9A-Z])` | Look back at the last matched character and continue the match if it is alphanumeric. Note that alphanumeric characters are a subset of the set that consists of periods, hyphens, and all word characters.
|
||||
`@` | Match an at sign ("@").
|
||||
|
||||
### Lookahead Assertions
|
||||
|
||||
.NET Core includes two language elements, **(?**=_subexpression_**)** and **(?!**_subexpression_**)**, that match the next character or characters in the input string. Both language elements are zero-width assertions; that is, they determine whether the character or characters that immediately follow the current character can be matched by *subexpression*, without advancing or backtracking.
|
||||
|
||||
**(?**=_subexpression_**)** is a positive lookahead assertion; that is, the character or characters after the current position must match *subexpression*. **(?!**_subexpression_**)** is a negative lookahead assertion; that is, the character or characters after the current position must not match *subexpression*. Both positive and negative lookahead assertions are most useful when *subexpression* is a subset of the next *subexpression*.
|
||||
|
||||
The following example uses two equivalent regular expression patterns that validate a fully qualified type name. The first pattern is subject to poor performance because of excessive backtracking. The second modifies the first regular expression by replacing a nested quantifier with a positive lookahead assertion. The output from the example displays the execution time of the [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Diagnostics;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "aaaaaaaaaaaaaaaaaaaaaa.";
|
||||
bool result;
|
||||
Stopwatch sw;
|
||||
|
||||
string pattern = @"^(([A-Z]\w*)+\.)*[A-Z]\w*$";
|
||||
sw = Stopwatch.StartNew();
|
||||
result = Regex.IsMatch(input, pattern, RegexOptions.IgnoreCase);
|
||||
sw.Stop();
|
||||
Console.WriteLine("{0} in {1}", result, sw.Elapsed);
|
||||
|
||||
string aheadPattern = @"^((?=[A-Z])\w+\.)*[A-Z]\w*$";
|
||||
sw = Stopwatch.StartNew();
|
||||
result = Regex.IsMatch(input, aheadPattern, RegexOptions.IgnoreCase);
|
||||
sw.Stop();
|
||||
Console.WriteLine("{0} in {1}", result, sw.Elapsed);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// False in 00:00:03.8003793
|
||||
// False in 00:00:00.0000866
|
||||
```
|
||||
|
||||
The first regular expression pattern, `^(([A-Z]\w*)+\.)*[A-Z]\w*$`, is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Start the match at the beginning of the string.
|
||||
`([A-Z]\w*)+\.` | Match an alphabetical character (A-Z) followed by zero or more word characters one or more times, followed by a period. This comparison is case-insensitive, because the [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method is called with the [RegexOptions.IgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) option.
|
||||
`(([A-Z]\w*)+\.)*` | Match the previous pattern zero or more times.
|
||||
`[A-Z]\w*` | Match an alphabetical character followed by zero or more word characters.
|
||||
`$` | End the match at the end of the input string.
|
||||
|
||||
|
||||
The second regular expression pattern, `^((?=[A-Z])\w+\.)*[A-Z]\w*$`, uses a positive lookahead assertion. It is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Start the match at the beginning of the string.
|
||||
`(?=[A-Z])` | Look ahead to the first character and continue the match if it is alphabetical (A-Z). This comparison is case-insensitive, because the [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method is called with the [RegexOptions.IgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) option.
|
||||
`\w+\.` | Match one or more word characters followed by a period.
|
||||
`((?=[A-Z])\w+\.)*` | Match the pattern of one or more word characters followed by a period zero or more times. The initial word character must be alphabetical.
|
||||
`[A-Z]\w*` | Match an alphabetical character followed by zero or more word characters.
|
||||
`$` | End the match at the end of the input string.
|
||||
|
|
@ -1,43 +0,0 @@
|
|||
---
|
||||
title: Compilation and Reuse in Regular Expressions
|
||||
description: Compilation and Reuse in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 8f61d405-e44f-44ef-8978-dba1a6128e39
|
||||
---
|
||||
|
||||
# Compilation and Reuse in Regular Expressions
|
||||
|
||||
You can optimize the performance of applications that make extensive use of regular expressions by understanding how the regular expression engine compiles expressions and by understanding how regular expressions are cached. This topic discusses both compilation and caching.
|
||||
|
||||
## Compiled Regular Expressions
|
||||
|
||||
By default, the regular expression engine compiles a regular expression to a sequence of internal instructions (these are high-level codes that are different from Microsoft intermediate language, or MSIL). When the engine executes a regular expression, it interprets the internal codes.
|
||||
|
||||
If a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object is constructed with the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) option, it compiles the regular expression to explicit MSIL code instead of high-level regular expression internal instructions. This allows .NET Core's just-in-time (JIT) compiler to convert the expression to native machine code for higher performance.
|
||||
|
||||
However, generated MSIL cannot be unloaded. The only way to unload code is to unload an entire application domain (that is, to unload all of your application's code.). Effectively, once a regular expression is compiled with the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) option, .NET Core never releases the resources used by the compiled expression, even if the regular expression was created by a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object that is itself released to garbage collection.
|
||||
|
||||
You must be careful to limit the number of different regular expressions you compile with the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) option to avoid consuming too many resources. If an application must use a large or unbounded number of regular expressions, each expression should be interpreted, not compiled. However, if a small number of regular expressions are used repeatedly, they should be compiled with [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) for better performance.
|
||||
|
||||
## The Regular Expressions Cache
|
||||
|
||||
To improve performance, the regular expression engine maintains an application-wide cache of compiled regular expressions. The cache stores regular expression patterns that are used only in static method calls. (Regular expression patterns supplied to instance methods are not cached.) This avoids the need to reparse an expression into high-level byte code each time it is used.
|
||||
|
||||
The maximum number of cached regular expressions is determined by the value of the static [Regex.CacheSize](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_CacheSize) property. By default, the regular expression engine caches up to 15 compiled regular expressions. If the number of compiled regular expressions exceeds the cache size, the least recently used regular expression is discarded and the new regular expression is cached.
|
||||
|
||||
Your application can take advantage of precompiled regular expressions in one of the following two ways:
|
||||
|
||||
* By using a static method of the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object to define the regular expression. If you are using a regular expression pattern that has already been defined in another static method call, the regular expression engine will retrieve it from the cache. If not, the engine will compile the regular expression and add it to the cache.
|
||||
|
||||
* By reusing an existing [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object as long as its regular expression pattern is needed.
|
||||
|
||||
|
||||
Because of the overhead of object instantiation and regular expression compilation, creating and rapidly destroying numerous [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) objects is a very expensive process. For applications that use a large number of different regular expressions, you can optimize performance by using calls to static [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) methods and possibly by increasing the size of the regular expression cache.
|
||||
|
|
@ -1,384 +0,0 @@
|
|||
---
|
||||
title: Details of Regular Expression Behavior
|
||||
description: Details of Regular Expression Behavior
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 11b5e2d1-d660-460f-9a1d-76da61bbcc83
|
||||
---
|
||||
|
||||
# Details of Regular Expression Behavior
|
||||
|
||||
|
||||
The .NET Core regular expression engine is a backtracking regular expression matcher that incorporates a traditional Nondeterministic Finite Automaton (NFA) engine such as that used by Perl, Python, Emacs, and Tcl. This distinguishes it from faster, but more limited, pure regular expression Deterministic Finite Automaton (DFA) engines such as those found in awk, egrep, or lex. This also distinguishes it from standardized, but slower, POSIX NFAs. The following section describes the three types of regular expression engines, and explains why regular expressions in .NET Core are implemented by using a traditional NFA engine.
|
||||
|
||||
## Benefits of the NFA Engine
|
||||
|
||||
When DFA engines perform pattern matching, their processing order is driven by the input string. The engine begins at the beginning of the input string and proceeds sequentially to determine whether the next character matches the regular expression pattern. They can guarantee to match the longest string possible. Because they never test the same character twice, DFA engines do not support backtracking. However, because a DFA engine contains only finite state, it cannot match a pattern with backreferences, and because it does not construct an explicit expansion, it cannot capture subexpressions.
|
||||
|
||||
Unlike DFA engines, when traditional NFA engines perform pattern matching, their processing order is driven by the regular expression pattern. As it processes a particular language element, the engine uses greedy matching; that is, it matches as much of the input string as it possibly can. But it also saves its state after successfully matching a subexpression. If a match eventually fails, the engine can return to a saved state so it can try additional matches. This process of abandoning a successful subexpression match so that later language elements in the regular expression can also match is known as backtracking. NFA engines use backtracking to test all possible expansions of a regular expression in a specific order and accept the first match. Because a traditional NFA engine constructs a specific expansion of the regular expression for a successful match, it can capture subexpression matches and matching backreferences. However, because a traditional NFA backtracks, it can visit the same state multiple times if it arrives at the state over different paths. As a result, it can run exponentially slowly in the worst case. Because a traditional NFA engine accepts the first match it finds, it can also leave other (possibly longer) matches undiscovered.
|
||||
|
||||
POSIX NFA engines are like traditional NFA engines, except that they continue to backtrack until they can guarantee that they have found the longest match possible. As a result, a POSIX NFA engine is slower than a traditional NFA engine, and when you use a POSIX NFA engine, you cannot favor a shorter match over a longer one by changing the order of the backtracking search.
|
||||
|
||||
Traditional NFA engines are favored by programmers because they offer greater control over string matching than either DFA or POSIX NFA engines. Although, in the worst case, they can run slowly, you can steer them to find matches in linear or polynomial time by using patterns that reduce ambiguities and limit backtracking. In other words, although NFA engines trade performance for power and flexibility, in most cases they offer good to acceptable performance if a regular expression is well-written and avoids cases in which backtracking degrades performance exponentially.
|
||||
|
||||
> ** Note**
|
||||
>
|
||||
> For information about the performance penalty caused by excessive backtracking and ways to craft a regular expression to work around them, see [Backtracking in Regular Expressions](../backtracking.md).
|
||||
|
||||
## .NET Framework Engine Capabilities
|
||||
|
||||
To take advantage of the benefits of a traditional NFA engine, the .NET Core regular expression engine includes a complete set of constructs to enable programmers to steer the backtracking engine. These constructs can be used to find matches faster or to favor specific expansions over others.
|
||||
|
||||
Other features of the .NET Core regular expression engine include the following:
|
||||
|
||||
### Lazy quantifiers
|
||||
|
||||
Lazy quantifiers: **??**, __*?__, **+?**, **{**_n_**,**_m_**}?**. These constructs tell the backtracking engine to search the minimum number of repetitions first. In contrast, ordinary greedy quantifiers try to match the maximum number of repetitions first. The following example illustrates the difference between the two. A regular expression matches a sentence that ends in a number, and a capturing group is intended to extract that number. The regular expression `.+(\d+)\.` includes the greedy quantifier `.+`, which causes the regular expression engine to capture only the last digit of the number. In contrast, the regular expression `.+?(\d+)\.` includes the lazy quantifier `.+?`, which causes the regular expression engine to capture the entire number.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string greedyPattern = @".+(\d+)\.";
|
||||
string lazyPattern = @".+?(\d+)\.";
|
||||
string input = "This sentence ends with the number 107325.";
|
||||
Match match;
|
||||
|
||||
// Match using greedy quantifier .+.
|
||||
match = Regex.Match(input, greedyPattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine("Number at end of sentence (greedy): {0}",
|
||||
match.Groups[1].Value);
|
||||
else
|
||||
Console.WriteLine("{0} finds no match.", greedyPattern);
|
||||
// Match using lazy quantifier .+?.
|
||||
match = Regex.Match(input, lazyPattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine("Number at end of sentence (lazy): {0}",
|
||||
match.Groups[1].Value);
|
||||
else
|
||||
Console.WriteLine("{0} finds no match.", lazyPattern);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Number at end of sentence (greedy): 5
|
||||
// Number at end of sentence (lazy): 107325
|
||||
```
|
||||
|
||||
The greedy and lazy versions of this regular expression are defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`.+` (greedy quantifier) | Match at least one occurrence of any character. This causes the regular expression engine to match the entire string, and then to backtrack as needed to match the remainder of the pattern.
|
||||
`.+?` (lazy quantifier) | Match at least one occurrence of any character, but match as few as possible.
|
||||
`(\d+)` | Match at least one numeric character, and assign it to the first capturing group.
|
||||
`\.` | Match a period.
|
||||
|
||||
### Positive lookahead
|
||||
|
||||
Positive lookahead: **(?**=_subexpression_**)**. This feature allows the backtracking engine to return to the same spot in the text after matching a subexpression. It is useful for searching throughout the text by verifying multiple patterns that start from the same position. It also allows the engine to verify that a substring exists at the end of the match without including the substring in the matched text. The following example uses positive lookahead to extract the words in a sentence that are not followed by punctuation symbols.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b[A-Z]+\b(?=\P{P})";
|
||||
string input = "If so, what comes next?";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// If
|
||||
// what
|
||||
// comes
|
||||
```
|
||||
|
||||
The regular expression `\b[A-Z]+\b(?=\P{P})` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`[A-Z]+` | Match any alphabetic character one or more times. Because the [Regex.Matches](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method is called with the [RegexOptions.IgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) option, the comparison is case-insensitive.
|
||||
`\b` | End the match at a word boundary.
|
||||
`(?=\P{P})` | Look ahead to determine whether the next character is a punctuation symbol. If it is not, the match succeeds.
|
||||
|
||||
### Negative lookahead
|
||||
|
||||
Negative lookahead: **(?!**_subexpression_**)**. This feature adds the ability to match an expression only if a subexpression fails to match. This is particularly powerful for pruning a search, because it is often simpler to provide an expression for a case that should be eliminated than an expression for cases that must be included. For example, it is difficult to write an expression for words that do not begin with "non". The following example uses negative lookahead to exclude them.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(?!non)\w+\b";
|
||||
string input = "Nonsense is not always non-functional.";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// is
|
||||
// not
|
||||
// always
|
||||
// functional
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(?!non)\w+\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(?!non)` | Look ahead to ensure that the current string does not begin with "non". If it does, the match fails.
|
||||
`(\w+)` | Match one or more word characters.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
### Conditional evaluation
|
||||
|
||||
Conditional evaluation: **(?(**_expression_**)**_yes_|_no_**)** and**(?(**_name_**)**_yes_|_no_**)**, where *expression* is a subexpression to match, *name* is the name of a capturing group, *yes* is the string to match if *expression* is matched or *name* is a valid, non-empty captured group, and *no* is the subexpression to match if *expression* is not matched or *name* is not a valid, non-empty captured group. This feature allows the engine to search by using more than one alternate pattern, depending on the result of a previous subexpression match or the result of a zero-width assertion. This allows a more powerful form of backreference that permits, for example, matching a subexpression based on whether a previous subexpression was matched. The regular expression in the following example matches paragraphs that are intended for both public and internal use. Paragraphs intended only for internal use begin with a `<PRIVATE>` tag. The regular expression pattern `^(?<Pvt>\<PRIVATE\>\s)?(?(Pvt)((\w+\p{P}?\s)+)|((\w+\p{P}?\s)+))\r?$` uses conditional evaluation to assign the contents of paragraphs intended for public and for internal use to separate capturing groups. These paragraphs can then be handled differently.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "<PRIVATE> This is not for public consumption." + Environment.NewLine +
|
||||
"But this is for public consumption." + Environment.NewLine +
|
||||
"<PRIVATE> Again, this is confidential.\n";
|
||||
string pattern = @"^(?<Pvt>\<PRIVATE\>\s)?(?(Pvt)((\w+\p{P}?\s)+)|((\w+\p{P}?\s)+))\r?$";
|
||||
string publicDocument = null, privateDocument = null;
|
||||
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.Multiline))
|
||||
{
|
||||
if (match.Groups[1].Success) {
|
||||
privateDocument += match.Groups[1].Value + "\n";
|
||||
}
|
||||
else {
|
||||
publicDocument += match.Groups[3].Value + "\n";
|
||||
privateDocument += match.Groups[3].Value + "\n";
|
||||
}
|
||||
}
|
||||
|
||||
Console.WriteLine("Private Document:");
|
||||
Console.WriteLine(privateDocument);
|
||||
Console.WriteLine("Public Document:");
|
||||
Console.WriteLine(publicDocument);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Private Document:
|
||||
// This is not for public consumption.
|
||||
// But this is for public consumption.
|
||||
// Again, this is confidential.
|
||||
//
|
||||
// Public Document:
|
||||
// But this is for public consumption.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of a line.
|
||||
`(?<Pvt>\<PRIVATE\>\s)?` | Match zero or one occurrence of the string `<PRIVATE>` followed by a white-space character. Assign the match to a capturing group named Pvt.
|
||||
`(?(Pvt)((\w+\p{P}?\s)+)` | If the `Pvt` capturing group exists, match one or more occurrences of one or more word characters followed by zero or one punctuation separator followed by a white-space character. Assign the substring to the first capturing group.
|
||||
`|((\w+\p{P}?\s)+))` | If the `Pvt` capturing group does not exist, match one or more occurrences of one or more word characters followed by zero or one punctuation separator followed by a white-space character. Assign the substring to the third capturing group.
|
||||
`\r?$` | Match the end of a line or the end of the string.
|
||||
|
||||
### Balancing group definitions
|
||||
|
||||
Balancing group definitions: **(?<**_name1-name2_**>** _subexpression_**)**. This feature allows the regular expression engine to keep track of nested constructs such as parentheses or opening and closing brackets.
|
||||
|
||||
### Nonbacktracking subexpressions
|
||||
|
||||
Nonbacktracking subexpressions (also known as greedy subexpressions): **(?>**_subexpression_**)**. This feature allows the backtracking engine to guarantee that a subexpression matches only the first match found for that subexpression, as if the expression were running independent of its containing expression. If you do not use this construct, backtracking searches from the larger expression can change the behavior of a subexpression. For example, the regular expression `(a+)\w` matches one or more "a" characters, along with a word character that follows the sequence of "a" characters, and assigns the sequence of "a" characters to the first capturing group, However, if the final character of the input string is also an "a", it is matched by the `\w` language element and is not included in the captured group.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "aaaaa", "aaaaab" };
|
||||
string backtrackingPattern = @"(a+)\w";
|
||||
Match match;
|
||||
|
||||
foreach (string input in inputs) {
|
||||
Console.WriteLine("Input: {0}", input);
|
||||
match = Regex.Match(input, backtrackingPattern);
|
||||
Console.WriteLine(" Pattern: {0}", backtrackingPattern);
|
||||
if (match.Success) {
|
||||
Console.WriteLine(" Match: {0}", match.Value);
|
||||
Console.WriteLine(" Group 1: {0}", match.Groups[1].Value);
|
||||
}
|
||||
else {
|
||||
Console.WriteLine(" Match failed.");
|
||||
}
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Input: aaaaa
|
||||
// Pattern: (a+)\w
|
||||
// Match: aaaaa
|
||||
// Group 1: aaaa
|
||||
// Input: aaaaab
|
||||
// Pattern: (a+)\w
|
||||
// Match: aaaaab
|
||||
// Group 1: aaaaa
|
||||
```
|
||||
|
||||
The regular expression `((?>a+))\w` prevents this behavior. Because all consecutive "a" characters are matched without backtracking, the first capturing group includes all consecutive "a" characters. If the "a" characters are not followed by at least one more character other than "a", the match fails.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "aaaaa", "aaaaab" };
|
||||
string nonbacktrackingPattern = @"((?>a+))\w";
|
||||
Match match;
|
||||
|
||||
foreach (string input in inputs) {
|
||||
Console.WriteLine("Input: {0}", input);
|
||||
match = Regex.Match(input, nonbacktrackingPattern);
|
||||
Console.WriteLine(" Pattern: {0}", nonbacktrackingPattern);
|
||||
if (match.Success) {
|
||||
Console.WriteLine(" Match: {0}", match.Value);
|
||||
Console.WriteLine(" Group 1: {0}", match.Groups[1].Value);
|
||||
}
|
||||
else {
|
||||
Console.WriteLine(" Match failed.");
|
||||
}
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Input: aaaaa
|
||||
// Pattern: ((?>a+))\w
|
||||
// Match failed.
|
||||
// Input: aaaaab
|
||||
// Pattern: ((?>a+))\w
|
||||
// Match: aaaaab
|
||||
// Group 1: aaaaa
|
||||
```
|
||||
|
||||
### Right-to-left matching
|
||||
|
||||
Right-to-left matching, which is specified by supplying the [RegexOptions.RightToLeft](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_RightToLeft) option to a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class constructor or static instance matching method. This feature is useful when searching from right to left instead of from left to right, or in cases where it is more efficient to begin a match at the right part of the pattern instead of the left. As the following example illustrates, using right-to-left matching can change the behavior of greedy quantifiers. The example conducts two searches for a sentence that ends in a number. The left-to-right search that uses the greedy quantifier `+` matches one of the six digits in the sentence, whereas the right-to-left search matches all six digits. For an description of the regular expression pattern, see the example that illustrates lazy quantifiers earlier in this section.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string greedyPattern = @".+(\d+)\.";
|
||||
string input = "This sentence ends with the number 107325.";
|
||||
Match match;
|
||||
|
||||
// Match from left-to-right using lazy quantifier .+?.
|
||||
match = Regex.Match(input, greedyPattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine("Number at end of sentence (left-to-right): {0}",
|
||||
match.Groups[1].Value);
|
||||
else
|
||||
Console.WriteLine("{0} finds no match.", greedyPattern);
|
||||
|
||||
// Match from right-to-left using greedy quantifier .+.
|
||||
match = Regex.Match(input, greedyPattern, RegexOptions.RightToLeft);
|
||||
if (match.Success)
|
||||
Console.WriteLine("Number at end of sentence (right-to-left): {0}",
|
||||
match.Groups[1].Value);
|
||||
else
|
||||
Console.WriteLine("{0} finds no match.", greedyPattern);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Number at end of sentence (left-to-right): 5
|
||||
// Number at end of sentence (right-to-left): 107325
|
||||
```
|
||||
|
||||
### Positive and negative lookbehind
|
||||
|
||||
Positive and negative lookbehind: **(?<**=_subexpression_**)** for positive lookbehind, and **(?<!**_subexpression_**)** for negative lookbehind. This feature is similar to lookahead, which is discussed earlier in this topic. Because the regular expression engine allows complete right-to-left matching, regular expressions allow unrestricted lookbehinds. Positive and negative lookbehind can also be used to avoid nesting quantifiers when the nested subexpression is a superset of an outer expression. Regular expressions with such nested quantifiers often offer poor performance. For example, the following example verifies that a string begins and ends with an alphanumeric character, and that any other character in the string is one of a larger subset. It forms a portion of the regular expression used to validate e-mail addresses.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "jack.sprat", "dog#", "dog#1", "me.myself",
|
||||
"me.myself!" };
|
||||
string pattern = @"^[A-Z0-9]([-!#$%&'.*+/=?^`{}|~\w])*(?<=[A-Z0-9])$";
|
||||
foreach (string input in inputs) {
|
||||
if (Regex.IsMatch(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("{0}: Valid", input);
|
||||
else
|
||||
Console.WriteLine("{0}: Invalid", input);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// jack.sprat: Valid
|
||||
// dog#: Invalid
|
||||
// dog#1: Valid
|
||||
// me.myself: Valid
|
||||
// me.myself!: Invalid
|
||||
```
|
||||
|
||||
The regular expression `^[A-Z0-9]([-!#$%&'.*+/=?^`{}|~\w])*(?<=[A-Z0-9])$` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of the string.
|
||||
`[A-Z0-9]` | Match any numeric or alphanumeric character. (The comparison is case-insensitive.)
|
||||
`([-!#$%&'.*+/=?^`{}|~\w])*` | Match zero or more occurrences of any word character, or any of the following characters: -, !, #, $, %, &, ', ., *, +, /, =, ?, ^, `, {, }, |, or ~.
|
||||
`(?<=[A-Z0-9])` | Look behind to the previous character, which must be numeric or alphanumeric. (The comparison is case-insensitive.)
|
||||
`$` | End the match at the end of the string.
|
||||
|
||||
## Related Topics
|
||||
|
||||
Title | Description
|
||||
----- | -----------
|
||||
[Backtracking](../backtracking.md) | Provides information about how regular expression backtracking branches to find alternative matches.
|
||||
[Compilation and Reuse](../compilation.md) | Provides information about compiling and reusing regular expressions to increase performance.
|
||||
[Thread Safety](../threadsafety.md) | Provides information about regular expression thread safety and explains when you should synchronize access to regular expression objects.
|
||||
|
||||
## Reference
|
||||
|
||||
[System.Text.RegularExpressions](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions)
|
||||
|
|
@ -1,24 +0,0 @@
|
|||
---
|
||||
title: Thread Safety in Regular Expressions
|
||||
description: Thread Safety in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 33eaf2c6-d1c7-4c10-9b14-fe55b3550013
|
||||
---
|
||||
|
||||
# Thread Safety in Regular Expressions
|
||||
|
||||
The [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class itself is thread safe and immutable (read-only). That is, [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) objects can be created on any thread and shared between threads; matching methods can be called from any thread and never alter any global state.
|
||||
|
||||
However, result objects ([Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) and [MatchCollection)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) returned by [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) should be used on a single thread. Although many of these objects are logically immutable, their implementations could delay computation of some results to improve performance, and as a result, callers must serialize access to them.
|
||||
|
||||
If there is a need to share [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) result objects on multiple threads, these objects can be converted to thread-safe instances by calling their synchronized methods. With the exception of enumerators, all regular expression classes are thread safe or can be converted into thread-safe objects by a synchronized method.
|
||||
|
||||
Enumerators are the only exception. An application must serialize calls to collection enumerators. The rule is that if a collection can be enumerated on more than one thread simultaneously, you should synchronize enumerator methods on the root object of the collection traversed by the enumerator.
|
||||
|
|
@ -1,668 +0,0 @@
|
|||
---
|
||||
title: Best Practices for Regular Expressions
|
||||
description: Best Practices for Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 34bf2530-41a2-4f20-a9b8-be00c4f87c27
|
||||
---
|
||||
|
||||
# Best Practices for Regular Expressions
|
||||
|
||||
The regular expression engine in .NET Core is a powerful, full-featured tool that processes text based on pattern matches rather than on comparing and matching literal text. In most cases, it performs pattern matching rapidly and efficiently. However, in some cases, the regular expression engine can appear to be very slow. In extreme cases, it can even appear to stop responding as it processes a relatively small input over the course of hours or even days.
|
||||
|
||||
This topic outlines some of the best practices that developers can adopt to ensure that their regular expressions achieve optimal performance. It contains the following sections:
|
||||
|
||||
* [Consider the Input Source](#Consider-the-Input-Source)
|
||||
|
||||
* [Handle Object Instantiation Appropriately](#Handle-Object-Instantiation-Appropriately)
|
||||
|
||||
* [Take Charge of Backtracking](#Take-Charge-of-Backtracking)
|
||||
|
||||
* [Use Time-out Values](#Use-Time-out-Values)
|
||||
|
||||
* [Capture Only When Necessary](#Capture-Only-When-Necessary)
|
||||
|
||||
* [Related Topics](#Related-Topics)
|
||||
|
||||
## Consider the Input Source
|
||||
|
||||
In general, regular expressions can accept two types of input: constrained or unconstrained. Constrained input is text that originates from a known or reliable source and follows a predefined format. Unconstrained input is text that originates from an unreliable source, such as a web user, and may not follow a predefined or expected format.
|
||||
|
||||
Regular expression patterns are typically written to match valid input. That is, developers examine the text that they want to match and then write a regular expression pattern that matches it. Developers then determine whether this pattern requires correction or further elaboration by testing it with multiple valid input items. When the pattern matches all presumed valid inputs, it is declared to be production-ready and can be included in a released application. This makes a regular expression pattern suitable for matching constrained input. However, it does not make it suitable for matching unconstrained input.
|
||||
|
||||
To match unconstrained input, a regular expression must be able to efficiently handle three kinds of text:
|
||||
|
||||
• Text that matches the regular expression pattern.
|
||||
|
||||
• Text that does not match the regular expression pattern.
|
||||
|
||||
• Text that nearly matches the regular expression pattern.
|
||||
|
||||
The last text type is especially problematic for a regular expression that has been written to handle constrained input. If that regular expression also relies on extensive backtracking, the regular expression engine can spend an inordinate amount of time (in some cases, many hours or days) processing seemingly innocuous text.
|
||||
|
||||
> **Warning**
|
||||
>
|
||||
> The following example uses a regular expression that is prone to excessive backtracking and that is likely to reject valid email addresses. You should not use it in an email validation routine.
|
||||
|
||||
|
||||
For example, consider a very commonly used but extremely problematic regular expression for validating the alias of an email address. The regular expression `^[0-9A-Z]([-.\w]*[0-9A-Z])*$` is written to process what is considered to be a valid email address, which consists of an alphanumeric character, followed by zero or more characters that can be alphanumeric, periods, or hyphens. The regular expression must end with an alphanumeric character. However, as the following example shows, although this regular expression handles valid input easily, its performance is very inefficient when it is processing nearly valid input.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Diagnostics;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Stopwatch sw;
|
||||
string[] addresses = { "AAAAAAAAAAA@contoso.com",
|
||||
"AAAAAAAAAAaaaaaaaaaa!@contoso.com" };
|
||||
// The following regular expression should not actually be used to
|
||||
// validate an email address.
|
||||
string pattern = @"^[0-9A-Z]([-.\w]*[0-9A-Z])*$";
|
||||
string input;
|
||||
|
||||
foreach (var address in addresses) {
|
||||
string mailBox = address.Substring(0, address.IndexOf("@"));
|
||||
int index = 0;
|
||||
for (int ctr = mailBox.Length - 1; ctr >= 0; ctr--) {
|
||||
index++;
|
||||
|
||||
input = mailBox.Substring(ctr, index);
|
||||
sw = Stopwatch.StartNew();
|
||||
Match m = Regex.Match(input, pattern, RegexOptions.IgnoreCase);
|
||||
sw.Stop();
|
||||
if (m.Success)
|
||||
Console.WriteLine("{0,2}. Matched '{1,25}' in {2}",
|
||||
index, m.Value, sw.Elapsed);
|
||||
else
|
||||
Console.WriteLine("{0,2}. Failed '{1,25}' in {2}",
|
||||
index, input, sw.Elapsed);
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// The example displays output similar to the following:
|
||||
// 1. Matched ' A' in 00:00:00.0007122
|
||||
// 2. Matched ' AA' in 00:00:00.0000282
|
||||
// 3. Matched ' AAA' in 00:00:00.0000042
|
||||
// 4. Matched ' AAAA' in 00:00:00.0000038
|
||||
// 5. Matched ' AAAAA' in 00:00:00.0000042
|
||||
// 6. Matched ' AAAAAA' in 00:00:00.0000042
|
||||
// 7. Matched ' AAAAAAA' in 00:00:00.0000042
|
||||
// 8. Matched ' AAAAAAAA' in 00:00:00.0000087
|
||||
// 9. Matched ' AAAAAAAAA' in 00:00:00.0000045
|
||||
// 10. Matched ' AAAAAAAAAA' in 00:00:00.0000045
|
||||
// 11. Matched ' AAAAAAAAAAA' in 00:00:00.0000045
|
||||
//
|
||||
// 1. Failed ' !' in 00:00:00.0000447
|
||||
// 2. Failed ' a!' in 00:00:00.0000071
|
||||
// 3. Failed ' aa!' in 00:00:00.0000071
|
||||
// 4. Failed ' aaa!' in 00:00:00.0000061
|
||||
// 5. Failed ' aaaa!' in 00:00:00.0000081
|
||||
// 6. Failed ' aaaaa!' in 00:00:00.0000126
|
||||
// 7. Failed ' aaaaaa!' in 00:00:00.0000359
|
||||
// 8. Failed ' aaaaaaa!' in 00:00:00.0000414
|
||||
// 9. Failed ' aaaaaaaa!' in 00:00:00.0000758
|
||||
// 10. Failed ' aaaaaaaaa!' in 00:00:00.0001462
|
||||
// 11. Failed ' aaaaaaaaaa!' in 00:00:00.0002885
|
||||
// 12. Failed ' Aaaaaaaaaaa!' in 00:00:00.0005780
|
||||
// 13. Failed ' AAaaaaaaaaaa!' in 00:00:00.0011628
|
||||
// 14. Failed ' AAAaaaaaaaaaa!' in 00:00:00.0022851
|
||||
// 15. Failed ' AAAAaaaaaaaaaa!' in 00:00:00.0045864
|
||||
// 16. Failed ' AAAAAaaaaaaaaaa!' in 00:00:00.0093168
|
||||
// 17. Failed ' AAAAAAaaaaaaaaaa!' in 00:00:00.0185993
|
||||
// 18. Failed ' AAAAAAAaaaaaaaaaa!' in 00:00:00.0366723
|
||||
// 19. Failed ' AAAAAAAAaaaaaaaaaa!' in 00:00:00.1370108
|
||||
// 20. Failed ' AAAAAAAAAaaaaaaaaaa!' in 00:00:00.1553966
|
||||
// 21. Failed ' AAAAAAAAAAaaaaaaaaaa!' in 00:00:00.3223372
|
||||
```
|
||||
|
||||
As the output from the example shows, the regular expression engine processes the valid email alias in about the same time interval regardless of its length. On the other hand, when the nearly valid email address has more than five characters, processing time approximately doubles for each additional character in the string. This means that a nearly valid 28-character string would take over an hour to process, and a nearly valid 33-character string would take nearly a day to process.
|
||||
|
||||
Because this regular expression was developed solely by considering the format of input to be matched, it fails to take account of input that does not match the pattern. This, in turn, can allow unconstrained input that nearly matches the regular expression pattern to significantly degrade performance.
|
||||
|
||||
To solve this problem, you can do the following:
|
||||
|
||||
* When developing a pattern, you should consider how backtracking might affect the performance of the regular expression engine, particularly if your regular expression is designed to process unconstrained input. For more information, see the [Take Charge of Backtracking](#Take-Charge-of-Backtracking) section.
|
||||
|
||||
* Thoroughly test your regular expression using invalid and near-valid input as well as valid input. To generate input for a particular regular expression randomly, you can use [Rex](http://research.microsoft.com/en-us/projects/rex/), which is a regular expression exploration tool from Microsoft Research.
|
||||
|
||||
## Handle Object Instantiation Appropriately
|
||||
|
||||
At the heart of .NET Core’s regular expression object model is the [System.Text.RegularExpressions.Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class, which represents the regular expression engine. Often, the single greatest factor that affects regular expression performance is the way in which the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) engine is used. Defining a regular expression involves tightly coupling the regular expression engine with a regular expression pattern. That coupling process, whether it involves instantiating a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object by passing its constructor a regular expression pattern or calling a static method by passing it the regular expression pattern along with the string to be analyzed, is by necessity an expensive one.
|
||||
|
||||
You can couple the regular expression engine with a particular regular expression pattern and then use the engine to match text in several ways:
|
||||
|
||||
* You can call a static pattern-matching method, such as [Regex.Match(String, String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_System_String_). This does not require instantiation of a regular expression object.
|
||||
|
||||
* You can instantiate a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object and call an instance pattern-matching method of an interpreted regular expression. This is the default method for binding the regular expression engine to a regular expression pattern. It results when a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object is instantiated without an options argument that includes the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) flag.
|
||||
|
||||
* You can instantiate a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object and call an instance pattern-matching method of a compiled regular expression. Regular expression objects represent compiled patterns when a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object is instantiated with an options argument that includes the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) flag.
|
||||
> **I,portant**
|
||||
>
|
||||
> The form of the method call (static, interpreted, compiled) affects performance if the same regular expression is used repeatedly in method calls, or if an application makes extensive use of regular expression objects.
|
||||
|
||||
### Static Regular Expressions
|
||||
|
||||
Static regular expression methods are recommended as an alternative to repeatedly instantiating a regular expression object with the same regular expression. Unlike regular expression patterns used by regular expression objects, either the operation codes or the compiled Microsoft intermediate language (MSIL) from patterns used in instance method calls is cached internally by the regular expression engine.
|
||||
|
||||
For example, you might call a method to validate user input. In this example, a method named `IsValidCurrency` checks whether the user has entered a currency symbol followed by at least one decimal digit. A very inefficient implementation of the `IsValidCurrency` method is shown in the following example. Note that each method call reinstantiates a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object with the same pattern. This, in turn, means that the regular expression pattern must be recompiled each time the method is called.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class RegexLib
|
||||
{
|
||||
public static bool IsValidCurrency(string currencyValue)
|
||||
{
|
||||
string pattern = @"\p{Sc}+\s*\d+";
|
||||
Regex currencyRegex = new Regex(pattern);
|
||||
return currencyRegex.IsMatch(currencyValue);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
You should replace this inefficient code with a call to the static [Regex.IsMatch(String, String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_System_String_) method. This eliminates the need to instantiate a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object each time you want to call a pattern-matching method, and enables the regular expression engine to retrieve a compiled version of the regular expression from its cache.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class RegexLib
|
||||
{
|
||||
public static bool IsValidCurrency(string currencyValue)
|
||||
{
|
||||
string pattern = @"\p{Sc}+\s*\d+";
|
||||
return Regex.IsMatch(currencyValue, pattern);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
By default, the last 15 most recently used static regular expression patterns are cached. For applications that require a larger number of cached static regular expressions, the size of the cache can be adjusted by setting the [Regex.CacheSize](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_CacheSize) property.
|
||||
|
||||
The regular expression `\p{Sc}+\s*\d+` that is used in this example verifies that the input string consists of a currency symbol and at least one decimal digit. The pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\p{Sc}+` | Match one or more characters in the Unicode Symbol, Currency category.
|
||||
`\s*` | Match zero or more white-space characters.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
|
||||
### Interpreted vs. Compiled Regular Expressions
|
||||
|
||||
Regular expression patterns that are not bound to the regular expression engine through the specification of the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) option are interpreted. When a regular expression object is instantiated, the regular expression engine converts the regular expression to a set of operation codes. When an instance method is called, the operation codes are converted to MSIL and executed by the JIT compiler. Similarly, when a static regular expression method is called and the regular expression cannot be found in the cache, the regular expression engine converts the regular expression to a set of operation codes and stores them in the cache. It then converts these operation codes to MSIL so that the JIT compiler can execute them. Interpreted regular expressions reduce startup time at the cost of slower execution time. Because of this, they are best used when the regular expression is used in a small number of method calls, or if the exact number of calls to regular expression methods is unknown but is expected to be small. As the number of method calls increases, the performance gain from reduced startup time is outstripped by the slower execution speed.
|
||||
|
||||
Regular expression patterns that are bound to the regular expression engine through the specification of the [RegexOptions.Compiled](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Compiled) option are compiled. This means that, when a regular expression object is instantiated, or when a static regular expression method is called and the regular expression cannot be found in the cache, the regular expression engine converts the regular expression to an intermediary set of operation codes, which it then converts to MSIL. When a method is called, the JIT compiler executes the MSIL. In contrast to interpreted regular expressions, compiled regular expressions increase startup time but execute individual pattern-matching methods faster. As a result, the performance benefit that results from compiling the regular expression increases in proportion to the number of regular expression methods called.
|
||||
|
||||
To summarize, we recommend that you use interpreted regular expressions when you call regular expression methods with a specific regular expression relatively infrequently. You should use compiled regular expressions when you call regular expression methods with a specific regular expression relatively frequently. The exact threshold at which the slower execution speeds of interpreted regular expressions outweigh gains from their reduced startup time, or the threshold at which the slower startup times of compiled regular expressions outweigh gains from their faster execution speeds, is difficult to determine. It depends on a variety of factors, including the complexity of the regular expression and the specific data that it processes. To determine whether interpreted or compiled regular expressions offer the best performance for your particular application scenario, you can use the [Stopwatch](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Stopwatch) class to compare their execution times.
|
||||
|
||||
The following example compares the performance of compiled and interpreted regular expressions when reading the first ten sentences and when reading all the sentences in the text of Theodore Dreiser's The Financier. As the output from the example shows, when only ten calls are made to regular expression matching methods, an interpreted regular expression offers better performance than a compiled regular expression. However, a compiled regular expression offers better performance when a large number of calls (in this case, over 13,000) are made.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Diagnostics;
|
||||
using System.IO;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\w+((\r?\n)|,?\s))*\w+[.?:;!]";
|
||||
Stopwatch sw;
|
||||
Match match;
|
||||
int ctr;
|
||||
|
||||
StreamReader inFile = new StreamReader(@".\Dreiser_TheFinancier.txt");
|
||||
string input = inFile.ReadToEnd();
|
||||
inFile.Close();
|
||||
|
||||
// Read first ten sentences with interpreted regex.
|
||||
Console.WriteLine("10 Sentences with Interpreted Regex:");
|
||||
sw = Stopwatch.StartNew();
|
||||
Regex int10 = new Regex(pattern, RegexOptions.Singleline);
|
||||
match = int10.Match(input);
|
||||
for (ctr = 0; ctr <= 9; ctr++) {
|
||||
if (match.Success)
|
||||
// Do nothing with the match except get the next match.
|
||||
match = match.NextMatch();
|
||||
else
|
||||
break;
|
||||
}
|
||||
sw.Stop();
|
||||
Console.WriteLine(" {0} matches in {1}", ctr, sw.Elapsed);
|
||||
|
||||
// Read first ten sentences with compiled regex.
|
||||
Console.WriteLine("10 Sentences with Compiled Regex:");
|
||||
sw = Stopwatch.StartNew();
|
||||
Regex comp10 = new Regex(pattern,
|
||||
RegexOptions.Singleline | RegexOptions.Compiled);
|
||||
match = comp10.Match(input);
|
||||
for (ctr = 0; ctr <= 9; ctr++) {
|
||||
if (match.Success)
|
||||
// Do nothing with the match except get the next match.
|
||||
match = match.NextMatch();
|
||||
else
|
||||
break;
|
||||
}
|
||||
sw.Stop();
|
||||
Console.WriteLine(" {0} matches in {1}", ctr, sw.Elapsed);
|
||||
|
||||
// Read all sentences with interpreted regex.
|
||||
Console.WriteLine("All Sentences with Interpreted Regex:");
|
||||
sw = Stopwatch.StartNew();
|
||||
Regex intAll = new Regex(pattern, RegexOptions.Singleline);
|
||||
match = intAll.Match(input);
|
||||
int matches = 0;
|
||||
while (match.Success) {
|
||||
matches++;
|
||||
// Do nothing with the match except get the next match.
|
||||
match = match.NextMatch();
|
||||
}
|
||||
sw.Stop();
|
||||
Console.WriteLine(" {0:N0} matches in {1}", matches, sw.Elapsed);
|
||||
|
||||
// Read all sentnces with compiled regex.
|
||||
Console.WriteLine("All Sentences with Compiled Regex:");
|
||||
sw = Stopwatch.StartNew();
|
||||
Regex compAll = new Regex(pattern,
|
||||
RegexOptions.Singleline | RegexOptions.Compiled);
|
||||
match = compAll.Match(input);
|
||||
matches = 0;
|
||||
while (match.Success) {
|
||||
matches++;
|
||||
// Do nothing with the match except get the next match.
|
||||
match = match.NextMatch();
|
||||
}
|
||||
sw.Stop();
|
||||
Console.WriteLine(" {0:N0} matches in {1}", matches, sw.Elapsed);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 10 Sentences with Interpreted Regex:
|
||||
// 10 matches in 00:00:00.0047491
|
||||
// 10 Sentences with Compiled Regex:
|
||||
// 10 matches in 00:00:00.0141872
|
||||
// All Sentences with Interpreted Regex:
|
||||
// 13,443 matches in 00:00:01.1929928
|
||||
// All Sentences with Compiled Regex:
|
||||
// 13,443 matches in 00:00:00.7635869
|
||||
//
|
||||
// >compare1
|
||||
// 10 Sentences with Interpreted Regex:
|
||||
// 10 matches in 00:00:00.0046914
|
||||
// 10 Sentences with Compiled Regex:
|
||||
// 10 matches in 00:00:00.0143727
|
||||
// All Sentences with Interpreted Regex:
|
||||
// 13,443 matches in 00:00:01.1514100
|
||||
// All Sentences with Compiled Regex:
|
||||
// 13,443 matches in 00:00:00.7432921
|
||||
```
|
||||
|
||||
The regular expression pattern used in the example, `\b(\w+((\r?\n)|,?\s))*\w+[.?:;!]`, is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\w+` | Match one or more word characters.
|
||||
`(\r?\n)|,?\s)` | Match either zero or one carriage return followed by a newline character, or zero or one comma followed by a white-space character.
|
||||
`(\w+((\r?\n)|,?\s))*` | Match zero or more occurrences of one or more word characters that are followed either by zero or one carriage return and a newline character, or by zero or one comma followed by a white-space character.
|
||||
`\w+` | Match one or more word characters.
|
||||
`[.?:;!]` | Match a period, question mark, colon, semicolon, or exclamation point.
|
||||
|
||||
## Take Charge of Backtracking
|
||||
|
||||
Ordinarily, the regular expression engine uses linear progression to move through an input string and compare it to a regular expression pattern. However, when indeterminate quantifiers such as __*__, **+**, and **?** are used in a regular expression pattern, the regular expression engine may give up a portion of successful partial matches and return to a previously saved state in order to search for a successful match for the entire pattern. This process is known as backtracking.
|
||||
|
||||
Support for backtracking gives regular expressions power and flexibility. It also places the responsibility for controlling the operation of the regular expression engine in the hands of regular expression developers. Because developers are often not aware of this responsibility, their misuse of backtracking or reliance on excessive backtracking often plays the most significant role in degrading regular expression performance. In a worst-case scenario, execution time can double for each additional character in the input string. In fact, by using backtracking excessively, it is easy to create the programmatic equivalent of an endless loop if input nearly matches the regular expression pattern; the regular expression engine may take hours or even days to process a relatively short input string.
|
||||
|
||||
Often, applications pay a performance penalty for using backtracking despite the fact that backtracking is not essential for a match. For example, the regular expression `\b\p{Lu}\w*\b` matches all words that begin with an uppercase character, as the following table shows.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\p{Lu}` | Match an uppercase character.
|
||||
`\w*` | Match zero or more word characters.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
Because a word boundary is not the same as, or a subset of, a word character, there is no possibility that the regular expression engine will cross a word boundary when matching word characters. This means that for this regular expression, backtracking can never contribute to the overall success of any match -- it can only degrade performance, because the regular expression engine is forced to save its state for each successful preliminary match of a word character.
|
||||
|
||||
If you determine that backtracking is not necessary, you can disable it by using the **(?>**_subexpression_**)** language element. The following example parses an input string by using two regular expressions. The first, `\b\p{Lu}\w*\b`, relies on backtracking. The second, `\b\p{Lu}(?>\w*)\b`, disables backtracking. As the output from the example shows, they both produce the same result.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "This this word Sentence name Capital";
|
||||
string pattern = @"\b\p{Lu}\w*\b";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
|
||||
Console.WriteLine();
|
||||
|
||||
pattern = @"\b\p{Lu}(?>\w*)\b";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// This
|
||||
// Sentence
|
||||
// Capital
|
||||
//
|
||||
// This
|
||||
// Sentence
|
||||
// Capital
|
||||
```
|
||||
|
||||
In many cases, backtracking is essential for matching a regular expression pattern to input text. However, excessive backtracking can severely degrade performance and create the impression that an application has stopped responding. In particular, this happens when quantifiers are nested and the text that matches the outer subexpression is a subset of the text that matches the inner subexpression.
|
||||
|
||||
> **Warning**
|
||||
>
|
||||
> In addition to avoiding excessive backtracking, you should use the timeout feature to ensure that excessive backtracking does not severely degrade regular expression performance. For more information, see the [Use Timeout Values](#Use-Timeout-Values) section.
|
||||
|
||||
For example, the regular expression pattern `^[0-9A-Z]([-.\w]*[0-9A-Z])*\$$` is intended to match a part number that consists of at least one alphanumeric character. Any additional characters can consist of an alphanumeric character, a hyphen, an underscore, or a period, though the last character must be alphanumeric. A dollar sign terminates the part number. In some cases, this regular expression pattern can exhibit extremely poor performance because quantifiers are nested, and because the subexpression `[0-9A-Z]` is a subset of the subexpression `[-.\w]*`.
|
||||
|
||||
In these cases, you can optimize regular expression performance by removing the nested quantifiers and replacing the outer subexpression with a zero-width lookahead or lookbehind assertion. Lookahead and lookbehind assertions are anchors; they do not move the pointer in the input string, but instead look ahead or behind to check whether a specified condition is met. For example, the part number regular expression can be rewritten as `^[0-9A-Z][-.\w]*(?<=[0-9A-Z])\$$`. This regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of the input string.
|
||||
`[0-9A-Z]` | Match an alphanumeric character. The part number must consist of at least this character.
|
||||
`[-.\w]*` | Match zero or more occurrences of any word character, hyphen, or period.
|
||||
`\$] | Match a dollar sign.
|
||||
`(?<=[0-9A-Z])` | Look ahead of the ending dollar sign to ensure that the previous character is alphanumeric.
|
||||
`$` End the match at the end of the input string.
|
||||
|
||||
The following example illustrates the use of this regular expression to match an array containing possible part numbers.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"^[0-9A-Z][-.\w]*(?<=[0-9A-Z])\$$";
|
||||
string[] partNos = { "A1C$", "A4", "A4$", "A1603D$", "A1603D#" };
|
||||
|
||||
foreach (var input in partNos) {
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine(match.Value);
|
||||
else
|
||||
Console.WriteLine("Match not found.");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// A1C$
|
||||
// Match not found.
|
||||
// A4$
|
||||
// A1603D$
|
||||
// Match not found.
|
||||
```
|
||||
|
||||
The regular expression language in .NET Core includes the following language elements that you can use to eliminate nested quantifiers.
|
||||
|
||||
Language element | Description
|
||||
---------------- | -----------
|
||||
**(?**=_subexpression_**)** | Zero-width positive lookahead. Look ahead of the current position to determine whether *subexpression* matches the input string.
|
||||
**(?!**_subexpression_**)** | Zero-width negative lookahead. Look ahead of the current position to determine whether *subexpression* does not match the input string.
|
||||
**(?<**=_subexpression_**)** | Zero-width positive lookbehind. Look behind the current position to determine whether *subexpression* matches the input string.
|
||||
**(?<!**_subexpression_**)** | Zero-width negative lookbehind. Look behind the current position to determine whether *subexpression* does not match the input string.
|
||||
|
||||
## Use Time-out Values
|
||||
|
||||
If your regular expressions processes input that nearly matches the regular expression pattern, it can often rely on excessive backtracking, which impacts its performance significantly. In addition to carefully considering your use of backtracking and testing the regular expression against near-matching input, you should always set a time-out value to ensure that the impact of excessive backtracking, if it occurs, is minimized.
|
||||
|
||||
The regular expression time-out interval defines the period of time that the regular expression engine will look for a single match before it times out. The default time-out interval is [Regex.InfiniteMatchTimeout](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_InfiniteMatchTimeout), which means that the regular expression will not time out. You can override this value and define a time-out interval as follows:
|
||||
|
||||
* By providing a time-out value when you instantiate a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object by calling the [Regex.Regex(String, RegexOptions, TimeSpan)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex__ctor_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_) constructor.
|
||||
|
||||
* By calling a static pattern matching method, such as [Regex.Match(String, String, RegexOptions, TimeSpan)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_) or [Regex.Replace(String, String, String, RegexOptions, TimeSpan)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_), that includes a *matchTimeout* parameter.
|
||||
|
||||
If you have defined a time-out interval and a match is not found at the end of that interval, the regular expression method throws a [RegexMatchTimeoutException](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexMatchTimeoutException) exception. In your exception handler, you can choose to retry the match with a longer time-out interval, abandon the match attempt and assume that there is no match, or abandon the match attempt and log the exception information for future analysis.
|
||||
|
||||
The following example defines a `GetWordData` method that instantiates a regular expression with a time-out interval of 350 milliseconds to calculate the number of words and average number of characters in a word in a text document. If the matching operation times out, the time-out interval is increased by 350 milliseconds and the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object is re-instantiated. If the new time-out interval exceeds 1 second, the method re-throws the exception to the caller.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.IO;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
RegexUtilities util = new RegexUtilities();
|
||||
string title = "Doyle - The Hound of the Baskervilles.txt";
|
||||
try {
|
||||
var info = util.GetWordData(title);
|
||||
Console.WriteLine("Words: {0:N0}", info.Item1);
|
||||
Console.WriteLine("Average Word Length: {0:N2} characters", info.Item2);
|
||||
}
|
||||
catch (IOException e) {
|
||||
Console.WriteLine("IOException reading file '{0}'", title);
|
||||
Console.WriteLine(e.Message);
|
||||
}
|
||||
catch (RegexMatchTimeoutException e) {
|
||||
Console.WriteLine("The operation timed out after {0:N0} milliseconds",
|
||||
e.MatchTimeout.TotalMilliseconds);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public class RegexUtilities
|
||||
{
|
||||
public Tuple<int, double> GetWordData(string filename)
|
||||
{
|
||||
const int MAX_TIMEOUT = 1000; // Maximum timeout interval in milliseconds.
|
||||
const int INCREMENT = 350; // Milliseconds increment of timeout.
|
||||
|
||||
List<string> exclusions = new List<string>( new string[] { "a", "an", "the" });
|
||||
int[] wordLengths = new int[29]; // Allocate an array of more than ample size.
|
||||
string input = null;
|
||||
StreamReader sr = null;
|
||||
try {
|
||||
sr = new StreamReader(filename);
|
||||
input = sr.ReadToEnd();
|
||||
}
|
||||
catch (FileNotFoundException e) {
|
||||
string msg = String.Format("Unable to find the file '{0}'", filename);
|
||||
throw new IOException(msg, e);
|
||||
}
|
||||
catch (IOException e) {
|
||||
throw new IOException(e.Message, e);
|
||||
}
|
||||
finally {
|
||||
if (sr != null) sr.Close();
|
||||
}
|
||||
|
||||
int timeoutInterval = INCREMENT;
|
||||
bool init = false;
|
||||
Regex rgx = null;
|
||||
Match m = null;
|
||||
int indexPos = 0;
|
||||
do {
|
||||
try {
|
||||
if (! init) {
|
||||
rgx = new Regex(@"\b\w+\b", RegexOptions.None,
|
||||
TimeSpan.FromMilliseconds(timeoutInterval));
|
||||
m = rgx.Match(input, indexPos);
|
||||
init = true;
|
||||
}
|
||||
else {
|
||||
m = m.NextMatch();
|
||||
}
|
||||
if (m.Success) {
|
||||
if ( !exclusions.Contains(m.Value.ToLower()))
|
||||
wordLengths[m.Value.Length]++;
|
||||
|
||||
indexPos += m.Length + 1;
|
||||
}
|
||||
}
|
||||
catch (RegexMatchTimeoutException e) {
|
||||
if (e.MatchTimeout.TotalMilliseconds < MAX_TIMEOUT) {
|
||||
timeoutInterval += INCREMENT;
|
||||
init = false;
|
||||
}
|
||||
else {
|
||||
// Rethrow the exception.
|
||||
throw;
|
||||
}
|
||||
}
|
||||
} while (m.Success);
|
||||
|
||||
// If regex completed successfully, calculate number of words and average length.
|
||||
int nWords = 0;
|
||||
long totalLength = 0;
|
||||
|
||||
for (int ctr = wordLengths.GetLowerBound(0); ctr <= wordLengths.GetUpperBound(0); ctr++) {
|
||||
nWords += wordLengths[ctr];
|
||||
totalLength += ctr * wordLengths[ctr];
|
||||
}
|
||||
return new Tuple<int, double>(nWords, totalLength/nWords);
|
||||
}
|
||||
}
|
||||
```
|
||||
## Capture Only When Necessary
|
||||
|
||||
Regular expressions in .NET Core support a number of grouping constructs, which let you group a regular expression pattern into one or more subexpressions. The most commonly used grouping constructs in .NET Core regular expression language are **(**_subexpression_**)**, which defines a numbered capturing group, and **(?<*_name_**>**_subexpression_**)**, which defines a named capturing group. Grouping constructs are essential for creating backreferences and for defining a subexpression to which a quantifier is applied.
|
||||
|
||||
However, the use of these language elements has a cost. They cause the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property to be populated with the most recent unnamed or named captures, and if a single grouping construct has captured multiple substrings in the input string, they also populate the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object returned by the [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) property of a particular capturing group with multiple [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) objects.
|
||||
|
||||
Often, grouping constructs are used in a regular expression only so that quantifiers can be applied to them, and the groups captured by these subexpressions are not subsequently used. For example, the regular expression `\b(\w+[;,]?\s?)+[.?!]` is designed to capture an entire sentence. The following table describes the language elements in this regular expression pattern and their effect on the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object's [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) and [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) collections.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\w+` | Match one or more word characters.
|
||||
`[;,]?` | Match zero or one comma or semicolon.
|
||||
`\s?` | Match zero or one white-space character.
|
||||
`(\w+[;,]?\s?)+` | Match one or more occurrences of one or more word characters followed by an optional comma or semicolon followed by an optional white-space character. This defines the first capturing group, which is necessary so that the combination of multiple word characters (that is, a word) followed by an optional punctuation symbol will be repeated until the regular expression engine reaches the end of a sentence.
|
||||
`[.?!]` | Match a period, question mark, or exclamation point.
|
||||
|
||||
As the following example shows, when a match is found, both the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) and [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) objects are populated with captures from the match. In this case, the capturing group `(\w+[;,]?\s?)` exists so that the **+** quantifier can be applied to it, which enables the regular expression pattern to match each word in a sentence. Otherwise, it would match the last word in a sentence.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "This is one sentence. This is another.";
|
||||
string pattern = @"\b(\w+[;,]?\s?)+[.?!]";
|
||||
|
||||
foreach (Match match in Regex.Matches(input, pattern)) {
|
||||
Console.WriteLine("Match: '{0}' at index {1}.",
|
||||
match.Value, match.Index);
|
||||
int grpCtr = 0;
|
||||
foreach (Group grp in match.Groups) {
|
||||
Console.WriteLine(" Group {0}: '{1}' at index {2}.",
|
||||
grpCtr, grp.Value, grp.Index);
|
||||
int capCtr = 0;
|
||||
foreach (Capture cap in grp.Captures) {
|
||||
Console.WriteLine(" Capture {0}: '{1}' at {2}.",
|
||||
capCtr, cap.Value, cap.Index);
|
||||
capCtr++;
|
||||
}
|
||||
grpCtr++;
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: 'This is one sentence.' at index 0.
|
||||
// Group 0: 'This is one sentence.' at index 0.
|
||||
// Capture 0: 'This is one sentence.' at 0.
|
||||
// Group 1: 'sentence' at index 12.
|
||||
// Capture 0: 'This ' at 0.
|
||||
// Capture 1: 'is ' at 5.
|
||||
// Capture 2: 'one ' at 8.
|
||||
// Capture 3: 'sentence' at 12.
|
||||
//
|
||||
// Match: 'This is another.' at index 22.
|
||||
// Group 0: 'This is another.' at index 22.
|
||||
// Capture 0: 'This is another.' at 22.
|
||||
// Group 1: 'another' at index 30.
|
||||
// Capture 0: 'This ' at 22.
|
||||
// Capture 1: 'is ' at 27.
|
||||
// Capture 2: 'another' at 30.
|
||||
```
|
||||
|
||||
When you use subexpressions only to apply quantifiers to them, and you are not interested in the captured text, you should disable group captures. For example, the **(?:**_subexpression_**)** language element prevents the group to which it applies from capturing matched substrings. In the following example, the regular expression pattern from the previous example is changed to `\b(?:\w+[;,]?\s?)+[.?!]`. As the output shows, it prevents the regular expression engine from populating the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) and [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) collections.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "This is one sentence. This is another.";
|
||||
string pattern = @"\b(?:\w+[;,]?\s?)+[.?!]";
|
||||
|
||||
foreach (Match match in Regex.Matches(input, pattern)) {
|
||||
Console.WriteLine("Match: '{0}' at index {1}.",
|
||||
match.Value, match.Index);
|
||||
int grpCtr = 0;
|
||||
foreach (Group grp in match.Groups) {
|
||||
Console.WriteLine(" Group {0}: '{1}' at index {2}.",
|
||||
grpCtr, grp.Value, grp.Index);
|
||||
int capCtr = 0;
|
||||
foreach (Capture cap in grp.Captures) {
|
||||
Console.WriteLine(" Capture {0}: '{1}' at {2}.",
|
||||
capCtr, cap.Value, cap.Index);
|
||||
capCtr++;
|
||||
}
|
||||
grpCtr++;
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: 'This is one sentence.' at index 0.
|
||||
// Group 0: 'This is one sentence.' at index 0.
|
||||
// Capture 0: 'This is one sentence.' at 0.
|
||||
//
|
||||
// Match: 'This is another.' at index 22.
|
||||
// Group 0: 'This is another.' at index 22.
|
||||
// Capture 0: 'This is another.' at 22.
|
||||
```
|
||||
|
||||
You can disable captures in one of the following ways:
|
||||
|
||||
* Use the **(?:**_subexpression_**)** language element. This element prevents the capture of matched substrings in the group to which it applies. It does not disable substring captures in any nested groups.
|
||||
|
||||
* Use the [RegexOptions.ExplicitCapture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_ExplicitCapture) option. It disables all unnamed or implicit captures in the regular expression pattern. When you use this option, only substrings that match named groups defined with the **(?<**_name_**>**_subexpression_**)** language element can be captured. The [ExplicitCapture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_ExplicitCapture) flag can be passed to the options parameter of a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class constructor or to the options parameter of a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) static matching method.
|
||||
|
||||
* Use the **n** option in the **(?imnsx)** language element. This option disables all unnamed or implicit captures from the point in the regular expression pattern at which the element appears. Captures are disabled either until the end of the pattern or until the **(-n)** option enables unnamed or implicit captures.
|
||||
|
||||
* Use the **n** option in the **(?imnsx:**_subexpression_**)** language element. This option disables all unnamed or implicit captures in *subexpression*. Captures by any unnamed or implicit nested capturing groups are disabled as well.
|
||||
|
||||
|
||||
|
|
@ -1,98 +0,0 @@
|
|||
---
|
||||
title: Regular Expression Example: Changing Date Formats
|
||||
description: Regular Expression Example: Changing Date Formats
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 3c7cad06-fcd2-4967-b8d7-86bd7fb33d82
|
||||
---
|
||||
|
||||
# Regular Expression Example: Changing Date Formats
|
||||
|
||||
|
||||
The following code example uses the [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_) method to replace dates that have the form *mm/dd/yy* with dates that have the form *dd-mm-yy*.
|
||||
|
||||
## Example
|
||||
|
||||
```csharp
|
||||
static string MDYToDMY(string input)
|
||||
{
|
||||
try {
|
||||
return Regex.Replace(input,
|
||||
"\\b(?<month>\\d{1,2})/(?<day>\\d{1,2})/(?<year>\\d{2,4})\\b",
|
||||
"${day}-${month}-${year}", RegexOptions.None,
|
||||
TimeSpan.FromMilliseconds(150));
|
||||
}
|
||||
catch (RegexMatchTimeoutException) {
|
||||
return input;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following code shows how the `MDYToDMY` method can be called in an application.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Class1
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string dateString = DateTime.Today.ToString("d",
|
||||
DateTimeFormatInfo.InvariantInfo);
|
||||
string resultString = MDYToDMY(dateString);
|
||||
Console.WriteLine("Converted {0} to {1}.", dateString, resultString);
|
||||
}
|
||||
|
||||
static string MDYToDMY(string input)
|
||||
{
|
||||
try {
|
||||
return Regex.Replace(input,
|
||||
"\\b(?<month>\\d{1,2})/(?<day>\\d{1,2})/(?<year>\\d{2,4})\\b",
|
||||
"${day}-${month}-${year}", RegexOptions.None,
|
||||
TimeSpan.FromMilliseconds(150));
|
||||
}
|
||||
catch (RegexMatchTimeoutException) {
|
||||
return input;
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
// The example displays the following output to the console if run on 8/21/2007:
|
||||
// Converted 08/21/2007 to 21-08-2007.
|
||||
```
|
||||
|
||||
## Comments
|
||||
|
||||
The regular expression pattern `\b(?<month>\d{1,2})/(?<day>\d{1,2})/(?<year>\d{2,4})\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(?<month>\d{1,2})` | Match one or two decimal digits. This is the `month` captured group.
|
||||
`/` | Match the slash mark.
|
||||
`(?<day>\d{1,2})` | Match one or two decimal digits. This is the `day` captured group.
|
||||
`/` | Match the slash mark.
|
||||
`(?<year>\d{2,4})` | Match from two to four decimal digits. This is the `year` captured group.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
The pattern `${day}-${month}-${year}` defines the replacement string as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`$(day)` | Add the string captured by the `day` capturing group.
|
||||
`-` | Add a hyphen.
|
||||
`$(month)` | Add the string captured by the `month` capturing group.
|
||||
`-` | Add a hyphen.
|
||||
`$(year)` | Add the string captured by the `year` capturing group.
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Examples](index.md)
|
|
@ -1,65 +0,0 @@
|
|||
---
|
||||
title: How to: Extract a Protocol and Port Number from a URL
|
||||
description: How to: Extract a Protocol and Port Number from a URL
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a047da6e-f200-413d-9613-797c57986c18
|
||||
---
|
||||
|
||||
# How to: Extract a Protocol and Port Number from a URL
|
||||
|
||||
The following example extracts a protocol and port number from a URL.
|
||||
|
||||
## Example
|
||||
|
||||
The example uses the [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method to return the protocol followed by a colon followed by the port number.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string url = "http://www.contoso.com:8080/letters/readme";
|
||||
|
||||
Regex r = new Regex(@"^(?<proto>\w+)://[^/]+?(?<port>:\d+)?/",
|
||||
RegexOptions.None, TimeSpan.FromMilliseconds(150));
|
||||
Match m = r.Match(url);
|
||||
if (m.Success)
|
||||
Console.WriteLine(r.Match(url).Result("${proto}${port}"));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// http:8080
|
||||
```
|
||||
|
||||
The regular expression pattern `^(?<proto>\w+)://[^/]+?(?<port>:\d+)?/` can be interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the start of the string.
|
||||
`(?<proto>\w+)` | Match one or more word characters. Name this group proto.
|
||||
`://` | Match a colon followed by two slash marks.
|
||||
`[^/]+?` | Match one or more occurrences (but as few as possible) of any character other than a slash mark.
|
||||
`(?<port>:\d+)?` | Match zero or one occurrence of a colon followed by one or more digit characters. Name this group port.
|
||||
`/` | Match a slash mark.
|
||||
|
||||
The [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method expands the `${proto}${port}` replacement sequence, which concatenates the value of the two named groups captured in the regular expression pattern. It is a convenient alternative to explicitly concatenating the strings retrieved from the collection object returned by the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property.
|
||||
|
||||
The example uses the [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method with two substitutions, `${proto}` and `${port}`, to include the captured groups in the output string. You can retrieve the captured groups from the match's [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object instead, as the following code shows.
|
||||
|
||||
```csharp
|
||||
Console.WriteLine(m.Groups["proto"].Value + m.Groups["port"].Value);
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Examples](index.md)
|
|
@ -1,36 +0,0 @@
|
|||
---
|
||||
title: Regular Expression Examples
|
||||
description: Regular Expression Examples
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: e5ed6aad-401b-49c8-9624-193fea2d213f
|
||||
---
|
||||
|
||||
# Regular Expression Examples
|
||||
|
||||
This section contains code examples that illustrate the use of regular expressions in common applications.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Regular Expression Example: Scanning for HREFs](scanning.md) - Provides an example that searches an input string and prints out all the href="…" values and their locations in the string.
|
||||
|
||||
[Regular Expression Example: Changing Date Formats](changingformats.md) - Provides an example that replaces dates in the form mm/dd/yy with dates in the form dd-mm-yy.
|
||||
|
||||
[How to: Extract a Protocol and Port Number from a URL](extractprotocol.md) - Provides an example that extracts a protocol and port number from a string that contains a URL. For example, "http://www.contoso.com:8080/letters/readme.html" returns "http:8080".
|
||||
|
||||
[How to: Strip Invalid Characters from a String](stripcharacters.md) - Provides an example that strips invalid non-alphanumeric characters from a string.
|
||||
|
||||
[How to: Verify that Strings Are in Valid Email Format](verifyformat.md) - Provides an example that you can use to verify that a string is in valid email format.
|
||||
|
||||
## Reference
|
||||
|
||||
[System.Text.RegularExpressions](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions) - Provides class library reference information for the .NET Core `System.Text.RegularExpressions` namespace.
|
||||
|
||||
|
||||
|
|
@ -1,92 +0,0 @@
|
|||
---
|
||||
title: Regular Expression Example: Scanning for HREFs
|
||||
description: Regular Expression Example: Scanning for HREFs
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 8704e75e-4a35-4c54-bf06-ed4d8e472b36
|
||||
---
|
||||
|
||||
# Regular Expression Example: Scanning for HREFs
|
||||
|
||||
The following example searches an input string and displays all the href="…" values and their locations in the string.
|
||||
|
||||
## The Regex Object
|
||||
|
||||
Because the `DumpHRefs` method can be called multiple times from user code, it uses the `static` [Regex.Match(String, String, RegexOptions)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method. This enables the regular expression engine to cache the regular expression and avoids the overhead of instantiating a new [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object each time the method is called. A [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object is then used to iterate through all matches in the string.
|
||||
|
||||
```csharp
|
||||
private static void DumpHRefs(string inputString)
|
||||
{
|
||||
Match m;
|
||||
string HRefPattern = "href\\s*=\\s*(?:[\"'](?<1>[^\"']*)[\"']|(?<1>\\S+))";
|
||||
|
||||
try {
|
||||
m = Regex.Match(inputString, HRefPattern,
|
||||
RegexOptions.IgnoreCase | RegexOptions.Compiled,
|
||||
TimeSpan.FromSeconds(1));
|
||||
while (m.Success)
|
||||
{
|
||||
Console.WriteLine("Found href " + m.Groups[1] + " at "
|
||||
+ m.Groups[1].Index);
|
||||
m = m.NextMatch();
|
||||
}
|
||||
}
|
||||
catch (RegexMatchTimeoutException) {
|
||||
Console.WriteLine("The matching operation timed out.");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following example then illustrates a call to the `DumpHRefs` method.
|
||||
|
||||
```csharp
|
||||
public static void Main()
|
||||
{
|
||||
string inputString = "My favorite web sites include:</P>" +
|
||||
"<A HREF=\"http://msdn2.microsoft.com\">" +
|
||||
"MSDN Home Page</A></P>" +
|
||||
"<A HREF=\"http://www.microsoft.com\">" +
|
||||
"Microsoft Corporation Home Page</A></P>" +
|
||||
"<A HREF=\"http://blogs.msdn.com/bclteam\">" +
|
||||
".NET Base Class Library blog</A></P>";
|
||||
DumpHRefs(inputString);
|
||||
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Found href http://msdn2.microsoft.com at 43
|
||||
// Found href http://www.microsoft.com at 102
|
||||
// Found href http://blogs.msdn.com/bclteam at 176
|
||||
```
|
||||
|
||||
The regular expression pattern `href\s*=\s*(?:["'](?<1>[^"']*)["']|(?<1>\S+))` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`href` | Match the literal string "href". The match is case-insensitive.
|
||||
`\s*` | Match zero or more white-space characters.
|
||||
`=` |`Match the equals sign.
|
||||
`\s*` | Match zero or more white-space characters.
|
||||
`(?:["'](?<1>[^"']*)"|(?<1>\S+)) | Match one of the following without assigning the result to a captured group: A quotation mark or apostrophe, followed by zero or more occurrences of any character other than a quotation mark or apostrophe, followed by a quotation mark or apostrophe. The group named `1` is included in this pattern. -or- One or more non-white-space characters. The group named `1` is included in this pattern.
|
||||
`(?<1>[^"']*)` | Assign zero or more occurrences of any character other than a quotation mark or apostrophe to the capturing group named `1`.
|
||||
`"(?<1>\S+)` | Assign one or more non-white-space characters to the capturing group named `1`.
|
||||
|
||||
## Match Result Class
|
||||
|
||||
The results of a search are stored in the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) class, which provides access to all the substrings extracted by the search. It also remembers the string being searched and the regular expression being used, so it can call the [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method to perform another search starting where the last one ended.
|
||||
|
||||
## Explicitly Named Captures
|
||||
|
||||
In traditional regular expressions, capturing parentheses are automatically numbered sequentially. This leads to two problems. First, if a regular expression is modified by inserting or removing a set of parentheses, all code that refers to the numbered captures must be rewritten to reflect the new numbering. Second, because different sets of parentheses often are used to provide two alternative expressions for an acceptable match, it might be difficult to determine which of the two expressions actually returned a result.
|
||||
|
||||
To address these problems, the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class supports the syntax `(?<name>…)` for capturing a match into a specified slot (the slot can be named using a string or an integer; integers can be recalled more quickly). Thus, alternative matches for the same string all can be directed to the same place. In case of a conflict, the last match dropped into a slot is the successful match. (However, a complete list of multiple matches for a single slot is available. See the [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) collection for details.)
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Examples](index.md)
|
||||
|
|
@ -1,50 +0,0 @@
|
|||
---
|
||||
title: How to: Strip Invalid Characters from a String
|
||||
description: How to: Strip Invalid Characters from a String
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 96a2cd0b-4f86-41a1-893b-460d31933fa6
|
||||
---
|
||||
|
||||
# How to: Strip Invalid Characters from a String
|
||||
|
||||
|
||||
The following example uses the static [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_System_String_System_Text_RegularExpressions_RegexOptions_System_TimeSpan_) method to strip invalid characters from a string.
|
||||
|
||||
## Example
|
||||
|
||||
You can use the `CleanInput` method defined in this example to strip potentially harmful characters that have been entered into a text field that accepts user input. In this case, `CleanInput` strips out all nonalphanumeric characters except periods (.), at symbols (@), and hyphens (-), and returns the remaining string. However, you can modify the regular expression pattern so that it strips out any characters that should not be included in an input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
static string CleanInput(string strIn)
|
||||
{
|
||||
// Replace invalid characters with empty strings.
|
||||
try {
|
||||
return Regex.Replace(strIn, @"[^\w\.@-]", "",
|
||||
RegexOptions.None, TimeSpan.FromSeconds(1.5));
|
||||
}
|
||||
// If we timeout when replacing invalid characters,
|
||||
// we should return Empty.
|
||||
catch (RegexMatchTimeoutException) {
|
||||
return String.Empty;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The regular expression pattern `[^\w\.@-]` matches any character that is not a word character, a period, an @ symbol, or a hyphen. A word character is any letter, decimal digit, or punctuation connector such as an underscore. Any character that matches this pattern is replaced by [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty), which is the string defined by the replacement pattern. To allow additional characters in user input, add those characters to the character class in the regular expression pattern. For example, the regular expression pattern `[^\w\.@-\\%]`also allows a percentage symbol and a backslash in an input string.
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Examples](index.md)
|
|
@ -1,155 +0,0 @@
|
|||
---
|
||||
title: How to: Verify that Strings Are in Valid Email Format
|
||||
description: How to: Verify that Strings Are in Valid Email Format
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: c3361216-4a76-42ce-80fb-0f8c5c325f7b
|
||||
---
|
||||
|
||||
# How to: Verify that Strings Are in Valid Email Format
|
||||
|
||||
The following example uses a regular expression to verify that a string is in valid email format.
|
||||
|
||||
## Example
|
||||
|
||||
The example defines an `IsValidEmail` method, which returns `true` if the string contains a valid email address and `false` if it does not, but takes no other action.
|
||||
|
||||
To verify that the email address is valid, the `IsValidEmail` method calls the [Regex.Replace(String, String, MatchEvaluator)]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_System_Text_RegularExpressions_MatchEvaluator_) method with the `(@)(.+)$` regular expression pattern to separate the domain name from the email address. The third parameter is a [MatchEvaluator]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchEvaluator) delegate that represents the method that processes and replaces the matched text. The regular expression pattern is interpreted as follows.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`(@)` | Match the @ character. This is the first capturing group.
|
||||
`(.+)` | Match one or more occurrences of any character. This is the second capturing group.
|
||||
`$` | End the match at the end of the string.
|
||||
|
||||
The domain name along with the @ character is passed to the `DomainMapper` method, which uses the [IdnMapping]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.IdnMapping) class to translate Unicode characters that are outside the US-ASCII character range to Punycode. The method also sets the `invalid` flag to `true` if the [IdnMapping.GetAscii]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.IdnMapping#System_Globalization_IdnMapping_GetAscii_System_String_) method detects any invalid characters in the domain name. The method returns the Punycode domain name preceded by the @ symbol to the `IsValidEmail` method.
|
||||
|
||||
The `IsValidEmail` method then calls the [Regex.IsMatch(String, String)]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_System_String_) method to verify that the address conforms to a regular expression pattern.
|
||||
|
||||
Note that the `IsValidEmail` method does not perform authentication to validate the email address. It merely determines whether its format is valid for an email address. In addition, the `IsValidEmail` method does not verify that the top-level domain name is a valid domain name listed at the [IANA Root Zone Database](https://www.iana.org/domains/root/db), which would require a look-up operation. Instead, the regular expression merely verifies that the top-level domain name consists of between two and twenty-four ASCII characters, with alphanumeric first and last characters and the remaining characters being either alphanumeric or a hyphen (-).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class RegexUtilities
|
||||
{
|
||||
bool invalid = false;
|
||||
|
||||
public bool IsValidEmail(string strIn)
|
||||
{
|
||||
invalid = false;
|
||||
if (String.IsNullOrEmpty(strIn))
|
||||
return false;
|
||||
|
||||
// Use IdnMapping class to convert Unicode domain names.
|
||||
try {
|
||||
strIn = Regex.Replace(strIn, @"(@)(.+)$", this.DomainMapper,
|
||||
RegexOptions.None, TimeSpan.FromMilliseconds(200));
|
||||
}
|
||||
catch (RegexMatchTimeoutException) {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (invalid)
|
||||
return false;
|
||||
|
||||
// Return true if strIn is in valid e-mail format.
|
||||
try {
|
||||
return Regex.IsMatch(strIn,
|
||||
@"^(?("")("".+?(?<!\\)""@)|(([0-9a-z]((\.(?!\.))|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)(?<=[0-9a-z])@))" +
|
||||
@"(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$",
|
||||
RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(250));
|
||||
}
|
||||
catch (RegexMatchTimeoutException) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
private string DomainMapper(Match match)
|
||||
{
|
||||
// IdnMapping class with default property values.
|
||||
IdnMapping idn = new IdnMapping();
|
||||
|
||||
string domainName = match.Groups[2].Value;
|
||||
try {
|
||||
domainName = idn.GetAscii(domainName);
|
||||
}
|
||||
catch (ArgumentException) {
|
||||
invalid = true;
|
||||
}
|
||||
return match.Groups[1].Value + domainName;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
In this example, the regular expression pattern `^(?(")(".+?(?<!\\)"@)|(([0-9a-z]((\.(?!\.))|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)(?<=[0-9a-z])@))(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$` is interpreted as shown in the following table. Note that the regular expression is compiled using the [RegexOptions.IgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) flag.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the start of the string.
|
||||
`(?(")` | Determine whether the first character is a quotation mark. `(?(")` is the beginning of an alternation construct.
|
||||
`(?("")("".+?(?<!\\)""@)` | If the first character is a quotation mark, match a beginning quotation mark followed by at least one occurrence of any character, followed by an ending quotation mark. The ending quotation mark must not be preceded by a backslash character `(\). (?<!` is the beginning of a zero-width negative lookbehind assertion. The string should conclude with an at sign (@).
|
||||
`|(([0-9a-z] | If the first character is not a quotation mark, match any alphabetic character from a to z or A to Z (the comparison is case insensitive), or any numeric character from 0 to 9.
|
||||
`(\.(?!\.))` | If the next character is a period, match it. If it is not a period, look ahead to the next character and continue the match. `(?!\.)` is a zero-width negative lookahead assertion that prevents two consecutive periods from appearing in the local part of an email address.
|
||||
`|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w] | If the next character is not a period, match any word character or one of the following characters: -!#$%'*+=?^`{}|~.
|
||||
`((\.(?!\.))|[-!#\$%'\*\+/=\?\^`\{\}\|~\w])* | Match the alternation pattern (a period followed by a non-period, or one of a number of characters) zero or more times.
|
||||
`@` | Match the @ character.
|
||||
`(?<=[0-9a-z])` | Continue the match if the character that precedes the @ character is A through Z, a through z, or 0 through 9. The `(?<=[0-9a-z])` construct defines a zero-width positive lookbehind assertion.
|
||||
`(?(\[)` | Check whether the character that follows @ is an opening bracket.
|
||||
`(\[(\d{1,3}\.){3}\d{1,3}\])` | If it is an opening bracket, match the opening bracket followed by an IP address (four sets of one to three digits, with each set separated by a period) and a closing bracket.
|
||||
`|(([0-9a-z][-\w]*[0-9a-z]*\.)+` | If the character that follows @ is not an opening bracket, match one alphanumeric character with a value of A-Z, a-z, or 0-9, followed by zero or more occurrences of a word character or a hyphen, followed by zero or one alphanumeric character with a value of A-Z, a-z, or 0-9, followed by a period. This pattern can be repeated one or more times, and must be followed by the top-level domain name.
|
||||
`[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))` | The top-level domain name must begin and end with an alphanumeric character (a-z, A-Z, and 0-9). It can also include from zero to 22 ASCII characters that are either alphanumeric or hyphens.
|
||||
`$` | End the match at the end of the string.
|
||||
|
||||
You can call the `IsValidEmail` and `DomainMapper` methods by using code such as the following:
|
||||
|
||||
```csharp
|
||||
public class Application
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
RegexUtilities util = new RegexUtilities();
|
||||
string[] emailAddresses = { "david.jones@proseware.com", "d.j@server1.proseware.com",
|
||||
"jones@ms1.proseware.com", "j.@server1.proseware.com",
|
||||
"j@proseware.com9", "js#internal@proseware.com",
|
||||
"j_9@[129.126.118.1]", "j..s@proseware.com",
|
||||
"js*@proseware.com", "js@proseware..com",
|
||||
"js@proseware.com9", "j.s@server1.proseware.com",
|
||||
"\"j\\\"s\\\"\"@proseware.com", "js@contoso.中国" };
|
||||
|
||||
foreach (var emailAddress in emailAddresses) {
|
||||
if (util.IsValidEmail(emailAddress))
|
||||
Console.WriteLine("Valid: {0}", emailAddress);
|
||||
else
|
||||
Console.WriteLine("Invalid: {0}", emailAddress);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Valid: david.jones@proseware.com
|
||||
// Valid: d.j@server1.proseware.com
|
||||
// Valid: jones@ms1.proseware.com
|
||||
// Invalid: j.@server1.proseware.com
|
||||
// Valid: j@proseware.com9
|
||||
// Valid: js#internal@proseware.com
|
||||
// Valid: j_9@[129.126.118.1]
|
||||
// Invalid: j..s@proseware.com
|
||||
// Invalid: js*@proseware.com
|
||||
// Invalid: js@proseware..com
|
||||
// Valid: js@proseware.com9
|
||||
// Valid: j.s@server1.proseware.com
|
||||
// Valid: "j\"s\""@proseware.com
|
||||
// Valid: js@contoso.ä¸å›½
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Examples](index.md)
|
|
@ -1,215 +0,0 @@
|
|||
---
|
||||
title: Regular Expressions in .NET Core
|
||||
description: Regular Expressions in .NET Core
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: d60198a1-78ed-4862-981f-d7dcc2713a0d
|
||||
---
|
||||
|
||||
# Regular Expressions in .NET Core
|
||||
|
||||
Regular expressions provide a powerful, flexible, and efficient method for processing text. The extensive pattern-matching notation of regular expressions enables you to quickly parse large amounts of text to find specific character patterns; to validate text to ensure that it matches a predefined pattern (such as an e-mail address); to extract, edit, replace, or delete text substrings; and to add the extracted strings to a collection in order to generate a report. For many applications that deal with strings or that parse large blocks of text, regular expressions are an indispensable tool.
|
||||
|
||||
## How Regular Expressions Work
|
||||
|
||||
The centerpiece of text processing with regular expressions is the regular expression engine, which is represented by the [System.Text.RegularExpressions.Regex]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object in .NET Core. At a minimum, processing text using regular expressions requires that the regular expression engine be provided with the following two items of information:
|
||||
|
||||
* The regular expression pattern to identify in the text.
|
||||
|
||||
In .NET Core, regular expression patterns are defined by a special syntax or language, which is compatible with Perl 5 regular expressions and adds some additional features such as right-to-left matching.
|
||||
|
||||
* The text to parse for the regular expression pattern.
|
||||
|
||||
The methods of the [Regex]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class let you perform the following operations:
|
||||
|
||||
* Determine whether the regular expression pattern occurs in the input text by calling the [Regex.IsMatch]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method.
|
||||
|
||||
* Retrieve one or all occurrences of text that matches the regular expression pattern by calling the [Regex.Match]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_) or [Regex.Matches]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method. The former method returns a [System.Text.RegularExpressions.Match]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object that provides information about the matching text. The latter returns a [MatchCollection]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object that contains one [System.Text.RegularExpressions.Match]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object for each match found in the parsed text.
|
||||
|
||||
* Replace text that matches the regular expression pattern by calling the [Regex.Replace]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) method.
|
||||
|
||||
For an overview of the regular expression object model, see [The Regular Expression Object Model](objectmodel.md).
|
||||
|
||||
## Regular Expression Examples
|
||||
|
||||
The [String]( https://docs.microsoft.com/dotnet/core/api/System.String) class includes a number of string search and replacement methods that you can use when you want to locate literal strings in a larger string. Regular expressions are most useful either when you want to locate one of several substrings in a larger string, or when you want to identify patterns in a string, as the following examples illustrate.
|
||||
|
||||
### Example 1: Replacing Substrings
|
||||
|
||||
Assume that a mailing list contains names that sometimes include a title (Mr., Mrs., Miss, or Ms.) along with a first and last name. If you do not want to include the titles when you generate envelope labels from the list, you can use a regular expression to remove the titles, as the following example illustrates.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "(Mr\\.? |Mrs\\.? |Miss |Ms\\.? )";
|
||||
string[] names = { "Mr. Henry Hunt", "Ms. Sara Samuels",
|
||||
"Abraham Adams", "Ms. Nicole Norris" };
|
||||
foreach (string name in names)
|
||||
Console.WriteLine(Regex.Replace(name, pattern, String.Empty));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Henry Hunt
|
||||
// Sara Samuels
|
||||
// Abraham Adams
|
||||
// Nicole Norris
|
||||
```
|
||||
|
||||
The regular expression pattern `(Mr\.? |Mrs\.? |Miss |Ms\.? )` matches any occurrence of "Mr ", "Mr. ", "Mrs ", "Mrs. ", "Miss ", "Ms or "Ms. ". The call to the [Regex.Replace]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) method replaces the matched string with [String.Empty]( https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty); in other words, it removes it from the original string.
|
||||
|
||||
### Example 2: Identifying Duplicated Words
|
||||
|
||||
Accidentally duplicating words is a common error that writers make. A regular expression can be used to identify duplicated words, as the following example shows.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Class1
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\w+?)\s\1\b";
|
||||
string input = "This this is a nice day. What about this? This tastes good. I saw a a dog.";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("{0} (duplicates '{1}') at position {2}",
|
||||
match.Value, match.Groups[1].Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// This this (duplicates 'This)' at position 0
|
||||
// a a (duplicates 'a)' at position 66
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(\w+?)\s\1\b` can be interpreted as follows:
|
||||
|
||||
Syntax | Meaning
|
||||
------ | -------
|
||||
`\b` | Start at a word boundary.
|
||||
`(\w+?)` | Match one or more word characters, but as few characters as possible. Together, they form a group that can be referred to as `\1`.
|
||||
`\s` | Match a white-space character.
|
||||
`\1` | Match the substring that is equal to the group named `\1`.
|
||||
`\b` | Match a word boundary.
|
||||
|
||||
The [Regex.Matches]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method is called with regular expression options set to [RegexOptions.IgnoreCase]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase). Therefore, the match operation is case-insensitive, and the example identifies the substring "This this" as a duplication.
|
||||
|
||||
Note that the input string includes the substring "this? This". However, because of the intervening punctuation mark, it is not identified as a duplication.
|
||||
|
||||
### Example 3: Dynamically Building a Culture-Sensitive Regular Expression
|
||||
|
||||
The following example illustrates the power of regular expressions combined with the flexibility offered by .NET Core's globalization features. It uses the [NumberFormatInfo]( https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object to determine the format of currency values in the system's current culture. It then uses that information to dynamically construct a regular expression that extracts currency values from the text. For each match, it extracts the subgroup that contains the numeric string only, converts it to a [Decimal]( https://docs.microsoft.com/dotnet/core/api/System.Decimal) value, and calculates a running total.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Globalization;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Define text to be parsed.
|
||||
string input = "Office expenses on 2/13/2008:\n" +
|
||||
"Paper (500 sheets) $3.95\n" +
|
||||
"Pencils (box of 10) $1.00\n" +
|
||||
"Pens (box of 10) $4.49\n" +
|
||||
"Erasers $2.19\n" +
|
||||
"Ink jet printer $69.95\n\n" +
|
||||
"Total Expenses $ 81.58\n";
|
||||
|
||||
// Get current culture's NumberFormatInfo object.
|
||||
NumberFormatInfo nfi = CultureInfo.CurrentCulture.NumberFormat;
|
||||
// Assign needed property values to variables.
|
||||
string currencySymbol = nfi.CurrencySymbol;
|
||||
bool symbolPrecedesIfPositive = nfi.CurrencyPositivePattern % 2 == 0;
|
||||
string groupSeparator = nfi.CurrencyGroupSeparator;
|
||||
string decimalSeparator = nfi.CurrencyDecimalSeparator;
|
||||
|
||||
// Form regular expression pattern.
|
||||
string pattern = Regex.Escape( symbolPrecedesIfPositive ? currencySymbol : "") +
|
||||
@"\s*[-+]?" + "([0-9]{0,3}(" + groupSeparator + "[0-9]{3})*(" +
|
||||
Regex.Escape(decimalSeparator) + "[0-9]+)?)" +
|
||||
(! symbolPrecedesIfPositive ? currencySymbol : "");
|
||||
Console.WriteLine( "The regular expression pattern is:");
|
||||
Console.WriteLine(" " + pattern);
|
||||
|
||||
// Get text that matches regular expression pattern.
|
||||
MatchCollection matches = Regex.Matches(input, pattern,
|
||||
RegexOptions.IgnorePatternWhitespace);
|
||||
Console.WriteLine("Found {0} matches.", matches.Count);
|
||||
|
||||
// Get numeric string, convert it to a value, and add it to List object.
|
||||
List<decimal> expenses = new List<Decimal>();
|
||||
|
||||
foreach (Match match in matches)
|
||||
expenses.Add(Decimal.Parse(match.Groups[1].Value));
|
||||
|
||||
// Determine whether total is present and if present, whether it is correct.
|
||||
decimal total = 0;
|
||||
foreach (decimal value in expenses)
|
||||
total += value;
|
||||
|
||||
if (total / 2 == expenses[expenses.Count - 1])
|
||||
Console.WriteLine("The expenses total {0:C2}.", expenses[expenses.Count - 1]);
|
||||
else
|
||||
Console.WriteLine("The expenses total {0:C2}.", total);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The regular expression pattern is:
|
||||
// \$\s*[-+]?([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)
|
||||
// Found 6 matches.
|
||||
// The expenses total $81.58.
|
||||
```
|
||||
|
||||
On a computer whose current culture is English - United States (en-US), the example dynamically builds the regular expression `\$\s*[-+]?([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)`. This regular expression pattern can be interpreted as follows:
|
||||
|
||||
Syntax | Meaning
|
||||
------ | -------
|
||||
`\$` | Look for a single occurrence of the dollar symbol ($) in the input string. The regular expression pattern string includes a backslash to indicate that the dollar symbol is to be interpreted literally rather than as a regular expression anchor. (The $ symbol alone would indicate that the regular expression engine should try to begin its match at the end of a string.) To ensure that the current culture's currency symbol is not misinterpreted as a regular expression symbol, the example calls the [Escape]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Escape_System_String_) method to escape the character.
|
||||
`\s*` | Look for zero or more occurrences of a white-space character.
|
||||
`[-+]?` | Look for zero or one occurrence of either a positive sign or a negative sign.
|
||||
`([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)` | The outer parentheses around this expression define it as a capturing group or a subexpression. If a match is found, information about this part of the matching string can be retrieved from the second [Group]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object in the [GroupCollection]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property. (The first element in the collection represents the entire match.)
|
||||
`[0-9]{0,3}` | Look for zero to three occurrences of the decimal digits 0 through 9.
|
||||
`(,[0-9]{3})*` | Look for zero or more occurrences of a group separator followed by three decimal digits.
|
||||
`\.` | Look for a single occurrence of the decimal separator.
|
||||
`[0-9]+` | Look for one or more decimal digits.
|
||||
`(\.[0-9]+)?` | Look for zero or one occurrence of the decimal separator followed by at least one decimal digit.
|
||||
|
||||
## Related Topics
|
||||
|
||||
Title | Description
|
||||
----- | -----------
|
||||
[The Regular Expression Object Model](objectmodel.md) | Provides information and code examples that illustrate how to use the regular expression classes.
|
||||
[Details of Regular Expression Behavior](behavior/regexbehavior.md) | Provides information about the capabilities and behavior of .NET Coreregular expressions.
|
||||
[Regular Expression Examples](examples/regexexamples.md) | Provides code examples that illustrate typical uses of regular expressions.
|
||||
|
||||
|
||||
## Reference
|
||||
|
||||
[System.Text.RegularExpressions]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions)
|
||||
|
||||
[System.Text.RegularExpressions.Regex]( https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,707 +0,0 @@
|
|||
---
|
||||
title: The Regular Expression Object Model
|
||||
description: The Regular Expression Object Model
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 198b6a08-1867-4da4-aa4d-dcc6413c6421
|
||||
---
|
||||
|
||||
# The Regular Expression Object Model
|
||||
|
||||
|
||||
This topic describes the object model used in working with.NET Core regular expressions. It contains the following sections:
|
||||
|
||||
* [The Regular Expression Engine](#The-Regular-Expression-Engine)
|
||||
|
||||
* [The MatchCollection and Match Objects](#The-MatchCollection-and-Match-Objects)
|
||||
|
||||
* [The Group Collection](#The-Group-Collection)
|
||||
|
||||
* [The Captured Group](#The-Captured-Group)
|
||||
|
||||
* [The Capture Collection](#The-Capture-Collection)
|
||||
|
||||
* [The Individual Capture](#The-Individual-Capture)
|
||||
|
||||
## The Regular Expression Engine
|
||||
|
||||
The regular expression engine in .NET Core is represented by the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class. The regular expression engine is responsible for parsing and compiling a regular expression, and for performing operations that match the regular expression pattern with an input string. The engine is the central component in .NET Core regular expression object model.
|
||||
|
||||
You can use the regular expression engine in either of two ways:
|
||||
|
||||
* By calling the static methods of the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class. The method parameters include the input string and the regular expression pattern. The regular expression engine caches regular expressions that are used in static method calls, so repeated calls to static regular expression methods that use the same regular expression offer relatively good performance.
|
||||
|
||||
* y instantiating a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object, by passing a regular expression to the class constructor. In this case, the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object is immutable (read-only) and represents a regular expression engine that is tightly coupled with a single regular expression. Because regular expressions used by Regex instances are not cached, you should not instantiate a [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object multiple times with the same regular expression.
|
||||
|
||||
You can call the methods of the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class to perform the following operations:
|
||||
|
||||
* Determine whether a string matches a regular expression pattern.
|
||||
|
||||
* Extract a single match or the first match.
|
||||
|
||||
* Extract all matches.
|
||||
|
||||
* Replace a matched substring.
|
||||
|
||||
* Split a single string into an array of strings.
|
||||
|
||||
These operations are described in the following sections.
|
||||
|
||||
### Matching a Regular Expression Pattern
|
||||
|
||||
The [Regex.IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method returns `true` if the string matches the pattern, or `false` if it does not. The [IsMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_) method is often used to validate string input. For example, the following code ensures that a string matches a valid social security number in the United States.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] values = { "111-22-3333", "111-2-3333"};
|
||||
string pattern = @"^\d{3}-\d{2}-\d{4}$";
|
||||
foreach (string value in values) {
|
||||
if (Regex.IsMatch(value, pattern))
|
||||
Console.WriteLine("{0} is a valid SSN.", value);
|
||||
else
|
||||
Console.WriteLine("{0}: Invalid", value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 111-22-3333 is a valid SSN.
|
||||
// 111-2-3333: Invalid
|
||||
```
|
||||
|
||||
The regular expression pattern `^\d{3}-\d{2}-\d{4}$` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Match the beginning of the input string.
|
||||
`\d{3}` | Match three decimal digits.
|
||||
`-` | Match a hyphen.
|
||||
`\d{2}` | Match two decimal digits.
|
||||
`-` | Match a hyphen.
|
||||
`\d{4}` | Match four decimal digits.
|
||||
`$` | Match the end of the input string.
|
||||
|
||||
### Extracting a Single Match or the First Match
|
||||
|
||||
The [Regex.Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_) method returns a [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object that contains information about the first substring that matches a regular expression pattern. If the `Match.Success` property returns `true`, indicating that a match was found, you can retrieve information about subsequent matches by calling the [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method. These method calls can continue until the `Match.Success` property returns `false`. For example, the following code uses the [Regex.Match(String, String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_System_String_) method to find the first occurrence of a duplicated word in a string. It then calls the [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method to find any additional occurrences. The example examines the `Match.Success` property after each method call to determine whether the current match was successful and whether a call to the [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method should follow.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "This is a a farm that that raises dairy cattle.";
|
||||
string pattern = @"\b(\w+)\W+(\1)\b";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
while (match.Success)
|
||||
{
|
||||
Console.WriteLine("Duplicate '{0}' found at position {1}.",
|
||||
match.Groups[1].Value, match.Groups[2].Index);
|
||||
match = match.NextMatch();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Duplicate 'a' found at position 10.
|
||||
// Duplicate 'that' found at position 22.
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(\w+)\W+(\1)\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match on a word boundary.
|
||||
`(\w+)` | Match one or more word characters. This is the first capturing group.
|
||||
`\W+` | Match one or more non-word characters.
|
||||
`(\1)` | Match the first captured string. This is the second capturing group.
|
||||
`\b` | End the match on a word boundary.
|
||||
|
||||
### Extracting All Matches
|
||||
|
||||
The [Regex.Matches](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method returns a [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object that contains information about all matches that the regular expression engine found in the input string. For example, the previous example could be rewritten to call the [Matches](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method instead of the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_) and [NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) methods.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "This is a a farm that that raises dairy cattle.";
|
||||
string pattern = @"\b(\w+)\W+(\1)\b";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("Duplicate '{0}' found at position {1}.",
|
||||
match.Groups[1].Value, match.Groups[2].Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Duplicate 'a' found at position 10.
|
||||
// Duplicate 'that' found at position 22.
|
||||
```
|
||||
|
||||
### Replacing a Matched Substring
|
||||
|
||||
The [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) method replaces each substring that matches the regular expression pattern with a specified string or regular expression pattern, and returns the entire input string with replacements. For example, the following code adds a U.S. currency symbol before a decimal number in a string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b\d+\.\d{2}\b";
|
||||
string replacement = "$$$&";
|
||||
string input = "Total Cost: 103.64";
|
||||
Console.WriteLine(Regex.Replace(input, pattern, replacement));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Total Cost: $103.64
|
||||
```
|
||||
|
||||
The regular expression pattern `\b\d+\.\d{2}\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
`\.` | Match a period.
|
||||
`\d{2}` | Match two decimal digits.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
The replacement pattern `$$$&` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Replacement string
|
||||
------- | ------------------
|
||||
`$$` | The dollar sign (**$**) character.
|
||||
`$&` | The entire matched substring.
|
||||
|
||||
### Splitting a Single String into an Array of Strings
|
||||
|
||||
The [Regex.Split](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Split_System_String_) method splits the input string at the positions defined by a regular expression match. For example, the following code places the items in a numbered list into a string array.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "1. Eggs 2. Bread 3. Milk 4. Coffee 5. Tea";
|
||||
string pattern = @"\b\d{1,2}\.\s";
|
||||
foreach (string item in Regex.Split(input, pattern))
|
||||
{
|
||||
if (! String.IsNullOrEmpty(item))
|
||||
Console.WriteLine(item);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Eggs
|
||||
// Bread
|
||||
// Milk
|
||||
// Coffee
|
||||
// Tea
|
||||
```
|
||||
|
||||
The regular expression pattern `\b\d{1,2}\.\s` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\d{1,2}` | Match one or two decimal digits.
|
||||
`\.` | Match a period.
|
||||
`\s` | Match a white-space character.
|
||||
|
||||
## The MatchCollection and Match Objects
|
||||
|
||||
[Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) methods return two objects that are part of the regular expression object model: the [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object, and the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object.
|
||||
|
||||
### The Match Collection
|
||||
|
||||
The [Regex.Matches](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method returns a [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object that contains [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) objects that represent all the matches that the regular expression engine found, in the order in which they occur in the input string. If there are no matches, the method returns a [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object that contains [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object with no members. The [MatchCollection.Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Item_System_Int32_) property lets you access individual members of the collection by index, from zero to one less than the value of the [MatchCollection.Count](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Count) property. [Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Item_System_Int32_) is the collection's indexer.
|
||||
|
||||
By default, the call to the [Regex.Matches](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method uses lazy evaluation to populate the [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object. Access to properties that require a fully populated collection, such as the [MatchCollection.Count](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Count) and [MatchCollection.Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Item_System_Int32_) properties, may involve a performance penalty. As a result, we recommend that you access the collection by using the [IEnumerator](https://docs.microsoft.com/dotnet/core/api/System.Collections.IEnumerator) object that is returned by the [MatchCollection.GetEnumerator](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_GetEnumerator) method. Individual languages provide constructs, such as `foreach` in C#, that wrap the collection's IEnumerator](https://docs.microsoft.com/dotnet/core/api/System.Collections.IEnumerator) interface.
|
||||
|
||||
The following example uses the [Regex.Matches(String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) method to populate a [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object with all the matches found in an input string. The example enumerates the collection, copies the matches to a string array, and records the character positions in an integer array.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
MatchCollection matches;
|
||||
List<string> results = new List<string>();
|
||||
List<int> matchposition = new List<int>();
|
||||
|
||||
// Create a new Regex object and define the regular expression.
|
||||
Regex r = new Regex("abc");
|
||||
// Use the Matches method to find all matches in the input string.
|
||||
matches = r.Matches("123abc4abcd");
|
||||
// Enumerate the collection to retrieve all matches and positions.
|
||||
foreach (Match match in matches)
|
||||
{
|
||||
// Add the match string to the string array.
|
||||
results.Add(match.Value);
|
||||
// Record the character position where the match was found.
|
||||
matchposition.Add(match.Index);
|
||||
}
|
||||
// List the results.
|
||||
for (int ctr = 0; ctr < results.Count; ctr++)
|
||||
Console.WriteLine("'{0}' found at position {1}.",
|
||||
results[ctr], matchposition[ctr]);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 'abc' found at position 3.
|
||||
// 'abc' found at position 7.
|
||||
```
|
||||
|
||||
### The Match
|
||||
|
||||
The [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) class represents the result of a single regular expression match. You can access [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) objects in two ways:
|
||||
|
||||
* By retrieving them from the [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object that is returned by the Regex.Matches method. To retrieve individual [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) objects, iterate the collection by using a `foreach` construct, or use the [MatchCollection.Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Item_System_Int32_) property to retrieve a specific [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object either by index or by name. You can also retrieve individual [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) objects from the collection by iterating the collection by index, from zero to one less that the number of objects in the collection. However, this method does not take advantage of lazy evaluation, because it accesses the [MatchCollection.Count](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection#System_Text_RegularExpressions_MatchCollection_Count) property.
|
||||
|
||||
The following example retrieves individual [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) objects from a [MatchCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.MatchCollection) object by iterating the collection using the `foreach` construct. The regular expression simply matches the string "abc" in the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "abc";
|
||||
string input = "abc123abc456abc789";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("{0} found at position {1}.",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// abc found at position 0.
|
||||
// abc found at position 6.
|
||||
// abc found at position 12.
|
||||
```
|
||||
|
||||
* By calling the [Regex.Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_) method, which returns a [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object that represents the first match in a string or a portion of a string. You can determine whether the match has been found by retrieving the value of the `Match.Success` property. To retrieve [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) objects that represent subsequent matches, call the [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method repeatedly, until the `Success` property of the returned [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object is `false`.
|
||||
|
||||
The following example uses the [Regex.Match(String, String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_System_String_) and [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) methods to match the string "abc" in the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "abc";
|
||||
string input = "abc123abc456abc789";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
while (match.Success)
|
||||
{
|
||||
Console.WriteLine("{0} found at position {1}.",
|
||||
match.Value, match.Index);
|
||||
match = match.NextMatch();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// abc found at position 0.
|
||||
// abc found at position 6.
|
||||
// abc found at position 12.
|
||||
```
|
||||
|
||||
Two properties of the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) class return collection objects:
|
||||
|
||||
* The [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property returns a [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object that contains information about the substrings that match capturing groups in the regular expression pattern.
|
||||
|
||||
* The `Match.Captures` property returns a [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object that is of limited use. The collection is not populated for a [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object whose `Success` property is `false`. Otherwise, it contains a single [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) object that has the same information as the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object.
|
||||
|
||||
For more information about these objects, see the [The Group Collection](#The-Group-Collection) and [The Capture Collection](#The-Capture-Collection) sections later in this topic.
|
||||
|
||||
Two additional properties of the [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) class provide information about the match. The `Match.Value` property returns the substring in the input string that matches the regular expression pattern. The `Match.Index` property returns the zero-based starting position of the matched string in the input string.
|
||||
|
||||
The [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) class also has two pattern-matching methods:
|
||||
|
||||
* The [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method finds the match after the match represented by the current [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object, and returns a [Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match) object that represents that match.
|
||||
|
||||
* The [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method performs a specified replacement operation on the matched string and returns the result.
|
||||
|
||||
|
||||
The following example uses the [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method to prepend a **$** symbol and a space before every number that includes two fractional digits.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b\d+(,\d{3})*\.\d{2}\b";
|
||||
string input = "16.32\n194.03\n1,903,672.08";
|
||||
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Result("$$ $&"));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// $ 16.32
|
||||
// $ 194.03
|
||||
// $ 1,903,672.08
|
||||
```
|
||||
|
||||
The regular expression pattern `\b\d+(,\d{3})*\.\d{2}\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
`(,\d{3})*` | Match zero or more occurrences of a comma followed by three decimal digits.
|
||||
`\.` | Match the decimal point character.
|
||||
`\d{2} | Match two decimal digits.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
The replacement pattern **$$ $&** indicates that the matched substring should be replaced by a dollar sign (**$**) symbol (the `$$` pattern), a space, and the value of the match (the `$&` pattern).
|
||||
|
||||
## The Group Collection
|
||||
|
||||
The [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property returns a [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object that contains [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects that represent captured groups in a single match. The first [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object in the collection (at index 0) represents the entire match. Each object that follows represents the results of a single capturing group.
|
||||
|
||||
You can retrieve individual [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects in the collection by using the [GroupCollection.Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection#System_Text_RegularExpressions_GroupCollection_Item_System_Int32_) property. You can retrieve unnamed groups by their ordinal position in the collection, and retrieve named groups either by name or by ordinal position. Unnamed captures appear first in the collection, and are indexed from left to right in the order in which they appear in the regular expression pattern. Named captures are indexed after unnamed captures, from left to right in the order in which they appear in the regular expression pattern. To determine what numbered groups are available in the collection returned for a particular regular expression matching method, you can call the instance [Regex.GetGroupNumbers](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_GetGroupNumbers) method. To determine what named groups are available in the collection, you can call the instance R[Regex.GetGroupNames](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_GetGroupNames) method. Both methods are particularly useful in general-purpose routines that analyze the matches found by any regular expression.
|
||||
|
||||
The [GroupCollection.Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection#System_Text_RegularExpressions_GroupCollection_Item_System_Int32_) property is the indexer of the collection. This means that individual [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects can be accessed by index (or by name, in the case of named groups) as follows:
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\w+)\s(\d{1,2}),\s(\d{4})\b";
|
||||
string input = "Born: July 28, 1989";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
for (int ctr = 0; ctr < match.Groups.Count; ctr++)
|
||||
Console.WriteLine("Group {0}: {1}", ctr, match.Groups[ctr].Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Group 0: July 28, 1989
|
||||
// Group 1: July
|
||||
// Group 2: 28
|
||||
// Group 3: 1989
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(\w+)\s(\d{1,2}),\s(\d{4})\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(\w+)` | Match one or more word characters. This is the first capturing group.
|
||||
`\s` | Match a white-space character.
|
||||
`(\d{1,2})` | Match one or two decimal digits. This is the second capturing group.
|
||||
`,` | Match a comma.
|
||||
`\s` | Match a white-space character.
|
||||
`(\d{4})` | Match four decimal digits. This is the third capturing group.
|
||||
`\b` | End the match on a word boundary.
|
||||
|
||||
## The Captured Group
|
||||
|
||||
The [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) class represents the result from a single capturing group. [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects that represent the capturing groups defined in a regular expression are returned by the [Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection#System_Text_RegularExpressions_GroupCollection_Item_System_Int32_) property of the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property. The [Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection#System_Text_RegularExpressions_GroupCollection_Item_System_Int32_) property is the indexer of the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) class. You can also retrieve individual members by iterating the collection using the `foreach` construct. For an example, see the previous section.
|
||||
|
||||
The following example uses nested grouping constructs to capture substrings into groups. The regular expression pattern `(a(b))c` matches the string "abc". It assigns the substring "ab" to the first capturing group, and the substring "b" to the second capturing group.
|
||||
|
||||
```csharp
|
||||
List<int> matchposition = new List<int>();
|
||||
List<string> results = new List<string>();
|
||||
// Define substrings abc, ab, b.
|
||||
Regex r = new Regex("(a(b))c");
|
||||
Match m = r.Match("abdabc");
|
||||
for (int i = 0; m.Groups[i].Value != ""; i++)
|
||||
{
|
||||
// Add groups to string array.
|
||||
results.Add(m.Groups[i].Value);
|
||||
// Record character position.
|
||||
matchposition.Add(m.Groups[i].Index);
|
||||
}
|
||||
|
||||
// Display the capture groups.
|
||||
for (int ctr = 0; ctr < results.Count; ctr++)
|
||||
Console.WriteLine("{0} at position {1}",
|
||||
results[ctr], matchposition[ctr]);
|
||||
// The example displays the following output:
|
||||
// abc at position 3
|
||||
// ab at position 3
|
||||
// b at position 4
|
||||
```
|
||||
|
||||
The following example uses named grouping constructs to capture substrings from a string that contains data in the format "DATANAME:VALUE", which the regular expression splits at the colon (:).
|
||||
|
||||
```csharp
|
||||
Regex r = new Regex("^(?<name>\\w+):(?<value>\\w+)");
|
||||
Match m = r.Match("Section1:119900");
|
||||
Console.WriteLine(m.Groups["name"].Value);
|
||||
Console.WriteLine(m.Groups["value"].Value);
|
||||
// The example displays the following output:
|
||||
// Section1
|
||||
// 119900
|
||||
```
|
||||
|
||||
The regular expression pattern `^(?<name>\w+):(?<value>\w+)` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of the input string.
|
||||
`(?<name>\w+)` | Match one or more word characters. The name of this capturing group is name.
|
||||
`:` | Match a colon.
|
||||
`(?<value>\w+)` | Match one or more word characters. The name of this capturing group is value.
|
||||
|
||||
The properties of the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) class provide information about the captured group: The `Group.Value` property contains the captured substring, the `Group.Index` property indicates the starting position of the captured group in the input text, the `Group.Length` property contains the length of the captured text, and the `Group.Success` property indicates whether a substring matched the pattern defined by the capturing group.
|
||||
|
||||
Applying quantifiers to a group modifies the relationship of one capture per capturing group in two ways:
|
||||
|
||||
* If the __*__ or __*?__ quantifier (which specifies zero or more matches) is applied to a group, a capturing group may not have a match in the input string. When there is no captured text, the properties of the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object are set as shown in the following table.
|
||||
|
||||
Group property | Value
|
||||
-------------- | -----
|
||||
`Success` | `false`
|
||||
`Value` | [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty)
|
||||
`Length` | 0
|
||||
|
||||
The following example provides an illustration. In the regular expression pattern `aaa(bbb)*ccc`, the first capturing group (the substring "bbb") can be matched zero or more times. Because the input string "aaaccc" matches the pattern, the capturing group does not have a match.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "aaa(bbb)*ccc";
|
||||
string input = "aaaccc";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
Console.WriteLine("Match value: {0}", match.Value);
|
||||
if (match.Groups[1].Success)
|
||||
Console.WriteLine("Group 1 value: {0}", match.Groups[1].Value);
|
||||
else
|
||||
Console.WriteLine("The first capturing group has no match.");
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match value: aaaccc
|
||||
// The first capturing group has no match.
|
||||
```
|
||||
|
||||
* Quantifiers can match multiple occurrences of a pattern that is defined by a capturing group. In this case, the `Value` and `Length` properties of a [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object contain information only about the last captured substring. For example, the following regular expression matches a single sentence that ends in a period. It uses two grouping constructs: The first captures individual words along with a white-space character; the second captures individual words. As the output from the example shows, although the regular expression succeeds in capturing an entire sentence, the second capturing group captures only the last word.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b((\w+)\s?)+\.";
|
||||
string input = "This is a sentence. This is another sentence.";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
{
|
||||
Console.WriteLine("Match: " + match.Value);
|
||||
Console.WriteLine("Group 2: " + match.Groups[2].Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: This is a sentence.
|
||||
// Group 2: sentence
|
||||
```
|
||||
|
||||
## The Capture Collection
|
||||
|
||||
The [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object contains information only about the last capture. However, the entire set of captures made by a capturing group is still available from the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object that is returned by the [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) property. Each member of the collection is a [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) object that represents a capture made by that capturing group, in the order in which they were captured (and, therefore, in the order in which the captured strings were matched from left to right in the input string). You can retrieve individual [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) objects from the collection in either of two ways:
|
||||
|
||||
* By iterating through the collection using a construct such as `foreach`.
|
||||
|
||||
* By using the [CaptureCollection.Item](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection#System_Text_RegularExpressions_CaptureCollection_Item_System_Int32_) property to retrieve a specific object by index. The Item property is the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object's indexer.
|
||||
|
||||
|
||||
If a quantifier is not applied to a capturing group, the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object contains a single [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) object that is of little interest, because it provides information about the same match as its [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object. If a quantifier is applied to a capturing group, the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object contains all captures made by the capturing group, and the last member of the collection represents the same capture as the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object.
|
||||
|
||||
For example, if you use the regular expression pattern `((a(b))c)+` (where the `+` quantifier specifies one or more matches) to capture matches from the string "abcabcabc", the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object for each [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object contains three members.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "((a(b))c)+";
|
||||
string input = "abcabcabc";
|
||||
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
{
|
||||
Console.WriteLine("Match: '{0}' at position {1}",
|
||||
match.Value, match.Index);
|
||||
GroupCollection groups = match.Groups;
|
||||
for (int ctr = 0; ctr < groups.Count; ctr++) {
|
||||
Console.WriteLine(" Group {0}: '{1}' at position {2}",
|
||||
ctr, groups[ctr].Value, groups[ctr].Index);
|
||||
CaptureCollection captures = groups[ctr].Captures;
|
||||
for (int ctr2 = 0; ctr2 < captures.Count; ctr2++) {
|
||||
Console.WriteLine(" Capture {0}: '{1}' at position {2}",
|
||||
ctr2, captures[ctr2].Value, captures[ctr2].Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: 'abcabcabc' at position 0
|
||||
// Group 0: 'abcabcabc' at position 0
|
||||
// Capture 0: 'abcabcabc' at position 0
|
||||
// Group 1: 'abc' at position 6
|
||||
// Capture 0: 'abc' at position 0
|
||||
// Capture 1: 'abc' at position 3
|
||||
// Capture 2: 'abc' at position 6
|
||||
// Group 2: 'ab' at position 6
|
||||
// Capture 0: 'ab' at position 0
|
||||
// Capture 1: 'ab' at position 3
|
||||
// Capture 2: 'ab' at position 6
|
||||
// Group 3: 'b' at position 7
|
||||
// Capture 0: 'b' at position 1
|
||||
// Capture 1: 'b' at position 4
|
||||
// Capture 2: 'b' at position 7
|
||||
```
|
||||
|
||||
The following example uses the regular expression `(Abc)+` to find one or more consecutive runs of the string "Abc" in the string "XYZAbcAbcAbcXYZAbcAb". The example illustrates the use of the [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) property to return multiple groups of captured substrings.
|
||||
|
||||
```csharp
|
||||
{
|
||||
int counter;
|
||||
Match m;
|
||||
CaptureCollection cc;
|
||||
GroupCollection gc;
|
||||
|
||||
// Look for groupings of "Abc".
|
||||
Regex r = new Regex("(Abc)+");
|
||||
// Define the string to search.
|
||||
m = r.Match("XYZAbcAbcAbcXYZAbcAb");
|
||||
gc = m.Groups;
|
||||
|
||||
// Display the number of groups.
|
||||
Console.WriteLine("Captured groups = " + gc.Count.ToString());
|
||||
|
||||
// Loop through each group.
|
||||
for (int i=0; i < gc.Count; i++)
|
||||
{
|
||||
cc = gc[i].Captures;
|
||||
counter = cc.Count;
|
||||
|
||||
// Display the number of captures in this group.
|
||||
Console.WriteLine("Captures count = " + counter.ToString());
|
||||
|
||||
// Loop through each capture in the group.
|
||||
for (int ii = 0; ii < counter; ii++)
|
||||
{
|
||||
// Display the capture and its position.
|
||||
Console.WriteLine(cc[ii] + " Starts at character " +
|
||||
cc[ii].Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Captured groups = 2
|
||||
// Captures count = 1
|
||||
// AbcAbcAbc Starts at character 3
|
||||
// Captures count = 3
|
||||
// Abc Starts at character 3
|
||||
// Abc Starts at character 6
|
||||
// Abc Starts at character 9
|
||||
```
|
||||
|
||||
## The Individual Capture
|
||||
|
||||
The [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) class contains the results from a single subexpression capture. The [Capture.Value](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture#System_Text_RegularExpressions_Capture_Value) property contains the matched text, and the [Capture.Index](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture#System_Text_RegularExpressions_Capture_Index) property indicates the zero-based position in the input string at which the matched substring begins.
|
||||
|
||||
The following example parses an input string for the temperature of selected cities. A comma (",") is used to separate a city and its temperature, and a semicolon (";") is used to separate each city's data. The entire input string represents a single match. In the regular expression pattern `((\w+(\s\w+)*),(\d+);)+`, which is used to parse the string, the city name is assigned to the second capturing group, and the temperature is assigned to the fourth capturing group.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "Miami,78;Chicago,62;New York,67;San Francisco,59;Seattle,58;";
|
||||
string pattern = @"((\w+(\s\w+)*),(\d+);)+";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
{
|
||||
Console.WriteLine("Current temperatures:");
|
||||
for (int ctr = 0; ctr < match.Groups[2].Captures.Count; ctr++)
|
||||
Console.WriteLine("{0,-20} {1,3}", match.Groups[2].Captures[ctr].Value,
|
||||
match.Groups[4].Captures[ctr].Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Current temperatures:
|
||||
// Miami 78
|
||||
// Chicago 62
|
||||
// New York 67
|
||||
// San Francisco 59
|
||||
```
|
||||
|
||||
The regular expression is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\w+` | Match one or more word characters.
|
||||
`(\s\w+)*` | Match zero or more occurrences of a white-space character followed by one or more word characters. This pattern matches multi-word city names. This is the third capturing group.
|
||||
`(\w+(\s\w+)*)` | Match one or more word characters followed by zero or more occurrences of a white-space character and one or more word characters. This is the second capturing group.
|
||||
`,` | Match a comma.
|
||||
`(\d+)` | Match one or more digits. This is the fourth capturing group.
|
||||
`;` | Match a semicolon.
|
||||
`((\w+(\s\w+)*),(\d+);)+` | Match the pattern of a word followed by any additional words followed by a comma, one or more digits, and a semicolon, one or more times. This is the first capturing group.
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Text.RegularExpressions](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions)
|
||||
|
||||
[.NET Core Regular Expressions](index.md)
|
||||
|
|
@ -1,228 +0,0 @@
|
|||
---
|
||||
title: Alternation Constructs in Regular Expressions
|
||||
description: Alternation Constructs in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: f5a63995-37b2-448f-b029-9de06f9016c5
|
||||
---
|
||||
|
||||
# Alternation Constructs in Regular Expressions
|
||||
|
||||
Alternation constructs modify a regular expression to enable either/or or conditional matching. .NET Core supports three alternation constructs:
|
||||
|
||||
* Pattern matching with **|**
|
||||
|
||||
* Conditional matching with **(?(**_expression_**)**_yes_**|**_no_**)**
|
||||
|
||||
* Conditional matching based on a valid captured group
|
||||
|
||||
## Pattern Matching with |
|
||||
|
||||
You can use the vertical bar (|) character to match any one of a series of patterns, where the | character separates each pattern.
|
||||
|
||||
Like the positive character class, the | character can be used to match any one of a number of single characters. The following example uses both a positive character class and either/or pattern matching with the | character to locate occurrences of the words "gray" or "grey" in a string. In this case, the | character produces a regular expression that is more verbose.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Regular expression using character class.
|
||||
string pattern1 = @"\bgr[ae]y\b";
|
||||
// Regular expression using either/or.
|
||||
string pattern2 = @"\bgr(a|e)y\b";
|
||||
|
||||
string input = "The gray wolf blended in among the grey rocks.";
|
||||
foreach (Match match in Regex.Matches(input, pattern1))
|
||||
Console.WriteLine("'{0}' found at position {1}",
|
||||
match.Value, match.Index);
|
||||
Console.WriteLine();
|
||||
foreach (Match match in Regex.Matches(input, pattern2))
|
||||
Console.WriteLine("'{0}' found at position {1}",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 'gray' found at position 4
|
||||
// 'grey' found at position 35
|
||||
//
|
||||
// 'gray' found at position 4
|
||||
// 'grey' found at position 35
|
||||
```
|
||||
|
||||
The regular expression that uses the | character, `\bgr(a|e)y\b,` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`gr` | Match the characters "gr".
|
||||
`(a|e)` | Match either an "a" or an "e".
|
||||
`y\b` | Match a "y" on a word boundary.
|
||||
|
||||
|
||||
The | character can also be used to perform an either/or match with multiple characters or subexpressions, which can include any combination of character literals and regular expression language elements. (The character class does not provide this functionality.) The following example uses the | character to extract either a U.S. Social Security Number (SSN), which is a 9-digit number with the format *ddd-dd-dddd*, or a U.S. Employer Identification Number (EIN), which is a 9-digit number with the format *dd-ddddddd*.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b";
|
||||
string input = "01-9999999 020-333333 777-88-9999";
|
||||
Console.WriteLine("Matches for {0}:", pattern);
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(" {0} at position {1}", match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Matches for \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b:
|
||||
// 01-9999999 at position 0
|
||||
// 777-88-9999 at position 22
|
||||
```
|
||||
|
||||
The regular expression `\b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})` | Match either of the following: two decimal digits followed by a hyphen followed by seven decimal digits; or three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits.
|
||||
`\d` | End the match at a word boundary.
|
||||
|
||||
## Conditional Matching with an Expression
|
||||
|
||||
This language element attempts to match one of two patterns depending on whether it can match an initial pattern. Its syntax is:
|
||||
|
||||
**(?(**_expression_**)**_yes_**|**_no_**)**
|
||||
|
||||
where *expression* is the initial pattern to match, *yes* is the pattern to match if expression is matched, and *no* is the optional pattern to match if *expression* is not matched. The regular expression engine treats *expression* as a zero-width assertion; that is, the regular expression engine does not advance in the input stream after it evaluates *expression*. Therefore, this construct is equivalent to the following:
|
||||
|
||||
**(?(?**=_expression_**)**_yes_**|**_no_**)**
|
||||
|
||||
where **(?**=_expression_**)** is a zero-width assertion construct. (For more information, see [Grouping Constructs in Regular Expressions](grouping.md).) Because the regular expression engine interprets *expression* as an anchor (a zero-width assertion), *expression* must either be a zero-width assertion (for more information, see [Anchors in Regular Expressions](anchors.md)) or a subexpression that is also contained in *yes*. Otherwise, the *yes* pattern cannot be matched.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> If *expression* is a named or numbered capturing group, the alternation construct is interpreted as a capture test; for more information, see the next section, [Conditional Matching Based on a Valid Capture Group](#Conditional-Matching-Based-on-a-Valid-Capture-Group). In other words, the regular expression engine does not attempt to match the captured substring, but instead tests for the presence or absence of the group.
|
||||
|
||||
|
||||
The following example is a variation of the example that appears in the previous section. It uses conditional matching to determine whether the first three characters after a word boundary are two digits followed by a hyphen. If they are, it attempts to match a U.S. Employer Identification Number (EIN). If not, it attempts to match a U.S. Social Security Number (SSN).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b";
|
||||
string input = "01-9999999 020-333333 777-88-9999";
|
||||
Console.WriteLine("Matches for {0}:", pattern);
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(" {0} at position {1}", match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Matches for \b(\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b:
|
||||
// 01-9999999 at position 0
|
||||
// 777-88-9999 at position 22
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(?(\d{2}-)\d{2}-\d{7}|\d{3}-\d{2}-\d{4})\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`(?(\d{2}-)` | Determine whether the next three characters consist of two digits followed by a hyphen.
|
||||
`\d{2}-\d{7}` | If the previous pattern matches, match two digits followed by a hyphen followed by seven digits.
|
||||
`\d{3}-\d{2}-\d{4}` | If the previous pattern does not match, match three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits.
|
||||
`\b` | Match a word boundary.
|
||||
|
||||
## Conditional Matching Based on a Valid Captured Group
|
||||
|
||||
This language element attempts to match one of two patterns depending on whether it has matched a specified capturing group. Its syntax is:
|
||||
|
||||
**(?(**_name_**)**_yes_**|**_no_**)**
|
||||
|
||||
or
|
||||
|
||||
**(?(**_number_**)**_yes_**|**_no_**)**
|
||||
|
||||
where *name* is the name and *number* is the number of a capturing group, *yes* is the expression to match if name or number has a match, and *no* is the optional expression to match if it does not.
|
||||
|
||||
If *name* does not correspond to the name of a capturing group that is used in the regular expression pattern, the alternation construct is interpreted as an expression test, as explained in the previous section. Typically, this means that expression evaluates to `false`. If `number` does not correspond to a numbered capturing group that is used in the regular expression pattern, the regular expression engine throws an [ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException).
|
||||
|
||||
The following example is a variation of the example that appears in the previous section. It uses a capturing group named `n2` that consists of two digits followed by a hyphen. The alternation construct tests whether this capturing group has been matched in the input string. If it has, the alternation construct attempts to match the last seven digits of a nine-digit U.S. Employer Identification Number (EIN). If it has not, it attempts to match a nine-digit U.S. Social Security Number (SSN).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(?<n2>\d{2}-)*(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b";
|
||||
string input = "01-9999999 020-333333 777-88-9999";
|
||||
Console.WriteLine("Matches for {0}:", pattern);
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(" {0} at position {1}", match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Matches for \b(?<n2>\d{2}-)*(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b:
|
||||
// 01-9999999 at position 0
|
||||
// 777-88-9999 at position 22
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(?<n2>\d{2}-)*(?(n2)\d{7}|\d{3}-\d{2}-\d{4})\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`(?<n2>\d{2}-)*` | Match zero or one occurrence of two digits followed by a hyphen. Name this capturing group `n2`.
|
||||
`(?(n2)` | Test whether `n2` was matched in the input string.
|
||||
`)\d{7}` | If `n2` was matched, match seven decimal digits.
|
||||
`|\d{3}-\d{2}-\d{4}` | If `n2` was not matched, match three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits.
|
||||
`\b` | Match a word boundary.
|
||||
|
||||
A variation of this example that uses a numbered group instead of a named group is shown in the following example. Its regular expression pattern is `\b(\d{2}-)*(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b`.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\d{2}-)*(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b";
|
||||
string input = "01-9999999 020-333333 777-88-9999";
|
||||
Console.WriteLine("Matches for {0}:", pattern);
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(" {0} at position {1}", match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example display the following output:
|
||||
// Matches for \b(\d{2}-)*(?(1)\d{7}|\d{3}-\d{2}-\d{4})\b:
|
||||
// 01-9999999 at position 0
|
||||
// 777-88-9999 at position 22
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
||||
|
|
@ -1,513 +0,0 @@
|
|||
---
|
||||
title: Anchors in Regular Expressions
|
||||
description: Anchors in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 72c0a33c-1ae6-4747-af07-58511149cac7
|
||||
---
|
||||
|
||||
# Anchors in Regular Expressions
|
||||
|
||||
Anchors, or atomic zero-width assertions, specify a position in the string where a match must occur. When you use an anchor in your search expression, the regular expression engine does not advance through the string or consume characters; it looks for a match in the specified position only. For example, **^** specifies that the match must start at the beginning of a line or string. Therefore, the regular expression `^http:` matches "http:" only when it occurs at the beginning of a line. The following table lists the anchors supported by the regular expressions in .NET Core.
|
||||
|
||||
Anchor | Description
|
||||
------ | -----------
|
||||
**^** | The match must occur at the beginning of the string or line.
|
||||
**$** | The match must occur at the end of the string or line, or before \n at the end of the string or line.
|
||||
**\A** | The match must occur at the beginning of the string only (no multiline support)
|
||||
**\Z** | The match must occur at the end of the string, or before \n at the end of the string.
|
||||
**\z** | The match must occur at the end of the string only.
|
||||
**\G** | The match must start at the position where the previous match ended.
|
||||
**\b** | The match must occur on a word boundary.
|
||||
**\B** | The match must not occur on a word boundary.
|
||||
|
||||
## Start of String or Line: ^
|
||||
|
||||
The **^** anchor specifies that the following pattern must begin at the first character position of the string. If you use **^** with the [RegexOptions.Multiline](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Multiline) option (see [Regular Expression Options](options.md)), the match must occur at the beginning of each line.
|
||||
|
||||
The following example uses the **^** anchor in a regular expression that extracts information about the years during which some professional baseball teams existed. The example calls two overloads of the `Regex.Matches` method:
|
||||
|
||||
* The call to the [Matches(String, String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_) overload finds only the first substring in the input string that matches the regular expression pattern.
|
||||
|
||||
* The call to the [Matches(String, String, RegexOptions)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) overload with the options parameter set to [RegexOptions.Multiline](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Multiline) finds all five substrings.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
int startPos = 0, endPos = 70;
|
||||
string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
|
||||
"Chicago Cubs, National League, 1903-present\n" +
|
||||
"Detroit Tigers, American League, 1901-present\n" +
|
||||
"New York Giants, National League, 1885-1957\n" +
|
||||
"Washington Senators, American League, 1901-1960\n";
|
||||
string pattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+";
|
||||
Match match;
|
||||
|
||||
if (input.Substring(startPos, endPos).Contains(",")) {
|
||||
match = Regex.Match(input, pattern);
|
||||
while (match.Success) {
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
|
||||
Console.WriteLine(".");
|
||||
startPos = match.Index + match.Length;
|
||||
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
|
||||
if (! input.Substring(startPos, endPos).Contains(",")) break;
|
||||
match = match.NextMatch();
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
|
||||
if (input.Substring(startPos, endPos).Contains(",")) {
|
||||
match = Regex.Match(input, pattern, RegexOptions.Multiline);
|
||||
while (match.Success) {
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
|
||||
Console.WriteLine(".");
|
||||
startPos = match.Index + match.Length;
|
||||
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
|
||||
if (! input.Substring(startPos, endPos).Contains(",")) break;
|
||||
match = match.NextMatch();
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
|
||||
//
|
||||
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
|
||||
// The Chicago Cubs played in the National League in 1903-present.
|
||||
// The Detroit Tigers played in the American League in 1901-present.
|
||||
// The New York Giants played in the National League in 1885-1957.
|
||||
// The Washington Senators played in the American League in 1901-1960.
|
||||
```
|
||||
|
||||
The regular expression pattern `^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of the input string (or the beginning of the line if the method is called with the `RegexOptions.Multiline` option).
|
||||
`((\w+(\s?)){2,}` | Match one or more word characters followed either by zero or by one space exactly two times. This is the first capturing group. This expression also defines a second and third capturing group: The second consists of the captured word, and the third consists of the captured spaces.
|
||||
`,\s` | Match a comma followed by a white-space character.
|
||||
`(\w+\s\w+)` | Match one or more word characters followed by a space, followed by one or more word characters. This is the fourth capturing group.
|
||||
`,` | Match a comma.
|
||||
`\s\d{4}` | Match a space followed by four decimal digits.
|
||||
`(-(\d{4}`|`present))?` | Match zero or one occurrence of a hyphen followed by four decimal digits or the string "present". This is the sixth capturing group. It also includes a seventh capturing group.
|
||||
`,?` | Match zero or one occurrence of a comma.
|
||||
`(\s\d{4}(-(\d{4}`|`present))?,?)+` | Match one or more occurrences of the following: a space, four decimal digits, zero or one occurrence of a hyphen followed by four decimal digits or the string "present", and zero or one comma. This is the fifth capturing group.
|
||||
|
||||
## End of String or Line: $
|
||||
|
||||
The **$** anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string.
|
||||
|
||||
If you use **$** with the [RegexOptions.Multiline](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Multiline) option, the match can also occur at the end of a line. Note that **$** matches **\n** but does not match **\r\n** (the combination of carriage return and newline characters, or CR/LF). To match the CR/LF character combination, include **\r?$** in the regular expression pattern.
|
||||
|
||||
The following example adds the **$** anchor to the regular expression pattern used in the example in the previous "Start of String or Line" section. When used with the original input string, which includes five lines of text, the [Regex.Matches(String, String)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_) method is unable to find a match, because the end of the first line does not match the **$** pattern. When the original input string is split into a string array, the `Regex.Matches(String, String)` method succeeds in matching each of the five lines. When the [Regex.Matches(String, String, RegexOptions)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method is called with the *options* parameter set to `RegexOptions.Multiline`, no matches are found because the regular expression pattern does not account for the carriage return element (\u+000D). However, when the regular expression pattern is modified by replacing **$** with **\r?$**, calling the `Regex.Matches(String, String, RegexOptions)` method with the *options* parameter set to `RegexOptions.Multiline` again finds five matches.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
int startPos = 0, endPos = 70;
|
||||
string cr = Environment.NewLine;
|
||||
string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + cr +
|
||||
"Chicago Cubs, National League, 1903-present" + cr +
|
||||
"Detroit Tigers, American League, 1901-present" + cr +
|
||||
"New York Giants, National League, 1885-1957" + cr +
|
||||
"Washington Senators, American League, 1901-1960" + cr;
|
||||
Match match;
|
||||
|
||||
string basePattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+";
|
||||
string pattern = basePattern + "$";
|
||||
Console.WriteLine("Attempting to match the entire input string:");
|
||||
if (input.Substring(startPos, endPos).Contains(",")) {
|
||||
match = Regex.Match(input, pattern);
|
||||
while (match.Success) {
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
|
||||
Console.WriteLine(".");
|
||||
startPos = match.Index + match.Length;
|
||||
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
|
||||
if (! input.Substring(startPos, endPos).Contains(",")) break;
|
||||
match = match.NextMatch();
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
|
||||
string[] teams = input.Split(new String[] { cr }, StringSplitOptions.RemoveEmptyEntries);
|
||||
Console.WriteLine("Attempting to match each element in a string array:");
|
||||
foreach (string team in teams)
|
||||
{
|
||||
if (team.Length > 70) continue;
|
||||
|
||||
match = Regex.Match(team, pattern);
|
||||
if (match.Success)
|
||||
{
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
Console.WriteLine(".");
|
||||
}
|
||||
}
|
||||
Console.WriteLine();
|
||||
|
||||
startPos = 0;
|
||||
endPos = 70;
|
||||
Console.WriteLine("Attempting to match each line of an input string with '$':");
|
||||
if (input.Substring(startPos, endPos).Contains(",")) {
|
||||
match = Regex.Match(input, pattern, RegexOptions.Multiline);
|
||||
while (match.Success) {
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
|
||||
Console.WriteLine(".");
|
||||
startPos = match.Index + match.Length;
|
||||
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
|
||||
if (! input.Substring(startPos, endPos).Contains(",")) break;
|
||||
match = match.NextMatch();
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
|
||||
startPos = 0;
|
||||
endPos = 70;
|
||||
pattern = basePattern + "\r?$";
|
||||
Console.WriteLine(@"Attempting to match each line of an input string with '\r?$':");
|
||||
if (input.Substring(startPos, endPos).Contains(",")) {
|
||||
match = Regex.Match(input, pattern, RegexOptions.Multiline);
|
||||
while (match.Success) {
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
|
||||
Console.WriteLine(".");
|
||||
startPos = match.Index + match.Length;
|
||||
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
|
||||
if (! input.Substring(startPos, endPos).Contains(",")) break;
|
||||
match = match.NextMatch();
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Attempting to match the entire input string:
|
||||
//
|
||||
// Attempting to match each element in a string array:
|
||||
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
|
||||
// The Chicago Cubs played in the National League in 1903-present.
|
||||
// The Detroit Tigers played in the American League in 1901-present.
|
||||
// The New York Giants played in the National League in 1885-1957.
|
||||
// The Washington Senators played in the American League in 1901-1960.
|
||||
//
|
||||
// Attempting to match each line of an input string with '$':
|
||||
//
|
||||
// Attempting to match each line of an input string with '\r+$':
|
||||
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
|
||||
// The Chicago Cubs played in the National League in 1903-present.
|
||||
// The Detroit Tigers played in the American League in 1901-present.
|
||||
// The New York Giants played in the National League in 1885-1957.
|
||||
// The Washington Senators played in the American League in 1901-1960.
|
||||
```
|
||||
|
||||
## Start of String Only: \A
|
||||
|
||||
The **\A** anchor specifies that a match must occur at the beginning of the input string. It is identical to the **^** anchor, except that **\A** ignores the [RegexOptions.Multiline](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Multiline) option. Therefore, it can only match the start of the first line in a multiline input string.
|
||||
|
||||
The following example is similar to the examples for the **^** and **$** anchors. It uses the **\A** anchor in a regular expression that extracts information about the years during which some professional baseball teams existed. The input string includes five lines. The call to the [Regex.Matches(String, String, RegexOptions)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method finds only the first substring in the input string that matches the regular expression pattern. As the example shows, the `Multiline` option has no effect.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
int startPos = 0, endPos = 70;
|
||||
string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
|
||||
"Chicago Cubs, National League, 1903-present\n" +
|
||||
"Detroit Tigers, American League, 1901-present\n" +
|
||||
"New York Giants, National League, 1885-1957\n" +
|
||||
"Washington Senators, American League, 1901-1960\n";
|
||||
|
||||
string pattern = @"\A((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+";
|
||||
Match match;
|
||||
|
||||
if (input.Substring(startPos, endPos).Contains(",")) {
|
||||
match = Regex.Match(input, pattern, RegexOptions.Multiline);
|
||||
while (match.Success) {
|
||||
Console.Write("The {0} played in the {1} in",
|
||||
match.Groups[1].Value, match.Groups[4].Value);
|
||||
foreach (Capture capture in match.Groups[5].Captures)
|
||||
Console.Write(capture.Value);
|
||||
|
||||
Console.WriteLine(".");
|
||||
startPos = match.Index + match.Length;
|
||||
endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
|
||||
if (! input.Substring(startPos, endPos).Contains(",")) break;
|
||||
match = match.NextMatch();
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
|
||||
```
|
||||
|
||||
## End of String or Before Ending Newline: \Z
|
||||
|
||||
The **\Z** anchor specifies that a match must occur at the end of the input string, or before **\n** at the end of the input string. It is identical to the **$** anchor, except that **\Z** ignores the [RegexOptions.Multiline](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Multiline) option. Therefore, in a multiline string, it can only match the end of the last line, or the last line before **\n**.
|
||||
|
||||
Note that **\Z** matches **\n** but does not match **\r\n** (the CR/LF character combination). To match CR/LF, include **\r?\Z** in the regular expression pattern.
|
||||
|
||||
The following example uses the **\Z** anchor in a regular expression that is similar to the example in the previous "Start of String or Line" section, which extracts information about the years during which some professional baseball teams existed. The subexpression `\r?\Z` in the regular expression `^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\Z` matches the end of a string, and also matches a string that ends with **\n** or **\r\n**. As a result, each element in the array matches the regular expression pattern.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
|
||||
"Chicago Cubs, National League, 1903-present" + Environment.NewLine,
|
||||
"Detroit Tigers, American League, 1901-present" + Regex.Unescape(@"\n"),
|
||||
"New York Giants, National League, 1885-1957",
|
||||
"Washington Senators, American League, 1901-1960" + Environment.NewLine};
|
||||
string pattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\Z";
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
if (input.Length > 70 || ! input.Contains(",")) continue;
|
||||
|
||||
Console.WriteLine(Regex.Escape(input));
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine(" Match succeeded.");
|
||||
else
|
||||
Console.WriteLine(" Match failed.");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
|
||||
// Match succeeded.
|
||||
// Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
|
||||
// Match succeeded.
|
||||
// Detroit\ Tigers,\ American\ League,\ 1901-present\n
|
||||
// Match succeeded.
|
||||
// New\ York\ Giants,\ National\ League,\ 1885-1957
|
||||
// Match succeeded.
|
||||
// Washington\ Senators,\ American\ League,\ 1901-1960\r\n
|
||||
// Match succeeded.
|
||||
```
|
||||
|
||||
## End of String Only: \z
|
||||
|
||||
The **\z** anchor specifies that a match must occur at the end of the input string. Like the **$** language element, **\z** ignores the [RegexOptions.Multiline](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_Multiline) option. Unlike the **\Z** language element, **\z** does not match a **\n** character at the end of a string. Therefore, it can only match the last line of the input string.
|
||||
|
||||
The following example uses the **\z** anchor in a regular expression that is otherwise identical to the example in the previous section, which extracts information about the years during which some professional baseball teams existed. The example tries to match each of five elements in a string array with the regular expression pattern `^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\z`. Two of the strings end with carriage return and line feed characters, one ends with a line feed character, and two end with neither a carriage return nor a line feed character. As the output shows, only the strings without a carriage return or line feed character match the pattern.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",
|
||||
"Chicago Cubs, National League, 1903-present" + Environment.NewLine,
|
||||
"Detroit Tigers, American League, 1901-present\\r",
|
||||
"New York Giants, National League, 1885-1957",
|
||||
"Washington Senators, American League, 1901-1960" + Environment.NewLine };
|
||||
string pattern = @"^((\w+(\s?)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))?,?)+\r?\z";
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
if (input.Length > 70 || ! input.Contains(",")) continue;
|
||||
|
||||
Console.WriteLine(Regex.Escape(input));
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine(" Match succeeded.");
|
||||
else
|
||||
Console.WriteLine(" Match failed.");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
|
||||
// Match succeeded.
|
||||
// Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
|
||||
// Match failed.
|
||||
// Detroit\ Tigers,\ American\ League,\ 1901-present\n
|
||||
// Match failed.
|
||||
// New\ York\ Giants,\ National\ League,\ 1885-1957
|
||||
// Match succeeded.
|
||||
// Washington\ Senators,\ American\ League,\ 1901-1960\r\n
|
||||
// Match failed.
|
||||
```
|
||||
|
||||
##Contiguous Matches: \G
|
||||
|
||||
The **\G** anchor specifies that a match must occur at the point where the previous match ended. When you use this anchor with the [Regex.Matches](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_) or [Match.NextMatch](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_NextMatch) method, it ensures that all matches are contiguous.
|
||||
|
||||
The following example uses a regular expression to extract the names of rodent species from a comma-delimited string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "capybara,squirrel,chipmunk,porcupine,gopher," +
|
||||
"beaver,groundhog,hamster,guinea pig,gerbil," +
|
||||
"chinchilla,prairie dog,mouse,rat";
|
||||
string pattern = @"\G(\w+\s?\w*),?";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
while (match.Success)
|
||||
{
|
||||
Console.WriteLine(match.Groups[1].Value);
|
||||
match = match.NextMatch();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// capybara
|
||||
// squirrel
|
||||
// chipmunk
|
||||
// porcupine
|
||||
// gopher
|
||||
// beaver
|
||||
// groundhog
|
||||
// hamster
|
||||
// guinea pig
|
||||
// gerbil
|
||||
// chinchilla
|
||||
// prairie dog
|
||||
// mouse
|
||||
// rat
|
||||
```
|
||||
|
||||
The regular expression `\G(\w+\s?\w*),?` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\G` | Begin where the last match ended.
|
||||
`\w+` | Match one or more word characters.
|
||||
`\s?` | Match zero or one space.
|
||||
`\w*` | Match zero or more word characters.
|
||||
`(\w+\s?\w*)` | Match one or more word characters followed by zero or one space, followed by zero or more word characters. This is the first capturing group.
|
||||
`,?` | Match zero or one occurrence of a literal comma character.
|
||||
|
||||
## Word Boundary: \b
|
||||
|
||||
The **\b** anchor specifies that the match must occur on a boundary between a word character (the **\w** language element) and a non-word character (the **\W** language element). Word characters consist of alphanumeric characters and underscores; a non-word character is any character that is not alphanumeric or an underscore. (For more information, see [Character Classes in Regular Expressions](characterclasses.md).) The match may also occur on a word boundary at the beginning or end of the string.
|
||||
|
||||
The **\b** anchor is frequently used to ensure that a subexpression matches an entire word instead of just the beginning or end of a word. The regular expression `\bare\w*\b` in the following example illustrates this usage. It matches any word that begins with the substring "are". The output from the example also illustrates that **\b** matches both the beginning and the end of the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "area bare arena mare";
|
||||
string pattern = @"\bare\w*\b";
|
||||
Console.WriteLine("Words that begin with 'are':");
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Words that begin with 'are':
|
||||
// 'area' found at position 0
|
||||
// 'arena' found at position 10
|
||||
```
|
||||
|
||||
The regular expression pattern is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`are` | Match the substring "are".
|
||||
`\w*` | Match zero or more word characters.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
## Non-Word Boundary: \B
|
||||
|
||||
The **\B** anchor specifies that the match must not occur on a word boundary. It is the opposite of the **\b** anchor.
|
||||
|
||||
The following example uses the **\B** anchor to locate occurrences of the substring "qu" in a word. The regular expression pattern `\Bqu\w+` matches a substring that begins with a "qu" that does not start a word and that continues to the end of the word.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "equity queen equip acquaint quiet";
|
||||
string pattern = @"\Bqu\w+";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 'quity' found at position 1
|
||||
// 'quip' found at position 14
|
||||
// 'quaint' found at position 21
|
||||
```
|
||||
|
||||
The regular expression pattern is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\B` | Do not begin the match at a word boundary.
|
||||
`qu` | Match the substring "qu".
|
||||
`\w+` | Match one or more word characters.
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
||||
|
|
@ -1,245 +0,0 @@
|
|||
---
|
||||
title: Backreference Constructs in Regular Expressions
|
||||
description: Backreference Constructs in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 1bbc9818-133a-43c3-97eb-d0575174e6f9
|
||||
---
|
||||
|
||||
# Backreference Constructs in Regular Expressions
|
||||
|
||||
Backreferences provide a convenient way to identify a repeated character or substring within a string. For example, if the input string contains multiple occurrences of an arbitrary substring, you can match the first occurrence with a capturing group, and then use a backreference to match subsequent occurrences of the substring.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> A separate syntax is used to refer to named and numbered capturing groups in replacement strings. For more information, see [Substitutions in Regular Expressions](substitutions.md).
|
||||
|
||||
.NET Core defines separate language elements to refer to numbered and named capturing groups. For more information about capturing groups, see [Grouping Constructs in Regular Expressions](grouping.md).
|
||||
|
||||
## Numbered Backreferences
|
||||
|
||||
A numbered backreference uses the following syntax:
|
||||
|
||||
**\**_number_
|
||||
|
||||
where *number* is the ordinal position of the capturing group in the regular expression. For example, `\4` matches the contents of the fourth capturing group. If *number* is not defined in the regular expression pattern, a parsing error occurs, and the regular expression engine throws an [ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException). For example, the regular expression `\b(\w+)\s\1` is valid, because `(\w+)` is the first and only capturing group in the expression. On the other hand, `\b(\w+)\s\2` is invalid and throws an argument exception, because there is no capturing group numbered `\2`.
|
||||
|
||||
Note the ambiguity between octal escape codes (such as `\16`) and **\**_number_ backreferences that use the same notation. This ambiguity is resolved as follows:
|
||||
|
||||
* The expressions `\1` through `\9` are always interpreted as backreferences, and not as octal codes.
|
||||
|
||||
* If the first digit of a multidigit expression is 8 or 9 (such as `\80` or `\91`), the expression as interpreted as a literal.
|
||||
|
||||
* Expressions from `\10` and greater are considered backreferences if there is a backreference corresponding to that number; otherwise, they are interpreted as octal codes.
|
||||
|
||||
* If a regular expression contains a backreference to an undefined group number, a parsing error occurs, and the regular expression engine throws an [ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException).
|
||||
|
||||
If the ambiguity is a problem, you can use the **\k<**_name_**>** notation, which is unambiguous and cannot be confused with octal character codes. Similarly, hexadecimal codes such as `\xdd` are unambiguous and cannot be confused with backreferences.
|
||||
|
||||
The following example finds doubled word characters in a string. It defines a regular expression, `(\w)\1,` which consists of the following elements.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
`(\w)` | Match a word character and assign it to the first capturing group.
|
||||
`\1` | Match the next character that is the same as the value of the first capturing group.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(\w)\1";
|
||||
string input = "trellis llama webbing dresser swagger";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("Found '{0}' at position {1}.",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Found 'll' at position 3.
|
||||
// Found 'll' at position 8.
|
||||
// Found 'bb' at position 16.
|
||||
// Found 'ss' at position 25.
|
||||
// Found 'gg' at position 33.
|
||||
```
|
||||
|
||||
## Named Backreferences
|
||||
|
||||
A named backreference is defined by using the following syntax:
|
||||
|
||||
**\k<**_name_**>**
|
||||
|
||||
or:
|
||||
|
||||
**\k'**_name_**'**
|
||||
|
||||
where *name* is the name of a capturing group defined in the regular expression pattern. If *name* is not defined in the regular expression pattern, a parsing error occurs, and the regular expression engine throws an [ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException).
|
||||
|
||||
The following example finds doubled word characters in a string. It defines a regular expression, `(?<char>\w)\k<char>`, which consists of the following elements.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
`(?<char>\w)` | Match a word character and assign it to a capturing group named char.
|
||||
`\k<char>` | Match the next character that is the same as the value of the char capturing group.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(?<char>\w)\k<char>";
|
||||
string input = "trellis llama webbing dresser swagger";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("Found '{0}' at position {1}.",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Found 'll' at position 3.
|
||||
// Found 'll' at position 8.
|
||||
// Found 'bb' at position 16.
|
||||
// Found 'ss' at position 25.
|
||||
// Found 'gg' at position 33.
|
||||
```csharp
|
||||
|
||||
Note that *name* can also be the string representation of a number. For example, the following example uses the regular expression `(?<2>\w)\k<2>` to find doubled word characters in a string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(?<2>\w)\k<2>";
|
||||
string input = "trellis llama webbing dresser swagger";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("Found '{0}' at position {1}.",
|
||||
match.Value, match.Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Found 'll' at position 3.
|
||||
// Found 'll' at position 8.
|
||||
// Found 'bb' at position 16.
|
||||
// Found 'ss' at position 25.
|
||||
// Found 'gg' at position 33.
|
||||
```
|
||||
|
||||
## What Backreferences Match
|
||||
|
||||
A backreference refers to the most recent definition of a group (the definition most immediately to the left, when matching left to right). When a group makes multiple captures, a backreference refers to the most recent capture.
|
||||
|
||||
The following example includes a regular expression pattern, `(?<1>a)(?<1>\1b)*`, which redefines the \1 named group. The following table describes each pattern in the regular expression.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`(?<1>a)` | Match the character "a" and assign the result to the capturing group named 1.
|
||||
`(?<1>\1b)*` |Match 0 or 1 occurrence of the group named 1 along with a "b", and assign the result to the capturing group named 1.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(?<1>a)(?<1>\1b)*";
|
||||
string input = "aababb";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
{
|
||||
Console.WriteLine("Match: " + match.Value);
|
||||
foreach (Group group in match.Groups)
|
||||
Console.WriteLine(" Group: " + group.Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Group: aababb
|
||||
// Group: abb
|
||||
```
|
||||
|
||||
In comparing the regular expression with the input string ("aababb"), the regular expression engine performs the following operations:
|
||||
|
||||
1. It starts at the beginning of the string, and successfully matches "a" with the expression `(?<1>a)`. The value of the 1 group is now "a".
|
||||
|
||||
2. It advances to the second character, and successfully matches the string "ab" with the expression `\1b`, or "ab". It then assigns the result, "ab" to `\1`.
|
||||
|
||||
3. It advances to the fourth character. The expression `(?<1>\1b)` is to be matched zero or more times, so it successfully matches the string "abb" with the expression `\1b`. It assigns the result, "abb", back to `\1`.
|
||||
|
||||
In this example, \* is a looping quantifier -- it is evaluated repeatedly until the regular expression engine cannot match the pattern it defines. Looping quantifiers do not clear group definitions.
|
||||
|
||||
If a group has not captured any substrings, a backreference to that group is undefined and never matches. This is illustrated by the regular expression pattern `\b(\p{Lu}{2})(\d{2})?(\p{Lu}{2})\b,` which is defined as follows:
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match on a word boundary.
|
||||
`(\p{Lu}{2})` | Match two uppercase letters. This is the first capturing group.
|
||||
`(\d{2})?` | Match zero or one occurrence of two decimal digits. This is the second capturing group.
|
||||
`(\p{Lu}{2})` | Match two uppercase letters. This is the third capturing group.
|
||||
`\b` | End the match on a word boundary.
|
||||
|
||||
An input string can match this regular expression even if the two decimal digits that are defined by the second capturing group are not present. The following example shows that even though the match is successful, an empty capturing group is found between two successful capturing groups.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\p{Lu}{2})(\d{2})?(\p{Lu}{2})\b";
|
||||
string[] inputs = { "AA22ZZ", "AABB" };
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
{
|
||||
Console.WriteLine("Match in {0}: {1}", input, match.Value);
|
||||
if (match.Groups.Count > 1)
|
||||
{
|
||||
for (int ctr = 1; ctr <= match.Groups.Count - 1; ctr++)
|
||||
{
|
||||
if (match.Groups[ctr].Success)
|
||||
Console.WriteLine("Group {0}: {1}",
|
||||
ctr, match.Groups[ctr].Value);
|
||||
else
|
||||
Console.WriteLine("Group {0}: <no match>", ctr);
|
||||
}
|
||||
}
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match in AA22ZZ: AA22ZZ
|
||||
// Group 1: AA
|
||||
// Group 2: 22
|
||||
// Group 3: ZZ
|
||||
//
|
||||
// Match in AABB: AABB
|
||||
// Group 1: AA
|
||||
// Group 2: <no match>
|
||||
// Group 3: BB
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
|
@ -1,892 +0,0 @@
|
|||
---
|
||||
title: Character Classes in Regular Expressions
|
||||
description: Character Classes in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: da569db8-0702-4bcd-9bbc-559ee64fa4b3
|
||||
---
|
||||
|
||||
# Character Classes in Regular Expressions
|
||||
|
||||
A character class defines a set of characters, any one of which can occur in an input string for a match to succeed. The regular expression language in .NET Core supports the following character classes:
|
||||
|
||||
* Positive character groups. A character in the input string must match one of a specified set of characters. For more information, see [Positive Character Group](#Positive-Character-Group:-[-]).
|
||||
|
||||
* Negative character groups. A character in the input string must not match one of a specified set of characters. For more information, see [Negative Character Group](#Negative-Character-Group:-[^]).
|
||||
|
||||
* Any character. The . (dot or period) character in a regular expression is a wildcard character that matches any character except **\n**. For more information, see [Any Character](#Any-Character:.).
|
||||
|
||||
* A general Unicode category or named block. A character in the input string must be a member of a particular Unicode category or must fall within a contiguous range of Unicode characters for a match to succeed. For more information, see [Unicode Category or Unicode Block](#Unicode-Category-or-Unicode-Block:\p{}).
|
||||
|
||||
* A negative general Unicode category or named block. A character in the input string must not be a member of a particular Unicode category or must not fall within a contiguous range of Unicode characters for a match to succeed. For more information, see [Negative Unicode Category or Unicode Block](#Negative-Unicode-Category-or-Unicode-Block:\P{}).
|
||||
|
||||
* A word character. A character in the input string can belong to any of the Unicode categories that are appropriate for characters in words. For more information, see [Word Character](#Word-Character:\w).
|
||||
|
||||
* A non-word character. A character in the input string can belong to any Unicode category that is not a word character. For more information, see [Non-Word Character](#Non-Word-Character:\W).
|
||||
|
||||
* A white-space character. A character in the input string can be any Unicode separator character, as well as any one of a number of control characters. For more information, see [White-Space Character](#White-Space-Character:\s).
|
||||
|
||||
* A non-white-space character. A character in the input string can be any character that is not a white-space character. For more information, see [Non-White-Space Character](#Non-White-Space-Character:\S).
|
||||
|
||||
* A decimal digit. A character in the input string can be any of a number of characters classified as Unicode decimal digits. For more information, see [Decimal Digit Character](#Decimal-Digit-Character:\d).
|
||||
|
||||
* A non-decimal digit. A character in the input string can be anything other than a Unicode decimal digit. For more information, see [Non-Digit Character](#Non-Digit_Character:\D).
|
||||
|
||||
|
||||
.NET Core supports character class subtraction expressions, which enables you to define a set of characters as the result of excluding one character class from another character class. For more information, see [Character Class Subtraction](#Character-Class-Subtraction:-[base_group---[excluded_group]]).
|
||||
|
||||
## Positive Character Group: [ ]
|
||||
|
||||
A positive character group specifies a list of characters, any one of which may appear in an input string for a match to occur. This list of characters may be specified individually, as a range, or both.
|
||||
|
||||
The syntax for specifying a list of individual characters is as follows:
|
||||
|
||||
[*character*_*group*]
|
||||
|
||||
where *character_group* is a list of the individual characters that can appear in the input string for a match to succeed. *character*_*group* can consist of any combination of one or more literal characters, [escape characters](escapes.md), or character classes.
|
||||
|
||||
The syntax for specifying a range of characters is as follows:
|
||||
|
||||
```
|
||||
[firstCharacter-lastCharacter]
|
||||
```
|
||||
|
||||
where *firstCharacter* is the character that begins the range and *lastCharacter* is the character that ends the range. A character range is a contiguous series of characters defined by specifying the first character in the series, a hyphen (-), and then the last character in the series. Two characters are contiguous if they have adjacent Unicode code points.
|
||||
|
||||
Some common regular expression patterns that contain positive character classes are listed in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`[aeiou]` | Match all vowels.
|
||||
`[\p{P}\d]` | Match all punctuation and decimal digit characters.
|
||||
`[\s\p{P}]` | Match all white-space and punctuation.
|
||||
|
||||
The following example defines a positive character group that contains the characters "a" and "e" so that the input string must contain the words "grey" or "gray" followed by another word for a match to occur.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"gr[ae]y\s\S+?[\s\p{P}]";
|
||||
string input = "The gray wolf jumped over the grey wall.";
|
||||
MatchCollection matches = Regex.Matches(input, pattern);
|
||||
foreach (Match match in matches)
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// gray wolf
|
||||
// grey wall.
|
||||
```
|
||||
|
||||
The regular expression `gr[ae]y\s\S+?[\s|\p{P}]` is defined as follows:
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`gr` | Match the literal characters "gr".
|
||||
`[ae]` | Match either an "a" or an "e".
|
||||
`y\s` | Match the literal character "y" followed by a white-space character.
|
||||
`\S+?` | Match one or more non-white-space characters, but as few as possible.
|
||||
`[\s\p{P}]` | Match either a white-space character or a punctuation mark.
|
||||
|
||||
The following example matches words that begin with any capital letter. It uses the subexpression `[A-Z]` to represent the range of capital letters from A to Z.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b[A-Z]\w*\b";
|
||||
string input = "A city Albany Zulu maritime Marseilles";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// A
|
||||
// Albany
|
||||
// Zulu
|
||||
// Marseilles
|
||||
```
|
||||
|
||||
The regular expression `\b[A-Z]\w*\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`[A-Z]` | Match any uppercase character from A to Z.
|
||||
`\w*` | Match zero or more word characters.
|
||||
`\b` | Match a word boundary.
|
||||
|
||||
## Negative Character Group: [^]
|
||||
|
||||
A negative character group specifies a list of characters that must not appear in an input string for a match to occur. The list of characters may be specified individually, as a range, or both.
|
||||
|
||||
The syntax for specifying a list of individual characters is as follows:
|
||||
|
||||
[^*character*_*group*]
|
||||
|
||||
where *character_group* is a list of the individual characters that cannot appear in the input string for a match to succeed. *character*_*group* can consist of any combination of one or more literal characters, [escape characters](escapes.md), or character classes.
|
||||
|
||||
The syntax for specifying a range of characters is as follows:
|
||||
|
||||
[^*firstCharacter-lastCharacter*]
|
||||
|
||||
where *firstCharacter* is the character that begins the range, and *lastCharacter* is the character that ends the range. A character range is a contiguous series of characters defined by specifying the first character in the series, a hyphen (-), and then the last character in the series. Two characters are contiguous if they have adjacent Unicode code points.
|
||||
|
||||
Two or more character ranges can be concatenated. For example, to specify the range of decimal digits from "0" through "9", the range of lowercase letters from "a" through "f", and the range of uppercase letters from "A" through "F", use `[0-9a-fA-F]`.
|
||||
|
||||
The leading carat character (^) in a negative character group is mandatory and indicates the character group is a negative character group instead of a positive character group.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> A negative character group in a larger regular expression pattern is not a zero-width assertion. That is, after evaluating the negative character group, the regular expression engine advances one character in the input string.
|
||||
|
||||
Some common regular expression patterns that contain negative character groups are listed in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`[^aeiou]` | Match all characters except vowels.
|
||||
`[^\p{P}\d]` | Match all characters except punctuation and decimal digit characters.
|
||||
|
||||
The following example matches any word that begins with the characters "th" and is not followed by an "o".
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\bth[^o]\w+\b";
|
||||
string input = "thought thing though them through thus thorough this";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// thing
|
||||
// them
|
||||
// through
|
||||
// thus
|
||||
// this
|
||||
```
|
||||
|
||||
The regular expression `\bth[^o]\w+\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`th` | Match the literal characters "th".
|
||||
`[^o]` | Match any character that is not an "o".
|
||||
`\w+` | Match one or more word characters.
|
||||
`\b` | End at a word boundary.
|
||||
|
||||
## Any Character: .
|
||||
|
||||
The period character (.) matches any character except **\n** (the newline character, **\u000A**), with the following two qualifications:
|
||||
|
||||
* If a regular expression pattern is modified by the `RegexOptions.Singleline` option, or if the portion of the pattern that contains the . character class is modified by the **s** option, . matches any character. For more information, see [Regular Expression Options](options.md).
|
||||
|
||||
The following example illustrates the different behavior of the . character class by default and with the `RegexOptions.Singleline` option. The regular expression `^.+` starts at the beginning of the string and matches every character. By default, the match ends at the end of the first line; the regular expression pattern matches the carriage return character, **\r** or **\u000D**, but it does not match **\n**. Because the `RegexOptions.Singleline` option interprets the entire input string as a single line, it matches every character in the input string, including **\n**.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "^.+";
|
||||
string input = "This is one line and" + Environment.NewLine + "this is the second.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(Regex.Escape(match.Value));
|
||||
|
||||
Console.WriteLine();
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.Singleline))
|
||||
Console.WriteLine(Regex.Escape(match.Value));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// This\ is\ one\ line\ and\r
|
||||
//
|
||||
// This\ is\ one\ line\ and\r\nthis\ is\ the\ second\.
|
||||
```
|
||||
|
||||
> ***Note***
|
||||
>
|
||||
> Because it matches any character except **\n**, the . character class also matches **\r** (the carriage return character, **\u000D**).
|
||||
|
||||
* In a positive or negative character group, a period is treated as a literal period character, and not as a character class. For more information, see [Positive Character Group](#Positive-Character-Group:-[-]) or [Negative Character Group](#Negative-Character-Group:-[^]) earlier in this topic. The following example provides an illustration by defining a regular expression that includes the period character (**.**) both as a character class and as a member of a positive character group. The regular expression `\b.*[.?!;:](\s|\z)` begins at a word boundary, matches any character until it encounters one of four punctuation marks, including a period, and then matches either a white-space character or the end of the string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b.*[.?!;:](\s|\z)";
|
||||
string input = "this. what: is? go, thing.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// this. what: is? go, thing.
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Because it matches any character, the . language element is often used with a lazy quantifier if a regular expression pattern attempts to match any character multiple times. For more information, see [Quantifiers in Regular Expressions](quantifiers.md).
|
||||
|
||||
## Unicode Category or Unicode Block: \p{}
|
||||
|
||||
The Unicode standard assigns each character a general category. For example, a particular character can be an uppercase letter (represented by the **Lu** category), a decimal digit (the **Nd** category), a math symbol (the **Sm** category), or a paragraph separator (the **Zl** category). Specific character sets in the Unicode standard also occupy a specific range or block of consecutive code points. For example, the basic Latin character set is found from **\u0000** through **\u007F**, while the Arabic character set is found from **\u0600** through **\u06FF**.
|
||||
|
||||
The regular expression construct
|
||||
|
||||
**\p{**_name_**}**
|
||||
|
||||
matches any character that belongs to a Unicode general category or named block, where name is the category abbreviation or named block name. For a list of category abbreviations, see the [Supported Unicode General Categories](#Supported-Unicode-General-Categories) section later in this topic. For a list of named blocks, see the [Supported Named Blocks](#Supported-Named-Blocks) section later in this topic.
|
||||
|
||||
The following example uses the **\p{**_name_**}** construct to match both a Unicode general category (in this case, the **Pd**, or Punctuation,Dash category) and a named block (the **IsGreek** and **IsBasicLatin** named blocks).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\p{IsGreek}+(\s)?)+\p{Pd}\s(\p{IsBasicLatin}+(\s)?)+";
|
||||
string input = "?ata ?a??a??? - The Gospel of Matthew";
|
||||
|
||||
Console.WriteLine(Regex.IsMatch(input, pattern)); // Displays True.
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The regular expression `\b(\p{IsGreek}+(\s)?)+\p{Pd}\s(\p{IsBasicLatin}+(\s)?)+` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`\p{IsGreek}+` | Match one or more Greek characters.
|
||||
`(\s)?` | Match zero or one white-space character.
|
||||
`(\p{IsGreek}+(\s)?)+` | Match the pattern of one or more Greek characters followed by zero or one white-space characters one or more times.
|
||||
`\p{Pd}` | Match a Punctuation, Dash character.
|
||||
`\s` | Match a white-space character.
|
||||
`\p{IsBasicLatin}+` | Match one or more basic Latin characters.
|
||||
`(\s)?` | Match zero or one white-space character.
|
||||
`(\p{IsBasicLatin}+(\s)?)+` | Match the pattern of one or more basic Latin characters followed by zero or one white-space characters one or more times.
|
||||
|
||||
## Negative Unicode Category or Unicode Block: \P{}
|
||||
|
||||
The Unicode standard assigns each character a general category. For example, a particular character can be an uppercase letter (represented by the **Lu** category), a decimal digit (the **Nd** category), a math symbol (the **Sm** category), or a paragraph separator (the **Zl** category). Specific character sets in the Unicode standard also occupy a specific range or block of consecutive code points. For example, the basic Latin character set is found from **\u0000** through **\u007F**, while the Arabic character set is found from **\u0600** through **\u06FF**.
|
||||
|
||||
The regular expression construct
|
||||
|
||||
**\P{**_name_**}**
|
||||
|
||||
matches any character that belongs to a Unicode general category or named block, where name is the category abbreviation or named block name. For a list of category abbreviations, see the [Supported Unicode General Categories](#Supported-Unicode-General-Categories) section later in this topic. For a list of named blocks, see the [Supported Named Blocks](#Supported-Named-Blocks) section later in this topic.
|
||||
|
||||
The following example uses the **\P{**_name_**}** construct to remove any currency symbols (in this case, the **Sc**, or Symbol, Currency category) from numeric strings.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(\P{Sc})+";
|
||||
|
||||
string[] values = { "$164,091.78", "£1,073,142.68", "73¢", "€120" };
|
||||
foreach (string value in values)
|
||||
Console.WriteLine(Regex.Match(value, pattern).Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 164,091.78
|
||||
// 1,073,142.68
|
||||
// 73
|
||||
// 120
|
||||
```
|
||||
|
||||
The regular expression pattern `(\P{Sc})+` matches one or more characters that are not currency symbols; it effectively strips any currency symbol from the result string.
|
||||
|
||||
## Word Character: \w
|
||||
|
||||
**\w** matches any word character. A word character is a member of any of the Unicode categories listed in the following table.
|
||||
|
||||
Category | Description
|
||||
-------- | -----------
|
||||
Ll | Letter, Lowercase
|
||||
Lu | Letter, Uppercase
|
||||
Lt | Letter, Titlecase
|
||||
Lo | Letter, Other
|
||||
Lm | Letter, Modifier
|
||||
Mn | Mark, Nonspacing
|
||||
Nd | Number, Decimal Digit
|
||||
Pc | Punctuation, Connector. This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F.
|
||||
|
||||
If ECMAScript-compliant behavior is specified, **\w** is equivalent to `[a-zA-Z_0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Because it matches any word character, the \w language element is often used with a lazy quantifier if a regular expression pattern attempts to match any word character multiple times, followed by a specific word character. For more information, see [Quantifiers in Regular Expressions](quantifiers.md).
|
||||
|
||||
The following example uses the **\w** language element to match duplicate characters in a word. The example defines a regular expression pattern, **(\w)\1**, which can be interpreted as follows.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
(\w) | Match a word character. This is the first capturing group.
|
||||
\1 | Match the value of the first capture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(\w)\1";
|
||||
string[] words = { "trellis", "seer", "latter", "summer",
|
||||
"hoarse", "lesser", "aardvark", "stunned" };
|
||||
foreach (string word in words)
|
||||
{
|
||||
Match match = Regex.Match(word, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine("'{0}' found in '{1}' at position {2}.",
|
||||
match.Value, word, match.Index);
|
||||
else
|
||||
Console.WriteLine("No double characters in '{0}'.", word);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 'll' found in 'trellis' at position 3.
|
||||
// 'ee' found in 'seer' at position 1.
|
||||
// 'tt' found in 'latter' at position 2.
|
||||
// 'mm' found in 'summer' at position 2.
|
||||
// No double characters in 'hoarse'.
|
||||
// 'ss' found in 'lesser' at position 2.
|
||||
// 'aa' found in 'aardvark' at position 0.
|
||||
// 'nn' found in 'stunned' at position 3.
|
||||
```
|
||||
|
||||
## Non-Word Character: \W
|
||||
|
||||
**\W** matches any non-word character. The **\W** language element is equivalent to the following character class:
|
||||
|
||||
```
|
||||
[^\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}\p{Lm}]
|
||||
```
|
||||
|
||||
In other words, it matches any character except for those in the Unicode categories listed in the following table.
|
||||
|
||||
Category | Description
|
||||
-------- | -----------
|
||||
Ll | Letter, Lowercase
|
||||
Lu | Letter, Uppercase
|
||||
Lt | Letter, Titlecase
|
||||
Lo | Letter, Other
|
||||
Lm | Letter, Modifier
|
||||
Mn | Mark, Nonspacing
|
||||
Nd | Number, Decimal Digit
|
||||
Pc | Punctuation, Connector. This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F.
|
||||
|
||||
If ECMAScript-compliant behavior is specified, **\W** is equivalent to `[^a-zA-Z_0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Because it matches any word character, the \w language element is often used with a lazy quantifier if a regular expression pattern attempts to match any word character multiple times, followed by a specific word character. For more information, see [Quantifiers in Regular Expressions](quantifiers.md).
|
||||
|
||||
The following example illustrates the **\W** character class. It defines a regular expression pattern, `\b(\w+)(\W){1,2}`, that matches a word followed by one or two non-word characters, such as white space or punctuation. The regular expression is interpreted as shown in the following table.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
\b | Begin the match at a word boundary.
|
||||
(\w+) | Match one or more word characters. This is the first capturing group.
|
||||
(\W){1,2} | Match a non-word character either one or two times. This is the second capturing group.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\w+)(\W){1,2}";
|
||||
string input = "The old, grey mare slowly walked across the narrow, green pasture.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
{
|
||||
Console.WriteLine(match.Value);
|
||||
Console.Write(" Non-word character(s):");
|
||||
CaptureCollection captures = match.Groups[2].Captures;
|
||||
for (int ctr = 0; ctr < captures.Count; ctr++)
|
||||
Console.Write(@"'{0}' (\u{1}){2}", captures[ctr].Value,
|
||||
Convert.ToUInt16(captures[ctr].Value[0]).ToString("X4"),
|
||||
ctr < captures.Count - 1 ? ", " : "");
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// old,
|
||||
// Non-word character(s):',' (\u002C), ' ' (\u0020)
|
||||
// grey
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// mare
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// slowly
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// walked
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// across
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// the
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// narrow,
|
||||
// Non-word character(s):',' (\u002C), ' ' (\u0020)
|
||||
// green
|
||||
// Non-word character(s):' ' (\u0020)
|
||||
// pasture.
|
||||
// Non-word character(s):'.' (\u002E)
|
||||
```
|
||||
|
||||
Because the `Group` object for the second capturing group contains only a single captured non-word character, the example retrieves all captured non-word characters from the `CaptureCollection` object that is returned by the `Group.Captures` property.
|
||||
|
||||
## White-Space Character: \s
|
||||
|
||||
**\s** matches any white-space character. It is equivalent to the escape sequences and Unicode categories listed in the following table.
|
||||
|
||||
Category | Description
|
||||
-------- | -----------
|
||||
**\f** | The form feed character, \u000C.
|
||||
**\n** | The newline character, \u000A.
|
||||
**\r** | The carriage return character, \u000D.
|
||||
**\t** | The tab character, \u0009.
|
||||
**\v** | The vertical tab character, \u000B.
|
||||
**\x85** | The ellipsis or NEXT LINE (NEL) character (…), \u0085.
|
||||
**\p{Z}** | Matches any separator character.
|
||||
|
||||
|
||||
If ECMAScript-compliant behavior is specified, **\s** is equivalent to `[ \f\n\r\t\v]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md).
|
||||
|
||||
The following example illustrates the \s character class. It defines a regular expression pattern, `\b\w+(e)?s(\s|$)`, that matches a word ending in either "s" or "es" followed by either a white-space character or the end of the input string. The regular expression is interpreted as shown in the following table.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
\b | Begin the match at a word boundary.
|
||||
\w+ | Match one or more word characters.
|
||||
(e)? | Match an "e" either zero or one time.
|
||||
s | Match an "s".
|
||||
(\s|$) | Match either a whitespace character or the end of the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b\w+(e)?s(\s|$)";
|
||||
string input = "matches stores stops leave leaves";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// matches
|
||||
// stores
|
||||
// stops
|
||||
// leaves
|
||||
```
|
||||
|
||||
## Non-White-Space Character: \S
|
||||
|
||||
**\S** matches any non-white-space character. It is equivalent to the `[^\f\n\r\t\v\x85\p{Z}]` regular expression pattern, or the opposite of the regular expression pattern that is equivalent to **\s**, which matches white-space characters. For more information, see the oprevious section, "White-Space Character: \s".
|
||||
|
||||
If ECMAScript-compliant behavior is specified, **\S** is equivalent to `[^ \f\n\r\t\v]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md).
|
||||
|
||||
The following example illustrates the **\S** language element. The regular expression pattern \b(\S+)\s? matches strings that are delimited by white-space characters. The second element in the match's GroupCollection object contains the matched string. The regular expression can be interpreted as shown in the following table.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(\S+)` | Match one or more non-white-space characters. This is the first capturing group.
|
||||
`\s?` | Match zero or one white-space character.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\S+)\s?";
|
||||
string input = "This is the first sentence of the first paragraph. " +
|
||||
"This is the second sentence.\n" +
|
||||
"This is the only sentence of the second paragraph.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Groups[1]);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// This
|
||||
// is
|
||||
// the
|
||||
// first
|
||||
// sentence
|
||||
// of
|
||||
// the
|
||||
// first
|
||||
// paragraph.
|
||||
// This
|
||||
// is
|
||||
// the
|
||||
// second
|
||||
// sentence.
|
||||
// This
|
||||
// is
|
||||
// the
|
||||
// only
|
||||
// sentence
|
||||
// of
|
||||
// the
|
||||
// second
|
||||
// paragraph.
|
||||
```
|
||||
|
||||
## Decimal Digit Character: \d
|
||||
|
||||
**\d** matches any decimal digit. It is equivalent to the `\\p{Nd}` regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets.
|
||||
|
||||
If ECMAScript-compliant behavior is specified, **\d** is equivalent to `[0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md).
|
||||
|
||||
The following example illustrates the **\d** language element. It tests whether an input string represents a valid telephone number in the United States and Canada. The regular expression pattern `^(\(?\d{3}\)?[\s-])?\d{3}-\d{4}$` is defined as shown in the following table.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of the input string.
|
||||
`\(?` | Match zero or one literal "(" character.
|
||||
`\d{3}` | Match three decimal digits.
|
||||
`\)?` | Match zero or one literal ")" character.
|
||||
`[\s-]` | Match a hyphen or a white-space character.
|
||||
`(\(?\d{3}\)?[\s-])?` | Match an optional opening parenthesis followed by three decimal digits, an optional closing parenthesis, and either a white-space character or a hyphen zero or one time. This is the first capturing group.
|
||||
`\d{3}-\d{4}` | Match three decimal digits followed by a hyphen and four more decimal digits.
|
||||
`$` | Match the end of the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"^(\(?\d{3}\)?[\s-])?\d{3}-\d{4}$";
|
||||
string[] inputs = { "111 111-1111", "222-2222", "222 333-444",
|
||||
"(212) 111-1111", "111-AB1-1111",
|
||||
"212-111-1111", "01 999-9999" };
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
if (Regex.IsMatch(input, pattern))
|
||||
Console.WriteLine(input + ": matched");
|
||||
else
|
||||
Console.WriteLine(input + ": match failed");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 111 111-1111: matched
|
||||
// 222-2222: matched
|
||||
// 222 333-444: match failed
|
||||
// (212) 111-1111: matched
|
||||
// 111-AB1-1111: match failed
|
||||
// 212-111-1111: matched
|
||||
// 01 999-9999: match failed
|
||||
```
|
||||
|
||||
## Non-Digit Character: \D
|
||||
|
||||
**\D** matches any non-digit character. It is equivalent to the `\P{Nd}` regular expression pattern.
|
||||
|
||||
If ECMAScript-compliant behavior is specified, **\D** is equivalent to `[^0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md).
|
||||
|
||||
The following example illustrates the **\D** language element. It tests whether a string such as a part number consists of the appropriate combination of decimal and non-decimal characters. The regular expression pattern `^\D\d{1,5}\D*$` is defined as shown in the following table.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the beginning of the input string.
|
||||
`\D` | Match a non-digit character.
|
||||
`\d{1,5}` | Match from one to five decimal digits.
|
||||
`\D*` | Match zero, one, or more non-decimal characters.
|
||||
`$` | Match the end of the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"^\D\d{1,5}\D*$";
|
||||
string[] inputs = { "A1039C", "AA0001", "C18A", "Y938518" };
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
if (Regex.IsMatch(input, pattern))
|
||||
Console.WriteLine(input + ": matched");
|
||||
else
|
||||
Console.WriteLine(input + ": match failed");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// A1039C: matched
|
||||
// AA0001: match failed
|
||||
// C18A: matched
|
||||
// Y938518: match failed
|
||||
```
|
||||
|
||||
## Supported Unicode General Categories
|
||||
|
||||
Unicode defines the general categories listed in the following table. For more information, see the "UCD File Format" and "General Category Values" subtopics at the [Unicode Character Database](http://www.unicode.org/reports/tr44/).
|
||||
|
||||
Category | Description
|
||||
-------- | -----------
|
||||
**Lu** | Letter, Uppercase
|
||||
**Ll** | Letter, Lowercase
|
||||
**Lt** | Letter, Titlecase
|
||||
**Lm** | Letter, Modifier
|
||||
**Lo** | Letter, Other
|
||||
**L** | All letter characters. This includes the **Lu**, **Ll**, **Lt**, **Lm**, and **Lo** characters.
|
||||
**Mn** | Mark, Nonspacing
|
||||
**Mc** | Mark, Spacing Combining
|
||||
**Me** | Mark, Enclosing
|
||||
**M** | All diacritic marks. This includes the **Mn**, **Mc**, and **Me** categories.
|
||||
**Nd** | Number, Decimal Digit
|
||||
**Nl** | Number, Letter
|
||||
**No** | Number, Other
|
||||
**N** | All numbers. This includes the **Nd**, **Nl**, and **No** categories.
|
||||
**Pc** | Punctuation, Connector
|
||||
**Pd** | Punctuation, Dash
|
||||
**Ps** | Punctuation, Open
|
||||
**Pe** | Punctuation, Close
|
||||
**Pi** | Punctuation, Initial quote (may behave like Ps or Pe depending on usage)
|
||||
**Pf** | Punctuation, Final quote (may behave like Ps or Pe depending on usage)
|
||||
**Po** | Punctuation, Other
|
||||
**P** | All punctuation characters. This includes the **Pc**, **Pd**, **Ps**, **Pe**, **Pi**, **Pf**, and **Po** categories.
|
||||
**Sm** | Symbol, Math
|
||||
**Sc** | Symbol, Currency
|
||||
**Sk** | Symbol, Modifier
|
||||
**So** | Symbol, Other
|
||||
**S** | All symbols. This includes the **Sm**, **Sc**, **Sk**, and **So** categories.
|
||||
**Zs** | Separator, Space
|
||||
**Zl** | Separator, Line
|
||||
**Zp** | Separator, Paragraph
|
||||
**Z** | All separator characters. This includes the **Zs**, **Zl**, and **Zp** categories.
|
||||
**Cc** | Other, Control
|
||||
**Cf** | Other, Format
|
||||
**Cs** | Other, Surrogate
|
||||
**Co** | Other, Private Use
|
||||
**Cn** | Other, Not Assigned (no characters have this property)
|
||||
**C** | All control characters. This includes the **Cc**, **Cf**, **Cs**, **Co**, and **Cn** categories.
|
||||
|
||||
##Supported Named Blocks
|
||||
|
||||
.NET Core provides the named blocks listed in the following table. The set of supported named blocks is based on Unicode 4.0 and Perl 5.6.
|
||||
|
||||
Code point range | Block name
|
||||
---------------- | ----------
|
||||
0000 - 007F | **IsBasicLatin**
|
||||
0080 - 00FF | **IsLatin-1Supplement**
|
||||
0100 - 017F | **IsLatinExtended-A**
|
||||
0180 - 024F | **IsLatinExtended-B**
|
||||
0250 - 02AF | **IsIPAExtensions**
|
||||
02B0 - 02FF | **IsSpacingModifierLetters**
|
||||
0300 - 036F | **IsCombiningDiacriticalMarks**
|
||||
0370 - 03FF | **IsGreek** -or- **IsGreekandCoptic**
|
||||
0400 - 04FF | **IsCyrillic**
|
||||
0500 - 052F | **IsCyrillicSupplement**
|
||||
0530 - 058F | **IsArmenian**
|
||||
0590 - 05FF | **IsHebrew**
|
||||
0600 - 06FF | **IsArabic**
|
||||
0700 - 074F | **IsSyriac**
|
||||
0780 - 07BF | **IsThaana**
|
||||
0900 - 097F | **IsDevanagari**
|
||||
0980 - 09FF | **IsBengali**
|
||||
0A00 - 0A7F | **IsGurmukhi**
|
||||
0A80 - 0AFF | **IsGujarati**
|
||||
0B00 - 0B7F | **IsOriya**
|
||||
0B80 - 0BFF | **IsTamil**
|
||||
0C00 - 0C7F | **IsTelugu**
|
||||
0C80 - 0CFF | **IsKannada**
|
||||
0D00 - 0D7F | **IsMalayalam**
|
||||
0D80 - 0DFF | **IsSinhala**
|
||||
0E00 - 0E7F | **IsThai**
|
||||
0E80 - 0EFF | **IsLao**
|
||||
0F00 - 0FFF | **IsTibetan**
|
||||
1000 - 109F | **IsMyanmar**
|
||||
10A0 - 10FF | **IsGeorgian**
|
||||
1100 - 11FF | **IsHangulJamo**
|
||||
1200 - 137F | **IsEthiopic**
|
||||
13A0 - 13FF | **IsCherokee**
|
||||
1400 - 167F | **IsUnifiedCanadianAboriginalSyllabics**
|
||||
1680 - 169F | **IsOgham**
|
||||
16A0 - 16FF | **IsRunic**
|
||||
1700 - 171F | **IsTagalog**
|
||||
1720 - 173F | **IsHanunoo**
|
||||
1740 - 175F | **IsBuhid**
|
||||
1760 - 177F | **IsTagbanwa**
|
||||
1780 - 17FF | **IsKhmer**
|
||||
1800 - 18AF | **IsMongolian**
|
||||
1900 - 194F | **IsLimbu**
|
||||
1950 - 197F | **IsTaiLe**
|
||||
19E0 - 19FF | **IsKhmerSymbols**
|
||||
1D00 - 1D7F | **IsPhoneticExtensions**
|
||||
1E00 - 1EFF | **IsLatinExtendedAdditional**
|
||||
1F00 - 1FFF | **IsGreekExtended**
|
||||
2000 - 206F | **IsGeneralPunctuation**
|
||||
2070 - 209F | **IsSuperscriptsandSubscripts**
|
||||
20A0 - 20CF | **IsCurrencySymbols**
|
||||
20D0 - 20FF | **IsCombiningDiacriticalMarksforSymbols** -or- **IsCombiningMarksforSymbols**
|
||||
2100 - 214F | **IsLetterlikeSymbols**
|
||||
2150 - 218F | **IsNumberForms**
|
||||
2190 - 21FF | **IsArrows**
|
||||
2200 - 22FF | **IsMathematicalOperators**
|
||||
2300 - 23FF | **IsMiscellaneousTechnical**
|
||||
2400 - 243F | **IsControlPictures**
|
||||
2440 - 245F | **IsOpticalCharacterRecognition**
|
||||
2460 - 24FF | **IsEnclosedAlphanumerics**
|
||||
2500 - 257F | **IsBoxDrawing**
|
||||
2580 - 259F | **IsBlockElements**
|
||||
25A0 - 25FF | **IsGeometricShapes**
|
||||
2600 - 26FF | **IsMiscellaneousSymbols**
|
||||
2700 - 27BF | **IsDingbats**
|
||||
27C0 - 27EF | **IsMiscellaneousMathematicalSymbols-A**
|
||||
27F0 - 27FF | **IsSupplementalArrows-A**
|
||||
2800 - 28FF | **IsBraillePatterns**
|
||||
2900 - 297F | **IsSupplementalArrows-B**
|
||||
2980 - 29FF | **IsMiscellaneousMathematicalSymbols-B**
|
||||
2A00 - 2AFF | **IsSupplementalMathematicalOperators**
|
||||
2B00 - 2BFF | **IsMiscellaneousSymbolsandArrows**
|
||||
2E80 - 2EFF | **IsCJKRadicalsSupplement**
|
||||
2F00 - 2FDF | **IsKangxiRadicals**
|
||||
2FF0 - 2FFF | **IsIdeographicDescriptionCharacters**
|
||||
3000 - 303F | **IsCJKSymbolsandPunctuation**
|
||||
3040 - 309F | **IsHiragana**
|
||||
30A0 - 30FF | **IsKatakana**
|
||||
3100 - 312F | **IsBopomofo**
|
||||
3130 - 318F | **IsHangulCompatibilityJamo**
|
||||
3190 - 319F | **IsKanbun**
|
||||
31A0 - 31BF | **IsBopomofoExtended**
|
||||
31F0 - 31FF | **IsKatakanaPhoneticExtensions**
|
||||
3200 - 32FF | **IsEnclosedCJKLettersandMonths**
|
||||
3300 - 33FF | **IsCJKCompatibility**
|
||||
3400 - 4DBF | **IsCJKUnifiedIdeographsExtensionA**
|
||||
4DC0 - 4DFF | **IsYijingHexagramSymbols**
|
||||
4E00 - 9FFF | **IsCJKUnifiedIdeographs**
|
||||
A000 - A48F | **IsYiSyllables**
|
||||
A490 - A4CF | **IsYiRadicals**
|
||||
AC00 - D7AF | **IsHangulSyllables**
|
||||
D800 - DB7F | **IsHighSurrogates**
|
||||
DB80 - DBFF | **IsHighPrivateUseSurrogates**
|
||||
DC00 - DFFF | **IsLowSurrogates**
|
||||
E000 - F8FF | **IsPrivateUse** or **IsPrivateUseArea**
|
||||
F900 - FAFF | **IsCJKCompatibilityIdeographs**
|
||||
FB00 - FB4F | **IsAlphabeticPresentationForms**
|
||||
FB50 - FDFF | **IsArabicPresentationForms-A**
|
||||
FE00 - FE0F | **IsVariationSelectors**
|
||||
FE20 - FE2F | **IsCombiningHalfMarks**
|
||||
FE30 - FE4F | **IsCJKCompatibilityForms**
|
||||
FE50 - FE6F | **IsSmallFormVariants**
|
||||
FE70 - FEFF | **IsArabicPresentationForms-B**
|
||||
FF00 - FFEF | **IsHalfwidthandFullwidthForms**
|
||||
FFF0 - FFFF | **IsSpecials**
|
||||
|
||||
## Character Class Subtraction: [base_group - [excluded_group]]
|
||||
|
||||
A character class defines a set of characters. Character class subtraction yields a set of characters that is the result of excluding the characters in one character class from another character class.
|
||||
|
||||
A character class subtraction expression has the following form:
|
||||
|
||||
__[__*base*_*group*-__[__*excluded*_*group*__]]--
|
||||
|
||||
The square brackets (**[]**) and hyphen (-) are mandatory. The *base_group* is a positive character group or a negative character group. The *excluded_group* component is another positive or negative character group, or another character class subtraction expression (that is, you can nest character class subtraction expressions).
|
||||
|
||||
For example, suppose you have a base group that consists of the character range from "a" through "z". To define the set of characters that consists of the base group except for the character "m", use `[a-z-[m]]`. To define the set of characters that consists of the base group except for the set of characters "d", "j", and "p", use `[a-z-[djp]]`. To define the set of characters that consists of the base group except for the character range from "m" through "p", use `[a-z-[m-p]]`.
|
||||
|
||||
Consider the nested character class subtraction expression, `[a-z-[d-w-[m-o]]]`. The expression is evaluated from the innermost character range outward. First, the character range from "m" through "o" is subtracted from the character range "d" through "w", which yields the set of characters from "d" through "l" and "p" through "w". That set is then subtracted from the character range from "a" through "z", which yields the set of characters `[abcmnoxyz]`.
|
||||
|
||||
You can use any character class with character class subtraction. To define the set of characters that consists of all Unicode characters from \u0000 through \uFFFF except white-space characters (**\s**), the characters in the punctuation general category (**\p{P}**), the characters in the **IsGreek** named block (**\p{IsGreek}**), and the Unicode NEXT LINE control character (\x85), use `[\u0000-\uFFFF-[\s\p{P}\p{IsGreek}\x85]]`.
|
||||
|
||||
Choose character classes for a character class subtraction expression that will yield useful results. Avoid an expression that yields an empty set of characters, which cannot match anything, or an expression that is equivalent to the original base group. For example, the empty set is the result of the expression `[\p{IsBasicLatin}-[\x00-\x7F]]`, which subtracts all characters in the **IsBasicLatin** character range from the **IsBasicLatin** general category. Similarly, the original base group is the result of the expression `[a-z-[0-9]]`. This is because the base group, which is the character range of letters from "a" through "z", does not contain any characters in the excluded group, which is the character range of decimal digits from "0" through "9".
|
||||
|
||||
The following example defines a regular expression, `^[0-9-[2468]]+$`, that matches zero and odd digits in an input string. The regular expression is interpreted as shown in the following table.
|
||||
|
||||
Element | Description
|
||||
------- | -----------
|
||||
`^` | Begin the match at the start of the input string.
|
||||
`[0-9-[2468]]+` | Match one or more occurrences of any character from 0 to 9 except for 2, 4, 6, and 8. In other words, match one or more occurrences of zero or an odd digit.
|
||||
`$` | End the match at the end of the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "123", "13579753", "3557798", "335599901" };
|
||||
string pattern = @"^[0-9-[2468]]+$";
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 13579753
|
||||
// 335599901
|
||||
```
|
||||
|
||||
##See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
||||
[Regular Expression Options](options.md)
|
||||
|
|
@ -1,100 +0,0 @@
|
|||
---
|
||||
title: Character Escapes in Regular Expressions
|
||||
description: Character Escapes in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 5189baaf-fd90-4fe4-ad1a-37758cbf2b48
|
||||
---
|
||||
|
||||
# Character Escapes in Regular Expressions
|
||||
|
||||
The backslash (\) in a regular expression indicates one of the following:
|
||||
|
||||
* The character that follows it is a special character, as shown in the table in the following section. For example, **\b** is an anchor that indicates that a regular expression match should begin on a word boundary, **\t** represents a tab, and **\x020** represents a space.
|
||||
|
||||
* A character that otherwise would be interpreted as an unescaped language construct should be interpreted literally. For example, a brace (**{**) begins the definition of a quantifier, but a backslash followed by a brace (**\{**) indicates that the regular expression engine should match the brace. Similarly, a single backslash marks the beginning of an escaped language construct, but two backslashes (**\\**) indicate that the regular expression engine should match the backslash.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Character escapes are recognized in regular expression patterns but not in replacement patterns.
|
||||
|
||||
## Character Escapes in .NET Core
|
||||
|
||||
The following table lists the character escapes supported by regular expressions in .NET Core.
|
||||
|
||||
Character or sequence | Description
|
||||
--------------------- | -----------
|
||||
All characters except for the following: **. $ ^ { [ ( | ) * + ? \** | Characters other than those listed in the **Character or sequence** column have no special meaning in regular expressions; they match themselves. The characters included in the **Character or sequence** column are special regular expression language elements. To match them in a regular expression, they must be escaped or included in a positive character group. For example, the regular expression `\$\d+ or [$]\d+` matches "$1200".
|
||||
**\a** | Matches a bell (alarm) character, **\u0007**.
|
||||
**\b** | In a __[__*character*_*group*__]__ character class, matches a backspace, **\u0008**. (See [Character Classes in Regular Expressions](classes.md).) Outside a character class, **\b** is an anchor that matches a word boundary. (See [Anchors in Regular Expressions](anchors.md).)
|
||||
**\t** | Matches a tab, **\u0009**.
|
||||
**\r** | Matches a carriage return, **\u000D**. Note that **\r** is not equivalent to the newline character, **\n**.
|
||||
**\v** | Matches a vertical tab, **\u000B**.
|
||||
**\f** | Matches a form feed, **\u000C**.
|
||||
**\n** | Matches a new line, **\u000A**.
|
||||
**\e** | Matches an escape, **\u001B**.
|
||||
**\**_nnn_ | Matches an ASCII character, where nnn consists of two or three digits that represent the octal character code. For example, `\040` represents a space character. This construct is interpreted as a backreference if it has only one digit (for example, `\2`) or if it corresponds to the number of a capturing group. (See [Backreference Constructs in Regular Expressions](backreference.md).)
|
||||
**\x**_nn_ | Matches an ASCII character, where *nn* is a two-digit hexadecimal character code.
|
||||
**\c**_X_ | Matches an ASCII control character, where *X* is the letter of the control character. For example, `\cC` is CTRL-C.
|
||||
**\u**_nnnn_ | Matches a UTF-16 code unit whose value is *nnnn* hexadecimal. **Note** The Perl 5 character escape that is used to specify Unicode is not supported by .NET Core. The Perl 5 character escape has the form **\x{####…}**, where **####…** is a series of hexadecimal digits. Instead, use **\u**_nnnn_.
|
||||
**\** | When followed by a character that is not recognized as an escaped character, matches that character. For example, `\*` matches an asterisk (*) and is the same as `\x2A`.
|
||||
|
||||
## Example
|
||||
|
||||
The following example illustrates the use of character escapes in a regular expression. It parses a string that contains the names of the world's largest cities and their populations in 2009. Each city name is separated from its population by a tab (**\t**) or a vertical bar (| or `\u007c`). Individual cities and their populations are separated from each other by a carriage return and line feed.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string delimited = @"\G(.+)[\t\u007c](.+)\r?\n";
|
||||
string input = "Mumbai, India|13,922,125\t\n" +
|
||||
"Shanghai, China\t13,831,900\n" +
|
||||
"Karachi, Pakistan|12,991,000\n" +
|
||||
"Delhi, India\t12,259,230\n" +
|
||||
"Istanbul, Turkey|11,372,613\n";
|
||||
Console.WriteLine("Population of the World's Largest Cities, 2009");
|
||||
Console.WriteLine();
|
||||
Console.WriteLine("{0,-20} {1,10}", "City", "Population");
|
||||
Console.WriteLine();
|
||||
foreach (Match match in Regex.Matches(input, delimited))
|
||||
Console.WriteLine("{0,-20} {1,10}", match.Groups[1].Value,
|
||||
match.Groups[2].Value);
|
||||
}
|
||||
}
|
||||
// The example displyas the following output:
|
||||
// Population of the World's Largest Cities, 2009
|
||||
//
|
||||
// City Population
|
||||
//
|
||||
// Mumbai, India 13,922,125
|
||||
// Shanghai, China 13,831,900
|
||||
// Karachi, Pakistan 12,991,000
|
||||
// Delhi, India 12,259,230
|
||||
// Istanbul, Turkey 11,372,613
|
||||
```
|
||||
|
||||
The regular expression `\G(.+)[\t|\u007c](.+)\r?\n` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\G` | Begin the match where the last match ended.
|
||||
`(.+)` | Match any character one or more times. This is the first capturing group.
|
||||
`[\t\u007c]` | Match a tab (**\t**) or a vertical bar (|).
|
||||
`(.+)` | Match any character one or more times. This is the second capturing group.
|
||||
`\r?\n` | Match zero or one occurrence of a carriage return followed by a new line.
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
|
@ -1,823 +0,0 @@
|
|||
---
|
||||
title: Grouping Constructs in Regular Expressions
|
||||
description: Grouping Constructs in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 18cdccb1-6abb-433d-b646-e2dc69acba29
|
||||
---
|
||||
|
||||
# Grouping Constructs in Regular Expressions
|
||||
|
||||
Grouping constructs delineate the subexpressions of a regular expression and capture the substrings of an input string. You can use grouping constructs to do the following:
|
||||
|
||||
* Match a subexpression that is repeated in the input string.
|
||||
|
||||
* Apply a quantifier to a subexpression that has multiple regular expression language elements. For more information about quantifiers, see [Quantifiers in Regular Expressions](quantifiers.md).
|
||||
|
||||
* Include a subexpression in the string that is returned by the [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) and [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) methods.
|
||||
|
||||
* Retrieve individual subexpressions from the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property and process them separately from the matched text as a whole.
|
||||
|
||||
The following table lists the grouping constructs supported by .NET Core regular expression engine and indicates whether they are capturing or non-capturing.
|
||||
|
||||
Grouping construct | Capturing or noncapturing
|
||||
------------------ | -------------------------
|
||||
[Matched subexpressions](#Matched-subexpressions) | Capturing
|
||||
[Named matched subexpressions](#Named-matched-subexpressions) | Capturing
|
||||
[Balancing group definitions](#Balancing group definitions) | Capturing
|
||||
[Noncapturing groups](#Noncapturing-groups) | Noncapturing
|
||||
[Group options](#Group-options) | Noncapturing
|
||||
[Zero-width positive lookahead assertions](#Zero-width-positive-lookahead-assertions) | Noncapturing
|
||||
[Zero-width negative lookahead assertions](#Zero-width-negative-lookahead-assertions) | Noncapturing
|
||||
[Zero-width positive lookbehind assertions](#Zero-width-positive-lookbehind-assertions) | Noncapturing
|
||||
[Zero-width negative lookbehind assertions](#Zero-width-negative-lookbehind-assertions) | Noncapturing
|
||||
[Nonbacktracking subexpressions](#Nonbacktracking-subexpressions) | Noncapturing
|
||||
|
||||
For information on groups and the regular expression object model, see [Grouping Constructs and Regular Expression Objects](#Grouping-constructs-and-regular-expression-objects).
|
||||
|
||||
## Matched Subexpressions
|
||||
|
||||
The following grouping construct captures a matched subexpression:
|
||||
|
||||
**(**_subexpression_**)**
|
||||
|
||||
where *subexpressio* is any valid regular expression pattern. Captures that use parentheses are numbered automatically from left to right based on the order of the opening parentheses in the regular expression, starting from one. The capture that is numbered zero is the text matched by the entire regular expression pattern.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> By default, the (subexpression) language element captures the matched subexpression. But if the RegexOptions parameter of a regular expression pattern matching method includes the RegexOptions.ExplicitCapture flag, or if the n option is applied to this subexpression (see Group options later in this topic), the matched subexpression is not captured.
|
||||
|
||||
You can access captured groups in four ways:
|
||||
|
||||
* By using the backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax **\**_number_, where *number* is the ordinal number of the captured subexpression.
|
||||
|
||||
* By using the named backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax **\k<**_name_**>**, where *name* is the name of a capturing group, or **\k**_<number_**>**, where *number* is the ordinal number of a capturing group. A capturing group has a default name that is identical to its ordinal number. For more information, see Grouping constructs and regular expression objects later in this topic.
|
||||
|
||||
* By using the **$**_number_ replacement sequence in a [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) or [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method call, where *number* is the ordinal number of the captured subexpression.
|
||||
|
||||
* Programmatically, by using the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property. The member at position zero in the collection represents the entire regular expression match. Each subsequent member represents a matched subexpression. For more information, see the [Grouping Constructs and Regular Expression Objects](#Grouping-constructs-and-regular-expression-objects) section.
|
||||
|
||||
The following example illustrates a regular expression that identifies duplicated words in text. The regular expression pattern's two capturing groups represent the two instances of the duplicated word. The second instance is captured to report its starting position in the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(\w+)\s(\1)";
|
||||
string input = "He said that that was the the correct answer.";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("Duplicate '{0}' found at positions {1} and {2}.",
|
||||
match.Groups[1].Value, match.Groups[1].Index, match.Groups[2].Index);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Duplicate 'that' found at positions 8 and 13.
|
||||
// Duplicate 'the' found at positions 22 and 26.
|
||||
```
|
||||
|
||||
The regular expression pattern is the following:
|
||||
|
||||
```
|
||||
(\w+)\s(\1)\W
|
||||
```
|
||||
|
||||
The following table shows how the regular expression pattern is interpreted.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`(\w+)` | Match one or more word characters. This is the first capturing group.
|
||||
`\s` | Match a white-space character.
|
||||
`(\1)` | Match the string in the first captured group. This is the second capturing group. The example assigns it to a captured group so that the starting position of the duplicate word can be retrieved from the `Match.Index` property.
|
||||
`\W` | Match a non-word character, including white space and punctuation. This prevents the regular expression pattern from matching a word that starts with the word from the first captured group.
|
||||
|
||||
## Named Matched Subexpressions
|
||||
|
||||
The following grouping construct captures a matched subexpression and lets you access it by name or by number:
|
||||
|
||||
```
|
||||
(?<name>subexpression)
|
||||
```
|
||||
|
||||
or:
|
||||
|
||||
```
|
||||
(?'name'subexpression)
|
||||
```
|
||||
|
||||
where *name* is a valid group name, and *subexpression* is any valid regular expression pattern. *name* must not contain any punctuation characters and cannot begin with a number.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> If the [RegexOptions](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions) parameter of a regular expression pattern matching method includes the [RegexOptions.ExplicitCapture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_ExplicitCapture) flag, or if the **n** option is applied to this subexpression (see [Group options](#Group-options) later in this topic), the only way to capture a subexpression is to explicitly name capturing groups.
|
||||
|
||||
You can access named captured groups in the following ways:
|
||||
|
||||
* By using the named backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax **\k<**_name_**>**, where *name* is the name of the captured subexpression.
|
||||
|
||||
* By using the backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax **\**_number_, where *number* is the ordinal number of the captured subexpression. Named matched subexpressions are numbered consecutively from left to right after matched subexpressions.
|
||||
|
||||
* By using the **${**_name_**}** replacement sequence in a [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) or [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method call, where *name* is the name of the captured subexpression.
|
||||
|
||||
* By using the **$**_number_ replacement sequence in a [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Replace_System_String_System_String_) or [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method call, where *number* is the ordinal number of the captured subexpression.
|
||||
|
||||
* Programmatically, by using the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property. The member at position zero in the collection represents the entire regular expression match. Each subsequent member represents a matched subexpression. Named captured groups are stored in the collection after numbered captured groups.
|
||||
|
||||
* Programmatically, by providing the subexpression name to the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object's indexer.
|
||||
|
||||
A simple regular expression pattern illustrates how numbered (unnamed) and named groups can be referenced either programmatically or by using regular expression language syntax. The regular expression `((?<One>abc)\d+)?(?<Two>xyz)(.*)` produces the following capturing groups by number and by name. The first capturing group (number 0) always refers to the entire pattern.
|
||||
|
||||
Number | Name | Pattern
|
||||
------ | ---- | -------
|
||||
0 | 0 (default name) | `((?<One>abc)\d+)?(?<Two>xyz)(.*)`
|
||||
1 | 1 (default name) | `((?<One>abc)\d+)`
|
||||
2 | 2 (default name) | `(.*)`
|
||||
3 | One | `(?<One>abc)`
|
||||
4 | Two | `(?<Two>xyz)`
|
||||
|
||||
The following example illustrates a regular expression that identifies duplicated words and the word that immediately follows each duplicated word. The regular expression pattern defines two named subexpressions: `duplicateWord`, which represents the duplicated word; and `nextWord`, which represents the word that follows the duplicated word.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(?<duplicateWord>\w+)\s\k<duplicateWord>\W(?<nextWord>\w+)";
|
||||
string input = "He said that that was the the correct answer.";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("A duplicate '{0}' at position {1} is followed by '{2}'.",
|
||||
match.Groups["duplicateWord"].Value, match.Groups["duplicateWord"].Index,
|
||||
match.Groups["nextWord"].Value);
|
||||
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// A duplicate 'that' at position 8 is followed by 'was'.
|
||||
// A duplicate 'the' at position 22 is followed by 'correct'.
|
||||
```
|
||||
|
||||
The regular expression pattern is as follows:
|
||||
|
||||
```
|
||||
(?<duplicateWord>\w+)\s\k<duplicateWord>\W(?<nextWord>\w+)
|
||||
```
|
||||
|
||||
The following table shows how the regular expression is interpreted.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`(?<duplicateWord>\w+)` | Match one or more word characters. Name this capturing group `duplicateWord`.
|
||||
`\s` | Match a white-space character.
|
||||
`\k<duplicateWord>` | Match the string from the captured group that is named `duplicateWord`.
|
||||
`\W` | Match a non-word character, including white space and punctuation. This prevents the regular expression pattern from matching a word that starts with the word from the first captured group.
|
||||
`(?<nextWord>\w+)` | Match one or more word characters. Name this capturing group `nextWord`.
|
||||
|
||||
Note that a group name can be repeated in a regular expression. For example, it is possible for more than one group to be named `digit`, as the following example illustrates. In the case of duplicate names, the value of the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object is determined by the last successful capture in the input string. In addition, the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) is populated with information about each capture just as it would be if the group name was not duplicated.
|
||||
|
||||
In the following example, the regular expression `\D+(?<digit>\d+)\D+(?<digit>\d+)?` includes two occurrences of a group named `digit`. The first `digit` named group captures one or more digit characters. The second `digit` named group captures either zero or one occurrence of one or more digit characters. As the output from the example shows, if the second capturing group successfully matches text, the value of that text defines the value of the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object. If the second capturing group cannot does not match the input string, the value of the last successful match defines the value of the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
String pattern = @"\D+(?<digit>\d+)\D+(?<digit>\d+)?";
|
||||
String[] inputs = { "abc123def456", "abc123def" };
|
||||
foreach (var input in inputs) {
|
||||
Match m = Regex.Match(input, pattern);
|
||||
if (m.Success) {
|
||||
Console.WriteLine("Match: {0}", m.Value);
|
||||
for (int grpCtr = 1; grpCtr < m.Groups.Count; grpCtr++) {
|
||||
Group grp = m.Groups[grpCtr];
|
||||
Console.WriteLine("Group {0}: {1}", grpCtr, grp.Value);
|
||||
for (int capCtr = 0; capCtr < grp.Captures.Count; capCtr++)
|
||||
Console.WriteLine(" Capture {0}: {1}", capCtr,
|
||||
grp.Captures[capCtr].Value);
|
||||
}
|
||||
}
|
||||
else {
|
||||
Console.WriteLine("The match failed.");
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: abc123def456
|
||||
// Group 1: 456
|
||||
// Capture 0: 123
|
||||
// Capture 1: 456
|
||||
//
|
||||
// Match: abc123def
|
||||
// Group 1: 123
|
||||
// Capture 0: 123
|
||||
```
|
||||
|
||||
The following table shows how the regular expression is interpreted.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\D+` | Match one or more non-decimal digit characters.
|
||||
`(?<digit>\d+)` | Match one or more decimal digit characters. Assign the match to the `digit` named group.
|
||||
`\D+` | Match one or more non-decimal digit characters.
|
||||
`(?<digit>\d+)?` | Match zero or one occurrence of one or more decimal digit characters. Assign the match to the `digit` named group.
|
||||
|
||||
## Balancing Group Definitions
|
||||
|
||||
A balancing group definition deletes the definition of a previously defined group and stores, in the current group, the interval between the previously defined group and the current group. This grouping construct has the following format:
|
||||
|
||||
```
|
||||
(?<name1-name2>subexpression)
|
||||
```
|
||||
|
||||
or:
|
||||
|
||||
```
|
||||
(?'name1-name2' subexpression)
|
||||
```
|
||||
|
||||
where *name1* is the current group (optional), *name2* is a previously defined group, and *subexpression* is any valid regular expression pattern. The balancing group definition deletes the definition of *name2* and stores the interval between *name2* and *name1* in *name1*. If no *name2* group is defined, the match backtracks. Because deleting the last definition of *name2* reveals the previous definition of *name2*, this construct lets you use the stack of captures for group *name2* as a counter for keeping track of nested constructs such as parentheses or opening and closing brackets.
|
||||
|
||||
The balancing group definition uses *name2*as a stack. The beginning character of each nested construct is placed in the group and in its [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) collection. When the closing character is matched, its corresponding opening character is removed from the group, and the [Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) collection is decreased by one. After the opening and closing characters of all nested constructs have been matched, *name1* is empty.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> After you modify the regular expression in the following example to use the appropriate opening and closing character of a nested construct, you can use it to handle most nested constructs, such as mathematical expressions or lines of program code that include multiple nested method calls.
|
||||
|
||||
The following example uses a balancing group definition to match left and right angle brackets (<>) in an input string. The example defines two named groups, `Open` and `Close`, that are used like a stack to track matching pairs of angle brackets. Each captured left angle bracket is pushed into the capture collection of the `Open` group, and each captured right angle bracket is pushed into the capture collection of the `Close` group. The balancing group definition ensures that there is a matching right angle bracket for each left angle bracket. If there is not, the final subpattern, `(?(Open)(?!))`, is evaluated only if the `Open` group is not empty (and, therefore, if all nested constructs have not been closed). If the final subpattern is evaluated, the match fails, because the `(?!)` subpattern is a zero-width negative lookahead assertion that always fails.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "^[^<>]*" +
|
||||
"(" +
|
||||
"((?'Open'<)[^<>]*)+" +
|
||||
"((?'Close-Open'>)[^<>]*)+" +
|
||||
")*" +
|
||||
"(?(Open)(?!))$";
|
||||
string input = "<abc><mno<xyz>>";
|
||||
|
||||
Match m = Regex.Match(input, pattern);
|
||||
if (m.Success == true)
|
||||
{
|
||||
Console.WriteLine("Input: \"{0}\" \nMatch: \"{1}\"", input, m);
|
||||
int grpCtr = 0;
|
||||
foreach (Group grp in m.Groups)
|
||||
{
|
||||
Console.WriteLine(" Group {0}: {1}", grpCtr, grp.Value);
|
||||
grpCtr++;
|
||||
int capCtr = 0;
|
||||
foreach (Capture cap in grp.Captures)
|
||||
{
|
||||
Console.WriteLine(" Capture {0}: {1}", capCtr, cap.Value);
|
||||
capCtr++;
|
||||
}
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine("Match failed.");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Input: "<abc><mno<xyz>>"
|
||||
// Match: "<abc><mno<xyz>>"
|
||||
// Group 0: <abc><mno<xyz>>
|
||||
// Capture 0: <abc><mno<xyz>>
|
||||
// Group 1: <mno<xyz>>
|
||||
// Capture 0: <abc>
|
||||
// Capture 1: <mno<xyz>>
|
||||
// Group 2: <xyz
|
||||
// Capture 0: <abc
|
||||
// Capture 1: <mno
|
||||
// Capture 2: <xyz
|
||||
// Group 3: >
|
||||
// Capture 0: >
|
||||
// Capture 1: >
|
||||
// Capture 2: >
|
||||
// Group 4:
|
||||
// Group 5: mno<xyz>
|
||||
// Capture 0: abc
|
||||
// Capture 1: xyz
|
||||
// Capture 2: mno<xyz>
|
||||
```
|
||||
|
||||
The regular expression pattern is:
|
||||
|
||||
```
|
||||
^[^<>]*(((?'Open'<)[^<>]*)+((?'Close-Open'>)[^<>]*)+)*(?(Open)(?!))$
|
||||
```
|
||||
|
||||
The regular expression is interpreted as follows:
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Begin at the start of the string.
|
||||
`[^<>]*` | Match zero or more characters that are not left or right angle brackets.
|
||||
`(?'Open'<)` | Match a left angle bracket and assign it to a group named `Open`.
|
||||
`[^<>]*` | Match zero or more characters that are not left or right angle brackets.
|
||||
`((?'Open'<)[^<>]*) +` | Match one or more occurrences of a left angle bracket followed by zero or more characters that are not left or right angle brackets. This is the second capturing group.
|
||||
`(?'Close-Open'>)` | Match a right angle bracket, assign the substring between the `Open` group and the current group to the `Close` group, and delete the definition of the `Open` group.
|
||||
`[^<>]*` | Match zero or more occurrences of any character that is neither a left nor a right angle bracket.
|
||||
`((?'Close-Open'>)[^<>]*)+` | Match one or more occurrences of a right angle bracket, followed by zero or more occurrences of any character that is neither a left nor a right angle bracket. When matching the right angle bracket, assign the substring between the `Open` group and the current group to the `Close` group, and delete the definition of the`Open` group. This is the third capturing group.
|
||||
`(((?'Open'<)[^<>]*)+((?'Close-Open'>)[^<>]*)+)*` | Match zero or more occurrences of the following pattern: one or more occurrences of a left angle bracket, followed by zero or more non-angle bracket characters, followed by one or more occurrences of a right angle bracket, followed by zero or more occurrences of non-angle brackets. When matching the right angle bracket, delete the definition of the `Open` group, and assign the substring between the `Open` group and the current group to the `Close` group. This is the first capturing group.
|
||||
`(?(Open)(?!))` | If the `Open` group exists, abandon the match if an empty string can be matched, but do not advance the position of the regular expression engine in the string. This is a zero-width negative lookahead assertion. Because an empty string is always implicitly present in an input string, this match always fails. Failure of this match indicates that the angle brackets are not balanced.
|
||||
`$` | Match the end of the input string.
|
||||
|
||||
The final subexpression, `(?(Open)(?!))`, indicates whether the nesting constructs in the input string are properly balanced (for example, whether each left angle bracket is matched by a right angle bracket). It uses conditional matching based on a valid captured group; for more information, see [Alternation Constructs in Regular Expressions](alternation.md). If the `Open` group is defined, the regular expression engine attempts to match the subexpression `(?!)` in the input string. The `Open` group should be defined only if nesting constructs are unbalanced. Therefore, the pattern to be matched in the input string should be one that always causes the match to fail. In this case, `(?!)` is a zero-width negative lookahead assertion that always fails, because an empty string is always implicitly present at the next position in the input string.
|
||||
|
||||
In the example, the regular expression engine evaluates the input string "<abc><mno<xyz>>" as shown in the following table.
|
||||
|
||||
Step | Pattern | Result
|
||||
---- | ------- | ------
|
||||
1 | `^` | Starts the match at the beginning of the input string
|
||||
2 | `[^<>]*` | Looks for non-angle bracket characters before the left angle bracket;finds no matches.
|
||||
3 | `(((?'Open'<)` | Matches the left angle bracket in "<abc>" and assigns it to the `Open` group.
|
||||
4 | `[^<>]*` | Matches "abc".
|
||||
5 | `)+` | "<abc" is the value of the second captured group. The next character in the input string is not a left angle bracket, so the regular expression engine does not loop back to the `(?'Open'<)[^<>]*)` subpattern.
|
||||
6 | `((?'Close-Open'>)` | Matches the right angle bracket in "<abc>", assigns "abc", which is the substring between the `Open` group and the right angle bracket, to the `Close` group, and deletes the current value ("<") of the `Open` group, leaving it empty.
|
||||
7 | `[^<>]*` | Looks for non-angle bracket characters after the right angle bracket; finds no matches.
|
||||
8 | `)+` | The value of the third captured group is ">". The next character in the input string is not a right angle bracket, so the regular expression engine does not loop back to the `((?'Close-Open'>)[^<>]*)` subpattern.
|
||||
9 | `)*` | The value of the first captured group is "<abc>". The next character in the input string is a left angle bracket, so the regular expression engine loops back to the `(((?'Open'<)` subpattern.
|
||||
10 | `(((?'Open'<)` | Matches the left angle bracket in "<mno>" and assigns it to the `Open` group. Its `Group.Captures` collection now has a single value, "<".
|
||||
11 | `[^<>]*` | Matches "mno".
|
||||
12 | `)+` | "<mno" is the value of the second captured group. The next character in the input string is an left angle bracket, so the regular expression engine loops back to the `(?'Open'<)[^<>]*)` subpattern.
|
||||
13 | `(((?'Open'<)` | Matches the left angle bracket in "<xyz>" and assigns it to the `Open` group. The `Group.Captures` collection of the `Open` group now includes two captures: the left angle bracket from "<mno>", and the left angle bracket from "<xyz>".
|
||||
14 | `[^<>]*` | Matches "xyz".
|
||||
15 | `)+` | "<xyz" is the value of the second captured group. The next character in the input string is not a left angle bracket, so the regular expression engine does not loop back to the `(?'Open'<)[^<>]*)` subpattern.
|
||||
16 | `((?'Close-Open'>)` | Matches the right angle bracket in "<xyz>". "xyz", assigns the substring between the `Open` group and the right angle bracket to the `Close` group, and deletes the current value of the `Open` group. The value of the previous capture (the left angle bracket in "<mno>") becomes the current value of the `Open` group. The `Captures` collection of the `Open` group now includes a single capture, the left angle bracket from "<xyz>".
|
||||
17 | `[^<>]*` | Looks for non-angle bracket characters; finds no matches.
|
||||
18 | `)+` | The value of the third captured group is ">". The next character in the input string is a right angle bracket, so the regular expression engine loops back to the `((?'Close-Open'>)[^<>]*)` subpattern.
|
||||
19 | `((?'Close-Open'>)` | Matches the final right angle bracket in "xyz>>", assigns "mno<xyz>" (the substring between the `Open` group and the right angle bracket) to the `Close` group, and deletes the current value of the `Open` group. The `Open` group is now empty.
|
||||
20 | `[^<>]*` | Looks for non-angle bracket characters; finds no matches.
|
||||
21 | `)+` | The value of the third captured group is ">". The next character in the input string is not a right angle bracket, so the regular expression engine does not loop back to the `((?'Close-Open'>)[^<>]*)` subpattern.
|
||||
22 | `)*` | The value of the first captured group is "<mno<xyz>>". The next character in the input string is not a left angle bracket, so the regular expression engine does not loop back to the `(((?'Open'<)` subpattern.
|
||||
23 | `(?(Open)(?!))` | The `Open` group is not defined, so no match is attempted.
|
||||
24 | `$` | Matches the end of the input string.
|
||||
|
||||
## Noncapturing Groups
|
||||
|
||||
The following grouping construct does not capture the substring that is matched by a subexpression:
|
||||
|
||||
```
|
||||
**(?**:_subexpression_**)**
|
||||
```
|
||||
|
||||
where *subexpression* is any valid regular expression pattern. The noncapturing group construct is typically used when a quantifier is applied to a group, but the substrings captured by the group are of no interest.
|
||||
|
||||
>**Note**
|
||||
>
|
||||
> If a regular expression includes nested grouping constructs, an outer noncapturing group construct does not apply to the inner nested group constructs.
|
||||
|
||||
The following example illustrates a regular expression that includes noncapturing groups. Note that the output does not include any captured groups.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(?:\b(?:\w+)\W*)+\.";
|
||||
string input = "This is a short sentence.";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
Console.WriteLine("Match: {0}", match.Value);
|
||||
for (int ctr = 1; ctr < match.Groups.Count; ctr++)
|
||||
Console.WriteLine(" Group {0}: {1}", ctr, match.Groups[ctr].Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: This is a short sentence.
|
||||
```
|
||||
|
||||
The regular expression `(?:\b(?:\w+)\W*)+\.` matches a sentence that is terminated by a period. Because the regular expression focuses on sentences and not on individual words, grouping constructs are used exclusively as quantifiers. The regular expression pattern is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(?:\w+)` | Match one or more word characters. Do not assign the matched text to a captured group.
|
||||
`\W*` | Match zero or more non-word characters.
|
||||
`(?:\b(?:\w+)\W*)+` | Match the pattern of one or more word characters starting at a word boundary, followed by zero or more non-word characters, one or more times. Do not assign the matched text to a captured group.
|
||||
`\.` | Match a period.
|
||||
|
||||
## Group Options
|
||||
|
||||
The following grouping construct applies or disables the specified options within a subexpression:
|
||||
|
||||
**(?imnsx-imnsx:**_subexpression_**)**
|
||||
|
||||
where *subexpression* is any valid regular expression pattern. For example, `(?i-s:)` turns on case insensitivity and disables single-line mode. For more information about the inline options you can specify, see [Regular Expression Options](options.md).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> You can specify options that apply to an entire regular expression rather than a subexpression by using a [System.Text.RegularExpressions.Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) class constructor or a static method. You can also specify inline options that apply after a specific point in a regular expression by using the `(?imnsx-imnsx)` language construct.
|
||||
|
||||
The group options construct is not a capturing group. That is, although any portion of a string that is captured by *subexpression* is included in the match, it is not included in a captured group nor used to populate the [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object.
|
||||
|
||||
For example, the regular expression `\b(?ix: d \w+)\s `in the following example uses inline options in a grouping construct to enable case-insensitive matching and ignore pattern whitespace in identifying all words that begin with the letter "d". The regular expression is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(?ix: d \w+)` | Using case-insensitive matching and ignoring white space in this pattern, match a "d" followed by one or more word characters.
|
||||
`\s` | Match a white-space character.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b(?ix: d \w+)\s";
|
||||
string input = "Dogs are decidedly good pets.";
|
||||
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}// found at index {1}.", match.Value, match.Index);
|
||||
// The example displays the following output:
|
||||
// 'Dogs // found at index 0.
|
||||
// 'decidedly // found at index 9.
|
||||
```
|
||||
|
||||
## Zero-Width Positive Lookahead Assertions
|
||||
|
||||
The following grouping construct defines a zero-width positive lookahead assertion:
|
||||
|
||||
**(?**=*subexpression*__)__
|
||||
|
||||
where *subexpression* is any regular expression pattern. For a match to be successful, the input string must match the regular expression pattern in *subexpression*, although the matched substring is not included in the match result. A zero-width positive lookahead assertion does not backtrack.
|
||||
|
||||
Typically, a zero-width positive lookahead assertion is found at the end of a regular expression pattern. It defines a substring that must be found at the end of a string for a match to occur but that should not be included in the match. It is also useful for preventing excessive backtracking. You can use a zero-width positive lookahead assertion to ensure that a particular captured group begins with text that matches a subset of the pattern defined for that captured group. For example, if a capturing group matches consecutive word characters, you can use a zero-width positive lookahead assertion to require that the first character be an alphabetical uppercase character.
|
||||
|
||||
The following example uses a zero-width positive lookahead assertion to match the word that precedes the verb "is" in the input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b\w+(?=\sis\b)";
|
||||
string[] inputs = { "The dog is a Malamute.",
|
||||
"The island has beautiful birds.",
|
||||
"The pitch missed home plate.",
|
||||
"Sunday is a weekend day." };
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
Match match = Regex.Match(input, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine("'{0}' precedes 'is'.", match.Value);
|
||||
else
|
||||
Console.WriteLine("'{0}' does not match the pattern.", input);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 'dog' precedes 'is'.
|
||||
// 'The island has beautiful birds.' does not match the pattern.
|
||||
// 'The pitch missed home plate.' does not match the pattern.
|
||||
// 'Sunday' precedes 'is'.
|
||||
```
|
||||
|
||||
The regular expression \b\w+(?=\sis\b) is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\w+` | Match one or more word characters.
|
||||
`(?=\sis\b)` | Determine whether the word characters are followed by a white-space character and the string "is", which ends on a word boundary. If so, the match is successful.
|
||||
|
||||
## Zero-Width Negative Lookahead Assertions
|
||||
|
||||
The following grouping construct defines a zero-width negative lookahead assertion:
|
||||
|
||||
**(?!**_subexpression_**)**
|
||||
|
||||
where *subexpression* is any regular expression pattern. For the match to be successful, the input string must not match the regular expression pattern in *subexpression*, although the matched string is not included in the match result.
|
||||
|
||||
A zero-width negative lookahead assertion is typically used either at the beginning or at the end of a regular expression. At the beginning of a regular expression, it can define a specific pattern that should not be matched when the beginning of the regular expression defines a similar but more general pattern to be matched. In this case, it is often used to limit backtracking. At the end of a regular expression, it can define a subexpression that cannot occur at the end of a match.
|
||||
|
||||
The following example defines a regular expression that uses a zero-width lookahead assertion at the beginning of the regular expression to match words that do not begin with "un".
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(?!un)\w+\b";
|
||||
string input = "unite one unethical ethics use untie ultimate";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// one
|
||||
// ethics
|
||||
// use
|
||||
// ultimate
|
||||
```
|
||||
|
||||
The regular expression \b(?!un)\w+\b is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(?!un)` | Determine whether the next two characters are "un". If they are not, a match is possible.
|
||||
`\w+` | Match one or more word characters.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
The following example defines a regular expression that uses a zero-width lookahead assertion at the end of the regular expression to match words that do not end with a punctuation character.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b\w+\b(?!\p{P})";
|
||||
string input = "Disconnected, disjointed thoughts in a sentence fragment.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// disjointed
|
||||
// thoughts
|
||||
// in
|
||||
// a
|
||||
// sentence
|
||||
```
|
||||
|
||||
The regular expression `\b\w+\b(?!\p{P})` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\w+` | Match one or more word characters.
|
||||
`\b` | End the match at a word boundary.
|
||||
`\p{P})` | If the next character is not a punctuation symbol (such as a period or a comma), the match succeeds.
|
||||
|
||||
## Zero-Width Positive Lookbehind Assertions
|
||||
|
||||
The following grouping construct defines a zero-width positive lookbehind assertion:
|
||||
|
||||
**(?<=**_subexpression_**)**
|
||||
|
||||
where *subexpression* is any regular expression pattern. For a match to be successful, *subexpression* must occur at the input string to the left of the current position, although subexpression is not included in the match result. A zero-width positive lookbehind assertion does not backtrack.
|
||||
|
||||
Zero-width positive lookbehind assertions are typically used at the beginning of regular expressions. The pattern that they define is a precondition for a match, although it is not a part of the match result.
|
||||
|
||||
For example, the following example matches the last two digits of the year for the twenty first century (that is, it requires that the digits "20" precede the matched string).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "2010 1999 1861 2140 2009";
|
||||
string pattern = @"(?<=\b20)\d{2}\b";
|
||||
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 10
|
||||
// 09
|
||||
```
|
||||
|
||||
The regular expression pattern `(?<=\b20)\d{2}\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\d{2}` | Match two decimal digits.
|
||||
`{?<=\b20)` | Continue the match if the two decimal digits are preceded by the decimal digits "20" on a word boundary.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
Zero-width positive lookbehind assertions are also used to limit backtracking when the last character or characters in a captured group must be a subset of the characters that match that group's regular expression pattern. For example, if a group captures all consecutive word characters, you can use a zero-width positive lookbehind assertion to require that the last character be alphabetical.
|
||||
|
||||
## Zero-Width Negative Lookbehind Assertions
|
||||
|
||||
The following grouping construct defines a zero-width negative lookbehind assertion:
|
||||
|
||||
**(?<!**_subexpression_**)**
|
||||
|
||||
where *subexpression* is any regular expression pattern. For a match to be successful, *subexpression* must not occur at the input string to the left of the current position. However, any substring that does not match subexpression is not included in the match result.
|
||||
|
||||
Zero-width negative lookbehind assertions are typically used at the beginning of regular expressions. The pattern that they define precludes a match in the string that follows. They are also used to limit backtracking when the last character or characters in a captured group must not be one or more of the characters that match that group's regular expression pattern. For example, if a group captures all consecutive word characters, you can use a zero-width positive lookbehind assertion to require that the last character not be an underscore (_).
|
||||
|
||||
The following example matches the date for any day of the week that is not a weekend (that is, that is neither Saturday nor Sunday).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] dates = { "Monday February 1, 2010",
|
||||
"Wednesday February 3, 2010",
|
||||
"Saturday February 6, 2010",
|
||||
"Sunday February 7, 2010",
|
||||
"Monday, February 8, 2010" };
|
||||
string pattern = @"(?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b";
|
||||
|
||||
foreach (string dateValue in dates)
|
||||
{
|
||||
Match match = Regex.Match(dateValue, pattern);
|
||||
if (match.Success)
|
||||
Console.WriteLine(match.Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// February 1, 2010
|
||||
// February 3, 2010
|
||||
// February 8, 2010
|
||||
```
|
||||
|
||||
The regular expression pattern `(?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b` is interpreted as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`\w+` | Match one or more word characters followed by a white-space character.
|
||||
`\d{1,2},` | Match either one or two decimal digits followed by a white-space character and a comma.
|
||||
`\d{4}\b` | Match four decimal digits, and end the match at a word boundary.
|
||||
`(?<!(Saturday|Sunday) )` | If the match is preceded by something other than the strings "Saturday" or "Sunday" followed by a space, the match is successful.
|
||||
|
||||
## Nonbacktracking Subexpressions
|
||||
|
||||
The following grouping construct represents a nonbacktracking subexpression (also known as a "greedy" subexpression):
|
||||
|
||||
**(?>**_subexpression_**)**
|
||||
|
||||
where *subexpression* is any regular expression pattern.
|
||||
|
||||
Ordinarily, if a regular expression includes an optional or alternative matching pattern and a match does not succeed, the regular expression engine can branch in multiple directions to match an input string with a pattern. If a match is not found when it takes the first branch, the regular expression engine can back up or backtrack to the point where it took the first match and attempt the match using the second branch. This process can continue until all branches have been tried.
|
||||
|
||||
The **(?>**_subexpression_**)** language construct disables backtracking. The regular expression engine will match as many characters in the input string as it can. When no further match is possible, it will not backtrack to attempt alternate pattern matches. (That is, the subexpression matches only strings that would be matched by the subexpression alone; it does not attempt to match a string based on the subexpression and any subexpressions that follow it.)
|
||||
|
||||
This option is recommended if you know that backtracking will not succeed. Preventing the regular expression engine from performing unnecessary searching improves performance.
|
||||
|
||||
The following example illustrates how a nonbacktracking subexpression modifies the results of a pattern match. The backtracking regular expression successfully matches a series of repeated characters followed by one more occurrence of the same character on a word boundary, but the nonbacktracking regular expression does not.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] inputs = { "cccd.", "aaad", "aaaa" };
|
||||
string back = @"(\w)\1+.\b";
|
||||
string noback = @"(?>(\w)\1+).\b";
|
||||
|
||||
foreach (string input in inputs)
|
||||
{
|
||||
Match match1 = Regex.Match(input, back);
|
||||
Match match2 = Regex.Match(input, noback);
|
||||
Console.WriteLine("{0}: ", input);
|
||||
|
||||
Console.Write(" Backtracking : ");
|
||||
if (match1.Success)
|
||||
Console.WriteLine(match1.Value);
|
||||
else
|
||||
Console.WriteLine("No match");
|
||||
|
||||
Console.Write(" Nonbacktracking: ");
|
||||
if (match2.Success)
|
||||
Console.WriteLine(match2.Value);
|
||||
else
|
||||
Console.WriteLine("No match");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// cccd.:
|
||||
// Backtracking : cccd
|
||||
// Nonbacktracking: cccd
|
||||
// aaad:
|
||||
// Backtracking : aaad
|
||||
// Nonbacktracking: aaad
|
||||
// aaaa:
|
||||
// Backtracking : aaaa
|
||||
// Nonbacktracking: No match
|
||||
```
|
||||
|
||||
The nonbacktracking regular expression `(?>(\w)\1+).\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`(\w)` | Match a single word character and assign it to the first capturing group.
|
||||
`\1+` | Match the value of the first captured substring one or more times.
|
||||
`.` | Match any character.
|
||||
`\b` | End the match on a word boundary.
|
||||
`(?>(\w)\1+)` | Match one or more occurrences of a duplicated word character, but do not backtrack to match the last character on a word boundary.
|
||||
|
||||
## Grouping Constructs and Regular Expression Objects
|
||||
|
||||
Substrings that are matched by a regular expression capturing group are represented by [System.Text.RegularExpressions.Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects, which can be retrieved from the [System.Text.RegularExpressions.GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object that is returned by the [Match.Groups](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Groups) property. The [GroupCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.GroupCollection) object is populated as follows:
|
||||
|
||||
* The first [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object in the collection (the object at index zero) represents the entire match.
|
||||
|
||||
* The next set of [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects represent unnamed (numbered) capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index values of these groups range from 1 to the number of unnamed capturing groups in the collection. (The index of a particular group is equivalent to its numbered backreference. For more information about backreferences, see [Backreference Constructs in Regular Expressions](backreference.md.)
|
||||
|
||||
* The final set of [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) objects represent named capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index value of the first named capturing group is one greater than the index of the last unnamed capturing group. If there are no unnamed capturing groups in the regular expression, the index value of the first named capturing group is one.
|
||||
|
||||
|
||||
If you apply a quantifier to a capturing group, the corresponding [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object's [Capture.Value](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture#properties), [Capture.Index](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture#properties), and [Capture.Length](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture#properties) properties reflect the last substring that is captured by a capturing group. You can retrieve a complete set of substrings that are captured by groups that have quantifiers from the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object that is returned by the [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) property.
|
||||
|
||||
The following example clarifies the relationship between the [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) and [Capture](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Capture) objects.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"(\b(\w+)\W+)+";
|
||||
string input = "This is a short sentence.";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
Console.WriteLine("Match: '{0}'", match.Value);
|
||||
for (int ctr = 1; ctr < match.Groups.Count; ctr++)
|
||||
{
|
||||
Console.WriteLine(" Group {0}: '{1}'", ctr, match.Groups[ctr].Value);
|
||||
int capCtr = 0;
|
||||
foreach (Capture capture in match.Groups[ctr].Captures)
|
||||
{
|
||||
Console.WriteLine(" Capture {0}: '{1}'", capCtr, capture.Value);
|
||||
capCtr++;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: 'This is a short sentence. '
|
||||
// Group 1: 'sentence.'
|
||||
// Capture 0: 'This '
|
||||
// Capture 1: 'is '
|
||||
// Capture 2: 'a '
|
||||
// Capture 3: 'short '
|
||||
// Capture 4: 'sentence.'
|
||||
// Group 2: 'sentence'
|
||||
// Capture 0: 'This'
|
||||
// Capture 1: 'is'
|
||||
// Capture 2: 'a'
|
||||
// Capture 3: 'short'
|
||||
// Capture 4: 'sentence'
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(\w+)\W+)+` extracts individual words from a string. It is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(\w+)` | Match one or more word characters. Together, these characters form a word. This is the second capturing group.
|
||||
`\W+` | Match one or more non-word characters.
|
||||
`(\w+)\W+)+` | Match the pattern of one or more word characters followed by one or more non-word characters one or more times. This is the first capturing group.
|
||||
|
||||
The first capturing group matches each word of the sentence. The second capturing group matches each word along with the punctuation and white space that follow the word. The [Group](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group) object whose index is 2 provides information about the text matched by the second capturing group. The complete set of words captured by the capturing group are available from the [CaptureCollection](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.CaptureCollection) object returned by the [Group.Captures](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Group#System_Text_RegularExpressions_Group_Captures) property.
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
||||
[Backtracking in Regular Expressions](backtracking.md)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,214 +0,0 @@
|
|||
---
|
||||
title: Regular Expression Language - Quick Reference
|
||||
description: Regular Expression Language - Quick Reference
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a3c514d9-f7a9-446b-af84-c4dcbbe7c552
|
||||
---
|
||||
|
||||
# Regular Expression Language - Quick Reference
|
||||
|
||||
A regular expression is a pattern that the regular expression engine attempts to match in input text. A pattern consists of one or more character literals, operators, or constructs.
|
||||
|
||||
Each section in this quick reference lists a particular category of characters, operators, and constructs that you can use to define regular expressions:
|
||||
|
||||
* [Character escapes](#Character-escapes)
|
||||
|
||||
* [Character classes](#Character-classes)
|
||||
|
||||
* [Anchors](#Anchors)
|
||||
|
||||
* [Grouping constructs](#Grouping-constructs)
|
||||
|
||||
* [Quantifiers](#Quantifiers)
|
||||
|
||||
* [Backreference constructs](#Backreference-constructs)
|
||||
|
||||
* [Alternation constructs](#Alternation-constructs)
|
||||
|
||||
* [Substitutions](#Substitutions)
|
||||
|
||||
* [Regular expression options](#Regular-expression-options)
|
||||
|
||||
* [Miscellaneous constructs](#Miscellaneous-constructs)
|
||||
|
||||
We’ve also provided this information in two formats that you can download and print for easy reference:
|
||||
|
||||
* [Download in Word (.docx) format](http://download.microsoft.com/download/D/2/4/D240EBF6-A9BA-4E4F-A63F-AEB6DA0B921C/Regular%20expressions%20quick%20reference.docx)
|
||||
|
||||
* [Download in PDF (.pdf) format]("http://download.microsoft.com/download/D/2/4/D240EBF6-A9BA-4E4F-A63F-AEB6DA0B921C/Regular%20expressions%20quick%20reference.pdf)
|
||||
|
||||
## Character Escapes
|
||||
|
||||
The backslash character (\) in a regular expression indicates that the character that follows it either is a special character (as shown in the following table), or should be interpreted literally. For more information, see [Character Escapes in Regular Expressions](escapes.md).
|
||||
|
||||
Escaped character | Description | Pattern | Matches
|
||||
----------------- | ----------- | ------- | -------
|
||||
**\a** | Matches a bell character, \u0007. | `\a` | "\u0007" in "Error!" + '\u0007'
|
||||
**\b** | In a character class, matches a backspace, \u0008. | `[\b]{3,}` | "\b\b\b\b" in "\b\b\b\b"
|
||||
**\t** | Matches a tab, \u0009. | `(\w+)\t` | "item1\t", "item2\t" in "item1\titem2\t"
|
||||
**\r** | Matches a carriage return, \u000D. (**\r** is not equivalent to the newline character, **\n**.) | `\r\n(\w+)` | "\r\nThese" in "\r\nThese are\ntwo lines."
|
||||
**\v** | Matches a vertical tab, \u000B. | `[\v]{2,}` | "\v\v\v" in "\v\v\v"
|
||||
**\f** | Matches a form feed, \u000C. | `[\f]{2,}` | "\f\f\f" in "\f\f\f"
|
||||
**\n** | Matches a new line, \u000A. | `\r\n(\w+)` | "\r\nThese" in "\r\nThese are\ntwo lines."
|
||||
**\e** | Matches an escape, \u001B. | `\e` | "\x001B" in "\x001B"
|
||||
**\**_nnn_ | Uses octal representation to specify a character (*nnn* consists of two or three digits). | `\w\040\w` | "a b", "c d" in "a bc d"
|
||||
**\x**_nn_ | Uses hexadecimal representation to specify a character (*nn* consists of exactly two digits). | `\w\x20\w` | "a b", "c d" in "a bc d"
|
||||
**\c**_X_ or **\c**_x_ | Matches the ASCII control character that is specified by *X* or *x*, where *X* or *x* is the letter of the control character. | `\cC` | "\x0003" in "\x0003" (Ctrl-C)
|
||||
**\u**_nnnn_ | Matches a Unicode character by using hexadecimal representation (exactly four digits, as represented by *nnnn*). | `\w\u0020\w` | "a b", "c d" in "a bc d"
|
||||
**\** | When followed by a character that is not recognized as an escaped character in this and other tables in this topic, matches that character. For example, __\*__ is the same as **\x2A**, and **\.** is the same as **\x2E**. This allows the regular expression engine to disambiguate language elements (such as `*` or `?`) and character literals (represented by `\*` or `\?)`. | `\d+[\+-x\*]\d+` | "2+2" and "3*9" in "(2+2) * 3*9"
|
||||
|
||||
## Character Classes
|
||||
|
||||
A character class matches any one of a set of characters. Character classes include the language elements listed in the following table. For more information, see [Character Classes in Regular Expressions](classes.md).
|
||||
|
||||
Character class | Description | Pattern | Matches
|
||||
--------------- | ----------- | ------- | -------
|
||||
__[__*character_group*__]__ | Matches any single character in character_group. By default, the match is case-sensitive. | `[ae]` | "a" in "gray", "a", "e" in "lane"
|
||||
__[^__*character_group*__]__ | Negation: Matches any single character that is not in *character_group*. By default, characters in *character_group* are case-sensitive. | `[^aei]` | "r", "g", "n" in "reign"
|
||||
__[__*first-last*__]__ | Character range: Matches any single character in the range from *first* to *last*. | `[A-Z]` | "A", "B" in "AB123"
|
||||
**.** | Wildcard: Matches any single character except \n. To match a literal period character (. or \u002E), you must precede it with the escape character (\.). | `a.e` | "ave" in "nave", "ate" in "water"
|
||||
__\p{__*name*__}__ | Matches any single character in the Unicode general category or named block specified by *name*. | `\p{Lu}`, `\p{IsCyrillic}` | "C", "L" in "City Lights", "?", "?" in "??em"
|
||||
__\P{__*name*__}__ | Matches any single character that is not in the Unicode general category or named block specified by *name*. | `\P{Lu}`, `\P{IsCyrillic}` |` "i", "t", "y" in "City", "e", "m" in "??em"
|
||||
**\w** | Matches any word character. | `\w` | "I", "D", "A", "1", "3" in "ID A1.3"
|
||||
**\W** | Matches any non-word character. | `\W` | " ", "." in "ID A1.3"
|
||||
**\s** | Matches any white-space character. | `\w\s` | "D " in "ID A1.3"
|
||||
**\S** | Matches any non-white-space character. | `\s\S` | " _" in "int __ctr"
|
||||
**\d** | Matches any decimal digit. | `\d` | "4" in "4 = IV"
|
||||
**\D** | Matches any character other than a decimal digit. | `\D` | " ", "=", " ", "I", "V" in "4 = IV"
|
||||
|
||||
## Anchors
|
||||
|
||||
Anchors, or atomic zero-width assertions, cause a match to succeed or fail depending on the current position in the string, but they do not cause the engine to advance through the string or consume characters. The metacharacters listed in the following table are anchors. For more information, see [Anchors in Regular Expressions](anchors.md)).
|
||||
|
||||
Assertion | Description | Pattern | Matches
|
||||
--------- | ----------- | ------- | -------
|
||||
**^** | The match must start at the beginning of the string or line. | `^\d{3}` | "901" in "901-333-"
|
||||
**$** | The match must occur at the end of the string or before **\n** at the end of the line or string. | `-\d{3}$` | "-333" in "-901-333"
|
||||
**\A** | The match must occur at the start of the string. | `\A\d{3}` | "901" in "901-333-"
|
||||
**\Z** | The match must occur at the end of the string or before **\n** at the end of the string. | `-\d{3}\Z` | "-333" in "-901-333"
|
||||
**\z** | The match must occur at the end of the string. | `-\d{3}\z` | "-333" in "-901-333"
|
||||
**\G** | The match must occur at the point where the previous match ended. | `\G\(\d\)` | "(1)", "(3)", "(5)" in "(1)(3)(5)[7](9)"
|
||||
**\b** | The match must occur on a boundary between a **\w** (alphanumeric) and a **\W** (nonalphanumeric) character. | `\b\w+\s\w+\b` | "them theme", "them them" in "them theme them them"
|
||||
**\B** | The match must not occur on a **\b** boundary. | `\Bend\w*\b` | "ends", "ender" in "end sends endure lender"
|
||||
|
||||
## Grouping Constructs
|
||||
|
||||
Grouping constructs delineate subexpressions of a regular expression and typically capture substrings of an input string. Grouping constructs include the language elements listed in the following table. For more information, see [Grouping Constructs in Regular Expressions](groupng.md).
|
||||
|
||||
Grouping construct | Description | Pattern | Matches
|
||||
------------------ | ----------- | ------- | -------
|
||||
**(**_subexpression_**)** | Captures the matched subexpression and assigns it a one-based ordinal number. | `(\w)\1` | "ee" in "deep"
|
||||
**(?**<name> _subexpression_**)** | Captures the matched subexpression into a named group. | `(?<double>\w)\k<double>` | "ee" in "deep"
|
||||
**(?**<name1-name2> _subexpression_**)** | Defines a balancing group definition. For more information, see the "Balancing Group Definition" section in [Grouping Constructs in Regular Expressions](groupng.md). | `(((?'Open'\()[^\(\)]*)+((?'Close-Open'\))[^\(\)]*)+)*(?(Open)(?!))$` | "((1-3)*(3-1))" in "3+2^((1-3)*(3-1))"
|
||||
**(?**: subexpression**)** | Defines a noncapturing group. | `Write(?:Line)?` | "WriteLine" in "Console.WriteLine()", "Write" in "Console.Write(value)"
|
||||
**(?imnsx-imnsx**: _subexpression_**)** | Applies or disables the specified options within _subexpression_. For more information, see [Regular Expression Options](options.md). | `A\d{2}(?i:\w+)\b` | "A12xl", "A12XL" in "A12xl A12XL a12xl"
|
||||
**(?**= _subexpression_**)** | Zero-width positive lookahead assertion. | `\w+(?=\.)` | "is", "ran", and "out" in "He is. The dog ran. The sun is out."
|
||||
**(?!** _subexpression_**)** | Zero-width negative lookahead assertion. | `\b(?!un)\w+\b` | "sure", "used" in "unsure sure unity used"
|
||||
**(?**<= _subexpression_**)** | Zero-width positive lookbehind assertion. | `(?<=19)\d{2}\b` | "99", "50", "05" in "1851 1999 1950 1905 2003"
|
||||
**(?**<! _subexpression_**)** | Zero-width negative lookbehind assertion. | `(?<!19)\d{2}\b` | "51", "03" in "1851 1999 1950 1905 2003"
|
||||
**(?**> _subexpression_**)** | Nonbacktracking (or "greedy") subexpression. | `[13579](?>A+B+)` | "1ABB", "3ABB", and "5AB" in "1ABB 3ABBC 5AB 5AC"
|
||||
|
||||
## Quantifiers
|
||||
|
||||
A quantifier specifies how many instances of the previous element (which can be a character, a group, or a character class) must be present in the input string for a match to occur. Quantifiers include the language elements listed in the following table. For more information, see [Quantifiers in Regular Expressions](quantifiers.md).
|
||||
|
||||
Quantifier | Description | Pattern | Matches
|
||||
---------- | ----------- | ------- | -------
|
||||
__*__ | Matches the previous element zero or more times. | `\d*\.\d` | ".0", "19.9", "219.9"
|
||||
**+** | Matches the previous element one or more times. | `"be+"` | "bee" in "been", "be" in "bent"
|
||||
**?** | Matches the previous element zero or one time. | `"rai?n"` | "ran", "rain"
|
||||
**{**_n_**}** | Matches the previous element exactly *n* times. | `",\d{3}"` | ",043" in "1,043.6", ",876", ",543", and ",210" in "9,876,543,210"
|
||||
**{**_n_,**}** | Matches the previous element at least *n* times. | `"\d{2,}"` | "166", "29", "1930"
|
||||
**{**_n_,_m_**}** | Matches the previous element at least *n* times, but no more than *m* times. | `"\d{3,5}"` | "166", "17668"; "19302" in "193024"
|
||||
__*?__ | Matches the previous element zero or more times, but as few times as possible. | `\d*?\.\d` | ".0", "19.9", "219.9"
|
||||
**+?** | Matches the previous element one or more times, but as few times as possible. | `"be+?"` | "be" in "been", "be" in "bent"
|
||||
**??** | Matches the previous element zero or one time, but as few times as possible. | `"rai??n"` | "ran", "rain"
|
||||
**{**_n_**}?** | Matches the preceding element exactly *n* times. | `",\d{3}?"` | ",043" in "1,043.6", ",876", ",543", and ",210" in "9,876,543,210"
|
||||
**{**_n_,**}?** | Matches the previous element at least *n* times, but as few times as possible. | `"\d{2,}?"` | "166", "29", "1930"
|
||||
**{**_n_,_m_**}?** | Matches the previous element between *n* and *m* times, but as few times as possible. | `"\d{3,5}?"` | "166", "17668"; "193", "024" in "193024"
|
||||
|
||||
## Backreference Constructs
|
||||
|
||||
A backreference allows a previously matched subexpression to be identified subsequently in the same regular expression. The following table lists the backreference constructs supported by regular expressions in the .NET Framework. For more information, see [Backreference Constructs in Regular Expressions](backreference.md).
|
||||
|
||||
Backreference construct | Description | Pattern | Matches
|
||||
----------------------- | ----------- | ------- | -------
|
||||
**\**_number_ | Backreference. Matches the value of a numbered subexpression. | `(\w)\1 ` | "ee" in "seek"
|
||||
**\k<**_name_**>** | Named backreference. Matches the value of a named expression. | `(?<char>\w)\k<char>` | "ee" in "seek"
|
||||
|
||||
## Alternation Constructs
|
||||
|
||||
Alternation constructs modify a regular expression to enable either/or matching. These constructs include the language elements listed in the following table. For more information, see [Alternation Constructs in Regular Expressions](alternation.md).
|
||||
|
||||
Alternation construct | Description | Pattern | Matches
|
||||
--------------------- | ----------- | ------- | -------
|
||||
**|** | Matches any one element separated by the vertical bar (*|) character. | `th(e*|is*|at)` | "the", "this" in "this is the day. "
|
||||
__(?(__*expression*__)__*yes*__|__*no*__)__ | Matches *yes* if the regular expression pattern designated by *expression* matches; otherwise, matches the optional *no* part. *expression* is interpreted as a zero-width assertion. | `(?(A)A\d{2}\b*|\b\d{3}\b)` | "A10", "910" in "A10 C103 910"
|
||||
**(?(**_name_**)**_yes_|_no_**)** | Matches *yes* if *name*, a named or numbered capturing group, has a match; otherwise, matches the optional *no*. | `(?<quoted>")?(?,(quoted).+?"*|\S+\s)` | Dogs.jpg, "Yiska playing.jpg" in "Dogs.jpg "Yiska playing.jpg""
|
||||
|
||||
## Substitutions
|
||||
|
||||
Substitutions are regular expression language elements that are supported in replacement patterns. For more information, see [Substitutions in Regular Expressions](.//substitutions.md). The metacharacters listed in the following table are atomic zero-width assertions.
|
||||
|
||||
Character | Description | Pattern | Replacement pattern | Input string | Result string
|
||||
--------- | ----------- | ------- | ------------------- | ------------ | -------------
|
||||
**$**_number_ | Substitutes the substring matched by group *number*. | `\b(\w+)(\s)(\w+)\b` | `$3$2$1` | "one two" | "two one"
|
||||
**${**_name_**}** | Substitutes the substring matched by the named group *name*. | `\b(?<word1>\w+)(\s)(?<word2>\w+)\b` | `${word2} ${word1}` | "one two" | "two one"
|
||||
**$$** | Substitutes a literal "$". | `\b(\d+)\s?USD` | `$$$1` | "103 USD" | "$103"
|
||||
**$&** | Substitutes a copy of the whole match. | `\$?\d*\.?\d+` | `**$&**` | "$1.30" | "**$1.30**"
|
||||
**$`** | Substitutes all the text of the input string before the match. | `B+` | `$`` | "AABBCC" | "AAAACC"
|
||||
**$'** | Substitutes all the text of the input string after the match. | `B+` | `$'` | "AABBCC" | "AACCCC"
|
||||
**$+** | Substitutes the last group that was captured. | `B+(C+)` | `$+` | "AABBCCDD" | "AACCDD"
|
||||
**$_** | Substitutes the entire input string. | `B+` | `$_` | "AABBCC" | "AAAABBCCCC"
|
||||
|
||||
## Regular Expression Options
|
||||
|
||||
You can specify options that control how the regular expression engine interprets a regular expression pattern. Many of these options can be specified either inline (in the regular expression pattern) or as one or more `RegexOptions` constants. This quick reference lists only inline options. For more information about inline and `RegexOptions` options, see the article [Regular Expression Options](options.md).
|
||||
|
||||
You can specify an inline option in two ways:
|
||||
|
||||
* By using the miscellaneous construct **(?imnsx-imnsx)**, where a minus sign (-) before an option or set of options turns those options off. For example, **(?i-mn)** turns case-insensitive matching (i) on, turns multiline mode (**m**) off, and turns unnamed group captures (**n**) off. The option applies to the regular expression pattern from the point at which the option is defined, and is effective either to the end of the pattern or to the point where another construct reverses the option.
|
||||
|
||||
* By using the grouping construct **(?imnsx-imnsx:**_subexpression_**)**, which defines options for the specified group only.
|
||||
|
||||
The .NET Core regular expression engine supports the following inline options.
|
||||
|
||||
Option | Description | Pattern | Matches
|
||||
------ | ----------- | ------- | -------
|
||||
**i** | Use case-insensitive matching. | **\b(?i)a(?-i)a\w+\b** | "aardvark", "aaaAuto" in "aardvark AAAuto aaaAuto Adam breakfast"
|
||||
**m** | Use multiline mode. **^** and **$** match the beginning and end of a line, instead of the beginning and end of a string. | For an example, see the "Multiline Mode" section in [Regular Expression Options](options.md). |
|
||||
**n*** | Do not capture unnamed groups. | For an example, see the "Explicit Captures Only" section in [Regular Expression Options](options.md). |
|
||||
**s** | Use single-line mode. | For an example, see the "Single-line Mode" section in [Regular Expression Options](options.md). |
|
||||
**x** | Ignore unescaped white space in the regular expression pattern. | **\b(?x) \d+ \s \w+** | "1 aardvark", "2 cats" in "1 aardvark 2 cats IV centurions"
|
||||
|
||||
##Miscellaneous Constructs
|
||||
|
||||
Miscellaneous constructs either modify a regular expression pattern or provide information about it. The following table lists the miscellaneous constructs supported by the .NET Core. For more information, see [Miscellaneous Constructs in Regular Expressions](miscellaneous.md).
|
||||
|
||||
Construct | Definition | Example
|
||||
--------- | ---------- | -------
|
||||
**(?imnsx-imnsx)** | Sets or disables options such as case insensitivity in the middle of a pattern. For more information, see [Regular Expression Options](options.md). | `\bA(?i)b\w+\b` matches "ABA", "Able" in "ABA Able Act"
|
||||
**(?#** _comment_**)** | Inline comment. The comment ends at the first closing parenthesis. | `\bA(?#` matches words starting with `A)\w+\b`
|
||||
**#** [to end of line] | X-mode comment. The comment starts at an unescaped # and continues to the end of the line. | `(?x)\bA\w+\b#` matches words starting with `A`
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Text.RegularExpressions](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions)
|
||||
|
||||
[Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex)
|
||||
|
||||
[Download in Word (.docx) format](http://download.microsoft.com/download/D/2/4/D240EBF6-A9BA-4E4F-A63F-AEB6DA0B921C/Regular%20expressions%20quick%20reference.docx)
|
||||
|
||||
[Download in PDF (.pdf) format]("http://download.microsoft.com/download/D/2/4/D240EBF6-A9BA-4E4F-A63F-AEB6DA0B921C/Regular%20expressions%20quick%20reference.pdf)
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,199 +0,0 @@
|
|||
---
|
||||
title: Miscellaneous Constructs in Regular Expressions
|
||||
description: Miscellaneous Constructs in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: c815d613-5d6b-40a5-a732-df7b309ff4ee
|
||||
---
|
||||
|
||||
# Miscellaneous Constructs in Regular Expressions
|
||||
|
||||
|
||||
Regular expressions in .NET Core include three miscellaneous language constructs. One lets you enable or disable particular matching options in the middle of a regular expression pattern. The remaining two let you include comments in a regular expression.
|
||||
|
||||
## Inline Options
|
||||
|
||||
You can set or disable specific pattern matching options for part of a regular expression by using the syntax
|
||||
|
||||
```
|
||||
(?imnsx-imnsx)
|
||||
```
|
||||
|
||||
You list the options you want to enable after the question mark, and the options you want to disable after the minus sign. The following table describes each option. For more information about each option, see [Regular Expression Options](options.md).
|
||||
|
||||
Option | Description
|
||||
------ | -----------
|
||||
**i** | Case-insensitive matching.
|
||||
**m** | Multiline mode.
|
||||
**n** | Explicit captures only. (Parentheses do not act as capturing groups.)
|
||||
**s** | Single-line mode.
|
||||
**x** | Ignore unescaped white space, and allow x-mode comments.
|
||||
|
||||
Any change in regular expression options defined by the **(?imnsx-imnsx)** construct remains in effect until the end of the enclosing group.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The **(?imnsx-imnsx**:_subexpression_**)** grouping construct provides identical functionality for a subexpression. For more information, see [Grouping Constructs in Regular Expressions](grouping.md).
|
||||
|
||||
The following example uses the **i**, **n**, and **x** options to enable case insensitivity and explicit captures, and to ignore white space in the regular expression pattern in the middle of a regular expression.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern;
|
||||
string input = "double dare double Double a Drooling dog The Dreaded Deep";
|
||||
|
||||
pattern = @"\b(D\w+)\s(d\w+)\b";
|
||||
// Match pattern using default options.
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
{
|
||||
Console.WriteLine(match.Value);
|
||||
if (match.Groups.Count > 1)
|
||||
for (int ctr = 1; ctr < match.Groups.Count; ctr++)
|
||||
Console.WriteLine(" Group {0}: {1}", ctr, match.Groups[ctr].Value);
|
||||
}
|
||||
Console.WriteLine();
|
||||
|
||||
// Change regular expression pattern to include options.
|
||||
pattern = @"\b(D\w+)(?ixn) \s (d\w+) \b";
|
||||
// Match new pattern with options.
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
{
|
||||
Console.WriteLine(match.Value);
|
||||
if (match.Groups.Count > 1)
|
||||
for (int ctr = 1; ctr < match.Groups.Count; ctr++)
|
||||
Console.WriteLine(" Group {0}: '{1}'", ctr, match.Groups[ctr].Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Drooling dog
|
||||
// Group 1: Drooling
|
||||
// Group 2: dog
|
||||
//
|
||||
// Drooling dog
|
||||
// Group 1: 'Drooling'
|
||||
// Dreaded Deep
|
||||
// Group 1: 'Dreaded'
|
||||
```
|
||||
|
||||
The example defines two regular expressions. The first, `\b(D\w+)\s(d\w+)\b`, matches two consecutive words that begin with an uppercase "D" and a lowercase "d". The second regular expression, `\b(D\w+)(?ixn) \s (d\w+) \b`, uses inline options to modify this pattern, as described in the following table. A comparison of the results confirms the effect of the `(?ixn)` construct.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`(D\w+)` | Match a capital "D" followed by one or more word characters. This is the first capture group.
|
||||
`(?ixn)` | From this point on, make comparisons case-insensitive, make only explicit captures, and ignore white space in the regular expression pattern.
|
||||
`\s` | Match a white-space character.
|
||||
`(d\w+)` | Match an uppercase or lowercase "d" followed by one or more word characters. This group is not captured because the n (explicit capture) option was enabled..
|
||||
`\b` | Match a word boundary.
|
||||
|
||||
## Inline Comment
|
||||
|
||||
The **(?#** _comment_**)** construct lets you include an inline comment in a regular expression. The regular expression engine does not use any part of the comment in pattern matching, although the comment is included in the string that is returned by the [Regex.ToString](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_ToString) method. The comment ends at the first closing parenthesis.
|
||||
|
||||
The following example repeats the first regular expression pattern from the example in the previous section. It adds two inline comments to the regular expression to indicate whether the comparison is case-sensitive. The regular expression pattern, `\b((?# case-sensitive comparison)D\w+)\s((?#case-insensitive comparison)d\w+)\b`, is defined as follows.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`(?# case-sensitive comparison)` | A comment. It does not affect pattern-matching behavior.
|
||||
`(D\w+)` | Match a capital "D" followed by one or more word characters. This is the first capturing group.
|
||||
`\s` | Match a white-space character.
|
||||
`(?ixn)` |`From this point on, make comparisons case-insensitive, make only explicit captures, and ignore white space in the regular expression pattern.
|
||||
`(?#case-insensitive comparison)` | A comment. It does not affect pattern-matching behavior.
|
||||
`(d\w+)` | Match an uppercase or lowercase "d" followed by one or more word characters. This is the second capture group.
|
||||
`\b` | Match a word boundary.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b((?# case sensitive comparison)D\w+)\s(?ixn)((?#case insensitive comparison)d\w+)\b";
|
||||
Regex rgx = new Regex(pattern);
|
||||
string input = "double dare double Double a Drooling dog The Dreaded Deep";
|
||||
|
||||
Console.WriteLine("Pattern: " + pattern.ToString());
|
||||
// Match pattern using default options.
|
||||
foreach (Match match in rgx.Matches(input))
|
||||
{
|
||||
Console.WriteLine(match.Value);
|
||||
if (match.Groups.Count > 1)
|
||||
{
|
||||
for (int ctr = 1; ctr <match.Groups.Count; ctr++)
|
||||
Console.WriteLine(" Group {0}: {1}", ctr, match.Groups[ctr].Value);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Pattern: \b((?# case sensitive comparison)D\w+)\s(?ixn)((?#case insensitive comp
|
||||
// arison)d\w+)\b
|
||||
// Drooling dog
|
||||
// Group 1: Drooling
|
||||
// Dreaded Deep
|
||||
// Group 1: Dreaded
|
||||
```
|
||||
|
||||
## End-of-Line Comment
|
||||
|
||||
A number sign (**#**) marks an x-mode comment, which starts at the unescaped # character at the end of the regular expression pattern and continues until the end of the line. To use this construct, you must either enable the **x** option (through inline options) or supply the [RegexOptions.IgnorePatternWhitespace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnorePatternWhitespace) value to the *option* parameter when instantiating the [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) object or calling a static [Regex](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex) method.
|
||||
|
||||
The following example illustrates the end-of-line comment construct. It determines whether a string is a composite format string that includes at least one format item. The following table describes the constructs in the regular expression pattern:
|
||||
|
||||
`\{\d+(,-*\d+)*(\:\w{1,4}?)*\}(?x) # Looks for a composite format item`.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\{` | Match an opening brace.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
`(,-*\d+)*` | Match zero or one occurrence of a comma, followed by an optional minus sign, followed by one or more decimal digits.
|
||||
`(\:\w{1,4}?)*` | Match zero or one occurrence of a colon, followed by one to four, but as few as possible, white-space characters.
|
||||
`(?#case insensitive comparison)` | An inline comment. It has no effect on pattern-matching behavior.
|
||||
`\}` | Match a closing brace.
|
||||
`(?x)` | Enable the ignore pattern white-space option so that the end-of-line comment will be recognized.
|
||||
`# Looks for a composite format item.` | An end-of-line comment.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\{\d+(,-*\d+)*(\:\w{1,4}?)*\}(?x) # Looks for a composite format item.";
|
||||
string input = "{0,-3:F}";
|
||||
Console.WriteLine("'{0}':", input);
|
||||
if (Regex.IsMatch(input, pattern))
|
||||
Console.WriteLine(" contains a composite format item.");
|
||||
else
|
||||
Console.WriteLine(" does not contain a composite format item.");
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// '{0,-3:F}':
|
||||
// contains a composite format item.
|
||||
```
|
||||
|
||||
Note that, instead of providing the `(?x)` construct in the regular expression, the comment could also have been recognized by calling the [Regex.IsMatch(String, String, RegexOptions)](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_IsMatch_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method and passing it the [RegexOptions.IgnorePatternWhitespace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnorePatternWhitespace) enumeration value.
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -1,522 +0,0 @@
|
|||
---
|
||||
title: Quantifiers in Regular Expressions
|
||||
description: Quantifiers in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a0ef6db0-2563-45c3-a61a-71a5d5075766
|
||||
---
|
||||
|
||||
# Quantifiers in Regular Expressions
|
||||
|
||||
|
||||
Quantifiers specify how many instances of a character, group, or character class must be present in the input for a match to be found. The following table lists the quantifiers supported by .NET Core.
|
||||
|
||||
Greedy quantifier | Lazy quantifier | Description
|
||||
----------------- | --------------- | -----------
|
||||
__*+__ | __*?__ | Match zero or more times.
|
||||
**+** | **+?** | Match one or more times.
|
||||
**?** | **??** | Match zero or one time.
|
||||
**{**_n_**}** | **{**_n_**}?** | Match exactly n times.
|
||||
**{**_n_**,}** | **{**_n_**,}?** | Match at least n times.
|
||||
**{**_n_**,**_m_**}** | **{**_n_**,**_m_**}?** | Match from n to m times.
|
||||
|
||||
The quantities *n* and *m* are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the `?` character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section [Greedy and Lazy Quantifiers](#Greedy-and-Lazy-Quantifiers) later in this topic.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> Nesting quantifiers (for example, as the regular expression pattern `(a*)*` does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see [Backtracking in Regular Expressions](backtracking.md).
|
||||
|
||||
## Regular Expression Quantifiers
|
||||
|
||||
The following sections list the quantifiers supported by .NET Core regular expressions.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> If the \*, +, ?, {, and } characters are encountered in a regular expression pattern, the regular expression engine interprets them as quantifiers or part of quantifier constructs unless they are included in a [character class](class.md). To interpret these as literal characters outside a character class, you must escape them by preceding them with a backslash. For example, the string `\*` in a regular expression pattern is interpreted as a literal asterisk ("*") character.
|
||||
|
||||
### Match Zero or More Times: *
|
||||
|
||||
The \* quantifier matches the preceding element zero or more times. It is equivalent to the **{0,}** quantifier. __*__ is a greedy quantifier whose lazy equivalent is __*?__.
|
||||
|
||||
The following example illustrates this regular expression. Of the nine digits in the input string, five match the pattern and four (`95`, `929`, `9129`, and `9919`) do not.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b91*9*\b";
|
||||
string input = "99 95 919 929 9119 9219 999 9919 91119";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// '99' found at position 0.
|
||||
// '919' found at position 6.
|
||||
// '9119' found at position 14.
|
||||
// '999' found at position 24.
|
||||
// '91119' found at position 33.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`91*` | Match a "9" followed by zero or more "1" characters.
|
||||
`9*` | Match zero or more "9" characters.
|
||||
`\b` | End at a word boundary.
|
||||
|
||||
### Match One or More Times: +
|
||||
|
||||
The **+** quantifier matches the preceding element one or more times. It is equivalent to **{1,}**. **+** is a greedy quantifier whose lazy equivalent is **+?**.
|
||||
|
||||
For example, the regular expression `\ban+\w*?\b` tries to match entire words that begin with the letter `a` followed by one or more instances of the letter `n`. The following example illustrates this regular expression. The regular expression matches the words `an`, `annual`, `announcement`, and `antique`, and correctly fails to match `autumn` and `all`.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\ban+\w*?\b";
|
||||
|
||||
string input = "Autumn is a great time for an annual announcement to all antique collectors.";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'an' found at position 27.
|
||||
// 'annual' found at position 30.
|
||||
// 'announcement' found at position 37.
|
||||
// 'antique' found at position 57.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`an+` | Match an "a" followed by one or more "n" characters.
|
||||
`\w*?` | Match a word character zero or more times, but as few times as possible.
|
||||
`\b` | End at a word boundary.
|
||||
|
||||
### Match Zero or One Time: ?
|
||||
|
||||
The **?** quantifier matches the preceding element zero or one time. It is equivalent to **{0,1}**. **?** is a greedy quantifier whose lazy equivalent is **??**.
|
||||
|
||||
For example, the regular expression `\ban?\b` tries to match entire words that begin with the letter `a` followed by zero or one instances of the letter `n`. In other words, it tries to match the words `a` and `an`. The following example illustrates this regular expression.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\ban?\b";
|
||||
string input = "An amiable animal with a large snount and an animated nose.";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'An' found at position 0.
|
||||
// 'a' found at position 23.
|
||||
// 'an' found at position 42.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`an?` | Match an "a" followed by zero or one "n" character.
|
||||
`\b` | End at a word boundary.
|
||||
|
||||
### Match Exactly n Times: {n}
|
||||
|
||||
|
||||
|
||||
The **{**_n_**}** quantifier matches the preceding element exactly *n* times, where *n* is any integer. **{**_n_**}** is a greedy quantifier whose lazy equivalent is **{**_n_**}?**.
|
||||
|
||||
For example, the regular expression `\b\d+\,\d{3}\b` tries to match a word boundary followed by one or more decimal digits followed by three decimal digits followed by a word boundary. The following example illustrates this regular expression.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b\d+\,\d{3}\b";
|
||||
string input = "Sales totaled 103,524 million in January, " +
|
||||
"106,971 million in February, but only " +
|
||||
"943 million in March.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// '103,524' found at position 14.
|
||||
// '106,971' found at position 45.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
`\,` | Match a comma character.
|
||||
`\d{3}` | Match three decimal digits.
|
||||
`\b` | End at a word boundary.
|
||||
|
||||
### Match at Least n Times: {n,}
|
||||
|
||||
The **{**_n_**,}** quantifier matches the preceding element at least *n* times, where *n* is any integer. **{**_n_**,}** is a greedy quantifier whose lazy equivalent is **{**_n_**}?**.
|
||||
|
||||
For example, the regular expression `\b\d{2,}\b\D+` tries to match a word boundary followed by at least two digits followed by a word boundary and a non-digit character. The following example illustrates this regular expression. The regular expression fails to match the phrase "7 days" because it contains just one decimal digit, but it successfully matches the phrases "10 weeks and 300 years".
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b\d{2,}\b\D+";
|
||||
string input = "7 days, 10 weeks, 300 years";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// '10 weeks, ' found at position 8.
|
||||
// '300 years' found at position 18.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`\d{2,}` | Match at least two decimal digits.
|
||||
`\b` | Match a word boundary.
|
||||
`\D+` | Match at least one non-decimal digit.
|
||||
|
||||
### Match Between n and m Times: {n,m}
|
||||
|
||||
The **{**_n_**,**_m_**}** quantifier matches the preceding element at least *n* times, but no more than *m* times, where *n* and *m* are integers. **{**_n_**,**_m_**}** is a greedy quantifier whose lazy equivalent is **{**_n_**,**_m_**}?**.
|
||||
|
||||
In the following example, the regular expression `(00\s){2,4}` tries to match between two and four occurrences of two zero digits followed by a space. Note that the final portion of the input string includes this pattern five times rather than the maximum of four. However, only the initial portion of this substring (up to the space and the fifth pair of zeros) matches the regular expression pattern.
|
||||
|
||||
```csharp
|
||||
string pattern = @"(00\s){2,4}";
|
||||
string input = "0x00 FF 00 00 18 17 FF 00 00 00 21 00 00 00 00 00";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// '00 00 ' found at position 8.
|
||||
// '00 00 00 ' found at position 23.
|
||||
// '00 00 00 00 ' found at position 35.
|
||||
```
|
||||
|
||||
### Match Zero or More Times (Lazy Match): *?
|
||||
|
||||
The __*?__ quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier __*__.
|
||||
|
||||
In the following example, the regular expression `\b\w*?oo\w*?\b` matches all words that contain the string `oo`.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b\w*?oo\w*?\b";
|
||||
string input = "woof root root rob oof woo woe";
|
||||
foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'woof' found at position 0.
|
||||
// 'root' found at position 5.
|
||||
// 'root' found at position 10.
|
||||
// 'oof' found at position 19.
|
||||
// 'woo' found at position 23.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`\w*?` | Match zero or more word characters, but as few characters as possible.
|
||||
`oo` | Match the string "oo".
|
||||
`\w*?` | Match zero or more word characters, but as few characters as possible.
|
||||
`\b` | End on a word boundary.
|
||||
|
||||
### Match One or More Times (Lazy Match): +?
|
||||
|
||||
The **+?** quantifier matches the preceding element one or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier **+**.
|
||||
|
||||
For example, the regular expression `\b\w+?\b` matches one or more characters separated by word boundaries. The following example illustrates this regular expression.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b\w+?\b";
|
||||
string input = "Aa Bb Cc Dd Ee Ff";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'Aa' found at position 0.
|
||||
// 'Bb' found at position 3.
|
||||
// 'Cc' found at position 6.
|
||||
// 'Dd' found at position 9.
|
||||
// 'Ee' found at position 12.
|
||||
// 'Ff' found at position 15.
|
||||
```
|
||||
|
||||
### Match Zero or One Time (Lazy Match): ??
|
||||
|
||||
The **??** quantifier matches the preceding element zero or one time, but as few times as possible. It is the lazy counterpart of the greedy quantifier **?**.
|
||||
|
||||
For example, the regular expression `^\s*(System.)??Console.Write(Line)??\(??` attempts to match the strings "Console.Write" or "Console.WriteLine". The string can also include "System." before "Console", and it can be followed by an opening parenthesis. The string must be at the beginning of a line, although it can be preceded by white space. The following example illustrates this regular expression.
|
||||
|
||||
```csharp
|
||||
string pattern = @"^\s*(System.)??Console.Write(Line)??\(??";
|
||||
string input = "System.Console.WriteLine(\"Hello!\")\n" +
|
||||
"Console.Write(\"Hello!\")\n" +
|
||||
"Console.WriteLine(\"Hello!\")\n" +
|
||||
"Console.ReadLine()\n" +
|
||||
" Console.WriteLine";
|
||||
foreach (Match match in Regex.Matches(input, pattern,
|
||||
RegexOptions.IgnorePatternWhitespace |
|
||||
RegexOptions.IgnoreCase |
|
||||
RegexOptions.Multiline))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'System.Console.Write' found at position 0.
|
||||
// 'Console.Write' found at position 36.
|
||||
// 'Console.Write' found at position 61.
|
||||
// ' Console.Write' found at position 110.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Match the start of the input stream.
|
||||
`\s*` | Match zero or more white-space characters.
|
||||
`(System.)??` | Match zero or one occurrence of the string "System.".
|
||||
`Console.Write` | Match the string "Console.Write".
|
||||
`(Line)??` | Match zero or one occurrence of the string "Line".
|
||||
`\(??` | Match zero or one occurrence of the opening parenthesis.
|
||||
|
||||
### Match Exactly n Times (Lazy Match): {n}?
|
||||
|
||||
The **{**_n_**}?** quantifier matches the preceding element exactly *n* times, where *n* is any integer. It is the lazy counterpart of the greedy quantifier **{**_n_**}+**.
|
||||
|
||||
In the following example, the regular expression `\b(\w{3,}?\.){2}?\w{3,}?\b` is used to identify a Web site address. Note that it matches "www.microsoft.com" and "msdn.microsoft.com", but does not match "mywebsite" or "mycompany.com".
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b(\w{3,}?\.){2}?\w{3,}?\b";
|
||||
string input = "www.microsoft.com msdn.microsoft.com mywebsite mycompany.com";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'www.microsoft.com' found at position 0.
|
||||
// 'msdn.microsoft.com' found at position 18.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`(\w{3,}?\.)` | Match at least 3 word characters, but as few characters as possible, followed by a dot or period character. This is the first capturing group.
|
||||
`(\w{3,}?\.){2}?` | Match the pattern in the first group two times, but as few times as possible.
|
||||
`\b` | End the match on a word boundary.
|
||||
|
||||
### Match at Least n Times (Lazy Match): {n,}?
|
||||
|
||||
The **{**_n_**,}?** quantifier matches the preceding element at least *n* times, where *n* is any integer, but as few times as possible. It is the lazy counterpart of the greedy quantifier **{**_n_**,}**.
|
||||
|
||||
See the example for the **{**_n_**}?** quantifier in the previous section for an illustration. The regular expression in that example uses the **{**_n_**,}** quantifier to match a string that has at least three characters followed by a period.
|
||||
|
||||
### Match Between n and m Times (Lazy Match): {n,m}?
|
||||
|
||||
The **{**_n_**,**_m_**}?** quantifier matches the preceding element between *n* and *m* times, where *n* and *m* are integers, but as few times as possible. It is the lazy counterpart of the greedy quantifier **{**_n_**,**_m_**}**.
|
||||
|
||||
In the following example, the regular expression `\b[A-Z](\w*\s+){1,10}?[.!?]` matches sentences that contain between one and ten words. It matches all the sentences in the input string except for one sentence that contains 18 words.
|
||||
|
||||
```csharp
|
||||
string pattern = @"\b[A-Z](\w*?\s*?){1,10}[.!?]";
|
||||
string input = "Hi. I am writing a short note. Its purpose is " +
|
||||
"to test a regular expression that attempts to find " +
|
||||
"sentences with ten or fewer words. Most sentences " +
|
||||
"in this note are short.";
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine("'{0}' found at position {1}.", match.Value, match.Index);
|
||||
|
||||
// The example displays the following output:
|
||||
// 'Hi.' found at position 0.
|
||||
// 'I am writing a short note.' found at position 4.
|
||||
// 'Most sentences in this note are short.' found at position 132.
|
||||
```
|
||||
|
||||
The regular expression pattern is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start at a word boundary.
|
||||
`[A-Z]` | Match an uppercase character from A to Z.
|
||||
`(\w*\s+)` | Match zero or more word characters, followed by one or more white-space characters. This is the first capture group.
|
||||
`{1,10}?` | Match the previous pattern between 1 and 10 times, but as few times as possible.
|
||||
`[.!?]` | Match any one of the punctuation characters ".", "!", or "?".
|
||||
|
||||
## Greedy and Lazy Quantifiers
|
||||
|
||||
A number of the quantifiers have two versions:
|
||||
|
||||
* A greedy version.
|
||||
|
||||
A greedy quantifier tries to match an element as many times as possible.
|
||||
|
||||
|
||||
* •A non-greedy (or lazy) version.
|
||||
|
||||
A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a **?**.
|
||||
|
||||
Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the __*__ greedy quantifier is `\b.*([0-9]{4})\b`. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows.
|
||||
|
||||
```csharp
|
||||
string greedyPattern = @"\b.*([0-9]{4})\b";
|
||||
string input1 = "1112223333 3992991999";
|
||||
foreach (Match match in Regex.Matches(input1, greedyPattern))
|
||||
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
|
||||
|
||||
// The example displays the following output:
|
||||
// Account ending in ******1999.
|
||||
```
|
||||
|
||||
The regular expression fails to match the first number because the __*__ quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string.
|
||||
|
||||
This is not the desired behavior. Instead, you can use the __*?__ lazy quantifier to extract digits from both numbers, as the following example shows.
|
||||
|
||||
```csharp
|
||||
string lazyPattern = @"\b.*?([0-9]{4})\b";
|
||||
string input2 = "1112223333 3992991999";
|
||||
foreach (Match match in Regex.Matches(input2, lazyPattern))
|
||||
Console.WriteLine("Account ending in ******{0}.", match.Groups[1].Value);
|
||||
|
||||
// The example displays the following output:
|
||||
// Account ending in ******3333.
|
||||
// Account ending in ******1999.
|
||||
```
|
||||
|
||||
In most cases, regular expressions with greedy and lazy quantifiers return the same matches. They most commonly return different results when they are used with the wildcard (**.**) metacharacter, which matches any character.
|
||||
|
||||
## Quantifiers and Empty Matches
|
||||
|
||||
The quantifiers __*__, **+**, and **{**_n_**,**_m_**}** and their lazy counterparts never repeat after an empty match when the minimum number of captures has been found. This rule prevents quantifiers from entering infinite loops on empty subexpression matches when the maximum number of possible group captures is infinite or near infinite.
|
||||
|
||||
For example, the following code shows the result of a call to the [Regex.Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Match_System_String_) method with the regular expression pattern `(a?)*,` which matches zero or one "a" character zero or more times. Note that the single capturing group captures each "a" as well as [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty), but that there is no second empty match, because the first empty match causes the quantifier to stop repeating.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = "(a?)*";
|
||||
string input = "aaabbb";
|
||||
Match match = Regex.Match(input, pattern);
|
||||
Console.WriteLine("Match: '{0}' at index {1}",
|
||||
match.Value, match.Index);
|
||||
if (match.Groups.Count > 1) {
|
||||
GroupCollection groups = match.Groups;
|
||||
for (int grpCtr = 1; grpCtr <= groups.Count - 1; grpCtr++) {
|
||||
Console.WriteLine(" Group {0}: '{1}' at index {2}",
|
||||
grpCtr,
|
||||
groups[grpCtr].Value,
|
||||
groups[grpCtr].Index);
|
||||
int captureCtr = 0;
|
||||
foreach (Capture capture in groups[grpCtr].Captures) {
|
||||
captureCtr++;
|
||||
Console.WriteLine(" Capture {0}: '{1}' at index {2}",
|
||||
captureCtr, capture.Value, capture.Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Match: 'aaa' at index 0
|
||||
// Group 1: '' at index 3
|
||||
// Capture 1: 'a' at index 0
|
||||
// Capture 2: 'a' at index 1
|
||||
// Capture 3: 'a' at index 2
|
||||
// Capture 4: '' at index 3
|
||||
```
|
||||
|
||||
To see the practical difference between a capturing group that defines a minimum and a maximum number of captures and one that defines a fixed number of captures, consider the regular expression patterns `(a\1|(?(1)\1)){0,2}` and `(a\1|(?(1)\1)){2}`. Both regular expressions consist of a single capturing group, which is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`(a\1` | Either match "a" along with the value of the first captured group …
|
||||
`|(?(1)` | … or test whether the first captured group has been defined. (Note that the **(?(1)** construct does not define a capturing group.)
|
||||
`\1))` | If the first captured group exists, match its value. If the group does not exist, the group will match [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty).
|
||||
|
||||
The first regular expression tries to match this pattern between zero and two times; the second, exactly two times. Because the first pattern reaches its minimum number of captures with its first capture of [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty), it never repeats to try to match `a\1;` the `{0,2}` quantifier allows only empty matches in the last iteration. In contrast, the second regular expression does match "a" because it evaluates `a\1` a second time; the minimum number of iterations, 2, forces the engine to repeat after an empty match.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern, input;
|
||||
|
||||
pattern = @"(a\1|(?(1)\1)){0,2}";
|
||||
input = "aaabbb";
|
||||
|
||||
Console.WriteLine("Regex pattern: {0}", pattern);
|
||||
Match match = Regex.Match(input, pattern);
|
||||
Console.WriteLine("Match: '{0}' at position {1}.",
|
||||
match.Value, match.Index);
|
||||
if (match.Groups.Count > 1) {
|
||||
for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
|
||||
{
|
||||
Group group = match.Groups[groupCtr];
|
||||
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
|
||||
groupCtr, group.Value, group.Index);
|
||||
int captureCtr = 0;
|
||||
foreach (Capture capture in group.Captures) {
|
||||
captureCtr++;
|
||||
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
|
||||
captureCtr, capture.Value, capture.Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
Console.WriteLine();
|
||||
|
||||
pattern = @"(a\1|(?(1)\1)){2}";
|
||||
Console.WriteLine("Regex pattern: {0}", pattern);
|
||||
match = Regex.Match(input, pattern);
|
||||
Console.WriteLine("Matched '{0}' at position {1}.",
|
||||
match.Value, match.Index);
|
||||
if (match.Groups.Count > 1) {
|
||||
for (int groupCtr = 1; groupCtr <= match.Groups.Count - 1; groupCtr++)
|
||||
{
|
||||
Group group = match.Groups[groupCtr];
|
||||
Console.WriteLine(" Group: {0}: '{1}' at position {2}.",
|
||||
groupCtr, group.Value, group.Index);
|
||||
int captureCtr = 0;
|
||||
foreach (Capture capture in group.Captures) {
|
||||
captureCtr++;
|
||||
Console.WriteLine(" Capture: {0}: '{1}' at position {2}.",
|
||||
captureCtr, capture.Value, capture.Index);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Regex pattern: (a\1|(?(1)\1)){0,2}
|
||||
// Match: '' at position 0.
|
||||
// Group: 1: '' at position 0.
|
||||
// Capture: 1: '' at position 0.
|
||||
//
|
||||
// Regex pattern: (a\1|(?(1)\1)){2}
|
||||
// Matched 'a' at position 0.
|
||||
// Group: 1: 'a' at position 0.
|
||||
// Capture: 1: '' at position 0.
|
||||
// Capture: 2: 'a' at position 0.
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
||||
[Backtracking in Regular Expressions](backtracking.md)
|
||||
|
|
@ -1,382 +0,0 @@
|
|||
---
|
||||
title: Substitutions in Regular Expressions
|
||||
description: Substitutions in Regular Expressions
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: b1a753f6-febc-4c6d-9d04-283203370aed
|
||||
---
|
||||
|
||||
# Substitutions in Regular Expressions
|
||||
|
||||
|
||||
Substitutions are language elements that are recognized only within replacement patterns. They use a regular expression pattern to define all or part of the text that is to replace matched text in the input string. The replacement pattern can consist of one or more substitutions along with literal characters. Replacement patterns are provided to overloads of the [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method that have a *replacement* parameter and to the [Match.Result](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Match#System_Text_RegularExpressions_Match_Result_System_String_) method. The methods replace the matched pattern with the pattern that is defined by the *replacement* parameter.
|
||||
|
||||
.NET Core defines the substitution elements listed in the following table.
|
||||
|
||||
Substitution | Description
|
||||
------------ | -----------
|
||||
**$**_number_ | Includes the last substring matched by the capturing group that is identified by *number*, where *number* is a decimal value, in the replacement string. For more information, see [Substituting a Numbered Group](#Substituting-a-Numbered-Group).
|
||||
**${**_name_**}** | Includes the last substring matched by the named group that is designated by **(?<**_name_**>)** in the replacement string. For more information, see [Substituting a Named Group](#Substituting-a-Named-Group).
|
||||
**$$** | Includes a single "$" literal in the replacement string. For more information, see [Substituting a "$" Symbol](#Substituting-a-"$"-Symbol).
|
||||
**$&** | Includes a copy of the entire match in the replacement string. For more information, see [Substituting the Entire Match](#Substituting-the-Entire-Match).
|
||||
**$`** | Includes all the text of the input string before the match in the replacement string. For more information, see [Substituting the Text before the Match](#Substituting-the-Text-before-the-Match).
|
||||
**$'** | Includes all the text of the input string after the match in the replacement string. For more information, see [Substituting the Text after the Match](#Substituting-the-Text-after-the-Match).
|
||||
**$+** | Includes the last group captured in the replacement string. For more information, see [Substituting the Last Captured Group](#Substituting-the-Last-Captured-Group).
|
||||
**$_** | Includes the entire input string in the replacement string. For more information, see [Substituting the Entire Input String](#Substituting-the-Entire-Input-String).
|
||||
|
||||
## Substitution Elements and Replacement Patterns
|
||||
|
||||
Substitutions are the only special constructs recognized in a replacement pattern. None of the other regular expression language elements, including character escapes and the period (**.**), which matches any character, are supported. Similarly, substitution language elements are recognized only in replacement patterns and are never valid in regular expression patterns.
|
||||
|
||||
The only character that can appear either in a regular expression pattern or in a substitution is the **$** character, although it has a different meaning in each context. In a regular expression pattern, **$** is an anchor that matches the end of the string. In a replacement pattern, **$** indicates the beginning of a substitution.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> For functionality similar to a replacement pattern within a regular expression, use a backreference. For more information about backreferences, see [Backreference Constructs](backreference.md).
|
||||
|
||||
## Substituting a Numbered Group
|
||||
|
||||
The **$**_number_ language element includes the last substring matched by the number capturing group in the replacement string, where *number* is the index of the capturing group. For example, the replacement pattern `$1` indicates that the matched substring is to be replaced by the first captured group. For more information about numbered capturing groups, see [Grouping Constructs in Regular Expressions](groupimg.md).
|
||||
|
||||
All digits that follow **$** are interpreted as belonging to the number group. If this is not your intent, you can substitute a named group instead. For example, you can use the replacement string **${1}1** instead of **$11** to define the replacement string as the value of the first captured group along with the number "1". For more information, see [Substituting a Named Group](#Substituting-a-Named-Group).
|
||||
|
||||
Capturing groups that are not explicitly assigned names using the **(?<**_name-**>)** syntax are numbered from left to right starting at one. Named groups are also numbered from left to right, starting at one greater than the index of the last unnamed group. For example, in the regular expression `(\w)(?<digit>\d)`, the index of the `digit` named group is 2.
|
||||
|
||||
If *number* does not specify a valid capturing group defined in the regular expression pattern, **$**_number_ is interpreted as a literal character sequence that is used to replace each match.
|
||||
|
||||
The following example uses the **$**_number_ substitution to strip the currency symbol from a decimal value. It removes currency symbols found at the beginning or end of a monetary value, and recognizes the two most common decimal separators ("." and ",").
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\p{Sc}*(\s?\d+[.,]?\d*)\p{Sc}*";
|
||||
string replacement = "$1";
|
||||
string input = "$16.32 12.19 £16.29 €18.29 €18,29";
|
||||
string result = Regex.Replace(input, pattern, replacement);
|
||||
Console.WriteLine(result);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 16.32 12.19 16.29 18.29 18,29
|
||||
```
|
||||
|
||||
The regular expression pattern `\p{Sc}*(\s?\d+[.,]?\d*)\p{Sc}*` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\p{Sc}*` | Match zero or more currency symbol characters.
|
||||
`\s?` | Match zero or one white-space characters.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
`[.,]?` | Match zero or one period or comma.
|
||||
`\d*` | Match zero or more decimal digits.
|
||||
`(\s?\d+[.,]?\d*)` | Match a white space followed by one or more decimal digits, followed by zero or one period or comma, followed by zero or more decimal digits. This is the first capturing group. Because the replacement pattern is `$1`, the call to the [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method replaces the entire matched substring with this captured group.
|
||||
|
||||
## Substituting a Named Group
|
||||
|
||||
The **${**_name_**}** language element substitutes the last substring matched by the *name* capturing group, where *name* is the name of a capturing group defined by the **(?<**_name_**>)** language element. For more information about named capturing groups, see [Grouping Constructs in Regular Expressions](groupimg.md).
|
||||
|
||||
If *name* doesn't specify a valid named capturing group defined in the regular expression pattern but consists of digits, **${**_name_**}** is interpreted as a numbered group.
|
||||
|
||||
If *name* specifies neither a valid named capturing group nor a valid numbered capturing group defined in the regular expression pattern, **${**_name_**}** is interpreted as a literal character sequence that is used to replace each match.
|
||||
|
||||
The following example uses the **${**_name_**}** substitution to strip the currency symbol from a decimal value. It removes currency symbols found at the beginning or end of a monetary value, and recognizes the two most common decimal separators ("." and ",").
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\p{Sc}*(?<amount>\s?\d+[.,]?\d*)\p{Sc}*";
|
||||
string replacement = "${amount}";
|
||||
string input = "$16.32 12.19 £16.29 €18.29 €18,29";
|
||||
string result = Regex.Replace(input, pattern, replacement);
|
||||
Console.WriteLine(result);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 16.32 12.19 16.29 18.29 18,29
|
||||
```
|
||||
|
||||
The regular expression pattern `\p{Sc}*(?<amount>\s?\d[.,]?\d*)\p{Sc}*` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\p{Sc}*` | Match zero or more currency symbol characters.
|
||||
`\s?` | Match zero or one white-space characters.
|
||||
`\d+` | Match one or more decimal digits.
|
||||
`[.,]?` | Match zero or one period or comma.
|
||||
`\d*` | Match zero or more decimal digits.
|
||||
`(?<amount>\s?\d[.,]?\d*)` | Match a white space, followed by one or more decimal digits, followed by zero or one period or comma, followed by zero or more decimal digits. This is the capturing group named amount. Because the replacement pattern is `${amount}`, the call to the [Regex.Replace](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex#System_Text_RegularExpressions_Regex_Matches_System_String_System_String_System_Text_RegularExpressions_RegexOptions_) method replaces the entire matched substring with this captured group.
|
||||
|
||||
## Substituting a "$" Character
|
||||
|
||||
The **$$** substitution inserts a literal "$" character in the replaced string.
|
||||
|
||||
The following example uses the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object to determine the current culture's currency symbol and its placement in a currency string. It then builds both a regular expression pattern and a replacement pattern dynamically. If the example is run on a computer whose current culture is en-US, it generates the regular expression pattern `\b(\d+)(\.(\d+))?` and the replacement pattern `$$ $1$2`. The replacement pattern replaces the matched text with a currency symbol and a space followed by the first and second captured groups.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Define array of decimal values.
|
||||
string[] values= { "16.35", "19.72", "1234", "0.99"};
|
||||
// Determine whether currency precedes (True) or follows (False) number.
|
||||
bool precedes = NumberFormatInfo.CurrentInfo.CurrencyPositivePattern % 2 == 0;
|
||||
// Get decimal separator.
|
||||
string cSeparator = NumberFormatInfo.CurrentInfo.CurrencyDecimalSeparator;
|
||||
// Get currency symbol.
|
||||
string symbol = NumberFormatInfo.CurrentInfo.CurrencySymbol;
|
||||
// If symbol is a "$", add an extra "$".
|
||||
if (symbol == "$") symbol = "$$";
|
||||
|
||||
// Define regular expression pattern and replacement string.
|
||||
string pattern = @"\b(\d+)(" + cSeparator + @"(\d+))?";
|
||||
string replacement = "$1$2";
|
||||
replacement = precedes ? symbol + " " + replacement : replacement + " " + symbol;
|
||||
foreach (string value in values)
|
||||
Console.WriteLine("{0} --> {1}", value, Regex.Replace(value, pattern, replacement));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 16.35 --> $ 16.35
|
||||
// 19.72 --> $ 19.72
|
||||
// 1234 --> $ 1234
|
||||
// 0.99 --> $ 0.99
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(\d+)(\.(\d+))?` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Start the match at the beginning of a word boundary.
|
||||
`(\d+)` | Match one or more decimal digits. This is the first capturing group.
|
||||
`\.` | Match a period (the decimal separator).
|
||||
`(\d+)` | Match one or more decimal digits. This is the third capturing group.
|
||||
`(\.(\d+))?` | Match zero or one occurrence of a period followed by one or more decimal digits. This is the second capturing group.
|
||||
|
||||
## Substituting the Entire Match
|
||||
|
||||
The **$&** substitution includes the entire match in the replacement string. Often, it is used to add a substring to the beginning or end of the matched string. For example, the `($&)` replacement pattern adds parentheses to the beginning and end of each match. If there is no match, the **$&** substitution has no effect.
|
||||
|
||||
The following example uses the **$&** substitution to add quotation marks at the beginning and end of book titles stored in a string array.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"^(\w+\s?)+$";
|
||||
string[] titles = { "A Tale of Two Cities",
|
||||
"The Hound of the Baskervilles",
|
||||
"The Protestant Ethic and the Spirit of Capitalism",
|
||||
"The Origin of Species" };
|
||||
string replacement = "\"$&\"";
|
||||
foreach (string title in titles)
|
||||
Console.WriteLine(Regex.Replace(title, pattern, replacement));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// "A Tale of Two Cities"
|
||||
// "The Hound of the Baskervilles"
|
||||
// "The Protestant Ethic and the Spirit of Capitalism"
|
||||
// "The Origin of Species"
|
||||
```
|
||||
|
||||
The regular expression pattern `^(\w+\s?)+$` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`^` | Start the match at the beginning of the input string.
|
||||
`(\w+\s?)+` | Match the pattern of one or more word characters followed by zero or one white-space characters one or more times.
|
||||
`$` | Match the end of the input string.
|
||||
|
||||
The `"$&"` replacement pattern adds a literal quotation mark to the beginning and end of each match.
|
||||
|
||||
## Substituting the Text Before the Match
|
||||
|
||||
The **$`** substitution replaces the matched string with the entire input string before the match. That is, it duplicates the input string up to the match while removing the matched text. Any text that follows the matched text is unchanged in the result string. If there are multiple matches in an input string, the replacement text is derived from the original input string, rather than from the string in which text has been replaced by earlier matches. (The example provides an illustration.) If there is no match, the **$`** substitution has no effect.
|
||||
|
||||
The following example uses the regular expression pattern `\d+` to match a sequence of one or more decimal digits in the input string. The replacement string **$`** replaces these digits with the text that precedes the match.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "aa1bb2cc3dd4ee5";
|
||||
string pattern = @"\d+";
|
||||
string substitution = "$`";
|
||||
Console.WriteLine("Matches:");
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(" {0} at position {1}", match.Value, match.Index);
|
||||
|
||||
Console.WriteLine("Input string: {0}", input);
|
||||
Console.WriteLine("Output string: " +
|
||||
Regex.Replace(input, pattern, substitution));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Matches:
|
||||
// 1 at position 2
|
||||
// 2 at position 5
|
||||
// 3 at position 8
|
||||
// 4 at position 11
|
||||
// 5 at position 14
|
||||
// Input string: aa1bb2cc3dd4ee5
|
||||
// Output string: aaaabbaa1bbccaa1bb2ccddaa1bb2cc3ddeeaa1bb2cc3dd4ee
|
||||
```
|
||||
|
||||
In this example, the input string `"aa1bb2cc3dd4ee5"` contains five matches. The following table illustrates how the $` substitution causes the regular expression engine to replace each match in the input string. Inserted text is shown in bold in the Result string column.
|
||||
|
||||
Match | Position | String before match | Result string
|
||||
----- | -------- | ------------------- | -------------
|
||||
1 | 2 | aa | aa**aa**bb2cc3dd4ee5
|
||||
2 | 5 | aa1bb | aaaabb**aa1bb**cc3dd4ee5
|
||||
3 | 8 | aa1bb2cc | aaaabbaa1bbcc**aa1bb2cc**dd4ee5
|
||||
4 | 11 | aa1bb2cc3dd | aaaabbaa1bbccaa1bb2ccdd**aa1bb2cc3dd**ee5
|
||||
5 | 14 | aa1bb2cc3dd4ee | aaaabbaa1bbccaa1bb2ccddaa1bb2cc3ddee **aa1bb2cc3dd4ee**
|
||||
|
||||
## Substituting the Text After the Match
|
||||
|
||||
The **$'** substitution replaces the matched string with the entire input string after the match. That is, it duplicates the input string after the match while removing the matched text. Any text that precedes the matched text is unchanged in the result string. If there is no match, the **$'** substitution has no effect.
|
||||
|
||||
The following example uses the regular expression pattern `\d+` to match a sequence of one or more decimal digits in the input string. The replacement string **$'** replaces these digits with the text that follows the match.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "aa1bb2cc3dd4ee5";
|
||||
string pattern = @"\d+";
|
||||
string substitution = "$'";
|
||||
Console.WriteLine("Matches:");
|
||||
foreach (Match match in Regex.Matches(input, pattern))
|
||||
Console.WriteLine(" {0} at position {1}", match.Value, match.Index);
|
||||
Console.WriteLine("Input string: {0}", input);
|
||||
Console.WriteLine("Output string: " +
|
||||
Regex.Replace(input, pattern, substitution));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Matches:
|
||||
// 1 at position 2
|
||||
// 2 at position 5
|
||||
// 3 at position 8
|
||||
// 4 at position 11
|
||||
// 5 at position 14
|
||||
// Input string: aa1bb2cc3dd4ee5
|
||||
// Output string: aaaabbaa1bbccaa1bb2ccddaa1bb2cc3ddeeaa1bb2cc3dd4ee
|
||||
```
|
||||
|
||||
In this example, the input string `"aa1bb2cc3dd4ee5"` contains five matches. The following table illustrates how the **$'** substitution causes the regular expression engine to replace each match in the input string. Inserted text is shown in bold in the Result string column.
|
||||
|
||||
Match | Position | String before match | Result string
|
||||
----- | -------- | ------------------- | -------------
|
||||
1 | 2 | bb2cc3dd4ee5 | aa**bb2cc3dd4ee5**bb2cc3dd4ee5
|
||||
2 | 5 | cc3dd4ee5 | aabb2cc3dd4ee5bb**cc3dd4ee5**cc3dd4ee5
|
||||
3 | 8 | dd4ee5 | aabb2cc3dd4ee5bbcc3dd4ee5cc**dd4ee5**dd4ee5
|
||||
4 | 11 | ee5 | aabb2cc3dd4ee5bbcc3dd4ee5ccdd4ee5dd**ee5**ee5
|
||||
5 | 14 | [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty) | aabb2cc3dd4ee5bbcc3dd4ee5ccdd4ee5ddee5ee
|
||||
|
||||
## Substituting the Last Captured Group
|
||||
|
||||
The **$+** substitution replaces the matched string with the last captured group. If there are no captured groups or if the value of the last captured group is [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String#System_String_Empty), the **$+** substitution has no effect.
|
||||
|
||||
The following example identifies duplicate words in a string and uses the **$+** substitution to replace them with a single occurrence of the word. The [RegexOptions.IgnoreCase](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.RegexOptions#System_Text_RegularExpressions_RegexOptions_IgnoreCase) option is used to ensure that words that differ in case but that are otherwise identical are considered duplicates.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string pattern = @"\b(\w+)\s\1\b";
|
||||
string substitution = "$+";
|
||||
string input = "The the dog jumped over the fence fence.";
|
||||
Console.WriteLine(Regex.Replace(input, pattern, substitution,
|
||||
RegexOptions.IgnoreCase));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// The dog jumped over the fence.
|
||||
```
|
||||
|
||||
The regular expression pattern `\b(\w+)\s\1\b` is defined as shown in the following table.
|
||||
|
||||
Pattern | Description
|
||||
------- | -----------
|
||||
`\b` | Begin the match at a word boundary.
|
||||
`(\w+)` | Match one or more word characters. This is the first capturing group.
|
||||
`\s` | Match a white-space character.
|
||||
`\1` | Match the first captured group.
|
||||
`\b` | End the match at a word boundary.
|
||||
|
||||
## Substituting the Entire Input String
|
||||
|
||||
The **$_** substitution replaces the matched string with the entire input string. That is, it removes the matched text and replaces it with the entire string, including the matched text.
|
||||
|
||||
The following example matches one or more decimal digits in the input string. It uses the **$_** substitution to replace them with the entire input string.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Text.RegularExpressions;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string input = "ABC123DEF456";
|
||||
string pattern = @"\d+";
|
||||
string substitution = "$_";
|
||||
Console.WriteLine("Original string: {0}", input);
|
||||
Console.WriteLine("String with substitution: {0}",
|
||||
Regex.Replace(input, pattern, substitution));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Original string: ABC123DEF456
|
||||
// String with substitution: ABCABC123DEF456DEFABC123DEF456
|
||||
```
|
||||
|
||||
In this example, the input string `"ABC123DEF456"` contains two matches. The following table illustrates how the **$_** substitution causes the regular expression engine to replace each match in the input string. Inserted text is shown in bold in the Result string column.
|
||||
|
||||
Match | Position | String before match | Result string
|
||||
----- | -------- | ------------------- | -------------
|
||||
1 | 3 | 123 | ABC**ABC123DEF456**DEF456
|
||||
2 | 5 | 456 | ABCABC123DEF456DEF**ABC123DEF456**
|
||||
|
||||
## See Also
|
||||
|
||||
[Regular Expression Language - Quick Reference](index.md)
|
||||
|
|
@ -1,27 +0,0 @@
|
|||
---
|
||||
title: Parsing Strings in .NET Core
|
||||
description: Parsing Strings in .NET Core
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 8d9c21fa-ad9c-4296-b595-5eb6f82adf4c
|
||||
---
|
||||
|
||||
# Parsing Strings in .NET Core
|
||||
|
||||
A parsing operation converts a string that represents a .NET Core base type into that base type. For example, a parsing operation is used to convert a string to a floating-point number or to a date and time value. The method most commonly used to perform a parsing operation is the `Parse` method. Because parsing is the reverse operation of formatting (which involves converting a base type into its string representation), many of the same rules and conventions apply. Just as formatting uses an object that implements the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) interface to provide culture-sensitive formatting information, parsing also uses an object that implements the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) interface to determine how to interpret a string representation.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Parsing Numeric Strings in .NET Core](parsingnumeric.md) - Describes how to convert strings into .NET Core numeric types.
|
||||
|
||||
[Parsing Date and Time Strings in .NET Core](parsingdatetime.md) - Describes how to convert strings into .NET Core `DateTime` types.
|
||||
|
||||
[Parsing Other Strings in .NET Core](.//parsingother.md) - Describes how to convert strings into `Char`, `Boolean`, and `Enum` types.
|
||||
|
||||
|
|
@ -1,122 +0,0 @@
|
|||
---
|
||||
title: Parsing Date and Time Strings in .NET Core
|
||||
description: Parsing Date and Time Strings in .NET Core
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: ae0dd8dc-98eb-4d18-bd57-ac3e04046ff3
|
||||
---
|
||||
|
||||
# Parsing Date and Time Strings in .NET Core
|
||||
|
||||
Parsing methods convert the string representation of a date and time to an equivalent [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) object. The [Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_) and [TryParse](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_TryParse_System_String_System_DateTime__) methods convert any of several common representations of a date and time. The [ParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ParseExact_System_String_System_String_System_IFormatProvider_) and [TryParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_TryParseExact_System_String_System_String_System_IFormatProvider_System_Globalization_DateTimeStyles_System_DateTime__) methods convert a string representation that conforms to the pattern specified by a date and time format string.
|
||||
|
||||
Parsing is influenced by the properties of a format provider that supplies information such as the strings used for date and time separators, and the names of months, days, and eras. The format provider is the current [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object, which is provided implicitly by the current thread culture or explicitly by the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter of a parsing method. For the [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter, specify a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object, which represents a culture, or a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object.
|
||||
|
||||
The string representation of a date to be parsed must include the month and at least a day or year. The string representation of a time must include the hour and at least minutes or the AM/PM designator. However, parsing supplies default values for omitted components if possible. A missing date defaults to the current date, a missing year defaults to the current year, a missing day of the month defaults to the first day of the month, and a missing time defaults to midnight.
|
||||
|
||||
If the string representation specifies only a time, parsing returns a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) object with its [Year](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Year), [Month](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Month), and [Day](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Day) properties set to the corresponding values of the [Today](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Today) property. However, if the [DateTimeStyles.NoCurrentDateDefault](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeStyles#System_Globalization_DateTimeStyles_NoCurrentDateDefault) constant is specified in the parsing method, the resulting year, month, and day properties are set to the value 1.
|
||||
|
||||
In addition to a date and a time component, the string representation of a date and time can include an offset that indicates how much the time differs from Coordinated Universal Time (UTC). For example, the string "2/14/2007 5:32:00 -7:00" defines a time that is seven hours earlier than UTC. If an offset is omitted from the string representation of a time, parsing returns a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) object with its [Kind](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Kind) property set to [DateTimeKind.Unspecified](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Unspecified). If an offset is specified, parsing returns a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) object with its [Kind](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Kind) property set to [Local](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Local) and its value adjusted to the local time zone of your machine. You can modify this behavior by using a [DateTimeStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeStyles) constant with the parsing method.
|
||||
|
||||
The format provider is also used to interpret an ambiguous numeric date. For example, it is not clear which components of the date represented by the string "02/03/04" are the month, day, and year. In this case, the components are interpreted according to the order of similar date formats in the format provider.
|
||||
|
||||
## Parse
|
||||
|
||||
The following code example illustrates the use of the `Parse` method to convert a string into a `DateTime`. This example uses the culture associated with the current thread to perform the parse. If the [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) associated with the current culture cannot parse the input string, a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException) is thrown.
|
||||
|
||||
```csharp
|
||||
string MyString = "Jan 1, 2009";
|
||||
DateTime MyDateTime = DateTime.Parse(MyString);
|
||||
Console.WriteLine(MyDateTime);
|
||||
// Displays the following output on a system whose culture is en-US:
|
||||
// 1/1/2009 12:00:00 AM
|
||||
```
|
||||
|
||||
You can also specify a `CultureInfo` set to one of the cultures defined by that object, or you can specify one of the standard [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) objects returned by the [CultureInfo.DateTimeFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_DateTimeFormat) property. The following code example uses a format provider to parse a German string into a `DateTime`. A `CultureInfo` representing the de-DE culture is defined and passed with the string being parsed to ensure successful parsing of this particular string. This precludes whatever setting is in the `CurrentCulture` of the `CurrentThread`.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
CultureInfo MyCultureInfo = new CultureInfo("de-DE");
|
||||
string MyString = "12 Juni 2008";
|
||||
DateTime MyDateTime = DateTime.Parse(MyString, MyCultureInfo);
|
||||
Console.WriteLine(MyDateTime);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 6/12/2008 12:00:00 AM
|
||||
```
|
||||
|
||||
However, although you can use overloads of the [Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Parse_System_String_) method to specify custom format providers, the method does not support the use of non-standard format providers. To parse a date and time expressed in a non-standard format, use the [ParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ParseExact_System_String_System_String_System_IFormatProvider_) method instead.
|
||||
|
||||
The following code example uses the [DateTimeStyles](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ParseExact_System_String_System_String___System_IFormatProvider_System_Globalization_DateTimeStyles_) enumeration to specify that the current date and time information should not be added to the `DateTime` for fields that the string does not define.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
CultureInfo MyCultureInfo = new CultureInfo("de-DE");
|
||||
string MyString = "12 Juni 2008";
|
||||
DateTime MyDateTime = DateTime.Parse(MyString, MyCultureInfo,
|
||||
DateTimeStyles.NoCurrentDateDefault);
|
||||
Console.WriteLine(MyDateTime);
|
||||
}
|
||||
}
|
||||
// The example displays the following output if the current culture is en-US:
|
||||
// 6/12/2008 12:00:00 AM
|
||||
```
|
||||
|
||||
## ParseExact
|
||||
|
||||
The [DateTime.ParseExact]((https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ParseExact_System_String_System_String_System_IFormatProvider_)) method converts a string that conforms to a specified string pattern to a `DateTime` object. When a string that is not of the form specified is passed to this method, a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException) is thrown. You can specify one of the standard date and time format specifiers or a limited combination of the custom date and time format specifiers. Using the custom format specifiers, it is possible for you to construct a custom recognition string.
|
||||
|
||||
Each overload of the [ParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_ParseExact_System_String_System_String_System_IFormatProvider_) method also has an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) parameter that typically provides culture-specific information about the formatting of the string. Typically, this [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) object is a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object that represents a standard culture or a [DateTimeFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.DateTimeFormatInfo) object that is returned by the [CultureInfo.DateTimeFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_DateTimeFormat) property. However, unlike the other date and time parsing functions, this method also supports an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider) that defines a non-standard date and time format.
|
||||
|
||||
In the following code example, the `ParseExact` method is passed a string object to parse, followed by a format specifier, followed by a `CultureInfo` object. This `ParseExact` method can only parse strings that exhibit the long date pattern in the en-US culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
CultureInfo MyCultureInfo = new CultureInfo("en-US");
|
||||
string[] MyString = {" Friday, April 10, 2009", "Friday, April 10, 2009"};
|
||||
foreach (string dateString in MyString)
|
||||
{
|
||||
try {
|
||||
DateTime MyDateTime = DateTime.ParseExact(dateString, "D", MyCultureInfo);
|
||||
Console.WriteLine(MyDateTime);
|
||||
}
|
||||
catch (FormatException) {
|
||||
Console.WriteLine("Unable to parse '{0}'", dateString);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Unable to parse ' Friday, April 10, 2009'
|
||||
// 4/10/2009 12:00:00 AM
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Parsing Strings in .NET Core](index.md)
|
||||
|
|
@ -1,206 +0,0 @@
|
|||
---
|
||||
title: Parsing Numeric Strings in .NET Core
|
||||
description: Parsing Numeric Strings in .NET Core
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 260db216-f9ae-4a54-ac13-ec5358302663
|
||||
---
|
||||
|
||||
# Parsing Numeric Strings in .NET Core
|
||||
|
||||
All numeric types have two static parsing methods, `Parse` and `TryParse`, that you can use to convert the string representation of a number into a numeric type. These methods enable you to parse strings that were produced by using the standard numeric and custom numeric format strings. By default, the `Parse` and `TryParse` methods can successfully convert strings that contain integral decimal digits only to integer values. They can successfully convert strings that contain integral and fractional decimal digits, group separators, and a decimal separator to floating-point values. The `Parse` method throws an exception if the operation fails, whereas the `TryParse` method returns `false`.
|
||||
|
||||
## Parsing and Format Providers
|
||||
|
||||
Typically, the string representations of numeric values differ by culture. Elements of numeric strings such as currency symbols, group (or thousands) separators, and decimal separators all vary by culture. Parsing methods either implicitly or explicitly use a format provider that recognizes these culture-specific variations. If no format provider is specified in a call to the `Parse` or `TryParse` method, the format provider associated with the current thread culture (the [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object returned by the [NumberFormatInfo.CurrentInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrentInfo) property) is used.
|
||||
|
||||
A format provider is represented by an [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrentInfo) implementation. This interface has a single member, the [GetFormat](https://docs.microsoft.com/dotnet/core/api/System.IFormatProvider#System_IFormatProvider_GetFormat_System_Type_) method, whose single parameter is a [Type](https://docs.microsoft.com/dotnet/core/api/System.Type) object that represents the type to be formatted. This method returns the object that provides formatting information. .NET Core supports the following two [IFormatProvider](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrentInfo) implementations for parsing numeric strings:
|
||||
|
||||
* A [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) object whose [CultureInfo.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo#System_Globalization_CultureInfo_GetFormat_System_Type_) method returns a [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object that provides culture-specific formatting information.
|
||||
|
||||
* A [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object whose [NumberFormatInfo.GetFormat](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_GetFormat_System_Type_) method returns itself.
|
||||
|
||||
The following example tries to convert each string in an array to a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value. It first tries to parse the string by using a format provider that reflects the conventions of the English (United States) culture. If this operation throws a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException), it tries to parse the string by using a format provider that reflects the conventions of the French (France) culture.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string[] values = { "1,304.16", "$1,456.78", "1,094", "152",
|
||||
"123,45 €", "1 304,16", "Ae9f" };
|
||||
double number;
|
||||
CultureInfo culture = null;
|
||||
|
||||
foreach (string value in values) {
|
||||
try {
|
||||
culture = CultureInfo.CreateSpecificCulture("en-US");
|
||||
number = Double.Parse(value, culture);
|
||||
Console.WriteLine("{0}: {1} --> {2}", culture.Name, value, number);
|
||||
}
|
||||
catch (FormatException) {
|
||||
Console.WriteLine("{0}: Unable to parse '{1}'.",
|
||||
culture.Name, value);
|
||||
culture = CultureInfo.CreateSpecificCulture("fr-FR");
|
||||
try {
|
||||
number = Double.Parse(value, culture);
|
||||
Console.WriteLine("{0}: {1} --> {2}", culture.Name, value, number);
|
||||
}
|
||||
catch (FormatException) {
|
||||
Console.WriteLine("{0}: Unable to parse '{1}'.",
|
||||
culture.Name, value);
|
||||
}
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// en-US: 1,304.16 --> 1304.16
|
||||
//
|
||||
// en-US: Unable to parse '$1,456.78'.
|
||||
// fr-FR: Unable to parse '$1,456.78'.
|
||||
//
|
||||
// en-US: 1,094 --> 1094
|
||||
//
|
||||
// en-US: 152 --> 152
|
||||
//
|
||||
// en-US: Unable to parse '123,45 €'.
|
||||
// fr-FR: Unable to parse '123,45 €'.
|
||||
//
|
||||
// en-US: Unable to parse '1 304,16'.
|
||||
// fr-FR: 1 304,16 --> 1304.16
|
||||
//
|
||||
// en-US: Unable to parse 'Ae9f'.
|
||||
// fr-FR: Unable to parse 'Ae9f'.
|
||||
```
|
||||
|
||||
## Parsing and NumberStyles Values
|
||||
|
||||
The style elements (such as white space, group separators, and decimal separator) that the parse operation can handle are defined by a [NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles) enumeration value. By default, strings that represent integer values are parsed by using the [NumberStyles.Integer](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Integer) value, which permits only numeric digits, leading and trailing white space, and a leading sign. Strings that represent floating-point values are parsed using a combination of the [NumberStyles.Float](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Float) and [NumberStyles.AllowThousands](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowThousands) values; this composite style permits decimal digits along with leading and trailing white space, a leading sign, a decimal separator, a group separator, and an exponent. By calling an overload of the `Parse` or `TryParse` method that includes a parameter of type [NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles) and setting one or more [NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles) flags, you can control the style elements that can be present in the string for the parse operation to succeed.
|
||||
|
||||
For example, a string that contains a group separator cannot be converted to an [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) value by using the [Int32.Parse(String)](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_Parse_System_String_) method. However, the conversion succeeds if you use the [NumberStyles.AllowThousands](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowThousands) flag, as the following example illustrates.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Globalization;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string value = "1,304";
|
||||
int number;
|
||||
IFormatProvider provider = CultureInfo.CreateSpecificCulture("en-US");
|
||||
if (Int32.TryParse(value, out number))
|
||||
Console.WriteLine("{0} --> {1}", value, number);
|
||||
else
|
||||
Console.WriteLine("Unable to convert '{0}'", value);
|
||||
|
||||
if (Int32.TryParse(value, NumberStyles.Integer | NumberStyles.AllowThousands,
|
||||
provider, out number))
|
||||
Console.WriteLine("{0} --> {1}", value, number);
|
||||
else
|
||||
Console.WriteLine("Unable to convert '{0}'", value);
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Unable to convert '1,304'
|
||||
// 1,304 --> 1304
|
||||
```
|
||||
|
||||
> **Warning**
|
||||
>
|
||||
> The parse operation always uses the formatting conventions of a particular culture. If you do not specify a culture by passing a [CultureInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.CultureInfo) or [NumberFormatInfo](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo) object, the culture associated with the current thread is used.
|
||||
|
||||
The following table lists the members of the [NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles) enumeration and describes the effect that they have on the parsing operation.
|
||||
|
||||
NumberStyles value | Effect on the string to be parsed
|
||||
------------------ | ---------------------------------
|
||||
[NumberStyles.None](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_None) | Only numeric digits are permitted.
|
||||
[NumberStyles.AllowDecimalPoint](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowDecimalPoint) | The decimal separator and fractional digits are permitted. For integer values, only zero is permitted as a fractional digit. Valid decimal separators are determined by the [NumberFormatInfo.NumberDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberDecimalSeparator) or [NumberFormatInfo.CurrencyDecimalSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyDecimalSeparator) property.
|
||||
[NumberStyles.AllowExponent](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowExponent) | The "e" or "E" character can be used to indicate exponential notation.
|
||||
[NumberStyles.AllowLeadingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingWhite) | Leading white space is permitted.
|
||||
[NumberStyles.AllowTrailingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingWhite) | Trailing white space is permitted.
|
||||
[NumberStyles.AllowLeadingSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingSign) | A positive or negative sign can precede numeric digits.
|
||||
[NumberStyles.AllowTrailingSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingSign) | A positive or negative sign can follow numeric digits.
|
||||
[NumberStyles.AllowParentheses](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowParentheses) | Parentheses can be used to indicate negative values.
|
||||
[NumberStyles.AllowThousands](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowThousands) | The group separator is permitted. The group separator character is determined by the [NumberFormatInfo.NumberGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_NumberGroupSeparator) or [NumberFormatInfo.CurrencyGroupSeparator](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencyGroupSeparator) property.
|
||||
[NumberStyles.AllowCurrencySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowCurrencySymbol) | The currency symbol is permitted. The currency symbol is defined by the [NumberFormatInfo.CurrencySymbol](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberFormatInfo#System_Globalization_NumberFormatInfo_CurrencySymbol) property.
|
||||
[NumberStyles.AllowHexSpecifier](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowHexSpecifier) | The string to be parsed is interpreted as a hexadecimal number. It can include the hexadecimal digits 0-9, A-F, and a-f. This flag can be used only to parse integer values.
|
||||
|
||||
In addition, the [NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles) enumeration provides the following composite styles, which include multiple [NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles) flags.
|
||||
|
||||
Composite NumberStyles value | Includes members
|
||||
---------------------------- | ----------------
|
||||
[NumberStyles.Integer](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Integer) | Includes the [NumberStyles.AllowLeadingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingWhite), [NumberStyles.AllowTrailingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingWhite), and [NumberStyles.AllowLeadingSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingSign) styles. This is the default style used to parse integer values.
|
||||
[NumberStyles.Number](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Number) | Includes the [NumberStyles.AllowLeadingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingWhite), [NumberStyles.AllowTrailingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingWhite), [NumberStyles.AllowLeadingSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingSign), [NumberStyles.AllowTrailingSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingSign), [NumberStyles.AllowDecimalPoint](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowDecimalPoint), and [NumberStyles.AllowThousands](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowThousands) styles.
|
||||
[NumberStyles.Float](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Float) | Includes the [NumberStyles.AllowLeadingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingWhite), [NumberStyles.AllowTrailingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingWhite), [NumberStyles.AllowLeadingSign](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingSign), [NumberStyles.AllowDecimalPoint](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowDecimalPoint), and [NumberStyles.AllowExponent](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowExponent) styles.
|
||||
[NumberStyles.Currency](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Currency) | Includes all styles except [NumberStyles.AllowExponent](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowExponent) and [NumberStyles.AllowHexSpecifier](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowHexSpecifier).
|
||||
[NumberStyles.Any](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_Any) | Includes all styles except [NumberStyles.AllowHexSpecifier](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowHexSpecifier).
|
||||
[NumberStyles.HexNumber](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_HexNumber) | Includes the [NumberStyles.AllowLeadingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowLeadingWhite), [NumberStyles.AllowTrailingWhite](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowTrailingWhite), and [NumberStyles.AllowHexSpecifier](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles#System_Globalization_NumberStyles_AllowHexSpecifier) styles.
|
||||
|
||||
## Parsing and Unicode Digits
|
||||
|
||||
The Unicode standard defines code points for digits in various writing systems. For example, code points from U+0030 to U+0039 represent the basic Latin digits 0 through 9, code points from U+09E6 to U+09EF represent the Bangla digits 0 through 9, and code points from U+FF10 to U+FF19 represent the Fullwidth digits 0 through 9. However, the only numeric digits recognized by parsing methods are the basic Latin digits 0-9 with code points from U+0030 to U+0039. If a numeric parsing method is passed a string that contains any other digits, the method throws a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException).
|
||||
|
||||
The following example uses the [Int32.Parse](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_Parse_System_String_) method to parse strings that consist of digits in different writing systems. As the output from the example shows, the attempt to parse the basic Latin digits succeeds, but the attempt to parse the Fullwidth, Arabic-Indic, and Bangla digits fails.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
string value;
|
||||
// Define a string of basic Latin digits 1-5.
|
||||
value = "\u0031\u0032\u0033\u0034\u0035";
|
||||
ParseDigits(value);
|
||||
|
||||
// Define a string of Fullwidth digits 1-5.
|
||||
value = "\uFF11\uFF12\uFF13\uFF14\uFF15";
|
||||
ParseDigits(value);
|
||||
|
||||
// Define a string of Arabic-Indic digits 1-5.
|
||||
value = "\u0661\u0662\u0663\u0664\u0665";
|
||||
ParseDigits(value);
|
||||
|
||||
// Define a string of Bangla digits 1-5.
|
||||
value = "\u09e7\u09e8\u09e9\u09ea\u09eb";
|
||||
ParseDigits(value);
|
||||
}
|
||||
|
||||
static void ParseDigits(string value)
|
||||
{
|
||||
try {
|
||||
int number = Int32.Parse(value);
|
||||
Console.WriteLine("'{0}' --> {1}", value, number);
|
||||
}
|
||||
catch (FormatException) {
|
||||
Console.WriteLine("Unable to parse '{0}'.", value);
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// '12345' --> 12345
|
||||
// Unable to parse '12345'.
|
||||
// Unable to parse '١٢٣٤٥'.
|
||||
// Unable to parse '১২৩৪৫'.
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Globalization.NumberStyles](https://docs.microsoft.com/dotnet/core/api/System.Globalization.NumberStyles)
|
||||
|
||||
[Parsing Strings in .NET Core](index.md)
|
||||
|
|
@ -1,58 +0,0 @@
|
|||
---
|
||||
title: Parsing Other Strings in .NET Core
|
||||
description: Parsing Other Strings in .NET Core
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 53dfea6a-c90f-4bb2-81e2-be6e41ef1eb7
|
||||
---
|
||||
|
||||
# Parsing Other Strings in .NET Core
|
||||
|
||||
|
||||
In addition to numeric and [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) strings, you can also parse strings that represent the types [Char](https://docs.microsoft.com/dotnet/core/api/System.Char), [Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean), and [Enum](https://docs.microsoft.com/dotnet/core/api/System.Enum) into data types.
|
||||
|
||||
## Char
|
||||
|
||||
The static parse method associated with the [Char](https://docs.microsoft.com/dotnet/core/api/System.Char) data type is useful for converting a string that contains a single character into its Unicode value. The following code example parses a string into a Unicode character.
|
||||
|
||||
```csharp
|
||||
string MyString1 = "A";
|
||||
char MyChar = Char.Parse(MyString1);
|
||||
// MyChar now contains a Unicode "A" character.
|
||||
```
|
||||
|
||||
## Boolean
|
||||
|
||||
The [Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean) data type contains a [Parse](https://docs.microsoft.com/dotnet/core/api/System.Boolean#System_Boolean_Parse_System_String_) method that you can use to convert a string that represents a `Boolean` value into an actual `Boolean` type. This method is not case-sensitive and can successfully parse a string containing "True" or "False." The `Parse` method associated with the `Boolean` type can also parse strings that are surrounded by white spaces. If any other string is passed, a [FormatException](https://docs.microsoft.com/dotnet/core/api/System.FormatException) is thrown.
|
||||
|
||||
The following code example uses the `Parse` method to convert a string into a `Boolean` value.
|
||||
|
||||
```csharp
|
||||
string MyString2 = "True";
|
||||
bool MyBool = bool.Parse(MyString2);
|
||||
// MyBool now contains a True Boolean value.
|
||||
```
|
||||
|
||||
## Enumeration
|
||||
|
||||
You can use the static [Parse](https://docs.microsoft.com/dotnet/core/api/System.Enum#System_Enum_Parse_System_Type_System_String_) method to initialize an enumeration type to the value of a string. This method accepts the enumeration type you are parsing, the string to parse, and an optional `Boolean` flag indicating whether or not the parse is case-sensitive. The string you are parsing can contain several values separated by commas, which can be preceded or followed by one or more empty spaces (also called white spaces). When the string contains multiple values, the value of the returned object is the value of all specified values combined with a bitwise OR operation.
|
||||
|
||||
The following example uses the `Parse` method to convert a string representation into an enumeration value. The [DayOfWeek](https://docs.microsoft.com/dotnet/core/api/System.DayOfWeek) enumeration is initialized to Thursday from a string.
|
||||
|
||||
```csharp
|
||||
string MyString3 = "Thursday";
|
||||
DayOfWeek MyDays = (DayOfWeek)Enum.Parse(typeof(DayOfWeek), MyString3);
|
||||
Console.WriteLine(MyDays);
|
||||
// The result is Thursday.
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Parsing Strings in .NET Core](index.md)
|
||||
|
|
@ -1,732 +0,0 @@
|
|||
|
||||
---
|
||||
title: Type Conversion
|
||||
description: Type Conversion
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: e113269f-b753-4dc3-a4d1-dea161e994df
|
||||
---
|
||||
|
||||
# Type Conversion
|
||||
|
||||
Every value has an associated type, which defines attributes such as the amount of space allocated to the value, the range of possible values it can have, and the members that it makes available. Many values can be expressed as more than one type. For example, the value `4` can be expressed as an integer or a floating-point value. Type conversion creates a value in a new type that is equivalent to the value of an old type, but does not necessarily preserve the identity (or exact value) of the original object.
|
||||
|
||||
.NET Core automatically supports the following conversions:
|
||||
|
||||
* Conversion from a derived class to a base class. This means, for example, that an instance of any class or structure can be converted to an [Object](https://docs.microsoft.com/dotnet/core/api/System.Object) instance. This conversion does not require a casting operator.
|
||||
|
||||
* Conversion from a base class back to the original derived class. In C#, this conversion requires a casting operator.
|
||||
|
||||
* Conversion from a type that implements an interface to an interface object that represents that interface. This conversion does not require a casting operator.
|
||||
|
||||
* Conversion from an interface object back to the original type that implements that interface. In C#, this conversion requires a casting operator.
|
||||
|
||||
In addition to these automatic conversions, .NET Core provides several features that support custom type conversion. These include the following:
|
||||
|
||||
* The `Implicit` operator, which defines the available widening conversions between types. For more information, see the [Implicit Conversion with the Implicit Operator](#Implicit-Conversion-with-the-Implicit-Operator) section.
|
||||
|
||||
* The `Explicit` operator, which defines the available narrowing conversions between types. For more information, see the [Explicit Conversion with the Explicit Operator](#Explicit-Conversion-with-the-Explicit-Operator) section.
|
||||
|
||||
* The [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface, which defines conversions to each of the base .NET Core data types. For more information, see the [The IConvertible Interface](#The-IConvertible-Interface) section.
|
||||
|
||||
* The [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class, which provides a set of methods that implement the methods in the `IConvertible` interface. For more information, see the [The Convert Class](#The-Convert-Class) section.
|
||||
|
||||
* The [TypeConverter](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverter) class, which is a base class that can be extended to support the conversion of a specified type to any other type. For more information, see the [The TypeConverter Class](#The-TypeConverter-Class) section.
|
||||
|
||||
## Implicit Conversion with the Implicit Operator
|
||||
|
||||
Widening conversions involve the creation of a new value from the value of an existing type that has either a more restrictive range or a more restricted member list than the target type. Widening conversions cannot result in data loss (although they may result in a loss of precision). Because data cannot be lost, compilers can handle the conversion implicitly or transparently, without requiring the use of an explicit conversion method or a casting operator.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Although code that performs an implicit conversion can call a conversion method or use a casting operator, their use is not required by compilers that support implicit conversions.
|
||||
|
||||
For example, the [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) type supports implicit conversions from [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Char](https://docs.microsoft.com/dotnet/core/api/System.Char), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt362), and [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) values. The following example illustrates some of these implicit conversions in assigning values to a `Decimal` variable.
|
||||
|
||||
```csharp
|
||||
byte byteValue = 16;
|
||||
short shortValue = -1024;
|
||||
int intValue = -1034000;
|
||||
long longValue = 1152921504606846976;
|
||||
ulong ulongValue = UInt64.MaxValue;
|
||||
|
||||
decimal decimalValue;
|
||||
|
||||
decimalValue = byteValue;
|
||||
Console.WriteLine("After assigning a {0} value, the Decimal value is {1}.",
|
||||
byteValue.GetType().Name, decimalValue);
|
||||
|
||||
decimalValue = shortValue;
|
||||
Console.WriteLine("After assigning a {0} value, the Decimal value is {1}.",
|
||||
shortValue.GetType().Name, decimalValue);
|
||||
|
||||
decimalValue = intValue;
|
||||
Console.WriteLine("After assigning a {0} value, the Decimal value is {1}.",
|
||||
intValue.GetType().Name, decimalValue);
|
||||
|
||||
decimalValue = longValue;
|
||||
Console.WriteLine("After assigning a {0} value, the Decimal value is {1}.",
|
||||
longValue.GetType().Name, decimalValue);
|
||||
|
||||
decimalValue = ulongValue;
|
||||
Console.WriteLine("After assigning a {0} value, the Decimal value is {1}.",
|
||||
longValue.GetType().Name, decimalValue);
|
||||
// The example displays the following output:
|
||||
// After assigning a Byte value, the Decimal value is 16.
|
||||
// After assigning a Int16 value, the Decimal value is -1024.
|
||||
// After assigning a Int32 value, the Decimal value is -1034000.
|
||||
// After assigning a Int64 value, the Decimal value is 1152921504606846976.
|
||||
// After assigning a Int64 value, the Decimal value is 18446744073709551615.
|
||||
```
|
||||
|
||||
If a particular language compiler supports custom operators, you can also define implicit conversions in your own custom types. The following example provides a partial implementation of a signed byte data type named `ByteWithSign` that uses sign-and-magnitude representation. It supports implicit conversion of [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) and [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) values to `ByteWithSign` values.
|
||||
|
||||
```csharp
|
||||
public struct ByteWithSign
|
||||
{
|
||||
private SByte signValue;
|
||||
private Byte value;
|
||||
|
||||
public static implicit operator ByteWithSign(SByte value)
|
||||
{
|
||||
ByteWithSign newValue;
|
||||
newValue.signValue = (SByte) Math.Sign(value);
|
||||
newValue.value = (byte) Math.Abs(value);
|
||||
return newValue;
|
||||
}
|
||||
|
||||
public static implicit operator ByteWithSign(Byte value)
|
||||
{
|
||||
ByteWithSign newValue;
|
||||
newValue.signValue = 1;
|
||||
newValue.value = value;
|
||||
return newValue;
|
||||
}
|
||||
|
||||
public override string ToString()
|
||||
{
|
||||
return (signValue * value).ToString();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Client code can then declare a `ByteWithSign` variable and assign it [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) and [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte) values without performing any explicit conversions or using any casting operators, as the following example shows.
|
||||
|
||||
```csharp
|
||||
SByte sbyteValue = -120;
|
||||
ByteWithSign value = sbyteValue;
|
||||
Console.WriteLine(value);
|
||||
value = Byte.MaxValue;
|
||||
Console.WriteLine(value);
|
||||
// The example displays the following output:
|
||||
// -120
|
||||
// 255
|
||||
```
|
||||
|
||||
## Explicit Conversion with the Explicit Operator
|
||||
|
||||
Narrowing conversions involve the creation of a new value from the value of an existing type that has either a greater range or a larger member list than the target type. Because a narrowing conversion can result in a loss of data, compilers often require that the conversion be made explicit through a call to a conversion method or a casting operator. That is, the conversion must be handled explicitly in developer code.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The major purpose of requiring a conversion method or casting operator for narrowing conversions is to make the developer aware of the possibility of data loss or an [OverflowException](https://docs.microsoft.com/dotnet/core/api/System.OverflowException) so that it can be handled in code. However, some compilers can relax this requirement.
|
||||
|
||||
For example, the [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), and [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) data types have ranges that exceed that the [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) data type, as the following table shows.
|
||||
|
||||
Type | Comparison with range of Int32
|
||||
---- | ------------------------------
|
||||
[Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64) | [Int64.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.Int64#System_Int64_MaxValue) is greater than [Int32.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_MaxValue), and [Int64.MinValue](https://docs.microsoft.com/dotnet/core/api/System.Int64#System_Int64_MinValue) is less than (has a greater negative range than) [Int32.MinValue](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_MinValue).
|
||||
[UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) | [UInt32.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.UInt32#System_UInt32_MaxValue) is greater than [Int32.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_MaxValue).
|
||||
[UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64) | [UInt64.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.UInt64#System_UInt64_MaxValue) is greater than [Int32.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.Int32#System_Int32_MaxValue).
|
||||
|
||||
To handle such narrowing conversions, .NET Core allows types to define an `Explicit` operator. Individual language compilers can then implement this operator using their own syntax, or a member of the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class can be called to perform the conversion. (For more information about the `Convert` class, see [The Convert Class](#The-Convert-Class) later in this topic.) The following example illustrates the use of language features to handle the explicit conversion of these potentially out-of-range integer values to [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) values.
|
||||
|
||||
```csharp
|
||||
long number1 = int.MaxValue + 20L;
|
||||
uint number2 = int.MaxValue - 1000;
|
||||
ulong number3 = int.MaxValue;
|
||||
|
||||
int intNumber;
|
||||
|
||||
try {
|
||||
intNumber = checked((int) number1);
|
||||
Console.WriteLine("After assigning a {0} value, the Integer value is {1}.",
|
||||
number1.GetType().Name, intNumber);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
if (number1 > int.MaxValue)
|
||||
Console.WriteLine("Conversion failed: {0} exceeds {1}.",
|
||||
number1, int.MaxValue);
|
||||
else
|
||||
Console.WriteLine("Conversion failed: {0} is less than {1}.",
|
||||
number1, int.MinValue);
|
||||
}
|
||||
|
||||
try {
|
||||
intNumber = checked((int) number2);
|
||||
Console.WriteLine("After assigning a {0} value, the Integer value is {1}.",
|
||||
number2.GetType().Name, intNumber);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("Conversion failed: {0} exceeds {1}.",
|
||||
number2, int.MaxValue);
|
||||
}
|
||||
|
||||
try {
|
||||
intNumber = checked((int) number3);
|
||||
Console.WriteLine("After assigning a {0} value, the Integer value is {1}.",
|
||||
number3.GetType().Name, intNumber);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("Conversion failed: {0} exceeds {1}.",
|
||||
number1, int.MaxValue);
|
||||
}
|
||||
|
||||
// The example displays the following output:
|
||||
// Conversion failed: 2147483667 exceeds 2147483647.
|
||||
// After assigning a UInt32 value, the Integer value is 2147482647.
|
||||
// After assigning a UInt64 value, the Integer value is 2147483647.
|
||||
```
|
||||
|
||||
Explicit conversions can produce different results in different languages, and these results can differ from the value returned by the corresponding [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) method. For example, if the [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value **12.63251** is converted to an [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), bthe .NET Core [Convert.ToInt32(Double)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ToInt32_System_Double_) method rounds the [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) to return a value of **13**, but the C# `(int)` operator truncates the [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) to return a value of **12**. Similarly, the C# `(int)` operator does not support Boolean-to-integer conversion, but the .NET Core [Convert.ToInt32(Boolean)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ToInt32_System_Boolean_) method converts a value of `true` to **1**.
|
||||
|
||||
Most compilers allow explicit conversions to be performed in a checked or unchecked manner. When a checked conversion is performed, an [OverflowException](https://docs.microsoft.com/dotnet/core/api/System.OverflowException) is thrown when the value of the type to be converted is outside the range of the target type. When an unchecked conversion is performed under the same conditions, the conversion might not throw an exception, but the exact behavior becomes undefined and an incorrect value might result.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> In C#, checked conversions can be performed by using the `checked` keyword together with a casting operator, or by specifying the `/checked+` compiler option. Conversely, unchecked conversions can be performed by using the `unchecked` keyword together with the casting operator, or by specifying the `/checked-` compiler option. By default, explicit conversions are unchecked.
|
||||
|
||||
The following C# example uses the `checked` and `unchecked` keywords to illustrate the difference in behavior when a value outside the range of a [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte) is converted to a `Byte`. The checked conversion throws an exception, but the unchecked conversion assigns [Byte.MaxValue](https://docs.microsoft.com/dotnet/core/api/System.Byte#System_Byte_MaxValue) to the `Byte` variable.
|
||||
|
||||
```csharp
|
||||
int largeValue = Int32.MaxValue;
|
||||
byte newValue;
|
||||
|
||||
try {
|
||||
newValue = unchecked((byte) largeValue);
|
||||
Console.WriteLine("Converted the {0} value {1} to the {2} value {3}.",
|
||||
largeValue.GetType().Name, largeValue,
|
||||
newValue.GetType().Name, newValue);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("{0} is outside the range of the Byte data type.",
|
||||
largeValue);
|
||||
}
|
||||
|
||||
try {
|
||||
newValue = checked((byte) largeValue);
|
||||
Console.WriteLine("Converted the {0} value {1} to the {2} value {3}.",
|
||||
largeValue.GetType().Name, largeValue,
|
||||
newValue.GetType().Name, newValue);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("{0} is outside the range of the Byte data type.",
|
||||
largeValue);
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Converted the Int32 value 2147483647 to the Byte value 255.
|
||||
// 2147483647 is outside the range of the Byte data type.
|
||||
```
|
||||
|
||||
If a particular language compiler supports custom overloaded operators, you can also define explicit conversions in your own custom types. The following example provides a partial implementation of a signed byte data type named `ByteWithSign` that uses sign-and-magnitude representation. It supports explicit conversion of [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) and [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) values to `ByteWithSign` values.
|
||||
|
||||
```csharp
|
||||
public struct ByteWithSign
|
||||
{
|
||||
private SByte signValue;
|
||||
private Byte value;
|
||||
|
||||
private const byte MaxValue = byte.MaxValue;
|
||||
private const int MinValue = -1 * byte.MaxValue;
|
||||
|
||||
public static explicit operator ByteWithSign(int value)
|
||||
{
|
||||
// Check for overflow.
|
||||
if (value > ByteWithSign.MaxValue || value < ByteWithSign.MinValue)
|
||||
throw new OverflowException(String.Format("'{0}' is out of range of the ByteWithSign data type.",
|
||||
value));
|
||||
|
||||
ByteWithSign newValue;
|
||||
newValue.signValue = (SByte) Math.Sign(value);
|
||||
newValue.value = (byte) Math.Abs(value);
|
||||
return newValue;
|
||||
}
|
||||
|
||||
public static explicit operator ByteWithSign(uint value)
|
||||
{
|
||||
if (value > ByteWithSign.MaxValue)
|
||||
throw new OverflowException(String.Format("'{0}' is out of range of the ByteWithSign data type.",
|
||||
value));
|
||||
|
||||
ByteWithSign newValue;
|
||||
newValue.signValue = 1;
|
||||
newValue.value = (byte) value;
|
||||
return newValue;
|
||||
}
|
||||
|
||||
public override string ToString()
|
||||
{
|
||||
return (signValue * value).ToString();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Client code can then declare a `ByteWithSign` variable and assign it [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) and [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32) values if the assignments include a casting operator, as the following example shows.
|
||||
|
||||
```csharp
|
||||
ByteWithSign value;
|
||||
|
||||
try {
|
||||
int intValue = -120;
|
||||
value = (ByteWithSign) intValue;
|
||||
Console.WriteLine(value);
|
||||
}
|
||||
catch (OverflowException e) {
|
||||
Console.WriteLine(e.Message);
|
||||
}
|
||||
|
||||
try {
|
||||
uint uintValue = 1024;
|
||||
value = (ByteWithSign) uintValue;
|
||||
Console.WriteLine(value);
|
||||
}
|
||||
catch (OverflowException e) {
|
||||
Console.WriteLine(e.Message);
|
||||
}
|
||||
// The example displays the following output:
|
||||
// -120
|
||||
// '1024' is out of range of the ByteWithSign data type.
|
||||
```
|
||||
|
||||
## The IConvertible Interface
|
||||
|
||||
To support the conversion of any type to a common language runtime base type, .NET Core provides the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface. The implementing type is required to provide the following:
|
||||
|
||||
* A method that returns the [TypeCode](https://docs.microsoft.com/dotnet/core/api/System.TypeCode) of the implementing type.
|
||||
|
||||
* Methods to convert the implementing type to each common language runtime base type ([Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean), [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), and so on).
|
||||
|
||||
* A generalized conversion method to convert an instance of the implementing type to another specified type. Conversions that are not supported should throw an [InvalidCastException](https://docs.microsoft.com/dotnet/core/api/System.InvalidCastException).
|
||||
|
||||
Each common language runtime base type (that is, the [Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean), [Byte](https://docs.microsoft.com/dotnet/core/api/System.Byte), [Char](https://docs.microsoft.com/dotnet/core/api/System.Char), [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime), [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal), [Double](https://docs.microsoft.com/dotnet/core/api/System.Double), [Int16](https://docs.microsoft.com/dotnet/core/api/System.Int16), [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32), [Int64](https://docs.microsoft.com/dotnet/core/api/System.Int64), [SByte](https://docs.microsoft.com/dotnet/core/api/System.SByte), [Single](https://docs.microsoft.com/dotnet/core/api/System.Single), [String](https://docs.microsoft.com/dotnet/core/api/System.String), [UInt16](https://docs.microsoft.com/dotnet/core/api/System.UInt16), [UInt32](https://docs.microsoft.com/dotnet/core/api/System.UInt32), and [UInt64](https://docs.microsoft.com/dotnet/core/api/System.UInt64), as well as the [DBNull](https://docs.microsoft.com/dotnet/core/api/System.DBNull) and [Enum](https://docs.microsoft.com/dotnet/core/api/System.Enum) types, implement the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface. However, these are explicit interface implementations; the conversion method can be called only through an [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface variable, as the following example shows. This example converts an [Int32](https://docs.microsoft.com/dotnet/core/api/System.Int32) value to its equivalent [Char](https://docs.microsoft.com/dotnet/core/api/System.Char) value.
|
||||
|
||||
```csharp
|
||||
int codePoint = 1067;
|
||||
IConvertible iConv = codePoint;
|
||||
char ch = iConv.ToChar(null);
|
||||
Console.WriteLine("Converted {0} to {1}.", codePoint, ch);
|
||||
```
|
||||
|
||||
The requirement to call the conversion method on its interface rather than on the implementing type makes explicit interface implementations relatively expensive. Instead, we recommend that you call the appropriate member of the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class to convert between common language runtime base types. For more information, see the next section, [The Convert Class](#The-Convert-Class).
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> In addition to the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface and the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class provided by .NET Core, individual languages may also provide ways to perform conversions. For example, C# uses casting operators.
|
||||
|
||||
For the most part, the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface is designed to support conversion between the base types in .NET Core. However, the interface can also be implemented by a custom type to support conversion of that type to other custom types. For more information, see the section [Custom Conversions with the ChangeType Method](#Custom-Conversions-with-the-ChangeType-Method) later in this topic.
|
||||
|
||||
## The Convert Class
|
||||
|
||||
Although each base type's [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface implementation can be called to perform a type conversion, calling the methods of the [System.Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class is the recommended language-neutral way to convert from one base type to another. In addition, the [Convert.ChangeType(Object, Type, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ChangeType_System_Object_System_TypeCode_System_IFormatProvider_) method can be used to convert from a specified custom type to another type.
|
||||
|
||||
### Conversions Between Base Types
|
||||
|
||||
The [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class provides a language-neutral way to perform conversions between base types and is available to all languages that target the common language runtime. It provides a complete set of methods for both widening and narrowing conversions, and throws an [InvalidCastException](https://docs.microsoft.com/dotnet/core/api/System.InvalidCastException) for conversions that are not supported (such as the conversion of a [DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value to an integer value). Narrowing conversions are performed in a checked context, and an [OverflowException](https://docs.microsoft.com/dotnet/core/api/System.OverflowException) is thrown if the conversion fails.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> Because the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class includes methods to convert to and from each base type, it eliminates the need to call each base type's [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) explicit interface implementation.
|
||||
|
||||
The following example illustrates the use of the [System.Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class to perform several widening and narrowing conversions between .NET Core base types.
|
||||
|
||||
```csharp
|
||||
// Convert an Int32 value to a Decimal (a widening conversion).
|
||||
int integralValue = 12534;
|
||||
decimal decimalValue = Convert.ToDecimal(integralValue);
|
||||
Console.WriteLine("Converted the {0} value {1} to " +
|
||||
"the {2} value {3:N2}.",
|
||||
integralValue.GetType().Name,
|
||||
integralValue,
|
||||
decimalValue.GetType().Name,
|
||||
decimalValue);
|
||||
// Convert a Byte value to an Int32 value (a widening conversion).
|
||||
byte byteValue = Byte.MaxValue;
|
||||
int integralValue2 = Convert.ToInt32(byteValue);
|
||||
Console.WriteLine("Converted the {0} value {1} to " +
|
||||
"the {2} value {3:G}.",
|
||||
byteValue.GetType().Name,
|
||||
byteValue,
|
||||
integralValue2.GetType().Name,
|
||||
integralValue2);
|
||||
|
||||
// Convert a Double value to an Int32 value (a narrowing conversion).
|
||||
double doubleValue = 16.32513e12;
|
||||
try {
|
||||
long longValue = Convert.ToInt64(doubleValue);
|
||||
Console.WriteLine("Converted the {0} value {1:E} to " +
|
||||
"the {2} value {3:N0}.",
|
||||
doubleValue.GetType().Name,
|
||||
doubleValue,
|
||||
longValue.GetType().Name,
|
||||
longValue);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("Unable to convert the {0:E} value {1}.",
|
||||
doubleValue.GetType().Name, doubleValue);
|
||||
}
|
||||
|
||||
// Convert a signed byte to a byte (a narrowing conversion).
|
||||
sbyte sbyteValue = -16;
|
||||
try {
|
||||
byte byteValue2 = Convert.ToByte(sbyteValue);
|
||||
Console.WriteLine("Converted the {0} value {1} to " +
|
||||
"the {2} value {3:G}.",
|
||||
sbyteValue.GetType().Name,
|
||||
sbyteValue,
|
||||
byteValue2.GetType().Name,
|
||||
byteValue2);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("Unable to convert the {0} value {1}.",
|
||||
sbyteValue.GetType().Name, sbyteValue);
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Converted the Int32 value 12534 to the Decimal value 12,534.00.
|
||||
// Converted the Byte value 255 to the Int32 value 255.
|
||||
// Converted the Double value 1.632513E+013 to the Int64 value 16,325,130,000,000.
|
||||
// Unable to convert the SByte value -16.
|
||||
```
|
||||
|
||||
In some cases, particularly when converting to and from floating-point values, a conversion may involve a loss of precision, even though it does not throw an [OverflowException](https://docs.microsoft.com/dotnet/core/api/System.OverflowException). The following example illustrates this loss of precision. In the first case, a [Decimal](https://docs.microsoft.com/dotnet/core/api/System.Decimal) value has less precision (fewer significant digits) when it is converted to a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double). In the second case, a [Double](https://docs.microsoft.com/dotnet/core/api/System.Double) value is rounded from **42.72** to **43** in order to complete the conversion.
|
||||
|
||||
```csharp
|
||||
double doubleValue;
|
||||
|
||||
// Convert a Double to a Decimal.
|
||||
decimal decimalValue = 13956810.96702888123451471211m;
|
||||
doubleValue = Convert.ToDouble(decimalValue);
|
||||
Console.WriteLine("{0} converted to {1}.", decimalValue, doubleValue);
|
||||
|
||||
doubleValue = 42.72;
|
||||
try {
|
||||
int integerValue = Convert.ToInt32(doubleValue);
|
||||
Console.WriteLine("{0} converted to {1}.",
|
||||
doubleValue, integerValue);
|
||||
}
|
||||
catch (OverflowException) {
|
||||
Console.WriteLine("Unable to convert {0} to an integer.",
|
||||
doubleValue);
|
||||
}
|
||||
// The example displays the following output:
|
||||
// 13956810.96702888123451471211 converted to 13956810.9670289.
|
||||
// 42.72 converted to 43.
|
||||
```
|
||||
|
||||
For a table that lists both the widening and narrowing conversions supported by the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class, see [Type Conversion Tables](conversio/conversiontables.md).
|
||||
|
||||
### Custom Conversions with the ChangeType Method
|
||||
|
||||
In addition to supporting conversions to each of the base types, the [Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert) class can be used to convert a custom type to one or more predefined types. This conversion is performed by the [Convert.ChangeType(Object, Type, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ChangeType_System_Object_System_Type_System_IFormatProvider_) method, which in turn wraps a call to the [IConvertible.ToType](https://docs.microsoft.com/dotnet/core/api/System.IConvertible#System_IConvertible_ToType_System_Type_System_IFormatProvider_) method of the value parameter. This means that the object represented by the value parameter must provide an implementation of the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Because the [Convert.ChangeType(Object, Type)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ChangeType_System_Object_System_Type_) and [Convert.ChangeType(Object, Type, IFormatProvider)](https://docs.microsoft.com/dotnet/core/api/System.Convert#System_Convert_ChangeType_System_Object_System_Type_System_IFormatProvider_) methods use a [Type](https://docs.microsoft.com/dotnet/core/api/System.Type) object to specify the target type to which value is converted, they can be used to perform a dynamic conversion to an object whose type is not known at compile time. However, note that the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) implementation of value must still support this conversion.
|
||||
|
||||
The following example illustrates a possible implementation of the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface that allows a `TemperatureCelsius` object to be converted to a `TemperatureFahrenheit` object and vice versa. The example defines a base class, `Temperature`, that implements the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface and overrides the [Object.ToString](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_ToString) method. The derived `TemperatureCelsius` and `TemperatureFahrenheit` classes each override the `ToType` and the `ToString` methods of the base class.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public abstract class Temperature : IConvertible
|
||||
{
|
||||
protected decimal temp;
|
||||
|
||||
public Temperature(decimal temperature)
|
||||
{
|
||||
this.temp = temperature;
|
||||
}
|
||||
|
||||
public decimal Value
|
||||
{
|
||||
get { return this.temp; }
|
||||
set { this.temp = Value; }
|
||||
}
|
||||
|
||||
public override string ToString()
|
||||
{
|
||||
return temp.ToString(null as IFormatProvider) + "º";
|
||||
}
|
||||
|
||||
// IConvertible implementations.
|
||||
public TypeCode GetTypeCode() {
|
||||
return TypeCode.Object;
|
||||
}
|
||||
|
||||
public bool ToBoolean(IFormatProvider provider) {
|
||||
throw new InvalidCastException(String.Format("Temperature-to-Boolean conversion is not supported."));
|
||||
}
|
||||
|
||||
public byte ToByte(IFormatProvider provider) {
|
||||
if (temp < Byte.MinValue || temp > Byte.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the Byte data type.", temp));
|
||||
else
|
||||
return (byte) temp;
|
||||
}
|
||||
|
||||
public char ToChar(IFormatProvider provider) {
|
||||
throw new InvalidCastException("Temperature-to-Char conversion is not supported.");
|
||||
}
|
||||
|
||||
public DateTime ToDateTime(IFormatProvider provider) {
|
||||
throw new InvalidCastException("Temperature-to-DateTime conversion is not supported.");
|
||||
}
|
||||
|
||||
public decimal ToDecimal(IFormatProvider provider) {
|
||||
return temp;
|
||||
}
|
||||
|
||||
public double ToDouble(IFormatProvider provider) {
|
||||
return (double) temp;
|
||||
}
|
||||
|
||||
public short ToInt16(IFormatProvider provider) {
|
||||
if (temp < Int16.MinValue || temp > Int16.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the Int16 data type.", temp));
|
||||
else
|
||||
return (short) Math.Round(temp);
|
||||
}
|
||||
|
||||
public int ToInt32(IFormatProvider provider) {
|
||||
if (temp < Int32.MinValue || temp > Int32.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the Int32 data type.", temp));
|
||||
else
|
||||
return (int) Math.Round(temp);
|
||||
}
|
||||
|
||||
public long ToInt64(IFormatProvider provider) {
|
||||
if (temp < Int64.MinValue || temp > Int64.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the Int64 data type.", temp));
|
||||
else
|
||||
return (long) Math.Round(temp);
|
||||
}
|
||||
|
||||
public sbyte ToSByte(IFormatProvider provider) {
|
||||
if (temp < SByte.MinValue || temp > SByte.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the SByte data type.", temp));
|
||||
else
|
||||
return (sbyte) temp;
|
||||
}
|
||||
|
||||
public float ToSingle(IFormatProvider provider) {
|
||||
return (float) temp;
|
||||
}
|
||||
|
||||
public virtual string ToString(IFormatProvider provider) {
|
||||
return temp.ToString(provider) + "°";
|
||||
}
|
||||
|
||||
// If conversionType is implemented by another IConvertible method, call it.
|
||||
public virtual object ToType(Type conversionType, IFormatProvider provider) {
|
||||
switch (Type.GetTypeCode(conversionType))
|
||||
{
|
||||
case TypeCode.Boolean:
|
||||
return this.ToBoolean(provider);
|
||||
case TypeCode.Byte:
|
||||
return this.ToByte(provider);
|
||||
case TypeCode.Char:
|
||||
return this.ToChar(provider);
|
||||
case TypeCode.DateTime:
|
||||
return this.ToDateTime(provider);
|
||||
case TypeCode.Decimal:
|
||||
return this.ToDecimal(provider);
|
||||
case TypeCode.Double:
|
||||
return this.ToDouble(provider);
|
||||
case TypeCode.Empty:
|
||||
throw new NullReferenceException("The target type is null.");
|
||||
case TypeCode.Int16:
|
||||
return this.ToInt16(provider);
|
||||
case TypeCode.Int32:
|
||||
return this.ToInt32(provider);
|
||||
case TypeCode.Int64:
|
||||
return this.ToInt64(provider);
|
||||
case TypeCode.Object:
|
||||
// Leave conversion of non-base types to derived classes.
|
||||
throw new InvalidCastException(String.Format("Cannot convert from Temperature to {0}.",
|
||||
conversionType.Name));
|
||||
case TypeCode.SByte:
|
||||
return this.ToSByte(provider);
|
||||
case TypeCode.Single:
|
||||
return this.ToSingle(provider);
|
||||
case TypeCode.String:
|
||||
IConvertible iconv = this;
|
||||
return iconv.ToString(provider);
|
||||
case TypeCode.UInt16:
|
||||
return this.ToUInt16(provider);
|
||||
case TypeCode.UInt32:
|
||||
return this.ToUInt32(provider);
|
||||
case TypeCode.UInt64:
|
||||
return this.ToUInt64(provider);
|
||||
default:
|
||||
throw new InvalidCastException("Conversion not supported.");
|
||||
}
|
||||
}
|
||||
|
||||
public ushort ToUInt16(IFormatProvider provider) {
|
||||
if (temp < UInt16.MinValue || temp > UInt16.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the UInt16 data type.", temp));
|
||||
else
|
||||
return (ushort) Math.Round(temp);
|
||||
}
|
||||
|
||||
public uint ToUInt32(IFormatProvider provider) {
|
||||
if (temp < UInt32.MinValue || temp > UInt32.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the UInt32 data type.", temp));
|
||||
else
|
||||
return (uint) Math.Round(temp);
|
||||
}
|
||||
|
||||
public ulong ToUInt64(IFormatProvider provider) {
|
||||
if (temp < UInt64.MinValue || temp > UInt64.MaxValue)
|
||||
throw new OverflowException(String.Format("{0} is out of range of the UInt64 data type.", temp));
|
||||
else
|
||||
return (ulong) Math.Round(temp);
|
||||
}
|
||||
}
|
||||
|
||||
public class TemperatureCelsius : Temperature, IConvertible
|
||||
{
|
||||
public TemperatureCelsius(decimal value) : base(value)
|
||||
{
|
||||
}
|
||||
|
||||
// Override ToString methods.
|
||||
public override string ToString()
|
||||
{
|
||||
return this.ToString(null);
|
||||
}
|
||||
|
||||
public override string ToString(IFormatProvider provider)
|
||||
{
|
||||
return temp.ToString(provider) + "°C";
|
||||
}
|
||||
|
||||
// If conversionType is a implemented by another IConvertible method, call it.
|
||||
public override object ToType(Type conversionType, IFormatProvider provider) {
|
||||
// For non-objects, call base method.
|
||||
if (Type.GetTypeCode(conversionType) != TypeCode.Object) {
|
||||
return base.ToType(conversionType, provider);
|
||||
}
|
||||
else
|
||||
{
|
||||
if (conversionType.Equals(typeof(TemperatureCelsius)))
|
||||
return this;
|
||||
else if (conversionType.Equals(typeof(TemperatureFahrenheit)))
|
||||
return new TemperatureFahrenheit((decimal) this.temp * 9 / 5 + 32);
|
||||
else
|
||||
throw new InvalidCastException(String.Format("Cannot convert from Temperature to {0}.",
|
||||
conversionType.Name));
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
public class TemperatureFahrenheit : Temperature, IConvertible
|
||||
{
|
||||
public TemperatureFahrenheit(decimal value) : base(value)
|
||||
{
|
||||
}
|
||||
|
||||
// Override ToString methods.
|
||||
public override string ToString()
|
||||
{
|
||||
return this.ToString(null);
|
||||
}
|
||||
|
||||
public override string ToString(IFormatProvider provider)
|
||||
{
|
||||
return temp.ToString(provider) + "°F";
|
||||
}
|
||||
|
||||
public override object ToType(Type conversionType, IFormatProvider provider)
|
||||
{
|
||||
// For non-objects, call base methood.
|
||||
if (Type.GetTypeCode(conversionType) != TypeCode.Object) {
|
||||
return base.ToType(conversionType, provider);
|
||||
}
|
||||
else
|
||||
{
|
||||
// Handle conversion between derived classes.
|
||||
if (conversionType.Equals(typeof(TemperatureFahrenheit)))
|
||||
return this;
|
||||
else if (conversionType.Equals(typeof(TemperatureCelsius)))
|
||||
return new TemperatureCelsius((decimal) (this.temp - 32) * 5 / 9);
|
||||
// Unspecified object type: throw an InvalidCastException.
|
||||
else
|
||||
throw new InvalidCastException(String.Format("Cannot convert from Temperature to {0}.",
|
||||
conversionType.Name));
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The following example illustrates several calls to these [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) implementations to convert `TemperatureCelsius` objects to `TemperatureFahrenheit` objects and vice versa.
|
||||
|
||||
```csharp
|
||||
TemperatureCelsius tempC1 = new TemperatureCelsius(0);
|
||||
TemperatureFahrenheit tempF1 = (TemperatureFahrenheit) Convert.ChangeType(tempC1, typeof(TemperatureFahrenheit), null);
|
||||
Console.WriteLine("{0} equals {1}.", tempC1, tempF1);
|
||||
TemperatureCelsius tempC2 = (TemperatureCelsius) Convert.ChangeType(tempC1, typeof(TemperatureCelsius), null);
|
||||
Console.WriteLine("{0} equals {1}.", tempC1, tempC2);
|
||||
TemperatureFahrenheit tempF2 = new TemperatureFahrenheit(212);
|
||||
TemperatureCelsius tempC3 = (TemperatureCelsius) Convert.ChangeType(tempF2, typeof(TemperatureCelsius), null);
|
||||
Console.WriteLine("{0} equals {1}.", tempF2, tempC3);
|
||||
TemperatureFahrenheit tempF3 = (TemperatureFahrenheit) Convert.ChangeType(tempF2, typeof(TemperatureFahrenheit), null);
|
||||
Console.WriteLine("{0} equals {1}.", tempF2, tempF3);
|
||||
// The example displays the following output:
|
||||
// 0°C equals 32°F.
|
||||
// 0°C equals 0°C.
|
||||
// 212°F equals 100°C.
|
||||
// 212°F equals 212°F.
|
||||
```
|
||||
|
||||
## The TypeConverter Class
|
||||
|
||||
.NET Core also allows you to define a type converter for a custom type by extending the [System.ComponentModel.TypeConverter](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverter) class and associating the type converter with the type through a [System.ComponentModel.TypeConverterAttribute](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverterAttribute) attribute. The following table highlights the differences between this approach and implementing the [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) interface for a custom type.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Design-time support can be provided for a custom type only if it has a type converter defined for it.
|
||||
|
||||
Conversion using TypeConverter | Conversion using IConvertible
|
||||
------------------------------ | -----------------------------
|
||||
Is implemented for a custom type by deriving a separate class from [TypeConverter](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverter). This derived class is associated with the custom type by applying a [TypeConverterAttribute](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverterAttribute) attribute. | Is implemented by a custom type to perform conversion. A user of the type invokes an [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible) conversion method on the type.
|
||||
Can be used both at design time and at run time. | Can be used only at run time.
|
||||
Uses reflection; therefore, is slower than conversion enabled by [IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible). | Does not use reflection.
|
||||
Allows two-way type conversions from the custom type to other data types, and from other data types to the custom type. For example, a [TypeConverter](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverter) defined for `MyType` allows conversions from `MyType` to [String](https://docs.microsoft.com/dotnet/core/api/System.String), and from [String](https://docs.microsoft.com/dotnet/core/api/System.String) to `MyType`. | Allows conversion from a custom type to other data types, but not from other data types to the custom type.
|
||||
|
||||
For more information about using type converters to perform conversions, see [System.ComponentModel.TypeConverter](https://docs.microsoft.com/dotnet/core/api/System.ComponentModel.TypeConverter).
|
||||
|
||||
## See Also
|
||||
|
||||
[System.Convert](https://docs.microsoft.com/dotnet/core/api/System.Convert)
|
||||
|
||||
[IConvertible](https://docs.microsoft.com/dotnet/core/api/System.IConvertible)
|
||||
|
||||
[Type Conversion Tables](conversio/conversiontables.md)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,37 +0,0 @@
|
|||
---
|
||||
title: How to: Access the Predefined UTC and Local Time Zone Objects
|
||||
description: How to: Access the Predefined UTC and Local Time Zone Objects
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 6c521dda-9c27-49cb-97f4-f440f02e38f6
|
||||
---
|
||||
|
||||
# How to: Access the Predefined UTC and Local Time Zone Objects
|
||||
|
||||
The [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) class provides two properties, `Utc` and `Local`, that give your code access to predefined time zone objects. This topic discusses how to access the `TimeZoneInfo` objects returned by those properties.
|
||||
|
||||
## To access the Coordinated Universal Time (UTC) TimeZoneInfo object
|
||||
|
||||
1. Use the **static** `TimeZoneInfo.Utc` property to access Coordinated Universal Time.
|
||||
|
||||
2. Rather than assigning the `TimeZoneInfo` object returned by the property to an object variable, continue to access Coordinated Universal Time through the `TimeZoneInfo.Utc` property.
|
||||
|
||||
|
||||
## To access the local time zone
|
||||
|
||||
1. Use the **static** `TimeZoneInfo.Local` property to access the local system time zone.
|
||||
|
||||
2. Rather than assigning the `TimeZoneInfo` object returned by the property to an object variable, continue to access the local time zone through the `TimeZoneInfo.Local` property.
|
||||
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[Finding the Time Zones Defined on a Local System](finding-the-time-zones-on-local-system.md)
|
|
@ -1,226 +0,0 @@
|
|||
---
|
||||
title: Choosing Between DateTime, DateTimeOffset, TimeSpan, and TimeZoneInfo
|
||||
description: Choosing Between DateTime, DateTimeOffset, TimeSpan, and TimeZoneInfo
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 6c521dda-9c27-49cb-97f4-f440f02e38f6
|
||||
---
|
||||
|
||||
# Choosing Between DateTime, DateTimeOffset, TimeSpan, and TimeZoneInfo
|
||||
|
||||
.NET Core applications that use date and time information are very diverse and can use that information in several ways. The more common uses of date and time information include one or more of the following:
|
||||
|
||||
* To reflect a date only, so that time information is not important.
|
||||
|
||||
* To reflect a time only, so that date information is not important.
|
||||
|
||||
* To reflect an abstract date and time that is not tied to a specific time and place (for example, most stores in an international chain open on weekdays at 9:00 A.M.).
|
||||
|
||||
* To retrieve date and time information from sources outside of the .NET Core application, typically where date and time information is stored in a simple data type.
|
||||
|
||||
* To uniquely and unambiguously identify a single point in time. Some applications require that a date and time be unambiguous only on the host system; others require that it be unambiguous across systems (that is, a date serialized on one system can be meaningfully deserialized and used on another system anywhere in the world).
|
||||
|
||||
* To preserve multiple related times (such as the requestor's local time and the server's time of receipt for a Web request).
|
||||
|
||||
* To perform date and time arithmetic, possibly with a result that uniquely and unambiguously identifies a single point in time.
|
||||
|
||||
.NET Core includes the [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime), [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset), [System.TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan), and [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) types, all of which can be used to build applications that work with dates and times.
|
||||
|
||||
## The DateTime structure
|
||||
|
||||
A [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value defines a particular date and time. It includes a `Kind` property that provides limited information about the time zone to which that date and time belongs. The [DateTimeKind](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind) value returned by the `Kind` property indicates whether the `DateTime` value represents the local time (`DateTimeKind.Local`), Coordinated Universal Time (UTC) (`DateTimeKind.Utc`), or an unspecified time (`DateTimeKind.Unspecified`).
|
||||
|
||||
The `DateTime` structure is suitable for applications that do the following:
|
||||
|
||||
* Work with dates only.
|
||||
|
||||
* Work with times only.
|
||||
|
||||
* Work with abstract dates and times.
|
||||
|
||||
* Work with dates and times for which time zone information is missing.
|
||||
|
||||
* Work with UTC dates and times only.
|
||||
|
||||
* Retrieve date and time information from sources outside the .NET Framework, such as SQL databases. Typically, these sources store date and time information in a simple format that is compatible with the DateTime structure.
|
||||
|
||||
* Perform date and time arithmetic, but are concerned with general results. For example, in an addition operation that adds six months to a particular date and time, it is often not important whether the result is adjusted for daylight saving time.
|
||||
|
||||
Unless a particular `DateTim` value represents UTC, that date and time value is often ambiguous or limited in its portability. For example, if a `DateTime` value represents the local time, it is portable within that local time zone (that is, if the value is deserialized on another system in the same time zone, that value still unambiguously identifies a single point in time). Outside the local time zone, that `DateTime` value can have multiple interpretations. If the value's `Kind` property is `DateTimeKind.Unspecified`, it is even less portable: it is now ambiguous within the same time zone and possibly even on the same system on which it was first serialized. Only if a `DateTime` value represents UTC does that value unambiguously identify a single point in time regardless of the system or time zone in which the value is used.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> When saving or sharing `DateTime` data, UTC should be used and the `DateTime` value's `Kind` property should be set to `DateTimeKind.Utc`.
|
||||
|
||||
## The DateTimeOffset structure
|
||||
|
||||
The [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) structure represents a date and time value, together with an offset that indicates how much that value differs from UTC. Thus, the value always unambiguously identifies a single point in time.
|
||||
|
||||
The `DateTimeOffset` type includes all of the functionality of the `DateTime` type along with time zone awareness. This makes it is suitable for applications that do the following:
|
||||
|
||||
* Uniquely and unambiguously identify a single point in time. The `DateTimeOffset` type can be used to unambiguously define the meaning of "now", to log transaction times, to log the times of system or application events, and to record file creation and modification times.
|
||||
|
||||
* Perform general date and time arithmetic.
|
||||
|
||||
* Preserve multiple related times, as long as those times are stored as two separate values or as two members of a structure.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> These uses for `DateTimeOffset` values are much more common than those for DateTime values. As a result, `DateTimeOffset` should be considered the default date and time type for application development.
|
||||
|
||||
A `DateTimeOffset` value is not tied to a particular time zone, but can originate from any of a variety of time zones. To illustrate this, the following example lists the time zones to which a number of `DateTimeOffset` values (including a local Pacific Standard Time) can belong.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
using System.Collections.ObjectModel;
|
||||
|
||||
public class TimeOffsets
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime thisDate = new DateTime(2007, 3, 10, 0, 0, 0);
|
||||
DateTime dstDate = new DateTime(2007, 6, 10, 0, 0, 0);
|
||||
DateTimeOffset thisTime;
|
||||
|
||||
thisTime = new DateTimeOffset(dstDate, new TimeSpan(-7, 0, 0));
|
||||
ShowPossibleTimeZones(thisTime);
|
||||
|
||||
thisTime = new DateTimeOffset(thisDate, new TimeSpan(-6, 0, 0));
|
||||
ShowPossibleTimeZones(thisTime);
|
||||
|
||||
thisTime = new DateTimeOffset(thisDate, new TimeSpan(+1, 0, 0));
|
||||
ShowPossibleTimeZones(thisTime);
|
||||
}
|
||||
|
||||
private static void ShowPossibleTimeZones(DateTimeOffset offsetTime)
|
||||
{
|
||||
TimeSpan offset = offsetTime.Offset;
|
||||
ReadOnlyCollection<TimeZoneInfo> timeZones;
|
||||
|
||||
Console.WriteLine("{0} could belong to the following time zones:",
|
||||
offsetTime.ToString());
|
||||
// Get all time zones defined on local system
|
||||
timeZones = TimeZoneInfo.GetSystemTimeZones();
|
||||
// Iterate time zones
|
||||
foreach (TimeZoneInfo timeZone in timeZones)
|
||||
{
|
||||
// Compare offset with offset for that date in that time zone
|
||||
if (timeZone.GetUtcOffset(offsetTime.DateTime).Equals(offset))
|
||||
Console.WriteLine(" {0}", timeZone.DisplayName);
|
||||
}
|
||||
Console.WriteLine();
|
||||
}
|
||||
}
|
||||
// This example displays the following output to the console:
|
||||
// 6/10/2007 12:00:00 AM -07:00 could belong to the following time zones:
|
||||
// (GMT-07:00) Arizona
|
||||
// (GMT-08:00) Pacific Time (US & Canada)
|
||||
// (GMT-08:00) Tijuana, Baja California
|
||||
//
|
||||
// 3/10/2007 12:00:00 AM -06:00 could belong to the following time zones:
|
||||
// (GMT-06:00) Central America
|
||||
// (GMT-06:00) Central Time (US & Canada)
|
||||
// (GMT-06:00) Guadalajara, Mexico City, Monterrey - New
|
||||
// (GMT-06:00) Guadalajara, Mexico City, Monterrey - Old
|
||||
// (GMT-06:00) Saskatchewan
|
||||
//
|
||||
// 3/10/2007 12:00:00 AM +01:00 could belong to the following time zones:
|
||||
// (GMT+01:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna
|
||||
// (GMT+01:00) Belgrade, Bratislava, Budapest, Ljubljana, Prague
|
||||
// (GMT+01:00) Brussels, Copenhagen, Madrid, Paris
|
||||
// (GMT+01:00) Sarajevo, Skopje, Warsaw, Zagreb
|
||||
// (GMT+01:00) West Central Africa
|
||||
```
|
||||
|
||||
The output shows that each date and time value in this example can belong to at least three different time zones. The `DateTimeOffset` value of 6/10/2007 shows that if a date and time value represents a daylight saving time, its offset from UTC does not even necessarily correspond to the originating time zone's base UTC offset or to the offset from UTC found in its display name. This means that, because a single `DateTimeOffset` value is not tightly coupled with its time zone, it cannot reflect a time zone's transition to and from daylight saving time. This can be particularly problematic when date and time arithmetic is used to manipulate a `DateTimeOffset` value. For a discussion of how to perform date and time arithmetic in a way that takes account of a time zone's adjustment rules, see [Performing Arithmetic Operations with Dates and Times](performing-arithmetic-operations.md).
|
||||
|
||||
## The TimeSpan structure
|
||||
|
||||
The [System.TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) structure represents a time interval. Its two typical uses are:
|
||||
|
||||
* Reflecting the time interval between two date and time values. For example, subtracting one `DateTime` value from another returns a `TimeSpan` value.
|
||||
|
||||
* Measuring elapsed time. For example, the `Stopwatch.Elapse` property returns a `TimeSpan` value that reflects the time interval that has elapsed since the call to one of the [System.Diagnostics.Stopwatch](https://docs.microsoft.com/dotnet/core/api/System.Diagnostics.Stopwatch) methods that begins to measure elapsed time.
|
||||
|
||||
A `TimeSpan` value can also be used as a replacement for a `DateTime` value when that value reflects a time without reference to a particular time of day. This usage is similar to the `DateTime.TimeOfDay` and `DateTimeOffset.TimeOfDay` properties, which return a `TimeSpan` value that represents the time without reference to a date. For example, the `TimeSpan` structure can be used to reflect a store's daily opening or closing time, or it can be used to represent the time at which any regular event occurs.
|
||||
|
||||
The following example defines a `StoreInfo` structure that includes `TimeSpan` objects for store opening and closing times, as well as a `TimeZoneInfo` object that represents the store's time zone. The structure also includes two methods, `IsOpenNow` and `IsOpenAt`, that indicates whether the store is open at a time specified by the user, who is assumed to be in the local time zone.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public struct StoreInfo
|
||||
{
|
||||
public String store;
|
||||
public TimeZoneInfo tz;
|
||||
public TimeSpan open;
|
||||
public TimeSpan close;
|
||||
|
||||
public bool IsOpenNow()
|
||||
{
|
||||
return IsOpenAt(DateTime.TimeOfDay);
|
||||
}
|
||||
|
||||
public bool IsOpenAt(TimeSpan time)
|
||||
{
|
||||
TimeZoneInfo local = TimeZoneInfo.Local;
|
||||
TimeSpan offset = TimeZoneInfo.BaseUtcOffset;
|
||||
|
||||
// Is the store in the same time zone?
|
||||
if (tz.Equals(local)) {
|
||||
return time >= open & time <= close;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `StoreInfo` structure can then be used by client code like the following.
|
||||
|
||||
```csharp
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
// Instantiate a StoreInfo object.
|
||||
var store103 = new StoreInfo();
|
||||
store103.store = "Store #103";
|
||||
store103.tz = TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
|
||||
// Store opens at 8:00.
|
||||
store103.open = new TimeSpan(8, 0, 0);
|
||||
// Store closes at 9:30.
|
||||
store103.close = new TimeSpan(21, 30, 0);
|
||||
|
||||
Console.WriteLine("Store is open now at {0}: {1}",
|
||||
DateTime.TimeOfDay, store103.IsOpenNow());
|
||||
TimeSpan[] times = { new TimeSpan(8, 0, 0), new TimeSpan(21, 0, 0),
|
||||
new TimeSpan(4, 59, 0), new TimeSpan(18, 31, 0) };
|
||||
foreach (var time in times)
|
||||
Console.WriteLine("Store is open at {0}: {1}",
|
||||
time, store103.IsOpenAt(time));
|
||||
}
|
||||
}
|
||||
// The example displays the following output:
|
||||
// Store is open now at 15:29:01.6129911: True
|
||||
// Store is open at 08:00:00: True
|
||||
// Store is open at 21:00:00: False
|
||||
// Store is open at 04:59:00: False
|
||||
// Store is open at 18:31:00: False
|
||||
```
|
||||
|
||||
## The TimeZoneInfo class
|
||||
|
||||
The [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) class represents any of the earth's time zones, and enables the conversion of any date and time in one time zone to its equivalent in another time zone. The `TimeZoneInfo` class makes it possible to work with dates and times so that any date and time value unambiguously identifies a single point in time.
|
||||
|
||||
In some cases, taking full advantage of the `TimeZoneInfo` class may require further development work. Date and time values are not tightly coupled with the time zones to which they belong. As a result, unless your application provides some mechanism for linking a date and time with its associated time zone, it is easy for a particular date and time value to become disassociated from its time zone. One method of linking this information is to define a class or structure that contains both the date and time value and its associated time zone object.
|
||||
|
||||
Taking advantage of time zone support in .NET Core is possible only if the time zone to which a date and time value belongs is known when that date and time object is instantiated. This is often not the case, particularly in Web or network applications.
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
|
@ -1,314 +0,0 @@
|
|||
---
|
||||
title: Converting Between DateTime and DateTimeOffset
|
||||
description: Converting Between DateTime and DateTimeOffset
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 08957e66-441a-4153-b57b-cc6a3c7b02f8
|
||||
---
|
||||
|
||||
# Converting Between DateTime and DateTimeOffset
|
||||
|
||||
Although the [System.DateTimeOffset]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) structure provides a greater degree of time zone awareness than the [System.DateTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTime) structure, `DateTime` parameters are used more commonly in method calls. Because of this, the ability to convert `DateTimeOffset` values to `DateTime` values and vice versa is particularly important. This article shows how to perform these conversions in a way that preserves as much time zone information as possible.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Both the `DateTime` and the `DateTimeOffset` types have some limitations when representing times in time zones. With its `Kind` property, `DateTime` is able to reflect only Coordinated Universal Time (UTC) and the system's local time zone. `DateTimeOffset` reflects a time's offset from UTC, but it does not reflect the actual time zone to which that offset belongs. For details about time values and support for time zones, see [Choosing Between DateTime, DateTimeOffset, TimeSpan, and TimeZoneInfo](choosing-between-datetime.md).
|
||||
|
||||
## Conversions from DateTime to DateTimeOffset
|
||||
|
||||
The `DateTimeOffset` structure provides two equivalent ways to perform `DateTime` to `DateTimeOffset` conversion that are suitable for most conversions:
|
||||
|
||||
* The [DateTimeOffset(DateTime)]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset__ctor_System_DateTime_) constructor, which creates a new `DateTimeOffset` object based on a `DateTime` value.
|
||||
|
||||
* The implicit conversion operator, which allows you to assign a `DateTime` value to a `DateTimeOffset` object.
|
||||
|
||||
For UTC and local `DateTime` values, the [DateTimeOffset.Offset]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_Offset) property of the resulting `DateTimeOffset` value accurately reflects the UTC or local time zone offset. For example, the following code converts a UTC time to its equivalent `DateTimeOffset` value.
|
||||
|
||||
```csharp
|
||||
DateTime utcTime1 = new DateTime(2008, 6, 19, 7, 0, 0);
|
||||
utcTime1 = DateTime.SpecifyKind(utcTime1, DateTimeKind.Utc);
|
||||
DateTimeOffset utcTime2 = utcTime1;
|
||||
Console.WriteLine("Converted {0} {1} to a DateTimeOffset value of {2}",
|
||||
utcTime1,
|
||||
utcTime1.Kind.ToString(),
|
||||
utcTime2);
|
||||
// This example displays the following output to the console:
|
||||
// Converted 6/19/2008 7:00:00 AM Utc to a DateTimeOffset value of 6/19/2008 7:00:00 AM +00:00
|
||||
```
|
||||
|
||||
In this case, the offset of the `utcTime2` variable is 00:00. Similarly, the following code converts a local time to its equivalent `DateTimeOffset` value.
|
||||
|
||||
```csharp
|
||||
DateTime localTime1 = new DateTime(2008, 6, 19, 7, 0, 0);
|
||||
localTime1 = DateTime.SpecifyKind(localTime1, DateTimeKind.Local);
|
||||
DateTimeOffset localTime2 = localTime1;
|
||||
Console.WriteLine("Converted {0} {1} to a DateTimeOffset value of {2}",
|
||||
localTime1,
|
||||
localTime1.Kind.ToString(),
|
||||
localTime2);
|
||||
// This example displays the following output to the console:
|
||||
// Converted 6/19/2008 7:00:00 AM Local to a DateTimeOffset value of 6/19/2008 7:00:00 AM -07:00
|
||||
```
|
||||
|
||||
However, for `DateTime` values whose `Kind` property is [DateTimeKind.Unspecified]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Unspecified), these two conversion methods produce a `DateTimeOffset` value whose offset is that of the local time zone. This is shown in the following example, which is run in the U.S. Pacific Standard Time zone.
|
||||
|
||||
```csharp
|
||||
DateTime time1 = new DateTime(2008, 6, 19, 7, 0, 0); // Kind is DateTimeKind.Unspecified
|
||||
DateTimeOffset time2 = time1;
|
||||
Console.WriteLine("Converted {0} {1} to a DateTimeOffset value of {2}",
|
||||
time1,
|
||||
time1.Kind.ToString(),
|
||||
time2);
|
||||
// This example displays the following output to the console:
|
||||
// Converted 6/19/2008 7:00:00 AM Unspecified to a DateTimeOffset value of 6/19/2008 7:00:00 AM -07:00
|
||||
```
|
||||
|
||||
If the `DateTime` value reflects the date and time in something other than the local time zone or UTC, you can convert it to a `DateTimeOffset` value and preserve its time zone information by calling the overloaded [DateTimeOffset(DateTime, TimeSpan)]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset__ctor_System_DateTime_System_TimeSpan_) constructor. For example, the following example instantiates a DateTimeOffset object that reflects Central Standard Time.
|
||||
|
||||
```csharp
|
||||
DateTime time1 = new DateTime(2008, 6, 19, 7, 0, 0); // Kind is DateTimeKind.Unspecified
|
||||
try
|
||||
{
|
||||
DateTimeOffset time2 = new DateTimeOffset(time1,
|
||||
TimeZoneInfo.FindSystemTimeZoneById("Central Standard Time").GetUtcOffset(time1));
|
||||
Console.WriteLine("Converted {0} {1} to a DateTime value of {2}",
|
||||
time1,
|
||||
time1.Kind.ToString(),
|
||||
time2);
|
||||
}
|
||||
// Handle exception if time zone is not defined in registry
|
||||
catch (TimeZoneNotFoundException)
|
||||
{
|
||||
Console.WriteLine("Unable to identify target time zone for conversion.");
|
||||
}
|
||||
// This example displays the following output to the console:
|
||||
// Converted 6/19/2008 7:00:00 AM Unspecified to a DateTime value of 6/19/2008 7:00:00 AM -05:00
|
||||
```
|
||||
|
||||
The second parameter to this constructor overload, a [System.TimeSpan]( https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) object that represents the time's offset from UTC, should be retrieved by calling the [TimeZoneInfo.GetUtcOffset(DateTime)]( https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_GetUtcOffset_System_DateTime_) method of the time's corresponding time zone. The method's single parameter is the `DateTime` value that represents the date and time to be converted. If the time zone supports daylight saving time, this parameter allows the method to determine the appropriate offset for that particular date and time.
|
||||
|
||||
## Conversions from DateTimeOffset to DateTime
|
||||
|
||||
The `DateTime` property is most commonly used to perform `DateTimeOffset` to `DateTime` conversion. However, it returns a `DateTime` value whose `Kind` property is [DateTimeKind.Unspecified]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Unspecified), as the following example illustrates.
|
||||
|
||||
```csharp
|
||||
DateTime baseTime = new DateTime(2008, 6, 19, 7, 0, 0);
|
||||
DateTimeOffset sourceTime;
|
||||
DateTime targetTime;
|
||||
|
||||
// Convert UTC to DateTime value
|
||||
sourceTime = new DateTimeOffset(baseTime, TimeSpan.Zero);
|
||||
targetTime = sourceTime.DateTime;
|
||||
Console.WriteLine("{0} converts to {1} {2}",
|
||||
sourceTime,
|
||||
targetTime,
|
||||
targetTime.Kind.ToString());
|
||||
|
||||
// Convert local time to DateTime value
|
||||
sourceTime = new DateTimeOffset(baseTime,
|
||||
TimeZoneInfo.Local.GetUtcOffset(baseTime));
|
||||
targetTime = sourceTime.DateTime;
|
||||
Console.WriteLine("{0} converts to {1} {2}",
|
||||
sourceTime,
|
||||
targetTime,
|
||||
targetTime.Kind.ToString());
|
||||
|
||||
// Convert Central Standard Time to a DateTime value
|
||||
try
|
||||
{
|
||||
TimeSpan offset = TimeZoneInfo.FindSystemTimeZoneById("Central Standard Time").GetUtcOffset(baseTime);
|
||||
sourceTime = new DateTimeOffset(baseTime, offset);
|
||||
targetTime = sourceTime.DateTime;
|
||||
Console.WriteLine("{0} converts to {1} {2}",
|
||||
sourceTime,
|
||||
targetTime,
|
||||
targetTime.Kind.ToString());
|
||||
}
|
||||
catch (TimeZoneNotFoundException)
|
||||
{
|
||||
Console.WriteLine("Unable to create DateTimeOffset based on U.S. Central Standard Time.");
|
||||
}
|
||||
// This example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM +00:00 converts to 6/19/2008 7:00:00 AM Unspecified
|
||||
// 6/19/2008 7:00:00 AM -07:00 converts to 6/19/2008 7:00:00 AM Unspecified
|
||||
// 6/19/2008 7:00:00 AM -05:00 converts to 6/19/2008 7:00:00 AM Unspecified
|
||||
```
|
||||
|
||||
This means that any information about the `DateTimeOffset` value's relationship to UTC is lost by the conversion when the `DateTime` property is used. This affects `DateTimeOffset` values that correspond to UTC time or to the system's local time because the `DateTime` structure reflects only those two time zones in its `Kind` property.
|
||||
|
||||
To preserve as much time zone information as possible when converting a `DateTimeOffset` to a `DateTime` value, you can use the [DateTimeOffset.UtcDateTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_UtcDateTime) and [DateTimeOffset.LocalDateTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_LocalDateTime) properties.
|
||||
|
||||
## Converting a UTC Time
|
||||
|
||||
To indicate that a converted `DateTime` value is the UTC time, you can retrieve the value of the [DateTimeOffset.UtcDateTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_UtcDateTime) property. It differs from the [DateTimeOffset.DateTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_DateTime) property in two ways:
|
||||
|
||||
* It returns a `DateTime` value whose `Kind` property is [DateTimeKind.Utc]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Utc).
|
||||
|
||||
* If the [DateTimeOffset.Offset]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_Offset) property value does not equal [TimeSpan.Zero]( https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_Zero), it converts the time to UTC.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> If your application requires that converted `DateTime` values unambiguously identify a single point in time, you should consider using the `DateTimeOffset.UtcDateTime` property to handle all `DateTimeOffset` to `DateTime` conversions.
|
||||
|
||||
The following code uses the `UtcDateTime` property to convert a `DateTimeOffset` value whose offset equals `TimeSpan.Zero` to a `DateTime` value.
|
||||
|
||||
```csharp
|
||||
DateTimeOffset utcTime1 = new DateTimeOffset(2008, 6, 19, 7, 0, 0, TimeSpan.Zero);
|
||||
DateTime utcTime2 = utcTime1.UtcDateTime;
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
utcTime1,
|
||||
utcTime2,
|
||||
utcTime2.Kind.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM +00:00 converted to 6/19/2008 7:00:00 AM Utc
|
||||
```
|
||||
|
||||
The following code uses the UtcDateTime property to perform both a time zone conversion and a type conversion on a `DateTimeOffset` value.
|
||||
|
||||
```csharp
|
||||
DateTimeOffset originalTime = new DateTimeOffset(2008, 6, 19, 7, 0, 0, new TimeSpan(5, 0, 0));
|
||||
DateTime utcTime = originalTime.UtcDateTime;
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
originalTime,
|
||||
utcTime,
|
||||
utcTime.Kind.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM +05:00 converted to 6/19/2008 2:00:00 AM Utc
|
||||
```
|
||||
|
||||
## Converting a Local Time
|
||||
|
||||
To indicate that a `DateTimeOffset` value represents the local time, you can pass the `DateTime` value returned by the [DateTimeOffset.DateTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_DateTime) property to the static [DateTime.SpecifyKind Method (DateTime, DateTimeKind)]( https://docs.microsoft.com/dotnet/core/api/System.DateTime#methods) method. The method returns the date and time passed to it as its first parameter, but sets the `Kind` property to the value specified by its second parameter. The following code uses the `SpecifyKind` method when converting a `DateTimeOffset` value whose offset corresponds to that of the local time zone.
|
||||
|
||||
```csharp
|
||||
DateTime sourceDate = new DateTime(2008, 6, 19, 7, 0, 0);
|
||||
DateTimeOffset utcTime1 = new DateTimeOffset(sourceDate,
|
||||
TimeZoneInfo.Local.GetUtcOffset(sourceDate));
|
||||
DateTime utcTime2 = utcTime1.DateTime;
|
||||
if (utcTime1.Offset.Equals(TimeZoneInfo.Local.GetUtcOffset(utcTime1.DateTime)))
|
||||
utcTime2 = DateTime.SpecifyKind(utcTime2, DateTimeKind.Local);
|
||||
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
utcTime1,
|
||||
utcTime2,
|
||||
utcTime2.Kind.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM -07:00 converted to 6/19/2008 7:00:00 AM Local
|
||||
```
|
||||
|
||||
You can also use the `DateTimeOffset.LocalDateTime` property to convert a `DateTimeOffset` value to a local `DateTime` value. The `Kind` property of the returned `DateTime` value is [DateTimeKind.Local]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Local). The following code uses the `DateTimeOffset.LocalDateTime` property when converting a `DateTimeOffset` value whose offset corresponds to that of the local time zone.
|
||||
|
||||
```csharp
|
||||
DateTime sourceDate = new DateTime(2008, 6, 19, 7, 0, 0);
|
||||
DateTimeOffset localTime1 = new DateTimeOffset(sourceDate,
|
||||
TimeZoneInfo.Local.GetUtcOffset(sourceDate));
|
||||
DateTime localTime2 = localTime1.LocalDateTime;
|
||||
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
localTime1,
|
||||
localTime2,
|
||||
localTime2.Kind.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM -07:00 converted to 6/19/2008 7:00:00 AM Local
|
||||
```
|
||||
|
||||
When you retrieve a `DateTime` value using the `DateTimeOffset.LocalDateTime` property, the property's `get` accessor first converts the `DateTimeOffset` value to UTC, then converts it to local time by calling the [DateTimeOffset.ToLocalTime]( https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#methods) method. This means that you can retrieve a value from the `DateTimeOffset.LocalDateTime` property to perform a time zone conversion at the same time that you perform a type conversion. It also means that the local time zone's adjustment rules are applied in performing the conversion. The following code illustrates the use of the `DateTimeOffset.LocalDateTime` property to perform both a type and a time zone conversion.
|
||||
|
||||
```csharp
|
||||
DateTimeOffset originalDate;
|
||||
DateTime localDate;
|
||||
|
||||
// Convert time originating in a different time zone
|
||||
originalDate = new DateTimeOffset(2008, 6, 18, 7, 0, 0,
|
||||
new TimeSpan(-5, 0, 0));
|
||||
localDate = originalDate.LocalDateTime;
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
originalDate,
|
||||
localDate,
|
||||
localDate.Kind.ToString());
|
||||
// Convert time originating in a different time zone
|
||||
// so local time zone's adjustment rules are applied
|
||||
originalDate = new DateTimeOffset(2007, 11, 4, 4, 0, 0,
|
||||
new TimeSpan(-5, 0, 0));
|
||||
localDate = originalDate.LocalDateTime;
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
originalDate,
|
||||
localDate,
|
||||
localDate.Kind.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM -05:00 converted to 6/19/2008 5:00:00 AM Local
|
||||
// 11/4/2007 4:00:00 AM -05:00 converted to 11/4/2007 1:00:00 AM Local
|
||||
```
|
||||
|
||||
## A General-Purpose Conversion Method
|
||||
|
||||
The following example defines a method named `ConvertFromDateTimeOffset` that converts `DateTimeOffset` values to `DateTime` values. Based on its offset, it determines whether the `DateTimeOffset` value is a UTC time, a local time, or some other time, and defines the returned date and time value's `Kind` property accordingly.
|
||||
|
||||
```csharp
|
||||
static DateTime ConvertFromDateTimeOffset(DateTimeOffset dateTime)
|
||||
{
|
||||
if (dateTime.Offset.Equals(TimeSpan.Zero))
|
||||
return dateTime.UtcDateTime;
|
||||
else if (dateTime.Offset.Equals(TimeZoneInfo.Local.GetUtcOffset(dateTime.DateTime)))
|
||||
return DateTime.SpecifyKind(dateTime.DateTime, DateTimeKind.Local);
|
||||
else
|
||||
return dateTime.DateTime;
|
||||
}
|
||||
```
|
||||
|
||||
The follow example calls the `ConvertFromDateTimeOffset` method to convert `DateTimeOffset` values that represent a UTC time, a local time, and a time in the U.S. Central Standard Time zone.
|
||||
|
||||
```csharp
|
||||
DateTime timeComponent = new DateTime(2008, 6, 19, 7, 0, 0);
|
||||
DateTime returnedDate;
|
||||
|
||||
// Convert UTC time
|
||||
DateTimeOffset utcTime = new DateTimeOffset(timeComponent, TimeSpan.Zero);
|
||||
returnedDate = ConvertFromDateTimeOffset(utcTime);
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
utcTime,
|
||||
returnedDate,
|
||||
returnedDate.Kind.ToString());
|
||||
|
||||
// Convert local time
|
||||
DateTimeOffset localTime = new DateTimeOffset(timeComponent,
|
||||
TimeZoneInfo.Local.GetUtcOffset(timeComponent));
|
||||
returnedDate = ConvertFromDateTimeOffset(localTime);
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
localTime,
|
||||
returnedDate,
|
||||
returnedDate.Kind.ToString());
|
||||
|
||||
// Convert Central Standard Time
|
||||
DateTimeOffset cstTime = new DateTimeOffset(timeComponent,
|
||||
TimeZoneInfo.FindSystemTimeZoneById("Central Standard Time").GetUtcOffset(timeComponent));
|
||||
returnedDate = ConvertFromDateTimeOffset(cstTime);
|
||||
Console.WriteLine("{0} converted to {1} {2}",
|
||||
cstTime,
|
||||
returnedDate,
|
||||
returnedDate.Kind.ToString());
|
||||
// The example displays the following output to the console:
|
||||
// 6/19/2008 7:00:00 AM +00:00 converted to 6/19/2008 7:00:00 AM Utc
|
||||
// 6/19/2008 7:00:00 AM -07:00 converted to 6/19/2008 7:00:00 AM Local
|
||||
// 6/19/2008 7:00:00 AM -05:00 converted to 6/19/2008 7:00:00 AM Unspecified
|
||||
```
|
||||
|
||||
Note that this code makes two assumptions that, depending on the application and the source of its date and time values, may not always be valid:
|
||||
|
||||
* It assumes that a date and time value whose offset is `TimeSpan.Zero` represents UTC. In fact, UTC is not a time in a particular time zone, but the time in relation to which the times in the world's time zones are standardized. Time zones can also have an offset of `Zero`.
|
||||
|
||||
* It assumes that a date and time whose offset equals that of the local time zone represents the local time zone. Because date and time values are disassociated from their original time zone, this may not be the case; the date and time can have originated in another time zone with the same offset.
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
|
||||
|
||||
|
|
@ -1,113 +0,0 @@
|
|||
---
|
||||
title: Converting Times Between Time Zones
|
||||
description: Converting Times Between Time Zones
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 33fd3334-352d-4a7f-9396-a07ea9991aeb
|
||||
---
|
||||
|
||||
# Converting Times Between Time Zones
|
||||
|
||||
It is becoming increasingly important for any application that works with dates and times to handle differences between time zones. An application can no longer assume that all times can be expressed in the local time, which is the time available from the [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) structure. For example, a Web page that displays the current time in the eastern part of the United States will lack credibility to a customer in eastern Asia. This topic explains how to convert times from one time zone to another, as well as how to convert `DateTimeOffset` values that have limited time zone awareness.
|
||||
|
||||
## Converting UTC to Local Time
|
||||
|
||||
To convert UTC to local time, call the [DateTime.ToLocalTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime#methods) method of the `DateTime` object whose time you want to convert. The exact behavior of the method depends on the value of the object’s `Kind` property, as the following table shows.
|
||||
|
||||
DateTime.Kind property | Conversion
|
||||
---------------------- | ----------
|
||||
`DateTimeKind.Local` | Returns the `DateTime` value unchanged.
|
||||
`DateTimeKind.Unspecified` | Assumes that the `DateTime` value is UTC and converts the UTC to local time.
|
||||
`DateTimeKind.Utc` | Converts the `DateTime` value to local time.
|
||||
|
||||
## Converting Between Any Two Time Zones
|
||||
|
||||
You can convert between any two time zones by using the static [TimeZoneInfo.ConvertTime](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_ConvertTime_System_DateTime_System_TimeZoneInfo_) method. This method's parameters are the `DateTime` value to convert, a `TimeZoneInfo` object that represents the time zone of the date and time value, and a `TimeZoneInfo` object that represents the time zone to convert the date and time value to.
|
||||
|
||||
The method requires that the `Kind` property of the date and time value to convert and the `TimeZoneInfo` object or time zone identifier that represents its time zone correspond to one another. Otherwise, an [ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException) is thrown. For example, if the `Kind` property of the date and time value is `DateTimeKind.Local`, an exception is thrown if the `TimeZoneInfo` object passed as a parameter to the method is not equal to `TimeZoneInfo.Local`. An exception is also thrown if the identifier passed as a parameter to the method is not equal to `TimeZoneInfo.Local.Id`.
|
||||
|
||||
The following example uses the `ConvertTime` method to convert from Hawaiian Standard Time to local time.
|
||||
|
||||
```csharp
|
||||
DateTime hwTime = new DateTime(2007, 02, 01, 08, 00, 00);
|
||||
try
|
||||
{
|
||||
TimeZoneInfo hwZone = TimeZoneInfo.FindSystemTimeZoneById("Hawaiian Standard Time");
|
||||
Console.WriteLine("{0} {1} is {2} local time.",
|
||||
hwTime,
|
||||
hwZone.IsDaylightSavingTime(hwTime) ? hwZone.DaylightName : hwZone.StandardName,
|
||||
TimeZoneInfo.ConvertTime(hwTime, hwZone, TimeZoneInfo.Local));
|
||||
}
|
||||
catch (TimeZoneNotFoundException)
|
||||
{
|
||||
Console.WriteLine("The registry does not define the Hawaiian Standard Time zone.");
|
||||
}
|
||||
catch (InvalidTimeZoneException)
|
||||
{
|
||||
Console.WriteLine("Registry data on the Hawaiian STandard Time zone has been corrupted.");
|
||||
}
|
||||
```
|
||||
|
||||
## Converting DateTimeOffset Values
|
||||
|
||||
Date and time values represented by [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) objects are not fully time-zone aware because the object is disassociated from its time zone at the time it is instantiated. However, in many cases an application simply needs to convert a date and time based on two different offsets from UTC rather than on the time in particular time zones. To perform this conversion, you can call the current instance's `ToOffset` method. The method's single parameter is `TimeSpan` representing the offset of the new date and time value that the method is to return.
|
||||
|
||||
For example, if the date and time of a user request for a Web page is known and is serialized as a string in the format MM/dd/yyyy hh:mm:ss zzzz, the following `ReturnTimeOnServer` method converts this date and time value to the date and time on the Web server.
|
||||
|
||||
```csharp
|
||||
public DateTimeOffset ReturnTimeOnServer(string clientString)
|
||||
{
|
||||
string format = @"M/d/yyyy H:m:s zzz";
|
||||
TimeSpan serverOffset = TimeZoneInfo.Local.GetUtcOffset(DateTimeOffset.Now);
|
||||
|
||||
try
|
||||
{
|
||||
DateTimeOffset clientTime = DateTimeOffset.ParseExact(clientString, format, CultureInfo.InvariantCulture);
|
||||
DateTimeOffset serverTime = clientTime.ToOffset(serverOffset);
|
||||
return serverTime;
|
||||
}
|
||||
catch (FormatException)
|
||||
{
|
||||
return DateTimeOffset.MinValue;
|
||||
}
|
||||
}
|
||||
```
|
||||
If the method is passed the string "9/1/2007 5:32:07 -05:00", which represents the date and time in a time zone five hours earlier than UTC, it returns 9/1/2007 3:32:07 AM -07:00 for a server located in the U.S. Pacific Standard Time zone.
|
||||
|
||||
The `TimeZoneInfo` class also includes an overloaded [TimeZoneInfo.ConvertTime(DateTimeOffset, TimeZoneInfo)](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_ConvertTime_System_DateTimeOffset_System_TimeZoneInfo_) method that performs time zone conversions with `DateTimeOffset` values. The method's parameters are a `DateTimeOffset` value and a reference to the time zone to which the time is to be converted. The method call returns a `DateTimeOffset` value. For example, the `ReturnTimeOnServer` method in the previous example could be rewritten as follows to call the `ConvertTime(DateTimeOffset, TimeZoneInfo)` method.
|
||||
|
||||
```csharp
|
||||
public DateTimeOffset ReturnTimeOnServer(string clientString)
|
||||
{
|
||||
string format = @"M/d/yyyy H:m:s zzz";
|
||||
|
||||
try
|
||||
{
|
||||
DateTimeOffset clientTime = DateTimeOffset.ParseExact(clientString, format,
|
||||
CultureInfo.InvariantCulture);
|
||||
DateTimeOffset serverTime = TimeZoneInfo.ConvertTime(clientTime,
|
||||
TimeZoneInfo.Local);
|
||||
return serverTime;
|
||||
}
|
||||
catch (FormatException)
|
||||
{
|
||||
return DateTimeOffset.MinValue;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo)
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[Finding the Time Zones Defined on a Local System](finding-the-time-zones-on-local-system.md)
|
||||
|
||||
|
|
@ -1,39 +0,0 @@
|
|||
---
|
||||
title: How to: Enumerate Time Zones Present on a Computer
|
||||
description: How to: Enumerate Time Zones Present on a Computer
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 5bff08fc-dedd-4b11-a709-3682141af556
|
||||
---
|
||||
|
||||
# How to: Enumerate Time Zones Present on a Computer
|
||||
|
||||
Successfully working with a designated time zone requires that information about that time zone be available to the system. For example, the Windows operating system stores this information in the registry. However, although the total number of time zones that exist throughout the world is large, the registry contains information about only a subset of them. In addition, the registry itself is a dynamic structure whose contents are subject to both deliberate and accidental change. As a result, an application cannot always assume that a particular time zone is defined and available on a system. The first step for many applications that use time zone information applications is to determine whether required time zones are available on the local system, or to give the user a list of time zones from which to select. This requires that an application enumerate the time zones defined on a local system.
|
||||
|
||||
## To enumerate the time zones present on the local system
|
||||
|
||||
1. Call the `TimeZoneInfo.GetSystemTimeZones` method. The method returns a generic [ReadOnlyCollection<T>](https://docs.microsoft.com/dotnet/core/api/System.Collections.ObjectModel.ReadOnlyCollection%601) collection of `TimeZoneInfo` objects. The entries in the collection are sorted by their `DisplayName` property. For example:
|
||||
|
||||
```csharp
|
||||
ReadOnlyCollection<TimeZoneInfo> tzCollection;
|
||||
tzCollection = TimeZoneInfo.GetSystemTimeZones();
|
||||
```
|
||||
|
||||
2. Enumerate the individual `TimeZoneInfo` objects in the collection by using a `foreach` loop, and perform any necessary processing on each object. For example, the following code enumerates the `ReadOnlyCollection<T>` collection of `TimeZoneInfo` objects returned in step 1 and lists the display name of each time zone on the console.
|
||||
|
||||
```csharp
|
||||
foreach (TimeZoneInfo timeZone in tzCollection)
|
||||
Console.WriteLine(" {0}: {1}", timeZone.Id, timeZone.DisplayName);
|
||||
```
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[Finding the Time Zones Defined on a Local System](finding-the-time-zones-on-local-system.md)
|
||||
|
|
@ -1,39 +0,0 @@
|
|||
---
|
||||
title: Finding the Time Zones Defined on a Local System
|
||||
description: Finding the Time Zones Defined on a Local System
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: d081ed64-c390-4d2c-ac37-b0646c698eb7
|
||||
---
|
||||
|
||||
# Finding the Time Zones Defined on a Local System
|
||||
|
||||
The [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) class does not expose a public constructor. As a result, the `new` keyword cannot be used to create a new `TimeZoneInfo` object. Instead, `TimeZoneInfo` objects are instantiated by retrieving information on predefined time zones from the operating system. This topic discusses instantiating a time zone from data stored by the operating system. In addition, static properties of the `TimeZoneInfo` class provide access to Coordinated Universal Time (UTC) and the local time zone.
|
||||
|
||||
## Accessing Individual Time Zones
|
||||
|
||||
The `TimeZoneInfo` class provides two predefined time zone objects that represent the UTC time and the local time zone. They are available from the `Utc` and `Local` properties, respectively. For instructions on accessing the UTC or local time zones, see [How to: Access the Predefined UTC and Local Time Zone Objects](access-utc-and-local.md).
|
||||
|
||||
You can also instantiate a `TimeZoneInfo` object that represents any time zone defined by the operating system. For instructions on instantiating a specific time zone object, see [How to: Instantiate a TimeZoneInfo Object](instantiate-time-zone-info.md).
|
||||
|
||||
## Time Zone Identifiers
|
||||
|
||||
The time zone identifier is a key field that uniquely identifies the time zone. While most keys are relatively short, the time zone identifier is comparatively long. In most cases, its value corresponds to the `TimeZoneInfo.StandardNam` property, which is used to provide the name of the time zone's standard time. However, there are exceptions. The best way to make sure that you supply a valid identifier is to enumerate the time zones available on your system and note their associated identifiers. For instructions on enumerating time zones, see [How to: Enumerate Time Zones Present on a Computer](enumerate-time-zones.md).
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[How to: Access the Predefined UTC and Local Time Zone Objects](access-utc-and-local.md)
|
||||
|
||||
[How to: Instantiate a TimeZoneInfo Object](instantiate-time-zone-info.md)
|
||||
|
||||
[How to: Enumerate Time Zones Present on a Computer](enumerate-time-zones.md)
|
||||
|
||||
[Converting Times Between Time Zones](converting-between-time-zones.md)
|
|
@ -1,65 +0,0 @@
|
|||
---
|
||||
title: Dates, Times, and Time Zones
|
||||
description: Dates, Times, and Time Zones
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: f7c9a2db-ba7e-417a-bd47-6490bdfec7d2
|
||||
---
|
||||
|
||||
# Dates, Times, and Time Zones
|
||||
|
||||
In addition to the basic [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) structure, .NET Core provides the following classes that support working with time zones:
|
||||
|
||||
* [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo)
|
||||
|
||||
Use this class to work with the system's local time zone and the Coordinated Universal Time (UTC) zone.
|
||||
|
||||
* [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset)
|
||||
|
||||
Use this structure to work with dates and times whose offset (or difference) from UTC is known. The `DateTimeOffset` structure combines a date and time value with that time's offset from UTC. Because of its relationship to UTC, an individual date and time value unambiguously identifies a single point in time. This makes a `DateTimeOffset` value more portable from one computer to another than a `DateTime` value.
|
||||
|
||||
This section of the documentation provides the information that you need to work with time zones and to create time zone-aware applications that can convert dates and times from one time zone to another.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Time Zone Overview](time-zone-overview.md) - Discusses the terminology, concepts, and issues involved in creating time zone-aware applications.
|
||||
|
||||
[Choosing Between DateTime, DateTimeOffset, TimeSpan, and TimeZoneInfo](choosing-between-datetime.md) - Discusses when to use the `DateTime`, `DateTimeOffset`, and `TimeZoneInfo` types when working with date and time data.
|
||||
|
||||
[Finding the Time Zones Defined on a Local System](finding-the-time-zones-on-local-system.md) - Describes how to enumerate the time zones found on a local system.
|
||||
|
||||
[Instantiating a DateTimeOffset Object](instantiating-a-datetimeoffset-object.md) - Discusses the ways in which a `DateTimeOffset` object can be instantiated, and the ways in which a `DateTime` value can be converted to a `DateTimeOffset` value.
|
||||
|
||||
[Performing Arithmetic Operations with Dates and Times](performing-arithmetic-operations.md) - Discusses the issues involved in adding, subtracting, and comparing `DateTime` and `DateTimeOffset` values.
|
||||
|
||||
[Converting Between DateTime and DateTimeOffset](converting-between-datetime-and-offset.md) - Describes how to convert between `DateTime` and `DateTimeOffset` values.
|
||||
|
||||
[Converting Times Between Time Zones](converting-between-time-zones.md) - Describes how to convert times from one time zone to another.
|
||||
|
||||
## Related Topics
|
||||
|
||||
[How to: Enumerate Time Zones Present on a Computer](enumerate-time-zones.md) - Provides examples that enumerate the time zones defined in a computer's registry and that let users select a predefined time zone from a list.
|
||||
|
||||
[How to: Access the Predefined UTC and Local Time Zone Objects](access-utc-and-local.md) - Describes how to access Coordinated Universal Time and the local time zone.
|
||||
|
||||
[How to: Instantiate a TimeZoneInfo Object](instantiate-time-zone-info.md) - Describes how to instantiate a `TimeZoneInfo` object from the local system registry.
|
||||
|
||||
[How to: Use Time Zones in Date and Time Arithmetic](use-time-zones-in-arithmetic.md) - Discusses how to perform date and time arithmetic that reflects a time zone's adjustment rules.
|
||||
|
||||
[How to: Resolve Ambiguous Times](resolve-ambiguous-times.md) - Describes how to resolve an ambiguous time by mapping it to the time zone's standard time.
|
||||
|
||||
[How to: Let Users Resolve Ambiguous Times](let-users-resolve-ambiguous-times.md) - Describes how to let a user determine the mapping between an ambiguous local time and Coordinated Universal Time.
|
||||
|
||||
## Reference
|
||||
|
||||
[System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo)
|
||||
|
||||
[System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset)
|
||||
|
||||
[System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime)
|
|
@ -1,31 +0,0 @@
|
|||
---
|
||||
title: How to: Instantiate a TimeZoneInfo Object
|
||||
description: How to: Instantiate a TimeZoneInfo Object
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 08fe7f7b-774f-427b-8d4b-2635fed58166
|
||||
---
|
||||
|
||||
# How to: Instantiate a TimeZoneInfo Object
|
||||
|
||||
The most common way to instantiate a `TimeZoneInfo` object is to retrieve information about it from the operating system. This topic discusses how to instantiate a `TimeZoneInfo` object from the local system.
|
||||
|
||||
## To instantiate a TimeZoneInfo Object
|
||||
|
||||
1. Declare a `TimeZoneInfo` object.
|
||||
|
||||
2. Call the static `TimeZoneInfo.FindSystemTimeZoneById` method.
|
||||
|
||||
3. Handle any exceptions thrown by the method, particularly the `System.TimeZoneNotFoundException` that is thrown if the time zone is not defined in the registry.
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[Finding the Time Zones Defined on a Local System](finding-the-time-zones-on-local-system.md)
|
|
@ -1,270 +0,0 @@
|
|||
---
|
||||
title: Instantiating a DateTimeOffset Object
|
||||
description: Instantiating a DateTimeOffset Object
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: bac8c875-da82-43f0-b92f-588077191e3c
|
||||
---
|
||||
|
||||
# Instantiating a DateTimeOffset Object
|
||||
|
||||
The [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) structure offers a number of ways to create new `DateTimeOffset` values. Many of them correspond directly to the methods available for instantiating new [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values, with enhancements that allow you to specify the date and time value's offset from Coordinated Universal Time (UTC). In particular, you can instantiate a `DateTimeOffset` value in the following ways:
|
||||
|
||||
* By calling a `DateTimeOffset` constructor.
|
||||
|
||||
* By implicitly converting a value to `DateTimeOffset` value.
|
||||
|
||||
* By parsing the string representation of a date and time.
|
||||
|
||||
## DateTimeOffset Constructors
|
||||
|
||||
The [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) type defines five constructors. Three of them correspond directly to `DateTime` constructors, with an additional parameter of type [System.TimeSpan](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan) that defines the date and time's offset from UTC. These allow you to define a `DateTimeOffset` value based on the value of its individual date and time components. For example, the following code uses these three constructors to instantiate `DateTimeOffset` objects with identical values of 7/1/2008 12:05 AM +01:00.
|
||||
|
||||
```csharp
|
||||
DateTimeOffset dateAndTime;
|
||||
|
||||
// Instantiate date and time using years, months, days,
|
||||
// hours, minutes, and seconds
|
||||
dateAndTime = new DateTimeOffset(2008, 5, 1, 8, 6, 32,
|
||||
new TimeSpan(1, 0, 0));
|
||||
Console.WriteLine(dateAndTime);
|
||||
// Instantiate date and time using years, months, days,
|
||||
// hours, minutes, seconds, and milliseconds
|
||||
dateAndTime = new DateTimeOffset(2008, 5, 1, 8, 6, 32, 545,
|
||||
new TimeSpan(1, 0, 0));
|
||||
Console.WriteLine("{0} {1}", dateAndTime.ToString("G"),
|
||||
dateAndTime.ToString("zzz"));
|
||||
|
||||
// Instantiate date and time using number of ticks
|
||||
// 05/01/2008 8:06:32 AM is 633,452,259,920,000,000 ticks
|
||||
dateAndTime = new DateTimeOffset(633452259920000000, new TimeSpan(1, 0, 0));
|
||||
Console.WriteLine(dateAndTime);
|
||||
// The example displays the following output to the console:
|
||||
// 5/1/2008 8:06:32 AM +01:00
|
||||
// 5/1/2008 8:06:32 AM +01:00
|
||||
// 5/1/2008 8:06:32 AM +01:00
|
||||
```
|
||||
|
||||
The other two constructors create a `DateTimeOffset` object from a DateTime value. The first of these has a single parameter, the `DateTime` value to convert to a `DateTimeOffset` value. The offset of the resulting `DateTimeOffset` value depends on the `Kind` property of the constructor's single `DateTime` parameter. If its value is [DateTimeKind.Utc](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Utc), the offset is set equal to [TimeSpan.Zero](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_Zero). Otherwise, its offset is set equal to that of the local time zone. The following example illustrates the use of this constructor to instantiate `DateTimeOffset` objects representing UTC and the local time zone:
|
||||
|
||||
```csharp
|
||||
// Declare date; Kind property is DateTimeKind.Unspecified
|
||||
DateTime sourceDate = new DateTime(2008, 5, 1, 8, 30, 0);
|
||||
DateTimeOffset targetTime;
|
||||
|
||||
// Instantiate a DateTimeOffset value from a UTC time
|
||||
DateTime utcTime = DateTime.SpecifyKind(sourceDate, DateTimeKind.Utc);
|
||||
targetTime = new DateTimeOffset(utcTime);
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM +00:00
|
||||
// Because the Kind property is DateTimeKind.Utc,
|
||||
// the offset is TimeSpan.Zero.
|
||||
|
||||
// Instantiate a DateTimeOffset value from a UTC time with a zero offset
|
||||
targetTime = new DateTimeOffset(utcTime, TimeSpan.Zero);
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM +00:00
|
||||
// Because the Kind property is DateTimeKind.Utc,
|
||||
// the call to the constructor succeeds
|
||||
|
||||
// Instantiate a DateTimeOffset value from a UTC time with a negative offset
|
||||
try
|
||||
{
|
||||
targetTime = new DateTimeOffset(utcTime, new TimeSpan(-2, 0, 0));
|
||||
Console.WriteLine(targetTime);
|
||||
}
|
||||
catch (ArgumentException)
|
||||
{
|
||||
Console.WriteLine("Attempt to create DateTimeOffset value from {0} failed.",
|
||||
targetTime);
|
||||
}
|
||||
// Throws exception and displays the following to the console:
|
||||
// Attempt to create DateTimeOffset value from 5/1/2008 8:30:00 AM +00:00 failed.
|
||||
|
||||
// Instantiate a DateTimeOffset value from a local time
|
||||
DateTime localTime = DateTime.SpecifyKind(sourceDate, DateTimeKind.Local);
|
||||
targetTime = new DateTimeOffset(localTime);
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM -07:00
|
||||
// Because the Kind property is DateTimeKind.Local,
|
||||
// the offset is that of the local time zone.
|
||||
|
||||
// Instantiate a DateTimeOffset value from an unspecified time
|
||||
targetTime = new DateTimeOffset(sourceDate);
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM -07:00
|
||||
// Because the Kind property is DateTimeKind.Unspecified,
|
||||
// the offset is that of the local time zone.
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> Calling the overload of the `DateTimeOffset` constructor that has a single `DateTime` parameter is equivalent to performing an implicit conversion of a `DateTime` value to a `DateTimeOffset` value.
|
||||
|
||||
The second constructor that creates a `DateTimeOffset` object from a `DateTime` value has two parameters: the `DateTime` value to convert, and a `TimeSpan` value representing the date and time's offset from UTC. This offset value must correspond to the `Kind` property of the constructor's first parameter or an [System.ArgumentException](https://docs.microsoft.com/dotnet/core/api/System.ArgumentException) is thrown. If the `Kind` property of the first parameter is [DateTimeKind.Utc](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Utc), the value of the second parameter must be `TimeSpan.Zero`. If the `Kind` property of the first parameter is [DateTimeKind.Local](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Local), the value of the second parameter must be the offset of the local system's time zone. If the `Kind` property of the first parameter is [DateTimeKind.Unspecified](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Unspecified), the offset can be any valid value. The following code illustrates calls to this constructor to convert `DateTime` to `DateTimeOffset` values.
|
||||
|
||||
```csharp
|
||||
DateTime sourceDate = new DateTime(2008, 5, 1, 8, 30, 0);
|
||||
DateTimeOffset targetTime;
|
||||
|
||||
// Instantiate a DateTimeOffset value from a UTC time with a zero offset.
|
||||
DateTime utcTime = DateTime.SpecifyKind(sourceDate, DateTimeKind.Utc);
|
||||
targetTime = new DateTimeOffset(utcTime, TimeSpan.Zero);
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM +00:00
|
||||
// Because the Kind property is DateTimeKind.Utc,
|
||||
// the call to the constructor succeeds
|
||||
|
||||
// Instantiate a DateTimeOffset value from a UTC time with a non-zero offset.
|
||||
try
|
||||
{
|
||||
targetTime = new DateTimeOffset(utcTime, new TimeSpan(-2, 0, 0));
|
||||
Console.WriteLine(targetTime);
|
||||
}
|
||||
catch (ArgumentException)
|
||||
{
|
||||
Console.WriteLine("Attempt to create DateTimeOffset value from {0} failed.",
|
||||
utcTime);
|
||||
}
|
||||
// Throws exception and displays the following to the console:
|
||||
// Attempt to create DateTimeOffset value from 5/1/2008 8:30:00 AM failed.
|
||||
|
||||
// Instantiate a DateTimeOffset value from a local time with
|
||||
// the offset of the local time zone
|
||||
DateTime localTime = DateTime.SpecifyKind(sourceDate, DateTimeKind.Local);
|
||||
targetTime = new DateTimeOffset(localTime,
|
||||
TimeZoneInfo.Local.GetUtcOffset(localTime));
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM -07:00
|
||||
// Because the Kind property is DateTimeKind.Local and the offset matches
|
||||
// that of the local time zone, the call to the constructor succeeds.
|
||||
|
||||
// Instantiate a DateTimeOffset value from a local time with a zero offset.
|
||||
try
|
||||
{
|
||||
targetTime = new DateTimeOffset(localTime, TimeSpan.Zero);
|
||||
Console.WriteLine(targetTime);
|
||||
}
|
||||
catch (ArgumentException)
|
||||
{
|
||||
Console.WriteLine("Attempt to create DateTimeOffset value from {0} failed.",
|
||||
localTime);
|
||||
}
|
||||
// Throws exception and displays the following to the console:
|
||||
// Attempt to create DateTimeOffset value from 5/1/2008 8:30:00 AM failed.
|
||||
|
||||
// Instantiate a DateTimeOffset value with an arbitary time zone.
|
||||
string timeZoneName = "Central Standard Time";
|
||||
TimeSpan offset = TimeZoneInfo.FindSystemTimeZoneById(timeZoneName).
|
||||
GetUtcOffset(sourceDate);
|
||||
targetTime = new DateTimeOffset(sourceDate, offset);
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM -05:00
|
||||
```
|
||||
|
||||
## Implicit Type Conversion
|
||||
|
||||
The [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) type supports one implicit type conversion: from a [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) value to a `DateTimeOffset` value. (An implicit type conversion is a conversion from one type to another that does not require an explicit cast and that does not lose information. It makes code like the following possible.
|
||||
|
||||
```csharp
|
||||
DateTimeOffset targetTime;
|
||||
|
||||
// The Kind property of sourceDate is DateTimeKind.Unspecified
|
||||
DateTime sourceDate = new DateTime(2008, 5, 1, 8, 30, 0);
|
||||
targetTime = sourceDate;
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM -07:00
|
||||
|
||||
// define a UTC time (Kind property is DateTimeKind.Utc)
|
||||
DateTime utcTime = DateTime.SpecifyKind(sourceDate, DateTimeKind.Utc);
|
||||
targetTime = utcTime;
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM +00:00
|
||||
|
||||
// Define a local time (Kind property is DateTimeKind.Local)
|
||||
DateTime localTime = DateTime.SpecifyKind(sourceDate, DateTimeKind.Local);
|
||||
targetTime = localTime;
|
||||
Console.WriteLine(targetTime);
|
||||
// Displays 5/1/2008 8:30:00 AM -07:00
|
||||
```
|
||||
|
||||
The offset of the resulting `DateTimeOffset` value depends on the `DateTime.Kind` property value. If its value is [DateTimeKind.Utc](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Utc), the offset is set equal to [TimeSpan.Zero](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_Zero). If its value is either [DateTimeKind.Local](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Local) or [DateTimeKind.Unspecified](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Unspecified), the offset is set equal to that of the local time zone.
|
||||
|
||||
## Parsing the String Representation of a Date and Time
|
||||
|
||||
The [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) type supports four methods that allow you to convert the string representation of a date and time into a `DateTimeOffset` value:
|
||||
|
||||
* [Parse](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_Parse_System_String_), which tries to convert the string representation of a date and time to a `DateTimeOffset` value and throws an exception if the conversion fails.
|
||||
|
||||
* [TryParse](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#methods), which tries to convert the string representation of a date and time to a `DateTimeOffset` value and returns `false` if the conversion fails.
|
||||
|
||||
* [ParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_ParseExact_System_String_System_String_System_IFormatProvider_), which tries to convert the string representation of a date and time in a specified format to a `DateTimeOffset` value. The method throws an exception if the conversion fails.
|
||||
|
||||
* [TryParseExact](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#methods), which tries to convert the string representation of a date and time in a specified format to a `DateTimeOffset` value. The method returns `false` if the conversion fails.
|
||||
|
||||
The following example illustrates calls to each of these four string conversion methods to instantiate a `DateTimeOffset` value.
|
||||
|
||||
```csharp
|
||||
string timeString;
|
||||
DateTimeOffset targetTime;
|
||||
|
||||
timeString = "05/01/2008 8:30 AM +01:00";
|
||||
try
|
||||
{
|
||||
targetTime = DateTimeOffset.Parse(timeString);
|
||||
Console.WriteLine(targetTime);
|
||||
}
|
||||
catch (FormatException)
|
||||
{
|
||||
Console.WriteLine("Unable to parse {0}.", timeString);
|
||||
}
|
||||
|
||||
timeString = "05/01/2008 8:30 AM";
|
||||
if (DateTimeOffset.TryParse(timeString, out targetTime))
|
||||
Console.WriteLine(targetTime);
|
||||
else
|
||||
Console.WriteLine("Unable to parse {0}.", timeString);
|
||||
|
||||
timeString = "Thursday, 01 May 2008 08:30";
|
||||
try
|
||||
{
|
||||
targetTime = DateTimeOffset.ParseExact(timeString, "f",
|
||||
CultureInfo.InvariantCulture);
|
||||
Console.WriteLine(targetTime);
|
||||
}
|
||||
catch (FormatException)
|
||||
{
|
||||
Console.WriteLine("Unable to parse {0}.", timeString);
|
||||
}
|
||||
|
||||
timeString = "Thursday, 01 May 2008 08:30 +02:00";
|
||||
string formatString;
|
||||
formatString = CultureInfo.InvariantCulture.DateTimeFormat.LongDatePattern +
|
||||
" " +
|
||||
CultureInfo.InvariantCulture.DateTimeFormat.ShortTimePattern +
|
||||
" zzz";
|
||||
if (DateTimeOffset.TryParseExact(timeString,
|
||||
formatString,
|
||||
CultureInfo.InvariantCulture,
|
||||
DateTimeStyles.AllowLeadingWhite,
|
||||
out targetTime))
|
||||
Console.WriteLine(targetTime);
|
||||
else
|
||||
Console.WriteLine("Unable to parse {0}.", timeString);
|
||||
// The example displays the following output to the console:
|
||||
// 5/1/2008 8:30:00 AM +01:00
|
||||
// 5/1/2008 8:30:00 AM -07:00
|
||||
// 5/1/2008 8:30:00 AM -07:00
|
||||
// 5/1/2008 8:30:00 AM +02:00
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
|
@ -1,105 +0,0 @@
|
|||
---
|
||||
title: How to: Let Users Resolve Ambiguous Times
|
||||
description: How to: Let Users Resolve Ambiguous Times
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: a9425349-8887-4a0d-bc7c-5286d4e134e5
|
||||
---
|
||||
|
||||
# How to: Let Users Resolve Ambiguous Times
|
||||
|
||||
An ambiguous time is a time that maps to more than one Coordinated Universal Time (UTC). It occurs when the clock time is adjusted back in time, such as during the transition from a time zone's daylight saving time to its standard time. When handling an ambiguous time, you can do one of the following:
|
||||
|
||||
* If the ambiguous time is an item of data entered by the user, you can leave it to the user to resolve the ambiguity.
|
||||
|
||||
* Make an assumption about how the time maps to UTC. For example, you can assume that an ambiguous time is always expressed in the time zone's standard time.
|
||||
|
||||
This article shows how to let a user resolve an ambiguous time.
|
||||
|
||||
## To let a user resolve an ambiguous time
|
||||
|
||||
1. Get the date and time input by the user.
|
||||
|
||||
2. Call the [IsAmbiguousTime(DateTime)](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_IsAmbiguousTime_System_DateTime_) or [IsAmbiguousTime(DateTimeOffset)](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_IsAmbiguousTime_System_DateTimeOffset_) method to determine whether the time is ambiguous.
|
||||
|
||||
3. Let the user select the desired offset.
|
||||
|
||||
4. Get the UTC date and time by subtracting the offset selected by the user from the local time.
|
||||
|
||||
5. Call the static `SpecifyKind` method to set the UTC date and time value's `Kind` property to `DateTimeKind.Utc`.
|
||||
|
||||
## Example
|
||||
|
||||
The following example prompts the user to enter a date and time and, if it is ambiguous, lets the user select the UTC time that the ambiguous time maps to. The example uses a `DateTime` object; you can substitute a `DateTimeOffset` object if desired.
|
||||
|
||||
```csharp
|
||||
private void GetUserDateInput()
|
||||
{
|
||||
// Get date and time from user
|
||||
DateTime inputDate = GetUserDateTime();
|
||||
DateTime utcDate;
|
||||
|
||||
// Exit if date has no significant value
|
||||
if (inputDate == DateTime.MinValue) return;
|
||||
|
||||
if (TimeZoneInfo.Local.IsAmbiguousTime(inputDate))
|
||||
{
|
||||
Console.WriteLine("The date you've entered is ambiguous.");
|
||||
Console.WriteLine("Please select the correct offset from Universal Coordinated Time:");
|
||||
TimeSpan[] offsets = TimeZoneInfo.Local.GetAmbiguousTimeOffsets(inputDate);
|
||||
for (int ctr = 0; ctr < offsets.Length; ctr++)
|
||||
{
|
||||
Console.WriteLine("{0}.) {1} hours, {2} minutes", ctr, offsets[ctr].Hours, offsets[ctr].Minutes);
|
||||
}
|
||||
Console.Write("> ");
|
||||
int selection = Convert.ToInt32(Console.ReadLine());
|
||||
|
||||
// Convert local time to UTC, and set Kind property to DateTimeKind.Utc
|
||||
utcDate = DateTime.SpecifyKind(inputDate - offsets[selection], DateTimeKind.Utc);
|
||||
|
||||
Console.WriteLine("{0} local time corresponds to {1} {2}.", inputDate, utcDate, utcDate.Kind.ToString());
|
||||
}
|
||||
else
|
||||
{
|
||||
utcDate = inputDate.ToUniversalTime();
|
||||
Console.WriteLine("{0} local time corresponds to {1} {2}.", inputDate, utcDate, utcDate.Kind.ToString());
|
||||
}
|
||||
}
|
||||
|
||||
private DateTime GetUserDateTime()
|
||||
{
|
||||
bool exitFlag = false; // flag to exit loop if date is valid
|
||||
string dateString;
|
||||
DateTime inputDate = DateTime.MinValue;
|
||||
|
||||
Console.Write("Enter a local date and time: ");
|
||||
while (! exitFlag)
|
||||
{
|
||||
dateString = Console.ReadLine();
|
||||
if (dateString.ToUpper() == "E")
|
||||
exitFlag = true;
|
||||
|
||||
if (DateTime.TryParse(dateString, out inputDate))
|
||||
exitFlag = true;
|
||||
else
|
||||
Console.Write("Enter a valid date and time, or enter 'e' to exit: ");
|
||||
}
|
||||
|
||||
return inputDate;
|
||||
}
|
||||
```
|
||||
|
||||
The core of the example code uses an array of `TimeSpan` objects to indicate possible offsets of the ambiguous time from UTC. However, these offsets are unlikely to be meaningful to the user. To clarify the meaning of the offsets, the code also notes whether an offset represents the local time zone's standard time or its daylight saving time. The code determines which time is standard and which time is daylight by comparing the offset with the value of the `BaseUtcOffset` property. This property indicates the difference between the UTC and the time zone's standard time.
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[How to: Resolve Ambiguous Times](resolve-ambiguous-times.md)
|
||||
|
|
@ -1,200 +0,0 @@
|
|||
---
|
||||
title: Performing Arithmetic Operations with Dates and Times
|
||||
description: Performing Arithmetic Operations with Dates and Times
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 72beac2f-5296-4727-a176-b2fcc74a0725
|
||||
---
|
||||
|
||||
# Performing Arithmetic Operations with Dates and Times
|
||||
|
||||
Although both the [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) and the [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) structures provide members that perform arithmetic operations on their values, the results of arithmetic operations are very different. This article examines those differences, relates them to degrees of time zone awareness in date and time data, and discusses how to perform fully time zone aware operations using date and time data.
|
||||
|
||||
## Comparisons and Arithmetic Operations with DateTime Values
|
||||
|
||||
[System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values possess a limited degree of time zone awareness. The [DateTime.Kind](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_Kind) property allows a [System.DateTimeKind](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind) value to be assigned to the date and time to indicate whether it represents local time, Coordinated Universal Time (UTC), or the time in an unspecified time zone. However, this limited time zone information is ignored when comparing or performing date and time arithmetic on `DateTime` values. The following example, which compares the current local time with the current UTC time, illustrates this.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public enum TimeComparison
|
||||
{
|
||||
EarlierThan = -1,
|
||||
TheSameAs = 0,
|
||||
LaterThan = 1
|
||||
}
|
||||
|
||||
public class DateManipulation
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime localTime = DateTime.Now;
|
||||
DateTime utcTime = DateTime.UtcNow;
|
||||
|
||||
Console.WriteLine("Difference between {0} and {1} time: {2}:{3} hours",
|
||||
localTime.Kind.ToString(),
|
||||
utcTime.Kind.ToString(),
|
||||
(localTime - utcTime).Hours,
|
||||
(localTime - utcTime).Minutes);
|
||||
Console.WriteLine("The {0} time is {1} the {2} time.",
|
||||
localTime.Kind.ToString(),
|
||||
Enum.GetName(typeof(TimeComparison), localTime.CompareTo(utcTime)),
|
||||
utcTime.Kind.ToString());
|
||||
}
|
||||
}
|
||||
// If run in the U.S. Pacific Standard Time zone, the example displays
|
||||
// the following output to the console:
|
||||
// Difference between Local and Utc time: -7:0 hours
|
||||
// The Local time is EarlierThan the Utc time.
|
||||
```
|
||||
|
||||
The [DateTime.CompareTo(DateTime)](https://docs.microsoft.com/dotnet/core/api/System.DateTime#System_DateTime_CompareTo_System_DateTime_) method reports that the local time is earlier than (or less than) the UTC time, and the subtraction operation indicates that the difference between UTC and the local time for a system in the U.S. Pacific Standard Time zone is seven hours. But because these two values provide different representations of a single point in time, it is clear in this case that this time interval is completely attributable to the local time zone's offset from UTC.
|
||||
|
||||
More generally, the `DateTime.Kind` property does not affect the results returned by `DateTime` comparison and arithmetic methods (as the comparison of two identical points in time indicates), although it can affect the interpretation of those results. For example:
|
||||
|
||||
* The result of any arithmetic operation performed on two date and time values whose `DateTime.Kind` properties both equal [DateTimeKind.Utc](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Utc) reflects the actual time interval between the two values. Similarly, the comparison of two such date and time values accurately reflects the relationship between times.
|
||||
|
||||
* The result of any arithmetic or comparison operation performed on two date and time values whose `DateTime.Kind` properties both equal [DateTimeKind.Local](https://docs.microsoft.com/dotnet/core/api/System.DateTimeKind#System_DateTimeKind_Local) or on two date and time values with different `DateTime.Kind` property values reflects the difference in clock time between the two values.
|
||||
|
||||
* Arithmetic or comparison operations on local date and time values do not consider whether a particular value is ambiguous or invalid, nor do they take account of the effect of any adjustment rules that result from the local time zone's transition to or from daylight saving time.
|
||||
|
||||
* Any operation that compares or calculates the difference between UTC and a local time includes a time interval equal to the local time zone's offset from UTC in the result.
|
||||
|
||||
* Any operation that compares or calculates the difference between an unspecified time and either UTC or the local time reflects simple clock time. Time zone differences are not considered, and the result does not reflect the application of time zone adjustment rules.
|
||||
|
||||
* Any operation that compares or calculates the difference between two unspecified times may include an unknown interval that reflects the difference between the time in two different time zones.
|
||||
|
||||
There are many scenarios in which time zone differences do not affect date and time calculations or in which the context of the date and time data defines the meaning of comparison or arithmetic operations. For a discussion of some of these, see [Choosing Between DateTime, DateTimeOffset, TimeSpan, and TimeZoneInfo](choosing-between-datetime.md).
|
||||
|
||||
## Comparisons and Arithmetic Operations with DateTimeOffset Values
|
||||
|
||||
A [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) value includes not only a date and time, but also an offset that unambiguously defines that date and time relative to UTC. This makes it possible to define equality somewhat differently than for [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) values. Whereas `DateTime` values are equal if they have the same date and time value, `DateTimeOffset` values are equal if they both refer to the same point in time. This makes a `DateTimeOffset` value more accurate and less in need of interpretation when used in comparisons and in most arithmetic operations that determine the interval between two dates and times. The following example, which is the `DateTimeOffset` equivalent to the previous example that compared local and UTC DateTime values, illustrates this difference in behavior.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public enum TimeComparison
|
||||
{
|
||||
EarlierThan = -1,
|
||||
TheSameAs = 0,
|
||||
LaterThan = 1
|
||||
}
|
||||
|
||||
public class DateTimeOffsetManipulation
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTimeOffset localTime = DateTimeOffset.Now;
|
||||
DateTimeOffset utcTime = DateTimeOffset.UtcNow;
|
||||
|
||||
Console.WriteLine("Difference between local time and UTC: {0}:{1:D2} hours",
|
||||
(localTime - utcTime).Hours,
|
||||
(localTime - utcTime).Minutes);
|
||||
Console.WriteLine("The local time is {0} UTC.",
|
||||
Enum.GetName(typeof(TimeComparison), localTime.CompareTo(utcTime)));
|
||||
}
|
||||
}
|
||||
// Regardless of the local time zone, the example displays
|
||||
// the following output to the console:
|
||||
// Difference between local time and UTC: 0:00 hours.
|
||||
// The local time is TheSameAs UTC.
|
||||
```
|
||||
|
||||
In this example, the [DateTimeOffset.CompareTo](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset#System_DateTimeOffset_CompareTo_System_DateTimeOffset_) method indicates that the current local time and the current UTC time are equal, and subtraction of `DateTimeOffset` values indicates that the difference between the two times is [TimeSpan.Zero](https://docs.microsoft.com/dotnet/core/api/System.TimeSpan#System_TimeSpan_Zero).
|
||||
|
||||
The chief limitation of using `DateTimeOffset` values in date and time arithmetic is that although `DateTimeOffset` values have some time zone awareness, they are not fully time zone aware. Although the `DateTimeOffset` value's offset reflects a time zone's offset from UTC when a `DateTimeOffset` variable is first assigned a value, it becomes disassociated from the time zone thereafter. Because it is no longer directly associated with an identifiable time, the addition and subtraction of date and time intervals does not consider a time zone's adjustment rules.
|
||||
|
||||
To illustrate, the transition to daylight saving time in the U.S. Central Standard Time zone occurs at 2:00 A.M. on March 9, 2008. This means that adding a two and a half hour interval to a Central Standard time of 1:30 A.M. on March 9, 2008, should produce a date and time of 5:00 A.M. on March 9, 2008. However, as the following example shows, the result of the addition is 4:00 A.M. on March 9, 2008. Note that this result of this operation does represent the correct point in time, although it is not the time in the time zone in which we are interested (that is, it does not have the expected time zone offset).
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class IntervalArithmetic
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
DateTime generalTime = new DateTime(2008, 3, 9, 1, 30, 0);
|
||||
const string tzName = "Central Standard Time";
|
||||
TimeSpan twoAndAHalfHours = new TimeSpan(2, 30, 0);
|
||||
|
||||
// Instantiate DateTimeOffset value to have correct CST offset
|
||||
try
|
||||
{
|
||||
DateTimeOffset centralTime1 = new DateTimeOffset(generalTime,
|
||||
TimeZoneInfo.FindSystemTimeZoneById(tzName).GetUtcOffset(generalTime));
|
||||
|
||||
// Add two and a half hours
|
||||
DateTimeOffset centralTime2 = centralTime1.Add(twoAndAHalfHours);
|
||||
// Display result
|
||||
Console.WriteLine("{0} + {1} hours = {2}", centralTime1,
|
||||
twoAndAHalfHours.ToString(),
|
||||
centralTime2);
|
||||
}
|
||||
catch (TimeZoneNotFoundException)
|
||||
{
|
||||
Console.WriteLine("Unable to retrieve Central Standard Time zone information.");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output to the console:
|
||||
// 3/9/2008 1:30:00 AM -06:00 + 02:30:00 hours = 3/9/2008 4:00:00 AM -06:00
|
||||
```
|
||||
|
||||
## Arithmetic Operations with Times in Time Zones
|
||||
|
||||
The [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) class does not provide any methods that automatically apply adjustment rules when you perform date and time arithmetic. However, you can do this by converting the time in a time zone to UTC, performing the arithmetic operation, and then converting from UTC back to the time in the time zone. For details, see [How to: Use Time Zones in Date and Time Arithmetic](use-time-zones-in-arithmetic.md).
|
||||
|
||||
For example, the following code is similar to the previous code that added two-and-a-half hours to 2:00 A.M. on March 9, 2008. However, because it converts a Central Standard time to UTC before it performs date and time arithmetic, and then converts the result from UTC back to Central Standard time, the resulting time reflects the Central Standard Time Zone's transition to daylight saving time.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public class TimeZoneAwareArithmetic
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
const string tzName = "Central Standard Time";
|
||||
|
||||
DateTime generalTime = new DateTime(2008, 3, 9, 1, 30, 0);
|
||||
TimeZoneInfo cst = TimeZoneInfo.FindSystemTimeZoneById(tzName);
|
||||
TimeSpan twoAndAHalfHours = new TimeSpan(2, 30, 0);
|
||||
|
||||
// Instantiate DateTimeOffset value to have correct CST offset
|
||||
try
|
||||
{
|
||||
DateTimeOffset centralTime1 = new DateTimeOffset(generalTime,
|
||||
cst.GetUtcOffset(generalTime));
|
||||
|
||||
// Add two and a half hours
|
||||
DateTimeOffset utcTime = centralTime1.ToUniversalTime();
|
||||
utcTime += twoAndAHalfHours;
|
||||
|
||||
DateTimeOffset centralTime2 = TimeZoneInfo.ConvertTime(utcTime, cst);
|
||||
// Display result
|
||||
Console.WriteLine("{0} + {1} hours = {2}", centralTime1,
|
||||
twoAndAHalfHours.ToString(),
|
||||
centralTime2);
|
||||
}
|
||||
catch (TimeZoneNotFoundException)
|
||||
{
|
||||
Console.WriteLine("Unable to retrieve Central Standard Time zone information.");
|
||||
}
|
||||
}
|
||||
}
|
||||
// The example displays the following output to the console:
|
||||
// 3/9/2008 1:30:00 AM -06:00 + 02:30:00 hours = 3/9/2008 5:00:00 AM -05:00
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[How to: Use Time Zones in Date and Time Arithmetic](use-time-zones-in-arithmetic.md)
|
||||
|
||||
|
|
@ -1,66 +0,0 @@
|
|||
---
|
||||
title: How to: Resolve Ambiguous Times
|
||||
description: How to: Resolve Ambiguous Times
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 072444dc-84d0-4b2b-96a2-5f995a485a3a
|
||||
---
|
||||
|
||||
# How to: Resolve Ambiguous Times
|
||||
|
||||
An ambiguous time is a time that maps to more than one Coordinated Universal Time (UTC). It occurs when the clock time is adjusted back in time, such as during the transition from a time zone's daylight saving time to its standard time. When handling an ambiguous time, you can do one of the following:
|
||||
|
||||
* Make an assumption about how the time maps to UTC. For example, you can assume that an ambiguous time is always expressed in the time zone's standard time.
|
||||
|
||||
* If the ambiguous time is an item of data entered by the user, you can leave it to the user to resolve the ambiguity.
|
||||
|
||||
This article shows how to resolve an ambiguous time by assuming that it represents the time zone's standard time.
|
||||
|
||||
## To map an ambiguous time to a time zone's standard time
|
||||
|
||||
1. Call the [System.TimeZoneInfo.IsAmbiguousTime(DateTime)](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_IsAmbiguousTime_System_DateTime_) or [System.TimeZoneInfo.IsAmbiguousTime(DateTimeOffset)](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_IsAmbiguousTime_System_DateTimeOffset_) method to determine whether the time is ambiguous.
|
||||
|
||||
2. If the time is ambiguous, subtract the time from the 'TimeSpan' object returned by the time zone's 'BaseUtcOffset' property.
|
||||
|
||||
3. Call the static `SpecifyKind` method to set the UTC date and time value's `Kind` property to `DateTimeKind.Utc`.
|
||||
|
||||
## Example
|
||||
|
||||
The following example illustrates how to convert an ambiguous `DateTime` to UTC by assuming that it represents the local time zone's standard time.
|
||||
|
||||
```csharp
|
||||
private DateTime ResolveAmbiguousTime(DateTime ambiguousTime)
|
||||
{
|
||||
// Time is not ambiguous
|
||||
if (! TimeZoneInfo.Local.IsAmbiguousTime(ambiguousTime))
|
||||
{
|
||||
return ambiguousTime;
|
||||
}
|
||||
// Time is ambiguous
|
||||
else
|
||||
{
|
||||
DateTime utcTime = DateTime.SpecifyKind(ambiguousTime - TimeZoneInfo.Local.BaseUtcOffset,
|
||||
DateTimeKind.Utc);
|
||||
Console.WriteLine("{0} local time corresponds to {1} {2}.",
|
||||
ambiguousTime, utcTime, utcTime.Kind.ToString());
|
||||
return utcTime;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The example consists of a method named `ResolveAmbiguousTime` that determines whether the `DateTime` value passed to it is ambiguous. If the value is ambiguous, the method returns a `DateTime` value that represents the corresponding UTC time. The method handles this conversion by subtracting the value of the local time zone's `BaseUtcOffset` property from the local time.
|
||||
|
||||
Ordinarily, an ambiguous time is handled by calling the `GetAmbiguousTimeOffsets` method to retrieve an array of `TimeSpan` objects that contain the ambiguous time's possible UTC offsets. However, this example makes the arbitrary assumption that an ambiguous time should always be mapped to the time zone's standard time. The `BaseUtcOffset` property returns the offset between UTC and a time zone's standard time.
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[How to: Let Users Resolve Ambiguous Times](let-users-resolve-ambiguous-times.md)
|
||||
|
|
@ -1,58 +0,0 @@
|
|||
---
|
||||
title: Time Zone Overview
|
||||
description: Time Zone Overview
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: d01e8de1-7893-4dcf-bdae-50505a39e719
|
||||
---
|
||||
|
||||
# Time Zone Overview
|
||||
|
||||
The [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) class simplifies the creation of time zone-aware applications. The `TimeZoneInfo` class supports working with the local time zone and Coordinated Universal Time (UTC), as well as any time zone about which information is predefined in the registry. You can also use `TimeZoneInfo` to define custom time zones that the system has no information about.
|
||||
|
||||
## Time Zone Essentials
|
||||
|
||||
A time zone is a geographical region in which the same time is used. Typically, but not always, adjacent time zones are one hour apart. The time in any of the world's time zones can be expressed as an offset from Coordinated Universal Time (UTC).
|
||||
|
||||
Many of the world's time zones support daylight saving time. Daylight saving time tries to maximize daylight hours by advancing the time forward by one hour in the spring or early summer, and returning to the normal (or standard) time in the late summer or fall. These changes to and from standard time are known as adjustment rules.
|
||||
|
||||
The transition to and from daylight saving time in a particular time zone can be defined either by a fixed or a floating adjustment rule. A fixed adjustment rule sets a particular date on which the transition to or from daylight saving time occurs each year. For example, a transition from daylight saving time to standard time that occurs each year on October 25 follows a fixed adjustment rule. Much more common are floating adjustment rules, which set a particular day of a particular week of a particular month for the transition to or from daylight saving time. For example, a transition from standard time to daylight saving time that occurs on the third Sunday of March follows a floating adjustment rule.
|
||||
|
||||
For time zones that support adjustment rules, the transition to and from daylight saving time creates two kinds of anomalous times: invalid times and ambiguous times. An invalid time is a nonexistent time created by the transition from standard time to daylight saving time. For example, if this transition occurs on a particular day at 2:00 A.M. and causes the time to change to 3:00 A.M., each time interval between 2:00 A.M. and 2:59:99 A.M. is invalid. An ambiguous time is a time that can be mapped to two different times in a single time zone. It is created by the transition from daylight saving time to standard time. For example, if this transition occurs on a particular day at 2:00 A.M. and causes the time to change to 1:00 A.M., each time interval between 1:00 A.M. and 1:59:99 A.M. can be interpreted as either a standard time or a daylight saving time.
|
||||
|
||||
## Time Zone Terminology
|
||||
|
||||
The following table defines terms commonly used when working with time zones and developing time zone-aware applications.
|
||||
|
||||
Term | Definition
|
||||
---- | ----------
|
||||
Adjustment rule | A rule that defines when the transition from standard time to daylight saving time and back from daylight saving time to standard time occurs. Each adjustment rule has a start and end date that defines when the rule is in place (for example, the adjustment rule is in place from January 1, 1986, to December 31, 2020), a delta (the amount of time by which the standard time changes as a result of the application of the adjustment rule), and information about the specific date and time that the transitions are to occur during the adjustment period. Transitions can follow either a fixed rule or a floating rule.
|
||||
Ambiguous time | A time that can be mapped to two different times in a single time zone. It occurs when the clock time is adjusted back in time, such as during the transition from a time zone's daylight saving time to its standard time. For example, if this transition occurs on a particular day at 2:00 A.M. and causes the time to change to 1:00 A.M., each time interval between 1:00 A.M. and 1:59:99 A.M. can be interpreted as either a standard time or a daylight saving time.
|
||||
Fixed rule | An adjustment rule that sets a particular date for the transition to or from daylight saving time. For example, a transition from daylight saving time to standard time that occurs each year on October 25 follows a fixed adjustment rule.
|
||||
Floating rule | An adjustment rule that sets a particular day of a particular week of a particular month for the transition to or from daylight saving time. For example, a transition from standard time to daylight saving time that occurs on the third Sunday of March follows a floating adjustment rule.
|
||||
Invalid time | A nonexistent time that is an artifact of the transition from standard time to daylight saving time. It occurs when the clock time is adjusted forward in time, such as during the transition from a time zone's standard time to its daylight saving time. For example, if this transition occurs on a particular day at 2:00 A.M. and causes the time to change to 3:00 A.M., each time interval between 2:00 A.M. and 2:59:99 A.M. is invalid.
|
||||
Transition time | Information about a specific time change, such as the change from daylight saving time to standard time or vice versa, in a particular time zone.
|
||||
|
||||
## Time Zones and the TimeZoneInfo Class
|
||||
|
||||
In .NET Core, a [System.TimeZoneInfo](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo) object represents a time zone, based on information provided by the operating system. The dependence of the `TimeZoneInfo` class on the operating system means that a time zone-aware application cannot be certain that a particular time zone is defined on all operating systems. As a result, the attempt to instantiate a specific time zone (other than the local time zone or the time zone that represents UTC) should use exception handling. It should also provide some method of letting the application to continue if a required `TimeZoneInfo` object cannot be instantiated.
|
||||
|
||||
Because each time zone is characterized by a base offset from UTC, as well as by an offset from UTC that reflects any existing adjustment rules, a time in one time zone can be easily converted to the time in another time zone. For this purpose, the `TimeZoneInfo` object includes several conversion methods, including:
|
||||
|
||||
* `ConvertTime(DateTime, TimeZoneInfo)`, which converts a [System.DateTime](https://docs.microsoft.com/dotnet/core/api/System.DateTime) to the time in a particular time zone.
|
||||
|
||||
* `ConvertTime(DateTime, TimeZoneInfo, TimeZoneInfo)`, which converts a `DateTime` from one time zone to another.
|
||||
|
||||
* `ConvertTime(DateTimeOffset, TimeZoneInfo)`, which converts a [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) to the time in a particular time zone.
|
||||
|
||||
For details on converting times between time zones, see [Converting Times Between Time Zones](ConvertingBetweenTimeZones.md).
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
|
@ -1,111 +0,0 @@
|
|||
---
|
||||
title: How to: Use Time Zones in Date and Time Arithmetic
|
||||
description: How to: Use Time Zones in Date and Time Arithmetic
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: d01e8de1-7893-4dcf-bdae-50505a39e719
|
||||
---
|
||||
|
||||
# How to: Use Time Zones in Date and Time Arithmetic
|
||||
|
||||
Ordinarily, when you perform date and time arithmetic using [System.DateTimeOffset](https://docs.microsoft.com/dotnet/core/api/System.DateTimeOffset) values, the result does not reflect any time zone adjustment rules. This is true even when the time zone of the date and time value is clearly identifiable. This article shows how to perform arithmetic operations on date and time values that belong to a particular time zone. The results of the arithmetic operations will reflect the time zone's adjustment rules.
|
||||
|
||||
## To apply adjustment rules to date and time arithmetic
|
||||
|
||||
1. Implement some method of closely coupling a date and time value with the time zone to which it belongs. For example, declare a structure that includes both the date and time value and its time zone. The following example uses this approach to link a `DateTimeOffset` value with its time zone.
|
||||
|
||||
```csharp
|
||||
// Define a structure for DateTime values for internal use only
|
||||
internal struct TimeWithTimeZone
|
||||
{
|
||||
TimeZoneInfo TimeZone;
|
||||
DateTimeOffset Time;
|
||||
}
|
||||
```
|
||||
|
||||
2. Convert a time to Coordinated Universal Time (UTC) by calling the [TimeZoneInfo.ConvertTime(DateTime, TimeZoneInfo)](https://docs.microsoft.com/dotnet/core/api/System.TimeZoneInfo#System_TimeZoneInfo_ConvertTime_System_DateTime_System_TimeZoneInfo_) method.
|
||||
|
||||
3. Perform the arithmetic operation on the UTC time.
|
||||
|
||||
4. Convert the time from UTC to the original time's associated time zone by calling the `TimeZoneInfo.ConvertTime(DateTime, TimeZoneInfo)` method.
|
||||
|
||||
## Example
|
||||
|
||||
The following example adds two hours and thirty minutes to March 9, 2008, at 1:30 A.M. Central Standard Time. The time zone's transition to daylight saving time occurs thirty minutes later, at 2:00 A.M. on March 9, 2008. Because the example follows the four steps listed in the previous section, it correctly reports the resulting time as 5:00 A.M. on March 9, 2008.
|
||||
|
||||
```csharp
|
||||
using System;
|
||||
|
||||
public struct TimeZoneTime
|
||||
{
|
||||
public TimeZoneInfo TimeZone;
|
||||
public DateTimeOffset Time;
|
||||
|
||||
public TimeZoneTime(TimeZoneInfo tz, DateTimeOffset time)
|
||||
{
|
||||
if (tz == null)
|
||||
throw new ArgumentNullException("The time zone cannot be a null reference.");
|
||||
|
||||
this.TimeZone = tz;
|
||||
this.Time = time;
|
||||
}
|
||||
|
||||
public TimeZoneTime AddTime(TimeSpan interval)
|
||||
{
|
||||
// Convert time to UTC
|
||||
DateTimeOffset utcTime = TimeZoneInfo.ConvertTime(this.Time, TimeZoneInfo.Utc);
|
||||
// Add time interval to time
|
||||
utcTime = utcTime.Add(interval);
|
||||
// Convert time back to time in time zone
|
||||
return new TimeZoneTime(this.TimeZone, TimeZoneInfo.ConvertTime(utcTime, this.TimeZone));
|
||||
}
|
||||
}
|
||||
|
||||
public class TimeArithmetic
|
||||
{
|
||||
public const string tzName = "Central Standard Time";
|
||||
|
||||
public static void Main()
|
||||
{
|
||||
try
|
||||
{
|
||||
TimeZoneTime cstTime1, cstTime2;
|
||||
|
||||
TimeZoneInfo cst = TimeZoneInfo.FindSystemTimeZoneById(tzName);
|
||||
DateTime time1 = new DateTime(2008, 3, 9, 1, 30, 0);
|
||||
TimeSpan twoAndAHalfHours = new TimeSpan(2, 30, 0);
|
||||
|
||||
cstTime1 = new TimeZoneTime(cst,
|
||||
new DateTimeOffset(time1, cst.GetUtcOffset(time1)));
|
||||
cstTime2 = cstTime1.AddTime(twoAndAHalfHours);
|
||||
Console.WriteLine("{0} + {1} hours = {2}", cstTime1.Time,
|
||||
twoAndAHalfHours.ToString(),
|
||||
cstTime2.Time);
|
||||
}
|
||||
catch
|
||||
{
|
||||
Console.WriteLine("Unable to find {0}.", tzName);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Note that if this addition is simply performed on the `DateTimeOffset` value without first converting it to UTC, the result reflects the correct point in time but its offset does not reflect that of the designated time zone for that time.
|
||||
|
||||
`DateTimeOffset` values are disassociated from any time zone to which they might belong. To perform date and time arithmetic in a way that automatically applies a time zone's adjustment rules, the time zone to which any date and time value belongs must be immediately identifiable. This means that a date and time and its associated time zone must be tightly coupled. There are several ways to do this, which include the following:
|
||||
|
||||
* Assume that all times used in an application belong to a particular time zone. Although appropriate in some cases, this approach offers limited flexibility and possibly limited portability.
|
||||
|
||||
* Define a type that tightly couples a date and time with its associated time zone by including both as fields of the type. This approach is used in the code example, which defines a structure to store the date and time and the time zone in two member fields.
|
||||
|
||||
## See Also
|
||||
|
||||
[Dates, Times, and Time Zones](index.md)
|
||||
|
||||
[Performing Arithmetic Operations with Dates and Times](performing-arithmetic-operations.md)
|
|
@ -1,169 +0,0 @@
|
|||
---
|
||||
title: Fundamentals of Garbage Collection
|
||||
description: Fundamentals of Garbage Collection
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 8f2b52e1-a0af-4659-8d4e-03dd3f386cd0
|
||||
---
|
||||
|
||||
# Fundamentals of Garbage Collection
|
||||
|
||||
In the Common Language Runtime (CLR), the garbage collector serves as an automatic memory manager. It provides the following benefits:
|
||||
|
||||
* Enables you to develop your application without having to free memory.
|
||||
|
||||
* Allocates objects on the managed heap efficiently.
|
||||
|
||||
* Reclaims objects that are no longer being used, clears their memory, and keeps the memory available for future allocations. Managed objects automatically get clean content to start with, so their constructors do not have to initialize every data field.
|
||||
|
||||
* Provides memory safety by making sure that an object cannot use the content of another object.
|
||||
|
||||
|
||||
This topic describes the core concepts of garbage collection. It contains the following sections:
|
||||
|
||||
* [Fundamentals of memory](#Fundamentals-of-memory)
|
||||
|
||||
* [Conditions for a garbage collection](#Conditions-for-a-garbage-collection)
|
||||
|
||||
* [The managed heap](#The-managed-heap)
|
||||
|
||||
* [Generations](#Generations)
|
||||
|
||||
* [What happens during a garbage collection](#What-happens-during-a-garbage-collection)
|
||||
|
||||
* [Manipulating unmanaged resources](#Manipulating-unmanaged-resources)
|
||||
|
||||
## Fundamentals of memory
|
||||
|
||||
The following list summarizes important CLR memory concepts.
|
||||
|
||||
* Each process has its own, separate virtual address space. All processes on the same computer share the same physical memory, and share the page file if there is one.
|
||||
|
||||
* By default, on 32-bit computers, each process has a 2-GB user-mode virtual address space.
|
||||
|
||||
* As an application developer, you work only with virtual address space and never manipulate physical memory directly. The garbage collector allocates and frees virtual memory for you on the managed heap.
|
||||
|
||||
* Virtual memory can be in three states:
|
||||
|
||||
* Free. The block of memory has no references to it and is available for allocation.
|
||||
|
||||
* Reserved. The block of memory is available for your use and cannot be used for any other allocation request. However, you cannot store data to this memory block until it is committed.
|
||||
|
||||
* Committed. The block of memory is assigned to physical storage.
|
||||
|
||||
* Virtual address space can get fragmented. This means that there are free blocks, also known as holes, in the address space. When a virtual memory allocation is requested, the virtual memory manager has to find a single free block that is large enough to satisfy that allocation request. Even if you have 2 GB of free space, the allocation that requires 2 GB will be unsuccessful unless all of that space is in a single address block.
|
||||
|
||||
* You can run out of memory if you run out of virtual address space to reserve or physical space to commit.
|
||||
|
||||
Your page file is used even if physical memory pressure (that is, demand for physical memory) is low. The first time your physical memory pressure is high, the operating system must make room in physical memory to store data, and it backs up some of the data that is in physical memory to the page file. That data is not paged until it is needed, so it is possible to encounter paging in situations where the physical memory pressure is very low.
|
||||
|
||||
## Conditions for a garbage collection
|
||||
|
||||
Garbage collection occurs when one of the following conditions is true:
|
||||
|
||||
* The system has low physical memory.
|
||||
|
||||
* The memory that is used by allocated objects on the managed heap surpasses an acceptable threshold. This threshold is continuously adjusted as the process runs.
|
||||
|
||||
* The [GC.Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) method is called. In almost all cases, you do not have to call this method, because the garbage collector runs continuously. This method is primarily used for unique situations and testing.
|
||||
|
||||
## The managed heap
|
||||
|
||||
After the garbage collector is initialized by the CLR, it allocates a segment of memory to store and manage objects. This memory is called the managed heap, as opposed to a native heap in the operating system.
|
||||
|
||||
There is a managed heap for each managed process. All threads in the process allocate memory for objects on the same heap.
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> The size of segments allocated by the garbage collector is implementation-specific and is subject to change at any time, including in periodic updates. Your app should never make assumptions about or depend on a particular segment size, nor should it attempt to configure the amount of memory available for segment allocations.
|
||||
|
||||
The fewer objects allocated on the heap, the less work the garbage collector has to do. When you allocate objects, do not use rounded-up values that exceed your needs, such as allocating an array of 32 bytes when you need only 15 bytes.
|
||||
|
||||
When a garbage collection is triggered, the garbage collector reclaims the memory that is occupied by dead objects. The reclaiming process compacts live objects so that they are moved together, and the dead space is removed, thereby making the heap smaller. This ensures that objects that are allocated together stay together on the managed heap, to preserve their locality.
|
||||
|
||||
The intrusiveness (frequency and duration) of garbage collections is the result of the volume of allocations and the amount of survived memory on the managed heap.
|
||||
|
||||
The heap can be considered as the accumulation of two heaps: the large object heap and the small object heap.
|
||||
|
||||
The large object heap contains very large objects that are 85,000 bytes and larger. The objects on the large object heap are usually arrays. It is rare for an instance object to be extremely large.
|
||||
|
||||
## Generations
|
||||
|
||||
The heap is organized into generations so it can handle long-lived and short-lived objects. Garbage collection primarily occurs with the reclamation of short-lived objects that typically occupy only a small part of the heap. There are three generations of objects on the heap:
|
||||
|
||||
* **Generation 0.** This is the youngest generation and contains short-lived objects. An example of a short-lived object is a temporary variable. Garbage collection occurs most frequently in this generation.
|
||||
|
||||
Newly allocated objects form a new generation of objects and are implicitly generation 0 collections, unless they are large objects, in which case they go on the large object heap in a generation 2 collection.
|
||||
|
||||
Most objects are reclaimed for garbage collection in generation 0 and do not survive to the next generation.
|
||||
|
||||
* **Generation 1.** This generation contains short-lived objects and serves as a buffer between short-lived objects and long-lived objects.
|
||||
|
||||
* **Generation 2.** This generation contains long-lived objects. An example of a long-lived object is an object in a server application that contains static data that is live for the duration of the process.
|
||||
|
||||
Garbage collections occur on specific generations as conditions warrant. Collecting a generation means collecting objects in that generation and all its younger generations. A generation 2 garbage collection is also known as a full garbage collection, because it reclaims all objects in all generations (that is, all objects in the managed heap).
|
||||
|
||||
### Survival and promotions
|
||||
|
||||
Objects that are not reclaimed in a garbage collection are known as survivors, and are promoted to the next generation. Objects that survive a generation 0 garbage collection are promoted to generation 1; objects that survive a generation 1 garbage collection are promoted to generation 2; and objects that survive a generation 2 garbage collection remain in generation 2.
|
||||
|
||||
When the garbage collector detects that the survival rate is high in a generation, it increases the threshold of allocations for that generation, so the next collection gets a substantial size of reclaimed memory. The CLR continually balances two priorities: not letting an application's working set get too big and not letting the garbage collection take too much time.
|
||||
|
||||
### Ephemeral generations and segments
|
||||
|
||||
Because objects in generations 0 and 1 are short-lived, these generations are known as the ephemeral generations.
|
||||
|
||||
Ephemeral generations must be allocated in the memory segment that is known as the ephemeral segment. Each new segment acquired by the garbage collector becomes the new ephemeral segment and contains the objects that survived a generation 0 garbage collection. The old ephemeral segment becomes the new generation 2 segment.
|
||||
|
||||
|
||||
The ephemeral segment can include generation 2 objects. Generation 2 objects can use multiple segments (as many as your process requires and memory allows for).
|
||||
|
||||
The amount of freed memory from an ephemeral garbage collection is limited to the size of the ephemeral segment. The amount of memory that is freed is proportional to the space that was occupied by the dead objects.
|
||||
|
||||
## What happens during a garbage collection
|
||||
|
||||
A garbage collection has the following phases:
|
||||
|
||||
* A marking phase that finds and creates a list of all live objects.
|
||||
|
||||
* A relocating phase that updates the references to the objects that will be compacted.
|
||||
|
||||
* A compacting phase that reclaims the space occupied by the dead objects and compacts the surviving objects. The compacting phase moves objects that have survived a garbage collection toward the older end of the segment.
|
||||
|
||||
Because generation 2 collections can occupy multiple segments, objects that are promoted into generation 2 can be moved into an older segment. Both generation 1 and generation 2 survivors can be moved to a different segment, because they are promoted to generation 2.
|
||||
|
||||
Ordinarily, the large object heap is not compacted, because copying large objects imposes a performance penalty. However, you can use the [GCSettings.LargeObjectHeapCompactionMode](https://docs.microsoft.com/dotnet/core/api/GCSettings#System_Runtime_GCSettings_LargeObjectHeapCompactionMode) property to compact the large object heap on demand.
|
||||
|
||||
The garbage collector uses the following information to determine whether objects are live:
|
||||
|
||||
* **Stack roots.** Stack variables provided by the just-in-time (JIT) compiler and stack walker.
|
||||
|
||||
* **Garbage collection handles.** Handles that point to managed objects and that can be allocated by user code or by the Common Language Runtime.
|
||||
|
||||
* **Static data.** Static objects in application domains that could be referencing other objects. Each application domain keeps track of its static objects.
|
||||
|
||||
Before a garbage collection starts, all managed threads are suspended except for the thread that triggered the garbage collection.
|
||||
|
||||
The following illustration shows a thread that triggers a garbage collection and causes the other threads to be suspended.
|
||||
|
||||
! [When a thread triggers a garbage collection](../images/IC393001.png)
|
||||
|
||||
Thread that triggers a garbage collection
|
||||
|
||||
## Manipulating unmanaged resources
|
||||
|
||||
If your managed objects reference unmanaged objects by using their native file handles, you have to explicitly free the unmanaged objects, because the garbage collector tracks memory only on the managed heap.
|
||||
|
||||
Users of your managed object may not dispose the native resources used by the object. To perform the cleanup, you can make your managed object finalizable. Finalization consists of cleanup actions that you execute when the object is no longer in use. When your managed object dies, it performs cleanup actions that are specified in its finalizer method.
|
||||
|
||||
When a finalizable object is discovered to be dead, its finalizer is put in a queue so that its cleanup actions are executed, but the object itself is promoted to the next generation. Therefore, you have to wait until the next garbage collection that occurs on that generation (which is not necessarily the next garbage collection) to determine whether the object has been reclaimed.
|
||||
|
||||
## See Also
|
||||
|
||||
[Garbage Collection](index.md)
|
|
@ -1,47 +0,0 @@
|
|||
---
|
||||
title: Garbage Collection
|
||||
description: Garbage Collection
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: fc93a2f8-58ff-4648-a534-33911b3efadb
|
||||
|
||||
---
|
||||
|
||||
# Garbage Collection
|
||||
|
||||
The .NET garbage collector manages the allocation and release of memory for your application. Each time you create a new object, the Common Language Runtime allocates memory for the object from the managed heap. As long as address space is available in the managed heap, the runtime continues to allocate space for new objects. However, memory is not infinite. Eventually the garbage collector must perform a collection in order to free some memory. The garbage collector's optimizing engine determines the best time to perform a collection, based upon the allocations being made. When the garbage collector performs a collection, it checks for objects in the managed heap that are no longer being used by the application and performs the necessary operations to reclaim their memory.
|
||||
|
||||
## Related Topics
|
||||
|
||||
Title | Description
|
||||
----- | -----------
|
||||
[Fundamentals of Garbage Collection](fundamentals.md) | Describes how garbage collection works, how objects are allocated on the managed heap, and other core concepts.
|
||||
[Induced Collections](induced.md) | Describes how to make a garbage collection occur.
|
||||
[Latency Modes](latency.md) | Describes the modes that determine the intrusiveness of garbage collection.
|
||||
[Weak References](weak-references.md) | Describes features that permit the garbage collector to collect an object while still allowing the application to access that object.
|
||||
|
||||
## Reference
|
||||
|
||||
[System.GC](https://docs.microsoft.com/dotnet/core/api/System.GC)
|
||||
|
||||
[System.GCCollectionMode](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode)
|
||||
|
||||
[System.Runtime.GCLatencyMode](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode)
|
||||
|
||||
[System.Runtime.GCSettings](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCSettings)
|
||||
|
||||
[GCSettings.LargeObjectHeapCompactionMode](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCSettingsGCSettings.LargeObjectHeapCompactionMode)
|
||||
|
||||
[Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize)
|
||||
|
||||
[System.IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable)
|
||||
|
||||
## See Also
|
||||
|
||||
[Cleaning Up Unmanaged Resources](unmanaged.md)
|
|
@ -1,446 +0,0 @@
|
|||
---
|
||||
title: Implementing a Dispose Method
|
||||
description: Implementing a Dispose Method
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 194213c3-d188-4b5d-961f-2f104202a215
|
||||
---
|
||||
|
||||
# Implementing a Dispose Method
|
||||
|
||||
You implement a [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method to release unmanaged resources used by your application. The .NET garbage collector does not allocate or release unmanaged memory.
|
||||
|
||||
The pattern for disposing an object, referred to as a dispose pattern, imposes order on the lifetime of an object. The dispose pattern is used only for objects that access unmanaged resources, such as file and pipe handles, registry handles, wait handles, or pointers to blocks of unmanaged memory. This is because the garbage collector is very efficient at reclaiming unused managed objects, but it is unable to reclaim unmanaged objects.
|
||||
|
||||
The dispose pattern has two variations:
|
||||
|
||||
* You wrap each unmanaged resource that a type uses in a safe handle (that is, in a class derived from [System.Runtime.InteropServices.SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle)). In this case, you implement the [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface and an additional `Dispose(Boolean)` method. This is the recommended variation and doesn't require overriding the [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method.
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The [Microsoft.Win32.SafeHandles](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles) namespace provides a set of classes derived from [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle), which are listed in the [Using safe handles](#Using-safe-handles) section. If you can't find a class that is suitable for releasing your unmanaged resource, you can implement your own subclass of [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle).
|
||||
|
||||
* You implement the [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface and an additional `Dispose(Boolean`) method, and you also override the [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method. You must override [Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) to ensure that unmanaged resources are disposed of if your [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation is not called by a consumer of your type. If you use the recommended technique discussed in the previous bullet, the [System.Runtime.InteropServices.SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) class does this on your behalf.
|
||||
|
||||
To help ensure that resources are always cleaned up appropriately, a [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method should be callable multiple times without throwing an exception.
|
||||
|
||||
The code example provided for the [GC.KeepAlive](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_KeepAlive_System_Object_) method shows how aggressive garbage collection can cause a finalizer to run while a member of the reclaimed object is still executing. It is a good idea to call the [KeepAlive](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_KeepAlive_System_Object_) method at the end of a lengthy Dispose method.
|
||||
|
||||
## Dispose() and Dispose(Boolean)
|
||||
|
||||
The [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface requires the implementation of a single parameterless method, [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose). However, the dispose pattern requires two `Dispose methods` to be implemented:
|
||||
|
||||
* A public non-virtual [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation that has no parameters.
|
||||
|
||||
* A protected virtual `Dispose` method whose signature is:
|
||||
|
||||
```cs
|
||||
protected virtual void Dispose(bool disposing)
|
||||
```
|
||||
|
||||
### The Dispose() overload
|
||||
|
||||
Because the public, non-virtual, parameterless `Dispose` method is called by a consumer of the type, its purpose is to free unmanaged resources and to indicate that the finalizer, if one is present, doesn't have to run. Because of this, it has a standard implementation:
|
||||
|
||||
```cs
|
||||
public void Dispose()
|
||||
{
|
||||
// Dispose of unmanaged resources.
|
||||
Dispose(true);
|
||||
// Suppress finalization.
|
||||
GC.SuppressFinalize(this);
|
||||
}
|
||||
```
|
||||
|
||||
The `Dispose` method performs all object cleanup, so the garbage collector no longer needs to call the objects' [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) override. Therefore, the call to the [GC.SuppressFinalize](https://docs.microsoft.com/dotnet/core/api/System.GC.System_GC_SuppressFinalize_System_Object_) method prevents the garbage collector from running the finalizer. If the type has no finalizer, the call to [SuppressFinalize](https://docs.microsoft.com/dotnet/core/api/System.GC.System_GC_SuppressFinalize_System_Object_) has no effect. Note that the actual work of releasing unmanaged resources is performed by the second overload of the `Dispose` method.
|
||||
|
||||
### The Dispose(Boolean) overload
|
||||
|
||||
In the second overload, the *disposing* parameter is a [Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean) that indicates whether the method call comes from a [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method (its value is `true`) or from a finalizer (its value is `false`).
|
||||
|
||||
The body of the method consists of two blocks of code:
|
||||
|
||||
* A block that frees unmanaged resources. This block executes regardless of the value of the *disposing* parameter.
|
||||
|
||||
* A conditional block that frees managed resources. This block executes if the value of *disposing* is `true`. The managed resources that it frees can include:
|
||||
|
||||
**Managed objects that implement IDisposable**. The conditional block can be used to call their [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation. If you have used a safe handle to wrap your unmanaged resource, you should call the [SafeHandle.Dispose(Boolean](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle.html#System_Runtime_InteropServices_SafeHandle_Dispose_System_Boolean_) implementation here.
|
||||
|
||||
**Managed objects that consume large amounts of memory or consume scarce resources.** Freeing these objects explicitly in the `Dispose` method releases them faster than if they were reclaimed non-deterministically by the garbage collector.
|
||||
|
||||
|
||||
If the method call comes from a finalizer (that is, if *disposing* is `false`), only the code that frees unmanaged resources executes. Because the order in which the garbage collector destroys managed objects during finalization is not defined, calling this `Dispose` overload with a value of `false` prevents the finalizer from trying to release managed resources that may have already been reclaimed.
|
||||
|
||||
## Implementing the dispose pattern for a base class
|
||||
|
||||
If you implement the dispose pattern for a base class, you must provide the following:
|
||||
|
||||
> **Important**
|
||||
>
|
||||
> You should implement this pattern for all base classes that implement [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) and are not `sealed`.
|
||||
|
||||
* A [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation that calls the `Dispose(Boolean)` method.
|
||||
|
||||
* A `Dispose(Boolean)` method that performs the actual work of releasing resources.
|
||||
|
||||
* Either a class derived from [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) that wraps your unmanaged resource (recommended), or an override to the [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method. The [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle)SafeHandle class provides a finalizer that frees you from having to code one.
|
||||
|
||||
Here's the general pattern for implementing the dispose pattern for a base class that uses a safe handle.
|
||||
|
||||
```cs
|
||||
using Microsoft.Win32.SafeHandles;
|
||||
using System;
|
||||
using System.Runtime.InteropServices;
|
||||
|
||||
class BaseClass : IDisposable
|
||||
{
|
||||
// Flag: Has Dispose already been called?
|
||||
bool disposed = false;
|
||||
// Instantiate a SafeHandle instance.
|
||||
SafeHandle handle = new SafeFileHandle(IntPtr.Zero, true);
|
||||
|
||||
// Public implementation of Dispose pattern callable by consumers.
|
||||
public void Dispose()
|
||||
{
|
||||
Dispose(true);
|
||||
GC.SuppressFinalize(this);
|
||||
}
|
||||
|
||||
// Protected implementation of Dispose pattern.
|
||||
protected virtual void Dispose(bool disposing)
|
||||
{
|
||||
if (disposed)
|
||||
return;
|
||||
|
||||
if (disposing) {
|
||||
handle.Dispose();
|
||||
// Free any other managed objects here.
|
||||
//
|
||||
}
|
||||
|
||||
// Free any unmanaged objects here.
|
||||
//
|
||||
disposed = true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The previous example uses a [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle) object to illustrate the pattern; any object derived from [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) could be used instead. Note that the example does not properly instantiate its [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle) object.
|
||||
|
||||
Here's the general pattern for implementing the dispose pattern for a base class that overrides [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize).
|
||||
|
||||
```cs
|
||||
using System;
|
||||
|
||||
class BaseClass : IDisposable
|
||||
{
|
||||
// Flag: Has Dispose already been called?
|
||||
bool disposed = false;
|
||||
|
||||
// Public implementation of Dispose pattern callable by consumers.
|
||||
public void Dispose()
|
||||
{
|
||||
Dispose(true);
|
||||
GC.SuppressFinalize(this);
|
||||
}
|
||||
|
||||
// Protected implementation of Dispose pattern.
|
||||
protected virtual void Dispose(bool disposing)
|
||||
{
|
||||
if (disposed)
|
||||
return;
|
||||
|
||||
if (disposing) {
|
||||
// Free any other managed objects here.
|
||||
//
|
||||
}
|
||||
|
||||
// Free any unmanaged objects here.
|
||||
//
|
||||
disposed = true;
|
||||
}
|
||||
|
||||
~BaseClass()
|
||||
{
|
||||
Dispose(false);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Implementing the dispose pattern for a derived class
|
||||
|
||||
A class derived from a class that implements the [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface shouldn't implement [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable), because the base class implementation of [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) is inherited by its derived classes. Instead, to implement the dispose pattern for a derived class, you provide the following:
|
||||
|
||||
* A `protected Dispose(Boolean)` method that overrides the base class method and performs the actual work of releasing the resources of the derived class. This method should also call the `Dispose(Boolean)` method of the base class and pass it a value of `true` for the *disposing* argument.
|
||||
|
||||
* Either a class derived from [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) that wraps your unmanaged resource (recommended), or an override to the [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method. The [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) class provides a finalizer that frees you from having to code one. If you do provide a finalizer, it should call the `Dispose(Boolean)` overload with a *disposing* argument of `false`.
|
||||
|
||||
Here's the general pattern for implementing the dispose pattern for a derived class that uses a safe handle:
|
||||
|
||||
```cs
|
||||
using Microsoft.Win32.SafeHandles;
|
||||
using System;
|
||||
using System.Runtime.InteropServices;
|
||||
|
||||
class DerivedClass : BaseClass
|
||||
{
|
||||
// Flag: Has Dispose already been called?
|
||||
bool disposed = false;
|
||||
// Instantiate a SafeHandle instance.
|
||||
SafeHandle handle = new SafeFileHandle(IntPtr.Zero, true);
|
||||
|
||||
// Protected implementation of Dispose pattern.
|
||||
protected override void Dispose(bool disposing)
|
||||
{
|
||||
if (disposed)
|
||||
return;
|
||||
|
||||
if (disposing) {
|
||||
handle.Dispose();
|
||||
// Free any other managed objects here.
|
||||
//
|
||||
}
|
||||
|
||||
// Free any unmanaged objects here.
|
||||
//
|
||||
|
||||
disposed = true;
|
||||
// Call base class implementation.
|
||||
base.Dispose(disposing);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> **Note**
|
||||
>
|
||||
> The previous example uses a [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle) object to illustrate the pattern; any object derived from [SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) could be used instead. Note that the example does not properly instantiate its [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle) object.
|
||||
|
||||
Here's the general pattern for implementing the dispose pattern for a derived class that overrides [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize):
|
||||
|
||||
```cs
|
||||
using System;
|
||||
|
||||
class DerivedClass : BaseClass
|
||||
{
|
||||
// Flag: Has Dispose already been called?
|
||||
bool disposed = false;
|
||||
|
||||
// Protected implementation of Dispose pattern.
|
||||
protected override void Dispose(bool disposing)
|
||||
{
|
||||
if (disposed)
|
||||
return;
|
||||
|
||||
if (disposing) {
|
||||
// Free any other managed objects here.
|
||||
//
|
||||
}
|
||||
|
||||
// Free any unmanaged objects here.
|
||||
//
|
||||
disposed = true;
|
||||
|
||||
// Call the base class implementation.
|
||||
base.Dispose(disposing);
|
||||
}
|
||||
|
||||
~DerivedClass()
|
||||
{
|
||||
Dispose(false);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Using safe handles
|
||||
|
||||
Writing code for an object's finalizer is a complex task that can cause problems if not done correctly. Therefore, we recommend that you construct [System.Runtime.InteropServices.SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) objects instead of implementing a finalizer.
|
||||
|
||||
Classes derived from the [System.Runtime.InteropServices.SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) class simplify object lifetime issues by assigning and releasing handles without interruption. They contain a critical finalizer that is guaranteed to run while an application domain is unloading. The following derived classes in the [Microsoft.Win32.SafeHandles](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles) namespace provide safe handles:
|
||||
|
||||
* The [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle), [SafeMemoryMappedFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeMemoryMappedFileHandle), and [SafePipeHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafePipeHandle) class, for files, memory mapped files, and pipes.
|
||||
|
||||
* The [SafeMemoryMappedViewHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeMemoryMappedViewHandle) class, for memory views.
|
||||
|
||||
* The [SafeNCryptKeyHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeNCryptKeyHandle), [SafeNCryptProviderHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeNCryptProviderHandle), and [SafeNCryptSecretHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeNCryptSecretHandle) classes, for cryptography constructs.
|
||||
|
||||
* The [SafeRegistryHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeRegistryHandle) class, for registry keys.
|
||||
|
||||
* The [SafeWaitHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeWaitHandle) class, for wait handles.
|
||||
|
||||
## Using a safe handle to implement the dispose pattern for a base class
|
||||
|
||||
The following example illustrates the dispose pattern for a base class, `DisposableStreamResource`, that uses a safe handle to encapsulate unmanaged resources. It defines a `DisposableResource` class that uses a [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle) to wrap a [Stream](https://docs.microsoft.com/dotnet/core/api/System.IO.Stream) object that represents an open file. The `DisposableResource` method also includes a single property, `Size`, that returns the total number of bytes in the file stream.
|
||||
|
||||
```cs
|
||||
using Microsoft.Win32.SafeHandles;
|
||||
using System;
|
||||
using System.IO;
|
||||
using System.Runtime.InteropServices;
|
||||
|
||||
public class DisposableStreamResource : IDisposable
|
||||
{
|
||||
// Define constants.
|
||||
protected const uint GENERIC_READ = 0x80000000;
|
||||
protected const uint FILE_SHARE_READ = 0x00000001;
|
||||
protected const uint OPEN_EXISTING = 3;
|
||||
protected const uint FILE_ATTRIBUTE_NORMAL = 0x80;
|
||||
protected IntPtr INVALID_HANDLE_VALUE = new IntPtr(-1);
|
||||
private const int INVALID_FILE_SIZE = unchecked((int) 0xFFFFFFFF);
|
||||
|
||||
// Define Windows APIs.
|
||||
[DllImport("kernel32.dll", EntryPoint = "CreateFileW", CharSet = CharSet.Unicode)]
|
||||
protected static extern IntPtr CreateFile (
|
||||
string lpFileName, uint dwDesiredAccess,
|
||||
uint dwShareMode, IntPtr lpSecurityAttributes,
|
||||
uint dwCreationDisposition, uint dwFlagsAndAttributes,
|
||||
IntPtr hTemplateFile);
|
||||
|
||||
[DllImport("kernel32.dll")]
|
||||
private static extern int GetFileSize(SafeFileHandle hFile, out int lpFileSizeHigh);
|
||||
|
||||
// Define locals.
|
||||
private bool disposed = false;
|
||||
private SafeFileHandle safeHandle;
|
||||
private long bufferSize;
|
||||
private int upperWord;
|
||||
|
||||
public DisposableStreamResource(string filename)
|
||||
{
|
||||
if (filename == null)
|
||||
throw new ArgumentNullException("The filename cannot be null.");
|
||||
else if (filename == "")
|
||||
throw new ArgumentException("The filename cannot be an empty string.");
|
||||
|
||||
IntPtr handle = CreateFile(filename, GENERIC_READ, FILE_SHARE_READ,
|
||||
IntPtr.Zero, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL,
|
||||
IntPtr.Zero);
|
||||
if (handle != INVALID_HANDLE_VALUE)
|
||||
safeHandle = new SafeFileHandle(handle, true);
|
||||
else
|
||||
throw new FileNotFoundException(String.Format("Cannot open '{0}'", filename));
|
||||
|
||||
// Get file size.
|
||||
bufferSize = GetFileSize(safeHandle, out upperWord);
|
||||
if (bufferSize == INVALID_FILE_SIZE)
|
||||
bufferSize = -1;
|
||||
else if (upperWord > 0)
|
||||
bufferSize = (((long)upperWord) << 32) + bufferSize;
|
||||
}
|
||||
|
||||
public long Size
|
||||
{ get { return bufferSize; } }
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
Dispose(true);
|
||||
GC.SuppressFinalize(this);
|
||||
}
|
||||
|
||||
protected virtual void Dispose(bool disposing)
|
||||
{
|
||||
if (disposed) return;
|
||||
|
||||
// Dispose of managed resources here.
|
||||
if (disposing)
|
||||
safeHandle.Dispose();
|
||||
|
||||
// Dispose of any unmanaged resources not wrapped in safe handles.
|
||||
|
||||
disposed = true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Using a safe handle to implement the dispose pattern for a derived class
|
||||
|
||||
The following example illustrates the dispose pattern for a derived class, `DisposableStreamResource2`, that inherits from the `DisposableStreamResource` class presented in the previous example. The class adds an additional method, `WriteFileInfo`, and uses a [SafeFileHandle](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles.SafeFileHandle) object to wrap the handle of the writable file.
|
||||
|
||||
```cs
|
||||
using Microsoft.Win32.SafeHandles;
|
||||
using System;
|
||||
using System.IO;
|
||||
using System.Runtime.InteropServices;
|
||||
using System.Threading;
|
||||
|
||||
public class DisposableStreamResource2 : DisposableStreamResource
|
||||
{
|
||||
// Define additional constants.
|
||||
protected const uint GENERIC_WRITE = 0x40000000;
|
||||
protected const uint OPEN_ALWAYS = 4;
|
||||
|
||||
// Define additional APIs.
|
||||
[DllImport("kernel32.dll")]
|
||||
protected static extern bool WriteFile(
|
||||
SafeFileHandle safeHandle, string lpBuffer,
|
||||
int nNumberOfBytesToWrite, out int lpNumberOfBytesWritten,
|
||||
IntPtr lpOverlapped);
|
||||
|
||||
// Define locals.
|
||||
private bool disposed = false;
|
||||
private string filename;
|
||||
private bool created = false;
|
||||
private SafeFileHandle safeHandle;
|
||||
|
||||
public DisposableStreamResource2(string filename) : base(filename)
|
||||
{
|
||||
this.filename = filename;
|
||||
}
|
||||
|
||||
public void WriteFileInfo()
|
||||
{
|
||||
if (! created) {
|
||||
IntPtr hFile = CreateFile(@".\FileInfo.txt", GENERIC_WRITE, 0,
|
||||
IntPtr.Zero, OPEN_ALWAYS,
|
||||
FILE_ATTRIBUTE_NORMAL, IntPtr.Zero);
|
||||
if (hFile != INVALID_HANDLE_VALUE)
|
||||
safeHandle = new SafeFileHandle(hFile, true);
|
||||
else
|
||||
throw new IOException("Unable to create output file.");
|
||||
|
||||
created = true;
|
||||
}
|
||||
|
||||
string output = String.Format("{0}: {1:N0} bytes\n", filename, Size);
|
||||
int bytesWritten;
|
||||
bool result = WriteFile(safeHandle, output, output.Length, out bytesWritten, IntPtr.Zero);
|
||||
}
|
||||
|
||||
protected new virtual void Dispose(bool disposing)
|
||||
{
|
||||
if (disposed) return;
|
||||
|
||||
// Release any managed resources here.
|
||||
if (disposing)
|
||||
safeHandle.Dispose();
|
||||
|
||||
disposed = true;
|
||||
|
||||
// Release any unmanaged resources not wrapped by safe handles here.
|
||||
|
||||
// Call the base class implementation.
|
||||
base.Dispose(true);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
[SuppressFinalize](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_SuppressFinalize_System_Object_)
|
||||
|
||||
[IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable)
|
||||
|
||||
[IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose)
|
||||
|
||||
[Microsoft.Win32.SafeHandles](https://docs.microsoft.com/dotnet/core/api/Microsoft.Win32.SafeHandles)
|
||||
|
||||
[System.Runtime.InteropServices.SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle)
|
||||
|
||||
[IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose)
|
|
@ -1,54 +0,0 @@
|
|||
---
|
||||
title: Automatic Memory Management and Garbage Collection
|
||||
description: Automatic Memory Management and Garbage Collection
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 88e72321-b44c-4627-bdb0-163d14c152a4
|
||||
---
|
||||
|
||||
# Automatic Memory Management and Garbage Collection
|
||||
|
||||
Automatic memory management is one of the services that the Common Language Runtime provides during managed nxecution. The Common Language Runtime's garbage collector manages the allocation and release of memory for an application. For developers, this means that you do not have to write code to perform memory management tasks when you develop managed applications. Automatic memory management can eliminate common problems, such as forgetting to free an object and causing a memory leak, or attempting to access memory for an object that has already been freed. This section describes how the garbage collector allocates and releases memory.
|
||||
|
||||
## Allocating Memory
|
||||
|
||||
When you initialize a new process, the runtime reserves a contiguous region of address space for the process. This reserved address space is called the managed heap. The managed heap maintains a pointer to the address where the next object in the heap will be allocated. Initially, this pointer is set to the managed heap's base address. All reference types are allocated on the managed heap. When an application creates the first reference type, memory is allocated for the type at the base address of the managed heap. When the application creates the next object, the garbage collector allocates memory for it in the address space immediately following the first object. As long as address space is available, the garbage collector continues to allocate space for new objects in this manner.
|
||||
|
||||
Allocating memory from the managed heap is faster than unmanaged memory allocation. Because the runtime allocates memory for an object by adding a value to a pointer, it is almost as fast as allocating memory from the stack. In addition, because new objects that are allocated consecutively are stored contiguously in the managed heap, an application can access the objects very quickly.
|
||||
|
||||
## Releasing Memory
|
||||
|
||||
The garbage collector's optimizing engine determines the best time to perform a collection based on the allocations being made. When the garbage collector performs a collection, it releases the memory for objects that are no longer being used by the application. It determines which objects are no longer being used by examining the application's roots. Every application has a set of roots. Each root either refers to an object on the managed heap or is set to null. An application's roots include static fields, local variables and parameters on a thread's stack, and CPU registers. The garbage collector has access to the list of active roots that the just-in-time (JIT) compiler and the runtime maintain. Using this list, it examines an application's roots, and in the process creates a graph that contains all the objects that are reachable from the roots.
|
||||
|
||||
Objects that are not in the graph are unreachable from the application's roots. The garbage collector considers unreachable objects garbage and will release the memory allocated for them. During a collection, the garbage collector examines the managed heap, looking for the blocks of address space occupied by unreachable objects. As it discovers each unreachable object, it uses a memory-copying function to compact the reachable objects in memory, freeing up the blocks of address spaces allocated to unreachable objects. Once the memory for the reachable objects has been compacted, the garbage collector makes the necessary pointer corrections so that the application's roots point to the objects in their new locations. It also positions the managed heap's pointer after the last reachable object. Note that memory is compacted only if a collection discovers a significant number of unreachable objects. If all the objects in the managed heap survive a collection, then there is no need for memory compaction.
|
||||
|
||||
To improve performance, the runtime allocates memory for large objects in a separate heap. The garbage collector automatically releases the memory for large objects. However, to avoid moving large objects in memory, this memory is not compacted.
|
||||
|
||||
## Generations and Performance
|
||||
|
||||
To optimize the performance of the garbage collector, the managed heap is divided into three generations: 0, 1, and 2. The runtime's garbage collection algorithm is based on several generalizations that the computer software industry has discovered to be true by experimenting with garbage collection schemes. First, it is faster to compact the memory for a portion of the managed heap than for the entire managed heap. Secondly, newer objects will have shorter lifetimes and older objects will have longer lifetimes. Lastly, newer objects tend to be related to each other and accessed by the application around the same time.
|
||||
|
||||
The runtime's garbage collector stores new objects in generation 0. Objects created early in the application's lifetime that survive collections are promoted and stored in generations 1 and 2. The process of object promotion is described later in this topic. Because it is faster to compact a portion of the managed heap than the entire heap, this scheme allows the garbage collector to release the memory in a specific generation rather than release the memory for the entire managed heap each time it performs a collection.
|
||||
|
||||
In reality, the garbage collector performs a collection when generation 0 is full. If an application attempts to create a new object when generation 0 is full, the garbage collector discovers that there is no address space remaining in generation 0 to allocate for the object. The garbage collector performs a collection in an attempt to free address space in generation 0 for the object. The garbage collector starts by examining the objects in generation 0 rather than all objects in the managed heap. This is the most efficient approach, because new objects tend to have short lifetimes, and it is expected that many of the objects in generation 0 will no longer be in use by the application when a collection is performed. In addition, a collection of generation 0 alone often reclaims enough memory to allow the application to continue creating new objects.
|
||||
|
||||
After the garbage collector performs a collection of generation 0, it compacts the memory for the reachable objects as explained in [Releasing Memory](#releasing-memory) earlier in this topic. The garbage collector then promotes these objects and considers this portion of the managed heap generation 1. Because objects that survive collections tend to have longer lifetimes, it makes sense to promote them to a higher generation. As a result, the garbage collector does not have to reexamine the objects in generations 1 and 2 each time it performs a collection of generation 0.
|
||||
|
||||
After the garbage collector performs its first collection of generation 0 and promotes the reachable objects to generation 1, it considers the remainder of the managed heap generation 0. It continues to allocate memory for new objects in generation 0 until generation 0 is full and it is necessary to perform another collection. At this point, the garbage collector's optimizing engine determines whether it is necessary to examine the objects in older generations. For example, if a collection of generation 0 does not reclaim enough memory for the application to successfully complete its attempt to create a new object, the garbage collector can perform a collection of generation 1, then generation 2. If this does not reclaim enough memory, the garbage collector can perform a collection of generations 2, 1, and 0. After each collection, the garbage collector compacts the reachable objects in generation 0 and promotes them to generation 1. Objects in generation 1 that survive collections are promoted to generation 2. Because the garbage collector supports only three generations, objects in generation 2 that survive a collection remain in generation 2 until they are determined to be unreachable in a future collection.
|
||||
|
||||
## Releasing Memory for Unmanaged Resources
|
||||
|
||||
For the majority of the objects that your application creates, you can rely on the garbage collector to automatically perform the necessary memory management tasks. However, unmanaged resources require explicit cleanup. The most common type of unmanaged resource is an object that wraps an operating system resource, such as a file handle, window handle, or network connection. Although the garbage collector is able to track the lifetime of a managed object that encapsulates an unmanaged resource, it does not have specific knowledge about how to clean up the resource. When you create an object that encapsulates an unmanaged resource, it is recommended that you provide the necessary code to clean up the unmanaged resource in a public `Dispose` method. By providing a `Dispose` method, you enable users of your object to explicitly free its memory when they are finished with the object. When you use an object that encapsulates an unmanaged resource, you should be aware of `Dispose` and call it as necessary. For more information about cleaning up unmanaged resources and an example of a design pattern for implementing `Dispose`, see [Garbage Collection](garbage-collection.md).
|
||||
|
||||
## See Also
|
||||
|
||||
[System.GC](https://docs.microsoft.com/dotnet/core/api/System.GC)
|
||||
|
||||
[Garbage Collection](garbage-collection.md)
|
||||
|
|
@ -1,44 +0,0 @@
|
|||
---
|
||||
title: Induced Collections
|
||||
description: Induced Collections
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 29b8b5d0-7431-4fca-ab37-680ab36508fc
|
||||
---
|
||||
|
||||
# Induced Collections
|
||||
|
||||
In most cases, the garbage collector can determine the best time to perform a collection, and you should let it run independently. There are rare situations when a forced collection might improve your application's performance. In these cases, you can induce garbage collection by using the [GC.Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) method to force a garbage collection.
|
||||
|
||||
Use the [Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) method when there is a significant reduction in the amount of memory being used at a specific point in your application's code. For example, if your application uses a complex dialog box that has several controls, calling [Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) when the dialog box is closed could improve performance by immediately reclaiming the memory used by the dialog box. Be sure that your application is not inducing garbage collection too frequently, because that can decrease performance if the garbage collector is trying to reclaim objects at non-optimal times. You can supply a [GCCollectionMode.Optimized](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Optimized) enumeration value to the [Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) method to collect only when collection would be productive, as discussed in the next section.
|
||||
|
||||
## GC collection mode
|
||||
|
||||
You can use one of the [GC.Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) method overloads that includes a [GCCollectionMode](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode) value to specify the behavior for a forced collection as follows.
|
||||
|
||||
GCCollectionMode value | Description
|
||||
---------------------- | -----------
|
||||
[Default](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Default) | Uses the default garbage collection setting for the running version of the .NET Framework.
|
||||
[Forced](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Forced) | Forces garbage collection to occur immediately. This is equivalent to calling the [GC.Collect()](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect) overload. It results in a full blocking collection of all generations. You can also compact the large object heap by setting the [GCSettings.LargeObjectHeapCompactionMode](https://docs.microsoft.com/dotnet/core/api/GCSettingsSystem_Runtime_GCSettings_LargeObjectHeapCompactionMode) property to [GCLargeObjectHeapCompactionMode.CompactOnce](https://docs.microsoft.com/dotnet/core/api/GCLargeObjectHeapCompactionMode#System_Runtime_GCLargeObjectHeapCompactionMode_CompactOnce) before forcing an immediate full blocking garbage collection.
|
||||
[Optimized](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Optimized) | Enables the garbage collector to determine whether the current time is optimal to reclaim objects. The garbage collector could determine that a collection would not be productive enough to be justified, in which case it will return without reclaiming objects.
|
||||
|
||||
## Background or blocking collections
|
||||
|
||||
You can call the [GC.Collect(Int32, GCCollectionMode, Boolean)](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect_System_Int32_System_GCCollectionMode_System_Boolean_) method overload to specify whether an induced collection is blocking or not. The type of collection performed depends on a combination of the method's *mode* and *blocking* parameters. *mode* is a member of the [GCCollectionMode](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode) enumeration, and *blocking* is a [Boolean](https://docs.microsoft.com/dotnet/core/api/System.Boolean) value. The following table summarizes the interaction of the mode and blocking arguments.
|
||||
|
||||
*mode* | *blocking* = true | *blocking* = false
|
||||
------ | ----------------- | ------------------
|
||||
[Forced](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Forced) or [Default](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Default) | A blocking collection is performed as soon as possible. If a background collection is in progress and generation is 0 or 1, the [Collect(Int32, GCCollectionMode, Boolean)](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect_System_Int32_System_GCCollectionMode_System_Boolean_) method immediately triggers a blocking collection and returns when the collection is finished. If a background collection is in progress and the generation parameter is 2, the method waits until the background collection is finished, triggers a blocking generation 2 collection, and then returns. | A collection is performed as soon as possible. The [Collect(Int32, GCCollectionMode, Boolean)](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect_System_Int32_System_GCCollectionMode_System_Boolean_) method requests a background collection, but this is not guaranteed; depending on the circumstances, a blocking collection may still be performed. If a background collection is already in progress, the method returns immediately.
|
||||
[Optimized](https://docs.microsoft.com/dotnet/core/api/System.GCCollectionMode#System_GCCollectionMode_Optimized) | A blocking collection may be performed, depending on the state of the garbage collector and the generation parameter. The garbage collector tries to provide optimal performance. | A collection may be performed, depending on the state of the garbage collector. The [Collect(Int32, GCCollectionMode, Boolean)](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect_System_Int32_System_GCCollectionMode_System_Boolean_) method requests a background collection, but this is not guaranteed; depending on the circumstances, a blocking collection may still be performed. The garbage collector tries to provide optimal performance. If a background collection is already in progress, the method returns immediately.
|
||||
|
||||
## See Also
|
||||
|
||||
[Latency Modes](latency.md)
|
||||
|
||||
[Garbage Collection](garbage-collection.md)
|
|
@ -1,60 +0,0 @@
|
|||
---
|
||||
title: Latency Modes
|
||||
description: Latency Modes
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: d5224f62-ebe1-49d8-9c56-4f37ac42b2f8
|
||||
---
|
||||
|
||||
# Latency Modes
|
||||
|
||||
To reclaim objects, the garbage collector must stop all the executing threads in an application. In some situations, such as when an application retrieves data or displays content, a full garbage collection can occur at a critical time and impede performance. You can adjust the intrusiveness of the garbage collector by setting the [GCSettings.LatencyMode](https://docs.microsoft.com/dotnet/core/api/GCSettings#System_Runtime_GCSettings_LatencyMode) property to one of the [System.Runtime.GCLatencyMode](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode) values.
|
||||
|
||||
Latency refers to the time that the garbage collector intrudes in your application. During low latency periods, the garbage collector is more conservative and less intrusive in reclaiming objects. The [System.Runtime.GCLatencyMode](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode) enumeration provides two low latency settings:
|
||||
|
||||
* [LowLatency](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_LowLatency) suppresses generation 2 collections and performs only generation 0 and 1 collections. It can be used only for short periods of time. Over longer periods, if the system is under memory pressure, the garbage collector will trigger a collection, which can briefly pause the application and disrupt a time-critical operation.
|
||||
|
||||
* [SustainedLowLatency](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_SustainedLowLatency) suppresses foreground generation 2 collections and performs only generation 0, 1, and background generation 2 collections. It can be used for longer periods of time.
|
||||
|
||||
During low latency periods, generation 2 collections are suppressed unless the following occurs:
|
||||
|
||||
* The system receives a low memory notification from the operating system.
|
||||
|
||||
* Your application code induces a collection by calling the [GC.Collect](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect_System_Int32_) method and specifying 2 for the generation parameter.
|
||||
|
||||
The following table lists the application scenarios for using the [GCLatencyMode](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode) values.
|
||||
|
||||
Latency mode | Application scenarios
|
||||
------------ | ---------------------
|
||||
[Batch](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_Batch) | For applications that have no UI or server-side operations.
|
||||
[Interactive](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_Interactive) | For most applications that have a UI.
|
||||
[LowLatency](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_LowLatency) | For applications that have short-term, time-sensitive operations during which interruptions from the garbage collector could be disruptive. For example, applications that do animation rendering or data acquisition functions.
|
||||
[SustainedLowLatency](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_SustainedLowLatency) | For applications that have time-sensitive operations for a contained but potentially longer duration of time during which interruptions from the garbage collector could be disruptive. For example, applications that need quick response times as market data changes during trading hours. This mode results in a larger managed heap size than other modes. Because it does not compact the managed heap, higher fragmentation is possible. Ensure that sufficient memory is available.
|
||||
|
||||
## Guidelines for Using Low Latency
|
||||
|
||||
When you use [LowLatency](https://docs.microsoft.com/dotnet/core/api/System.Runtime.GCLatencyMode#System_Runtime_GCLatencyMode_LowLatency) mode, consider the following guidelines:
|
||||
|
||||
* Keep the period of time in low latency as short as possible.
|
||||
|
||||
* Avoid allocating high amounts of memory during low latency periods. Low memory notifications can occur because garbage collection reclaims fewer objects.
|
||||
|
||||
* While in the low latency mode, minimize the number of allocations you make, in particular allocations onto the Large Object Heap and pinned objects.
|
||||
|
||||
* Be aware of threads that could be allocating. Because the [LatencyMode](https://docs.microsoft.com/dotnet/core/api/GCSettings#System_Runtime_GCSettings_LatencyMode) property setting is process-wide, you could generate an [OutOfMemoryException](https://docs.microsoft.com/dotnet/core/api/System.OutOfMemoryException) on any thread that may be allocating.
|
||||
|
||||
* You can force generation 2 collections during a low latency period by calling the [GC.Collect(Int32, GCCollectionMode)](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_Collect_System_Int32_System_GCCollectionMode_) method.
|
||||
|
||||
## See Also
|
||||
|
||||
[System.GC](https://docs.microsoft.com/dotnet/core/api/System.GC)
|
||||
|
||||
[Induced Collections](induced.md)
|
||||
|
||||
[Garbage Collection](garbage-collection.md)
|
|
@ -1,45 +0,0 @@
|
|||
---
|
||||
title: Cleaning Up Unmanaged Resources
|
||||
description: Cleaning Up Unmanaged Resources
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: ac685b07-c5a8-4693-b27c-1549cc8ab3e6
|
||||
---
|
||||
|
||||
# Cleaning Up Unmanaged Resources
|
||||
|
||||
For the majority of the objects that your app creates, you can rely on the .NET garbage collector to handle memory management. However, when you create objects that include unmanaged resources, you must explicitly release those resources when you finish using them in your app. The most common types of unmanaged resource are objects that wrap operating system resources, such as files, windows, network connections, or database connections. Although the garbage collector is able to track the lifetime of an object that encapsulates an unmanaged resource, it doesn't know how to release and clean up the unmanaged resource.
|
||||
|
||||
If your types use unmanaged resources, you should do the following:
|
||||
|
||||
* Implement the dispose pattern. This requires that you provide an [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation to enable the deterministic release of unmanaged resources. A consumer of your type calls [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) when the object (and the resources it uses) is no longer needed. The [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method immediately releases the unmanaged resources.
|
||||
|
||||
* Provide for your unmanaged resources to be released in the event that a consumer of your type forgets to call [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose). There are two ways to do this:
|
||||
|
||||
* Use a safe handle to wrap your unmanaged resource. This is the recommended technique. Safe handles are derived from the [System.Runtime.InteropServices.SafeHandle](https://docs.microsoft.com/dotnet/core/api/System.Runtime.InteropServices.SafeHandle) class and include a robust [Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method. When you use a safe handle, you simply implement the [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface and call your safe handle's [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method in your [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation. The safe handle's finalizer is called automatically by the garbage collector if its [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method is not called.
|
||||
|
||||
—or—
|
||||
|
||||
* Override the [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method. Finalization enables the non-deterministic release of unmanaged resources when the consumer of a type fails to call [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) to dispose of them deterministically. However, because object finalization can be a complex and error-prone operation, we recommend that you use a safe handle instead of providing your own finalizer.
|
||||
|
||||
Consumers of your type can then call your [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation directly to free memory used by unmanaged resources. When you properly implement a [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method, either your safe handle's [Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method or your own override of the [Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method becomes a safeguard to clean up resources in the event that the [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method is not called.
|
||||
|
||||
## In This Section
|
||||
|
||||
[Implementing a Dispose Method](implementing-dispose.md) - Describes how to implement the dispose pattern for releasing unmanaged resources.
|
||||
|
||||
[Using Objects That Implement IDisposable](using-objects.md) - Describes how consumers of a type ensure that its Dispose implementation is called. We recommend using the C# using statement or the Visual Basic Using statement to do this.
|
||||
|
||||
## Reference
|
||||
|
||||
[System.IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) - Defines the `Dispose` method for releasing unmanaged resources.
|
||||
|
||||
[Object.Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) - Provides for object finalization if unmanaged resources are not released by the `Dispose` method.
|
||||
|
||||
[GC.SuppressFinalize](https://docs.microsoft.com/dotnet/core/api/System.GC#System_GC_SuppressFinalize_System_Object_) - Suppresses finalization. This method is customarily called from a `Dispose` method to prevent a finalizer from executing.
|
|
@ -1,163 +0,0 @@
|
|||
---
|
||||
title: Using Objects That Implement IDisposable
|
||||
description: Using Objects That Implement IDisposable
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: cf2c349e-cb30-4976-b9aa-80ce9742df12
|
||||
---
|
||||
|
||||
# Using Objects That Implement IDisposable
|
||||
|
||||
The common language runtime's garbage collector reclaims the memory used by unmanaged objects, but types that use unmanaged resources implement the [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface to allow this unmanaged memory to be reclaimed. When you finish using an object that implements [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable), you should call the object's [IDisposable.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) implementation. You can do this in one of two ways:
|
||||
|
||||
* With the `using` statement.
|
||||
|
||||
* By implementing a `try/finally` block.
|
||||
|
||||
## The using statement
|
||||
|
||||
The `using` statement simplifies the code that you must write to create and clean up an object. The using statement obtains one or more resources, executes the statements that you specify, and automatically disposes of the object. However, the `using` statement is useful only for objects that are used within the scope of the method in which they are constructed.
|
||||
|
||||
The following example uses the `using` statement to create and release a [System.IO.StreamReader](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader) object.
|
||||
|
||||
```cs
|
||||
using System;
|
||||
using System.IO;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Char[] buffer = new Char[50];
|
||||
using (StreamReader s = new StreamReader("File1.txt")) {
|
||||
int charsRead = 0;
|
||||
while (s.Peek() != -1) {
|
||||
charsRead = s.Read(buffer, 0, buffer.Length);
|
||||
//
|
||||
// Process characters read.
|
||||
//
|
||||
}
|
||||
s.Close();
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Note that although the [StreamReader](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader) class implements the [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) interface, which indicates that it uses an unmanaged resource, the example doesn't explicitly call the [StreamReader.Dispose](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader#System_IO_StreamReader_Dispose_System_Boolean_) method. When the C# compiler encounters the `using` statement, it emits intermediate language (IL) that is equivalent to the following code that explicitly contains a `try/finally` block.
|
||||
|
||||
```cs
|
||||
using System;
|
||||
using System.IO;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Char[] buffer = new Char[50];
|
||||
{
|
||||
StreamReader s = new StreamReader("File1.txt");
|
||||
try {
|
||||
int charsRead = 0;
|
||||
while (s.Peek() != -1) {
|
||||
charsRead = s.Read(buffer, 0, buffer.Length);
|
||||
//
|
||||
// Process characters read.
|
||||
//
|
||||
}
|
||||
s.Close();
|
||||
}
|
||||
finally {
|
||||
if (s != null)
|
||||
((IDisposable)s).Dispose();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `using` statement also allows you to acquire multiple resources in a single statement, which is internally equivalent to nested using statements. The following example instantiates two [StreamReader](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader) objects to read the contents of two different files.
|
||||
|
||||
```cs
|
||||
using System;
|
||||
using System.IO;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
Char[] buffer1 = new Char[50], buffer2 = new Char[50];
|
||||
|
||||
using (StreamReader version1 = new StreamReader("file1.txt"),
|
||||
version2 = new StreamReader("file2.txt")) {
|
||||
int charsRead1, charsRead2 = 0;
|
||||
while (version1.Peek() != -1 && version2.Peek() != -1) {
|
||||
charsRead1 = version1.Read(buffer1, 0, buffer1.Length);
|
||||
charsRead2 = version2.Read(buffer2, 0, buffer2.Length);
|
||||
//
|
||||
// Process characters read.
|
||||
//
|
||||
}
|
||||
version1.Close();
|
||||
version2.Close();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Try/finally block
|
||||
|
||||
Instead of wrapping a `try/finally` block in a `using` statement, you may choose to implement the `try/finally` block directly. This may be your personal coding style, or you might want to do this for one of the following reasons:
|
||||
|
||||
* To include a `catch` block to handle any exceptions thrown in the ``try block. Otherwise, any exceptions thrown by the `using` statement are unhandled, as are any exceptions thrown within the `using` block if a `try/catch` block isn't present.
|
||||
|
||||
* To instantiate an object that implements [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) whose scope is not local to the block within which it is declared.
|
||||
|
||||
The following example is similar to the previous example, except that it uses a `try/catch/finally` block to instantiate, use, and dispose of a [StreamReader](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader) object, and to handle any exceptions thrown by the [StreamReader](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader) constructor and its [ReadToEnd](https://docs.microsoft.com/dotnet/core/api/System.IO.StreamReader#System_IO_StreamReader_ReadToEnd) method. Note that the code in the `finally` block checks that the object that implements [IDisposable](https://docs.microsoft.com/dotnet/core/api/System.IDisposable) isn't `null` before it calls the [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method. Failure to do this can result in a [NullReferenceException](https://docs.microsoft.com/dotnet/core/api/System.NullReferenceException) exception at run time.
|
||||
|
||||
```cs
|
||||
using System;
|
||||
using System.Globalization;
|
||||
using System.IO;
|
||||
|
||||
public class Example
|
||||
{
|
||||
public static void Main()
|
||||
{
|
||||
StreamReader sr = null;
|
||||
try {
|
||||
sr = new StreamReader("file1.txt");
|
||||
String contents = sr.ReadToEnd();
|
||||
sr.Close();
|
||||
Console.WriteLine("The file has {0} text elements.",
|
||||
new StringInfo(contents).LengthInTextElements);
|
||||
}
|
||||
catch (FileNotFoundException) {
|
||||
Console.WriteLine("The file cannot be found.");
|
||||
}
|
||||
catch (IOException) {
|
||||
Console.WriteLine("An I/O error has occurred.");
|
||||
}
|
||||
catch (OutOfMemoryException) {
|
||||
Console.WriteLine("There is insufficient memory to read the file.");
|
||||
}
|
||||
finally {
|
||||
if (sr != null) sr.Dispose();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
You can follow this basic pattern if you choose to implement or must implement a `try/finally` block, because your programming language doesn't support a `using` statement but does allow direct calls to the [Dispose](https://docs.microsoft.com/dotnet/core/api/System.IDisposable#System_IDisposable_Dispose) method.
|
||||
|
||||
## See Also
|
||||
|
||||
[Cleaning Up Unmanaged Resources](unmanaged.md)
|
||||
|
||||
|
|
@ -1,55 +0,0 @@
|
|||
---
|
||||
title: Weak References
|
||||
description: Weak References
|
||||
keywords: .NET, .NET Core
|
||||
author: shoag
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 5c530594-d44e-401f-a7ce-3eaf4dbb1dd5
|
||||
---
|
||||
|
||||
# Weak References
|
||||
|
||||
The garbage collector cannot collect an object in use by an application while the application's code can reach that object. The application is said to have a strong reference to the object.
|
||||
|
||||
A weak reference permits the garbage collector to collect the object while still allowing the application to access the object. A weak reference is valid only during the indeterminate amount of time until the object is collected when no strong references exist. When you use a weak reference, the application can still obtain a strong reference to the object, which prevents it from being collected. However, there is always the risk that the garbage collector will get to the object first before a strong reference is reestablished.
|
||||
|
||||
Weak references are useful for objects that use a lot of memory, but can be recreated easily if they are reclaimed by garbage collection.
|
||||
|
||||
Suppose a tree view displays a complex hierarchical choice of options to the user. If the underlying data is large, keeping the tree in memory is inefficient when the user is involved with something else in the application.
|
||||
|
||||
When the user switches away to another part of the application, you can use the [WeakReference](https://docs.microsoft.com/dotnet/core/api/System.WeakReference) or [WeakReference<T>](https://docs.microsoft.com/dotnet/core/api/System.WeakReference%601) class to create a weak reference to the tree and destroy all strong references. When the user switches back to the tree, the application attempts to obtain a strong reference to the tree and, if successful, avoids reconstructing the tree.
|
||||
|
||||
To establish a weak reference with an object, you create a `WeakReference` using the instance of the object to be tracked. You then set the [Target](https://docs.microsoft.com/dotnet/core/api/System.WeakReference.html#System_WeakReference_Target) property to that object and set the original reference to the object to null.
|
||||
|
||||
## Short and Long Weak References
|
||||
|
||||
You can create a short weak reference or a long weak reference:
|
||||
|
||||
* Short
|
||||
|
||||
The target of a short weak reference becomes `null` when the object is reclaimed by garbage collection. The weak reference is itself a managed object, and is subject to garbage collection just like any other managed object. A short weak reference is the default constructor for `WeakReference`.
|
||||
|
||||
* Long
|
||||
|
||||
A long weak reference is retained after the object's [Finalize](https://docs.microsoft.com/dotnet/core/api/System.Object#System_Object_Finalize) method has been called. This allows the object to be recreated, but the state of the object remains unpredictable. To use a long reference, specify `true` in the `WeakReference` constructor.
|
||||
|
||||
If the object's type does not have a `Finalize` method, the short weak reference functionality applies and the weak reference is valid only until the target is collected, which can occur anytime after the finalizer is run.
|
||||
|
||||
To establish a strong reference and use the object again, cast the `Target` property of a `WeakReference` to the type of the object. If the `Target` property returns `null`, the object was collected; otherwise, you can continue to use the object because the application has regained a strong reference to it.
|
||||
|
||||
## Guidelines for Using Weak References
|
||||
|
||||
Use long weak references only when necessary as the state of the object is unpredictable after finalization.
|
||||
|
||||
Avoid using weak references to small objects because the pointer itself may be as large or larger.
|
||||
|
||||
Avoid using weak references as an automatic solution to memory management problems. Instead, develop an effective caching policy for handling your application's objects.
|
||||
|
||||
## See Also
|
||||
|
||||
[Garbage Collection](garbage-collection.md)
|
|
@ -29,4 +29,4 @@ The GC has has an additional heap for large objects called the Large Object Heap
|
|||
|
||||
Generation 2 and LOH collections can take noticeable time for programs that have run for a long time or operate over large amounts of data. Large server programs are known to have heaps in the 10s of GBs. The GC employs a variety of techniques to reduce the amount of time that it blocks program execution. The primary approach is to do as much garbage collection work as possible on a background thread in a way that does not interfere with program execution. The GC also exposes a few ways for developers to influence its behavior, which can be quite useful to improve performance.
|
||||
|
||||
For more information, see [Automatic Memory Management and Garbage Collection](garbagecollection/index.md).
|
||||
For more information, see [Garbage Collection](http://msdn.microsoft.com/library/0xy59wtx.aspx) on MSDN.
|
||||
|
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -0,0 +1,134 @@
|
|||
# [Welcome](welcome.md)
|
||||
# [About .NET](about/index.md)
|
||||
## [.NET Products](about/products.md)
|
||||
|
||||
# [Learn C#](csharp/index.md)
|
||||
## [Tutorials](tutorials/index.md)
|
||||
### [Console Application](tutorials/getting-started-with-csharp/console-teleprompter.md)
|
||||
### [REST client](tutorials/getting-started-with-csharp/console-webapiclient.md)
|
||||
### [Working with LINQ](tutorials/getting-started-with-csharp/working-with-linq.md)
|
||||
### [Microservices hosted in Docker](tutorials/getting-started-with-csharp/microservices.md)
|
||||
## [🔧 Tour of C#](csharp/features.md)
|
||||
### What's new in C# 6
|
||||
<!-- This page, or its parent should point to the Roslyn repo for folks that want
|
||||
to know what's next -->
|
||||
## [🔧 C# Concepts](csharp/concepts.md)
|
||||
### [🔧 C# Type system](csharp/type-system.md)
|
||||
### [Properties](csharp/properties.md)
|
||||
### [Indexers](csharp/indexers.md)
|
||||
### [🔧 Generics](csharp/generics.md)
|
||||
### [Iterators](csharp/iterators.md)
|
||||
### [🔧 Language Integrated Query (LINQ)](csharp/linq.md)
|
||||
### [Delegates & events](csharp/delegates-events.md)
|
||||
#### [Introduction to Delegates](csharp/delegates-overview.md)
|
||||
#### [System.Delegate and the delegate keyword](csharp/delegate-class.md)
|
||||
#### [Strongly Typed Delegates](csharp/delegates-strongly-typed.md)
|
||||
#### [Common Patterns for Delegates](csharp/delegates-patterns.md)
|
||||
#### [Introduction to Events](csharp/events-overview.md)
|
||||
#### [The .NET Event Pattern](csharp/event-pattern.md)
|
||||
#### [The Updated .NET Event Pattern](csharp/modern-events.md)
|
||||
#### [Distinguishing Delegates and Events](csharp/distinguish-delegates-events.md)
|
||||
### [🔧 Parallel programming](csharp/parallel.md)
|
||||
### [Asynchronous programming](csharp/async.md)
|
||||
### [🔧 Lambda Expressions](csharp/lambda-expressions.md)
|
||||
<!-- This is a sidebar the delegates topics. I don't think it
|
||||
needs to be linked into the TOC, but I wanted to leave it
|
||||
to get your thoughts. If it does belong in the TOC,
|
||||
this is the location:
|
||||
#### [Implicitly Typed Lambda Expressions](csharp/implicitly-typed-lambda-expressions.md)
|
||||
-->
|
||||
### [Expression Trees](csharp/expression-trees.md)
|
||||
#### [Expression Trees Explained](csharp/expression-trees-explained.md)
|
||||
#### [Framework Types Supporting Expression Trees](csharp/expression-classes.md)
|
||||
#### [Executing Expressions](csharp/expression-trees-execution.md)
|
||||
#### [Interpreting Expressions](csharp/expression-trees-interpreting.md)
|
||||
#### [Building Expressions](csharp/expression-trees-building.md)
|
||||
#### [Translating Expressions](csharp/expression-trees-translating.md)
|
||||
#### [Summary](csharp/expression-trees-summary.md)
|
||||
### [🔧 Native interoperability](csharp/interop.md)
|
||||
### [🔧 Reflection & code generation](csharp/reflection.md)
|
||||
### [🔧 Documenting your code](csharp/codedoc.md)
|
||||
## [🔧 Syntax Reference](csharp/syntax.md)
|
||||
|
||||
<!-- Note to self: update languages/csharp/index.md to match this file
|
||||
once this is approved. -->
|
||||
|
||||
<!-- marker for the end of edits -->
|
||||
# [F# Guide](fsharp/index.md)
|
||||
## [F# Learning Resources](http://fsharp.org/learn.html)
|
||||
## [F# Language Reference](https://msdn.microsoft.com/en-us/visualfsharpdocs/conceptual/fsharp-language-reference)
|
||||
## [Visual F# Development Portal](https://msdn.microsoft.com/en-us/visualfsharpdocs/conceptual/visual-fsharp-development-portal)
|
||||
## [Asynchronous programming](fsharp/async.md)
|
||||
|
||||
# [.NET Standard](standard/index.md)
|
||||
## [.NET Standard Library](standard/library.md)
|
||||
## [Frameworks](standard/frameworks.md)
|
||||
## [What is "managed code"?](standard/managed-code.md)
|
||||
## [Common Language Runtime (CLR)](standard/clr.md)
|
||||
## [Framework Libraries](standard/framework-libraries.md)
|
||||
## [.NET Class libraries](standard/class-libraries.md)
|
||||
## [Handling and throwing exceptions](standard/exceptions.md)
|
||||
## [.NET Assembly File Format](standard/assembly-format.md)
|
||||
## [Garbage Collection](standard/gc-overview.md)
|
||||
## [Generic types](standard/generics.md)
|
||||
## [Delegates and lambdas](standard/delegates-lambdas.md)
|
||||
## [LINQ](standard/using-linq.md)
|
||||
## [Common Type System & Common Language Specification](standard/common-type-system.md)
|
||||
## [Asynchronous programming](standard/async.md)
|
||||
### [Asynchronous programming in depth](standard/async-in-depth.md)
|
||||
## [Native interoperability](standard/native-interop.md)
|
||||
## [Collections and Data Structures](standard/collections/index.md)
|
||||
### [Selecting a Collection Class](standard/collections/selecting-a-collection-class.md)
|
||||
### [Commonly Used Collection Types](standard/collections/commonly-used-collection-types.md)
|
||||
### [When to Use Generic Collections](standard/collections/when-to-use-generic-collections.md)
|
||||
### [Comparisons and Sorts Within Collections](standard/collections/comparisons-and-sorts-within-collections.md)
|
||||
### [Sorted Collection Types](standard/collections/sorted-collection-types.md)
|
||||
### [Hashtable and Dictionary Collection Types](standard/collections/hashtable-and-dictionary-collection-types.md)
|
||||
### [Thread-Safe Collections](standard/collections/threadsafe/index.md)
|
||||
#### [BlockingCollection Overview](standard/collections/threadsafe/blockingcollection-overview.md)
|
||||
#### [When to Use a Thread-Safe Collection](standard/collections/threadsafe/when-to-use-a-thread-safe-collection.md)
|
||||
#### [How to: Add and Remove Items from a ConcurrentDictionary](standard/collections/threadsafe/how-to-add-and-remove-items.md)
|
||||
#### [How to: Add and Take Items Individually from a BlockingCollection](standard/collections/threadsafe/how-to-add-and-take-items.md)
|
||||
#### [How to: Add Bounding and Blocking Functionality to a Collection](standard/collections/threadsafe/how-to-add-bounding-and-blocking.md)
|
||||
#### [How to: Use ForEach to Remove Items in a BlockingCollection](standard/collections/threadsafe/how-to-use-foreach-to-remove.md)
|
||||
#### [How to: Use Arrays of Blocking Collections in a Pipeline](standard/collections/threadsafe/how-to-use-arrays-of-blockingcollections.md)
|
||||
#### [How to: Create an Object Pool by Using a ConcurrentBag](standard/collections/threadsafe/how-to-create-an-object-pool.md)
|
||||
## [Numerics in .NET Core](standard/numerics.md)
|
||||
|
||||
# [.NET Core Guide](core/index.md)
|
||||
## [Tutorials](core/tutorials/index.md)
|
||||
### [Getting started with .NET Core on Windows](core/tutorials/using-on-windows.md)
|
||||
### [Getting started with .NET Core on macOS](core/tutorials/using-on-macos.md)
|
||||
### [Getting started with .NET Core on Windows/Linux/macOS using the command line](core/tutorials/using-with-xplat-cli.md)
|
||||
### [Developing Libraries with Cross Platform Tools](core/tutorials/libraries.md)
|
||||
### [Developing ASP.NET Core applications](core/tutorials/aspnet-core.md)
|
||||
### [How to Manage Package Dependency Versions for .NET Core 1.0](core/tutorials/managing-package-dependency-versions.md)
|
||||
## [Deploying](core/deploying/index.md)
|
||||
### [🔧 Deploying Applications](core/deploying/applications.md)
|
||||
### [Creating a NuGet Package with Cross Platform Tools](core/deploying/creating-nuget-packages.md)
|
||||
### [Reducing Package Dependencies with project.json](core/deploying/reducing-dependencies.md)
|
||||
## [Unit Testing](core/testing/index.md)
|
||||
### [Unit Testing with dotnet test](core/testing/unit-testing-with-dotnet-test.md)
|
||||
## [Releases](core/versions/index.md)
|
||||
### [Servicing](core/versions/servicing.md)
|
||||
## [Runtime IDentifier catalog](core/rid-catalog.md)
|
||||
## [.NET Core Tools](core/tools/index.md)
|
||||
### [Extensibility Model](core/tools/extensibility.md)
|
||||
### [Test communication protocol](core/tools/test-protocol.md)
|
||||
### [Continuous Integration](core/tools/using-ci-with-cli.md)
|
||||
### [dotnet](core/tools/dotnet.md)
|
||||
### [dotnet-new](core/tools/dotnet-new.md)
|
||||
### [dotnet-restore](core/tools/dotnet-restore.md)
|
||||
### [dotnet-run](core/tools/dotnet-run.md)
|
||||
### [dotnet-build](core/tools/dotnet-build.md)
|
||||
### [dotnet-test](core/tools/dotnet-test.md)
|
||||
### [dotnet-pack](core/tools/dotnet-pack.md)
|
||||
### [dotnet-publish](core/tools/dotnet-publish.md)
|
||||
### [dotnet-install-script](core/tools/dotnet-install-script.md)
|
||||
### [project.json](core/tools/project-json.md)
|
||||
### [global.json](core/tools/global-json.md)
|
||||
## [Porting from .NET Framework](core/porting/index.md)
|
||||
### [Analyzing third-party dependencies](core/porting/third-party-deps.md)
|
||||
### [🔧 NuGet packages](core/porting/nuget-packages.md)
|
||||
## [Migrating from DNX](core/migrating-from-dnx.md)
|
||||
# [Samples and Tutorials](samples-and-tutorials/index.md)
|
|
@ -0,0 +1,447 @@
|
|||
---
|
||||
title: Console Application
|
||||
description: Console Application
|
||||
keywords: .NET, .NET Core
|
||||
author: BillWagner
|
||||
manager: wpickett
|
||||
ms.date: 06/20/2016
|
||||
ms.topic: article
|
||||
ms.prod: .net-core
|
||||
ms.technology: .net-core-technologies
|
||||
ms.devlang: dotnet
|
||||
ms.assetid: 883cd93d-50ce-4144-b7c9-2df28d9c11a0
|
||||
---
|
||||
|
||||
# Console Application
|
||||
|
||||
## Introduction
|
||||
This tutorial teaches you a number of features in .NET Core and the C# language. You’ll learn:
|
||||
* The basics of the .NET Core Command Line Interface (CLI).
|
||||
* The structure of a C# Console Application.
|
||||
* Console I/O.
|
||||
* The basics of File I/O APIS in .NET Core
|
||||
* The basics of the Task Asynchronous Programming Model in .NET Core.
|
||||
|
||||
You’ll build an application that reads a text file, and echoes the
|
||||
contents of that text file to the console. The output to the console will
|
||||
be paced to match reading it aloud. You can speed up or slow down the pace
|
||||
by pressing the ‘<’ or ‘>’ keys.
|
||||
|
||||
There are a lot of features in this tutorial. Let’s build them one by one.
|
||||
## Prerequisites
|
||||
You’ll need to setup your machine to run .NET core. You can find the
|
||||
installation instructions on the [.NET Core](https://www.microsoft.com/net/core)
|
||||
page. You can run this
|
||||
application on Windows, Linux, macOS or in a Docker container.
|
||||
You’ll need to install your favorite code editor.
|
||||
## Create the Application
|
||||
The first step is to create a new application. Open a command prompt and
|
||||
create a new directory for your application. Make that the current
|
||||
directory. Type the command "dotnet new" at the command prompt. This
|
||||
creates the starter files for a basic “Hello World” application.
|
||||
|
||||
Before you start making modifications, let’s go through the steps to run
|
||||
the simple Hello World application. After creating the application, type
|
||||
"dotnet restore" at the command prompt. This command runs the NuGet
|
||||
package restore process. NuGet is a .NET package manager. This command
|
||||
downloads any of the missing dependencies for your project. As this is a
|
||||
new project, none of the dependencies are in place, so the first run will
|
||||
download the .NET Core framework. After this initial step, you will only
|
||||
need to run dotnet restore when you add new dependent packages, or update
|
||||
the versions of any of your dependencies. This process also creates the
|
||||
project lock file (project.lock.json) in your project directory. This file
|
||||
helps to manage the project dependencies. It contains the local location
|
||||
of all the project dependencies. You do not need to put the file in source
|
||||
control; it will be generated when you run “dotnet restore”.
|
||||
|
||||
After restoring packages, you run “dotnet build”. This executes the build
|
||||
engine and creates your application executable. Finally, you execute “dotnet run” to
|
||||
run your application.
|
||||
|
||||
The simple Hello World application code is all in Program.cs. Open that
|
||||
file with your favorite text editor. We’re about to make our first changes.
|
||||
At the top of the file, see a using statement:
|
||||
|
||||
```cs
|
||||
using System;
|
||||
```
|
||||
|
||||
This statement tells the compiler that any types from the System namespace
|
||||
are in scope. Like other Object Oriented languages you may have used, C#
|
||||
uses namespaces to organize types. This hello world program is no
|
||||
different. You can see that the program is enclosed in the
|
||||
`ConsoleApplication` namespace. That’s not a very descriptive name, so
|
||||
change it to `TeleprompterConsole`.
|
||||
|
||||
```cs
|
||||
namespace TeleprompterConsole
|
||||
```
|
||||
|
||||
## Reading and Echoing the File
|
||||
The first feature to add is to read a text file, and display all that text
|
||||
to the console. First, let’s add a text file. Copy the
|
||||
[sampleQuotes.txt](https://github.com/dotnet/core-docs/blob/master/samples/csharp-language/console-teleprompter/sampleQuotes.txt)
|
||||
file from the GitHub repository for this [sample](https://github.com/dotnet/core-docs/tree/master/samples/csharp-language/console-teleprompter) into your project directory.
|
||||
This will serve as the script for your
|
||||
application.
|
||||
|
||||
Next, add the following method in your Program class (right below the Main
|
||||
method):
|
||||
|
||||
```cs
|
||||
static IEnumerable<string> ReadFrom(string file)
|
||||
{
|
||||
string line;
|
||||
using (var reader = File.OpenText(file))
|
||||
{
|
||||
while ((line = reader.ReadLine()) != null)
|
||||
{
|
||||
yield return line;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This method uses types from two new namespaces. For this to compile you’ll
|
||||
need to add the following two lines to the top of the file:
|
||||
|
||||
```cs
|
||||
using System.Collections.Generic;
|
||||
using System.IO;
|
||||
```
|
||||
|
||||
The `IEnumerable<T>` interface is defined in the
|
||||
`System.Collections.Generic` namespace. The File class is defined in the
|
||||
`System.IO namespace`.
|
||||
|
||||
This method is a special type of C# method called an *Enumerator method*.
|
||||
Enumerator methods return sequences that are evaluated lazily. That means
|
||||
each item in the sequence is generated as it is requested by the code
|
||||
consuming the sequence. Enumerator methods are methods that contain one or
|
||||
more `yield return` statements. The object returned by the `ReadFrom()`
|
||||
method contains the code to generate each item in the sequence. In this
|
||||
example, that involves reading the next line of text from the source file,
|
||||
and returning that string. Each time the calling code requests the next
|
||||
item from the sequence, the code reads the next line of text from the file
|
||||
and returns it. When the file has been completely read, the sequence
|
||||
indicates that there are no more items.
|
||||
|
||||
There are two other C# syntax elements that may be new to you. The `using`
|
||||
statement in this method manages resource cleanup. The variable that is
|
||||
initialized in the using statement (`reader`, in this example) must
|
||||
implement the `IDisposable` interface. The `IDisposable` interface
|
||||
defines a single method, `Dispose()`, that should be called when the
|
||||
resource should be released. The compiler generates that call when
|
||||
execution reaches the closing brace of the `using` statement. The
|
||||
compiler-generated code ensures that the resource is released even if an
|
||||
exception is thrown from the code in the block defined by the using
|
||||
statement.
|
||||
|
||||
The reader variable is defined using the `var` keyword. `var` defines an
|
||||
*implicitly typed local variable*. That means the type of the variable is
|
||||
determined by the compile time type of the object assigned to the
|
||||
variable. Here, that is the return value from `File.OpenText()`, which is
|
||||
a `StreamReader` object.
|
||||
|
||||
Now, let’s fill in the code to read the file in the Main method:
|
||||
|
||||
```cs
|
||||
var lines = ReadFrom("SampleQuotes.txt");
|
||||
foreach (var line in lines)
|
||||
{
|
||||
Console.WriteLine(line);
|
||||
}
|
||||
```
|
||||
|
||||
Run the program (using "dotnet run" and you can see every line printed out
|
||||
to the console.
|
||||
|
||||
## Adding Delays and Formatting output
|
||||
What you have is being displayed far too fast to read aloud. Now you need
|
||||
to add the delays in the output. As you start, you’ll be building some of
|
||||
the core code that enables asynchronous processing. However, these first
|
||||
steps will follow a few anti-patterns. The anti-patterns are pointed out
|
||||
in comments as you add the code, and the code will be updated in later
|
||||
steps.
|
||||
|
||||
There are two steps to this section. First, you’ll update the iterator
|
||||
method to return single words instead of entire lines. That’s done with
|
||||
these modifications. Replace the `yield return line;` statement with the
|
||||
following code:
|
||||
|
||||
```cs
|
||||
var words = line.Split(' ');
|
||||
foreach (var word in words)
|
||||
{
|
||||
yield return word + " ";
|
||||
}
|
||||
yield return Environment.NewLine;
|
||||
```
|
||||
|
||||
Next, you need to modify how you consume the lines of the file, and add a
|
||||
delay after writing each word. Replace the `Console.WriteLine()` statement
|
||||
in the `Main` method with the following block:
|
||||
|
||||
```cs
|
||||
{
|
||||
Console.Write(line);
|
||||
if (!string.IsNullOrWhiteSpace(line))
|
||||
{
|
||||
var pause = Task.Delay(200);
|
||||
// Synchronously waiting on a task is an
|
||||
// anti-pattern. This will get fixed in later
|
||||
// steps.
|
||||
pause.Wait();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `Task` class is in the `System.Threading.Tasks` namespace, so you need
|
||||
to add that using statement at the top of file:
|
||||
|
||||
```cs
|
||||
using System.Threading.Tasks;
|
||||
```
|
||||
|
||||
> Note: In RC2, you need to run the application using a different
|
||||
> command to see the correct output. This is due to an issue
|
||||
> in the CLI that [has been filed](https://github.com/dotnet/cli/issues/2976).
|
||||
> To run the application, instead of `dotnet run` use
|
||||
> `dotnet .\bin\Debug\netcoreapp1.0\console-teleprompter.dll`
|
||||
> substituting the correct path to your output DLL.
|
||||
|
||||
Run the sample, and check the output. Now, each single word is printed,
|
||||
followed by a 200 ms delay. However, the displayed output shows some
|
||||
issues because the source text file has several lines that have more than
|
||||
80 characters without a line break. That can be hard to read while it's
|
||||
scrolling by. That’s easy to fix. You’ll just keep track of the length of
|
||||
each line, and generate a new line whenever the line length reaches a
|
||||
certain threshold. Declare a local variable after the declaration of
|
||||
`words` that holds the line length:
|
||||
|
||||
```cs
|
||||
var lineLength = 0;
|
||||
```
|
||||
|
||||
Then, add the following code after the `yield return word;` statement
|
||||
(before the closing brace):
|
||||
|
||||
```cs
|
||||
lineLength += word.Length + 1;
|
||||
if (lineLength > 70)
|
||||
{
|
||||
yield return Environment.NewLine;
|
||||
lineLength = 0;
|
||||
}
|
||||
```
|
||||
|
||||
Run the sample, and you’ll be able to read aloud at its pre-configured
|
||||
pace.
|
||||
|
||||
## Async Tasks
|
||||
In this final step, you’ll add the code to write the output asynchronously
|
||||
in one task, while also running another task to read input from the user
|
||||
if they want to speed up or slow down the text display. This has a few
|
||||
steps in it and by the end, you’ll have all the updates that you need.
|
||||
The first step is to create an asynchronous `Task` returning method that
|
||||
represents the code you’ve created so far to read and display the file.
|
||||
|
||||
Add this method to your Program class: (It’s taken from the body of your
|
||||
Main method:
|
||||
|
||||
```cs
|
||||
private static async Task ShowTeleprompter()
|
||||
{
|
||||
var words = ReadFrom("SampleQuotes.txt");
|
||||
foreach (var line in words)
|
||||
{
|
||||
Console.Write(line);
|
||||
if (!string.IsNullOrWhiteSpace(line))
|
||||
{
|
||||
await Task.Delay(200);
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
You’ll notice two changes. First, in the body of the method, instead of
|
||||
calling `Wait()` to synchronously wait for a task to finish, this version
|
||||
uses the `await` keyword. In order to do that, you need to add the `async`
|
||||
modifier to the method signature. This method returns a `Task`. Notice that
|
||||
there are no return statements that return a Task object. Instead, that
|
||||
`Task` object is created by code the compiler generates when you use the
|
||||
`await` operator. You can imagine that this method returns when it reaches
|
||||
an `await`. The returned Task indicates that the work has not completed.
|
||||
The method resumes when the awaited task completes. When it has executed
|
||||
to completion, the returned `Task` indicates that it is complete.
|
||||
Calling code can
|
||||
monitor that returned task to determine when it has completed.
|
||||
|
||||
You can call this new method in your Main program:
|
||||
|
||||
```cs
|
||||
ShowTeleprompter().Wait();
|
||||
```
|
||||
|
||||
Here, in `Main()`, the code does synchronously wait. You should use the
|
||||
`await` operator instead of synchronously waiting whenever possible. But,
|
||||
in a console application’s `Main` method, you cannot use the `await`
|
||||
operator. That would result in the application exiting before all tasks
|
||||
have completed.
|
||||
|
||||
Next, you need to write the second asynchronous method to read from the
|
||||
Console and watch for the ‘<’ and ‘>’ keys. Here’s the method you add for
|
||||
that task:
|
||||
|
||||
```cs
|
||||
private static async Task GetInput()
|
||||
{
|
||||
var delay = 200;
|
||||
Action work = () =>
|
||||
{
|
||||
do {
|
||||
var key = Console.ReadKey(true);
|
||||
if (key.KeyChar == '>')
|
||||
{
|
||||
delay -= 10;
|
||||
}
|
||||
else if (key.KeyChar == '<')
|
||||
{
|
||||
delay += 10;
|
||||
}
|
||||
} while (true);
|
||||
};
|
||||
await Task.Run(work);
|
||||
}
|
||||
```
|
||||
|
||||
This creates a lambda expression to represent an `Action` that reads a key
|
||||
from the Console and modifies a local variable representing the delay when
|
||||
the user presses the ‘<’ or ‘>’ keys. This method uses `Console.ReadKey()`
|
||||
to block and wait for the user to press a key.
|
||||
|
||||
To finish this feature, you need to create a new async task returning
|
||||
method that starts both of these tasks (`GetInput()` and
|
||||
`ShowTeleprompter()`, and also manage the shared data between these two
|
||||
tasks.
|
||||
|
||||
It’s time to create a class that can handle the shared data between these
|
||||
two tasks. This class contains two public properties: the delay, and a
|
||||
flag to indicate that the file has been completely read:
|
||||
|
||||
```cs
|
||||
namespace TeleprompterConsole
|
||||
{
|
||||
internal class TelePrompterConfig
|
||||
{
|
||||
private object lockHandle = new object();
|
||||
public int DelayInMilliseconds { get; private set; } = 200;
|
||||
|
||||
public void UpdateDelay(int increment) // negative to speed up
|
||||
{
|
||||
var newDelay = Min(DelayInMilliseconds + increment, 1000);
|
||||
newDelay = Max(newDelay, 20);
|
||||
lock (lockHandle)
|
||||
{
|
||||
DelayInMilliseconds = newDelay;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Put that class in a new file, and enclose that class in the
|
||||
`TeleprompterConsole` namespace as shown above. You’ll also need to add a `static using`
|
||||
statement so that you can reference the `Min` and `Max` method without the
|
||||
enclosing class or namespace names. A static using statement imports the
|
||||
methods from one class. This is in contrast with the using statements used
|
||||
up to this point that have imported all classes from a namespace.
|
||||
|
||||
```cs
|
||||
using static System.Math;
|
||||
```
|
||||
|
||||
The other language feature that’s new is the `lock` statement. This
|
||||
statement ensures that only a single thread can be in that code at any
|
||||
given time. If one thread is in the locked section, other threads must
|
||||
wait for the first thread to exit that section. The lock statement uses an
|
||||
object that guards the lock section. This class follows a standard idiom
|
||||
to lock a private object in the class.
|
||||
|
||||
Next, you need to update the `ShowTeleprompter` and `GetInput` methods to
|
||||
use the new config object. Write one final Task returning async method to
|
||||
start both tasks and exit when the first task finishes:
|
||||
|
||||
```cs
|
||||
private static async Task RunTeleprompter()
|
||||
{
|
||||
var config = new TelePrompterConfig();
|
||||
var displayTask = ShowTeleprompter(config);
|
||||
|
||||
var speedTask = GetInput(config);
|
||||
await Task.WhenAny(displayTask, speedTask);
|
||||
}
|
||||
```
|
||||
|
||||
The one new method here is the `Task.WhenAny()` call. That creates a Task
|
||||
that finishes as soon as any of the tasks in its argument list completes.
|
||||
|
||||
Next, you need to update both the ShowTeleprompter and GetInput methods to
|
||||
use the config object for the delay:
|
||||
|
||||
```cs
|
||||
private static async Task ShowTeleprompter(TelePrompterConfig config)
|
||||
{
|
||||
var words = ReadFrom("SampleQuotes.txt");
|
||||
foreach (var line in words)
|
||||
{
|
||||
Console.Write(line);
|
||||
if (!string.IsNullOrWhiteSpace(line))
|
||||
{
|
||||
await Task.Delay(config.DelayInMilliseconds);
|
||||
}
|
||||
}
|
||||
config.SetDone();
|
||||
}
|
||||
|
||||
private static async Task GetInput(TelePrompterConfig config)
|
||||
{
|
||||
|
||||
Action work = () =>
|
||||
{
|
||||
do {
|
||||
var key = Console.ReadKey(true);
|
||||
if (key.KeyChar == '>')
|
||||
config.UpdateDelay(-10);
|
||||
else if (key.KeyChar == '<')
|
||||
config.UpdateDelay(10);
|
||||
} while (!config.Done);
|
||||
};
|
||||
await Task.Run(work);
|
||||
}
|
||||
```
|
||||
|
||||
This new version of `ShowTeleprompter` calls a new method in the
|
||||
`TeleprompterConfig` class. To finish, you'll need to add the
|
||||
`SetDone` method, and the `Done` property to the `TelePrompterConfig` class:
|
||||
|
||||
```cs
|
||||
public bool Done => done;
|
||||
|
||||
private bool done;
|
||||
|
||||
public void SetDone()
|
||||
{
|
||||
done = true;
|
||||
}
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
This tutorial showed you a number of the features around the C# language
|
||||
and the .NET Core libraries related to working in Console applications.
|
||||
You can build on this knowledge to explore more about the language, and
|
||||
the classes introduced here. You’ve seen the basics of File and Console
|
||||
I/O, blocking and non-blocking use of the Task based Asynchronous
|
||||
programming model, a tour of the C# language and how C# programs are
|
||||
organized and the .NET Core Command Line Interface and tools.
|
||||
|
Загрузка…
Ссылка в новой задаче