95 строки
6.2 KiB
Plaintext
95 строки
6.2 KiB
Plaintext
|
---
|
||
|
title: "Basic Concepts"
|
||
|
isMarkdown: false
|
||
|
---
|
||
|
<h1 id="basic-concepts">Basic Concepts</h1>
|
||
|
<p>The page describes some basic concepts that are important to understand
|
||
|
when working with RichTextKit.</p>
|
||
|
<h2 id="text-blocks">Text Blocks</h2>
|
||
|
<p>RichTextKit primarily works with text blocks. A text block represents
|
||
|
a single block of text - essentially a paragraph.</p>
|
||
|
<p>RichTextKit doesn't provide higher level constructs like list items,
|
||
|
block quotes, tables, table cells, divs etc... This project is strictly
|
||
|
about laying out a single block of text with rich (aka attributed) text
|
||
|
and then providing various functions for working with that text block.</p>
|
||
|
<p>RichTextKit doesn't have a concept of a DOM - that's a higher level concept
|
||
|
that RichTextKit is not attempting to solve.</p>
|
||
|
<p>Although RichTextKit works with individual text blocks, they can contain
|
||
|
carriage returns (<code>\n</code>) to force line breaks. These should be considered as
|
||
|
in-paragraph "soft returns", as opposed to a hard paragraph terminator.</p>
|
||
|
<p>Currently RichTextKit doesn't support non-text inline elements, although this
|
||
|
may be added in the future.</p>
|
||
|
<p>Text blocks are represented by the <a href="./ref/Topten.RichTextKit.TextBlock">TextBlock</a> class.</p>
|
||
|
<h2 id="styles">Styles</h2>
|
||
|
<p>Styles are used to describe the display attributes of text in a text block. A
|
||
|
text block is comprised of one of more runs of text each tagged with a
|
||
|
single style.</p>
|
||
|
<p>RichTextKit defines an interface <a href="./ref/Topten.RichTextKit.IStyle">IStyle</a> that provides the style of a
|
||
|
run of text. You can either provide your own implementation of this
|
||
|
interface (eg: suppose you have your own DOM that computes styles), or
|
||
|
you can use the built in <a href="./ref/Topten.RichTextKit.Style">Style</a> class which provides a simple implementation
|
||
|
(but doesn't have any concept if style inheritance or cascading).</p>
|
||
|
<p>Style runs are represented by the <a href="./ref/Topten.RichTextKit.StyleRun">StyleRun</a> class.</p>
|
||
|
<h2 id="font-fallback">Font Fallback</h2>
|
||
|
<p>Font fallback is the process of switching to a different font when the specified
|
||
|
font doesn't contain the required glyphs.</p>
|
||
|
<p>Font fallback is often required to display emoji characters (since most fonts don't
|
||
|
include emoji glyphs), for most complex scripts (eg: Arabic) and Asian scripts.</p>
|
||
|
<p>RichTextKit uses SkiaSharp's <code>MatchCharacter</code> function to resolve the typeface to
|
||
|
use for font fallback.</p>
|
||
|
<h2 id="font-shaping">Font Shaping</h2>
|
||
|
<p>For most Latin based languages, rendering text is simply a matter of placing one glyph
|
||
|
after another in a left-to-right manner.</p>
|
||
|
<p>For many other languages however the process is much more complicated and often involves
|
||
|
drawing multiple glyphs for a single character. This process is called "font shaping"</p>
|
||
|
<p>By default Skia (and thereforce ShiaSharp) doesn't do font shaping and often displays
|
||
|
text incorrectly. RichTextKit uses HarfBuzzSharp for text shaping.</p>
|
||
|
<h2 id="font-runs">Font Runs</h2>
|
||
|
<p>A font run represents a single run of text that uses the same font and other
|
||
|
style attributes for every character in the run. Font runs are derived by
|
||
|
splitting style runs into smaller segments when when text is wrapped onto a
|
||
|
new line, or when a font fallback is required.</p>
|
||
|
<p>Font runs are represented by the <a href="./ref/Topten.RichTextKit.FontRun">FontRun</a> class.</p>
|
||
|
<h2 id="style-runs-vs-font-runs">Style Runs vs Font Runs</h2>
|
||
|
<p>Style runs should be considered the client concept of a run of text. ie: as it
|
||
|
was added to the text block.</p>
|
||
|
<p>Font runs describe how a text block is broken down after the block has been laid
|
||
|
out and can be considered the internal view of the text block.</p>
|
||
|
<h2 id="lines">Lines</h2>
|
||
|
<p>After a text block has been laid out it results in a set of lines each
|
||
|
consisting of one or more font runs.</p>
|
||
|
<p>Lines are represented by the <a href="./ref/Topten.RichTextKit.TextLine">TextLine</a> class.</p>
|
||
|
<h2 id="bi-directional-ltr-and-rtl-text">Bi-Directional, LTR and RTL Text</h2>
|
||
|
<p>Bi-directional text is the process of correctly displaying text for mixed
|
||
|
left-to-right and right-to-left based languages.</p>
|
||
|
<p>RichTextKit includes an implementation of the Unicode Bi-directional Text
|
||
|
Algorithm (UAX #9) and each text block has a "base direction" that controls
|
||
|
the default layout order of text in that paragraph.</p>
|
||
|
<p>Embedded control characters (as specifed by UAX #9) can be used to control
|
||
|
bi-directional text formatting.</p>
|
||
|
<p>Specifying the text direction of a text run through styles is currently not
|
||
|
supported and embedded control characters must be used.</p>
|
||
|
<h2 id="utf-16-utf-32-code-points-and-clusters">UTF-16, UTF-32, Code Points and Clusters</h2>
|
||
|
<p>Internally, RichTextKit works with UTF-32 encoded text. Specifically it uses the
|
||
|
<code>int</code> type to represent a character rather than the <code>char</code> type.</p>
|
||
|
<p>To avoid confusion, the API and code base use the term "code point" to
|
||
|
refer to a single UTF-32 character.</p>
|
||
|
<p>Since C# strings are encoded as UTF-16, when a string is added to a text
|
||
|
block it will be automatically converted to UTF-32. It's important to note that
|
||
|
indexes into the converted string may no longer match indicies into the original
|
||
|
C# string. (although RichTextKit does provide functions to map between them).</p>
|
||
|
<p>RichTextKit uses the term <code>CodePointIndex</code> to refer to the index of a code point
|
||
|
in a UTF-32 buffer.</p>
|
||
|
<p>Also note that line endings <code>\r</code> and <code>\r\n</code> are automatically normalized to <code>\n</code>.<br />
|
||
|
This conversion can also affect the mapping of code point indicies to character
|
||
|
indicies in the original string.</p>
|
||
|
<p><em>(Note: <code>\n\r</code> style line endings aren't supported)</em></p>
|
||
|
<p>Some complex scripts require more that one code point to describe a single user
|
||
|
perceived character. These groupings of code points are called "clusters". This is
|
||
|
the same concept as used by HarfBuzz.</p>
|
||
|
<h2 id="measurement-and-size-units">Measurement and Size Units</h2>
|
||
|
<p>All measurements and sizes used by RichTextKit are logical. The final size of the
|
||
|
rendered text will depend on the configuration of the target Canvas scaling and
|
||
|
it's up to you to determine the appropriate system of measurement to use.</p>
|
||
|
|