RichTextKit/doc/basics.page

94 строки
5.8 KiB
Plaintext
Исходник Обычный вид История

2019-08-03 18:02:45 +03:00
---
title: "Basic Concepts"
isMarkdown: false
---
<h1 id="basic-concepts">Basic Concepts</h1>
<p>The page describes some basic concepts that are important to understand
when working with RichTextKit.</p>
<h2 id="text-blocks">Text Blocks</h2>
2019-08-04 09:40:57 +03:00
<p>RichTextKit primarily works with individual blocks of text where each is
essentially a paragraph.</p>
<p>RichTextKit doesn't provide document level constructs like list items,
2019-08-03 18:02:45 +03:00
block quotes, tables, table cells, divs etc... This project is strictly
about laying out a single block of text with rich (aka attributed) text
and then providing various functions for working with that text block.</p>
<p>RichTextKit doesn't have a concept of a DOM - that's a higher level concept
that RichTextKit is not attempting to solve.</p>
2019-08-04 09:40:57 +03:00
<p>Text blocks can contain forced line breaks with a newline character (<code>\n</code>).
These should be considered as in-paragraph &quot;soft returns&quot;, as opposed to
hard paragraph terminators.</p>
2019-08-03 18:02:45 +03:00
<p>Currently RichTextKit doesn't support non-text inline elements, although this
may be added in the future.</p>
<p>Text blocks are represented by the <a href="./ref/Topten.RichTextKit.TextBlock">TextBlock</a> class.</p>
<h2 id="styles">Styles</h2>
<p>Styles are used to describe the display attributes of text in a text block. A
text block is comprised of one of more runs of text each tagged with a
2019-08-04 09:40:57 +03:00
single style and these text runs are referred to a &quot;style runs&quot;.</p>
2019-08-03 18:02:45 +03:00
<p>Style runs are represented by the <a href="./ref/Topten.RichTextKit.StyleRun">StyleRun</a> class.</p>
<h2 id="font-fallback">Font Fallback</h2>
<p>Font fallback is the process of switching to a different font when the specified
font doesn't contain the required glyphs.</p>
<p>Font fallback is often required to display emoji characters (since most fonts don't
include emoji glyphs), for most complex scripts (eg: Arabic) and Asian scripts.</p>
<p>RichTextKit uses SkiaSharp's <code>MatchCharacter</code> function to resolve the typeface to
use for font fallback.</p>
<h2 id="font-shaping">Font Shaping</h2>
<p>For most Latin based languages, rendering text is simply a matter of placing one glyph
2019-08-04 09:40:57 +03:00
after another in a left-to-right manner. Other languages however require a more
complicated process that often involves drawing multiple glyphs for a single character.
This process is called &quot;font shaping&quot;.</p>
<p>By default Skia (and therefore ShiaSharp) doesn't do font shaping and often displays
text incorrectly. RichTextKit uses <a href="https://www.freedesktop.org/wiki/Software/HarfBuzz/">HarfBuzz</a>
for text shaping.</p>
2019-08-03 18:02:45 +03:00
<h2 id="font-runs">Font Runs</h2>
2019-08-04 09:40:57 +03:00
<p>Font run are derived by splitting style runs into smaller segments when text
is wrapped onto a new line, or when a font fallback is required.</p>
<p>Each font run represents a single run of text that uses the same font and other
style attributes for every character in the run.</p>
2019-08-03 18:02:45 +03:00
<p>Font runs are represented by the <a href="./ref/Topten.RichTextKit.FontRun">FontRun</a> class.</p>
<h2 id="style-runs-vs-font-runs">Style Runs vs Font Runs</h2>
2019-08-04 09:40:57 +03:00
<p>Style runs and font runs are similar concepts but serve two different purposes:</p>
<ul>
<li>Style runs describe the logical view of a text block (before layout)</li>
<li>Font runs describe the physical view of a text block (after layout)</li>
</ul>
2019-08-03 18:02:45 +03:00
<h2 id="lines">Lines</h2>
<p>After a text block has been laid out it results in a set of lines each
consisting of one or more font runs.</p>
<p>Lines are represented by the <a href="./ref/Topten.RichTextKit.TextLine">TextLine</a> class.</p>
<h2 id="bi-directional-ltr-and-rtl-text">Bi-Directional, LTR and RTL Text</h2>
<p>Bi-directional text is the process of correctly displaying text for mixed
left-to-right and right-to-left based languages.</p>
<p>RichTextKit includes an implementation of the Unicode Bi-directional Text
2019-08-04 09:40:57 +03:00
Algorithm (<a href="http://www.unicode.org/reports/tr9/">UAX #9</a>) and each text block
has a &quot;base direction&quot; that controls the default layout order of text in that
paragraph.</p>
2019-08-03 18:02:45 +03:00
<p>Embedded control characters (as specifed by UAX #9) can be used to control
2019-08-04 09:40:57 +03:00
bi-directional text formatting. Specifying the text direction of a text run
through styles is currently not supported and embedded control characters
must be used.</p>
2019-08-03 18:02:45 +03:00
<h2 id="utf-16-utf-32-code-points-and-clusters">UTF-16, UTF-32, Code Points and Clusters</h2>
<p>Internally, RichTextKit works with UTF-32 encoded text. Specifically it uses the
<code>int</code> type to represent a character rather than the <code>char</code> type.</p>
<p>To avoid confusion, the API and code base use the term &quot;code point&quot; to
refer to a single UTF-32 character.</p>
<p>Since C# strings are encoded as UTF-16, when a string is added to a text
block it will be automatically converted to UTF-32. It's important to note that
indexes into the converted string may no longer match indicies into the original
C# string. (although RichTextKit does provide functions to map between them).</p>
<p>RichTextKit uses the term <code>CodePointIndex</code> to refer to the index of a code point
in a UTF-32 buffer.</p>
2019-08-04 09:40:57 +03:00
<p>Also note that line endings <code>\r</code> and <code>\r\n</code> are automatically normalized to <code>\n</code>.
2019-08-03 18:02:45 +03:00
This conversion can also affect the mapping of code point indicies to character
indicies in the original string.</p>
<p><em>(Note: <code>\n\r</code> style line endings aren't supported)</em></p>
<p>Some complex scripts require more that one code point to describe a single user
perceived character. These groupings of code points are called &quot;clusters&quot;. This is
the same concept as used by HarfBuzz.</p>
<h2 id="measurement-and-size-units">Measurement and Size Units</h2>
2019-08-04 09:40:57 +03:00
<p>All measurements and sizes used by RichTextKit are logical.</p>
<p>The final size of rendered text will depend on the configuration of the target
Canvas scaling and it's up to you to determine the appropriate system of measurement
to use.</p>
2019-08-03 18:02:45 +03:00