2014-09-06 13:28:25 +04:00
|
|
|
CommonMark
|
2014-09-07 06:00:47 +04:00
|
|
|
==========
|
2014-07-22 09:29:16 +04:00
|
|
|
|
2014-10-26 02:37:01 +03:00
|
|
|
CommonMark is a rationalized version of Markdown syntax,
|
|
|
|
with a [spec][the spec] and BSD3-licensed reference
|
|
|
|
implementations in C and JavaScript.
|
2014-08-14 09:28:18 +04:00
|
|
|
|
2014-11-12 07:53:33 +03:00
|
|
|
[Try it now!](http://spec.commonmark.org/dingus.html)
|
|
|
|
|
2014-08-14 09:28:18 +04:00
|
|
|
The implementations
|
|
|
|
-------------------
|
2014-07-22 09:29:16 +04:00
|
|
|
|
2014-11-12 21:48:30 +03:00
|
|
|
The C implementation provides both a shared library (`libcmark`) and a
|
|
|
|
standalone program `cmark` that converts CommonMark to HTML. It is
|
|
|
|
written in standard C99 and has no library dependencies. The parser is
|
2014-11-18 04:57:23 +03:00
|
|
|
very fast (see [benchmarks](benchmarks.md)).
|
2014-10-26 02:37:01 +03:00
|
|
|
|
2014-11-12 21:48:30 +03:00
|
|
|
It is easy to use `libcmark` in python or ruby code: see `wrapper.py`
|
|
|
|
and `wrapper.rb` in the repository for simple examples.
|
2014-08-14 09:28:18 +04:00
|
|
|
|
2014-09-08 18:16:36 +04:00
|
|
|
The JavaScript implementation is a single JavaScript file, with
|
2014-11-12 07:51:45 +03:00
|
|
|
no dependencies, that can be linked to in an HTML page. Here
|
|
|
|
is a simple usage example:
|
|
|
|
|
|
|
|
``` javascript
|
|
|
|
var reader = new commonmark.DocParser();
|
|
|
|
var writer = new commonmark.HtmlRenderer();
|
|
|
|
var parsed = reader.parse("Hello *world*");
|
|
|
|
var result = writer.render(parsed);
|
|
|
|
```
|
|
|
|
|
|
|
|
A node package is also available; it includes a command-line tool called
|
2014-11-12 07:45:30 +03:00
|
|
|
`commonmark`.
|
2014-07-22 09:29:16 +04:00
|
|
|
|
2014-11-12 07:51:45 +03:00
|
|
|
**A note on security:**
|
|
|
|
Neither implementation attempts to sanitize link attributes or
|
|
|
|
raw HTML. If you use these libraries in applications that accept
|
|
|
|
untrusted user input, you must run the output through an HTML
|
|
|
|
sanitizer to protect against
|
|
|
|
[XSS attacks](http://en.wikipedia.org/wiki/Cross-site_scripting).
|
|
|
|
|
2014-11-12 18:43:17 +03:00
|
|
|
Installing (C)
|
|
|
|
--------------
|
2014-11-07 22:15:18 +03:00
|
|
|
|
2014-11-12 07:45:30 +03:00
|
|
|
Building the C program (`cmark`) and shared library (`libcmark`)
|
2014-11-12 07:48:09 +03:00
|
|
|
requires [cmake] and [re2c], which is used to generate `scanners.c` from
|
|
|
|
`scanners.re`. (Note that [re2c] is only a build dependency for
|
|
|
|
developers, since `scanners.c` can be provided in a released source
|
|
|
|
tarball.)
|
2014-11-12 07:45:30 +03:00
|
|
|
|
2014-11-13 00:17:06 +03:00
|
|
|
If you have GNU make, you can simply `make`, `make test`, and `make
|
|
|
|
install`. This calls [cmake] to create a `Makefile` in the `build`
|
|
|
|
directory, then uses that `Makefile` to create the executable and
|
|
|
|
library.
|
2014-11-12 07:45:30 +03:00
|
|
|
|
2014-11-13 00:17:06 +03:00
|
|
|
For a more portable method, you can use [cmake] manually. [cmake] knows
|
|
|
|
how to create build environments for many build systems. For example,
|
|
|
|
on FreeBSD:
|
2014-11-07 22:15:18 +03:00
|
|
|
|
|
|
|
mkdir build
|
|
|
|
cd build
|
2014-11-13 00:17:06 +03:00
|
|
|
cmake .. # optionally: -DCMAKE_INSTALL_PREFIX=path
|
|
|
|
make # executable will be create as build/src/cmake
|
|
|
|
make test
|
2014-11-07 22:15:18 +03:00
|
|
|
make install
|
|
|
|
|
2014-11-13 00:17:06 +03:00
|
|
|
Or, to create Xcode project files on OSX:
|
2014-11-07 22:15:18 +03:00
|
|
|
|
2014-11-13 00:17:06 +03:00
|
|
|
mkdir build
|
|
|
|
cd build
|
|
|
|
cmake -G Xcode ..
|
|
|
|
make
|
2014-11-12 07:45:30 +03:00
|
|
|
make test
|
2014-11-13 00:17:06 +03:00
|
|
|
make install
|
2014-11-12 07:45:30 +03:00
|
|
|
|
2014-11-13 00:17:06 +03:00
|
|
|
Tests can also be run manually on any executable `$PROG` using:
|
2014-11-12 07:45:30 +03:00
|
|
|
|
2014-11-19 08:38:45 +03:00
|
|
|
python runtests.py --program $PROG
|
2014-11-13 00:17:06 +03:00
|
|
|
|
2014-11-24 01:25:51 +03:00
|
|
|
If you want to extract the raw test data from the spec without
|
|
|
|
actually running the tests, you can do:
|
|
|
|
|
|
|
|
python runtests.py --dump-tests
|
|
|
|
|
|
|
|
and you'll get all the tests in JSON format.
|
|
|
|
|
2014-11-13 00:17:06 +03:00
|
|
|
The GNU Makefile also provides a few other targets for developers.
|
2014-11-12 18:51:06 +03:00
|
|
|
To run a "fuzz test" against ten long randomly generated inputs:
|
|
|
|
|
|
|
|
make fuzztest
|
|
|
|
|
|
|
|
To run a test for memory leaks using valgrind:
|
|
|
|
|
|
|
|
make leakcheck
|
|
|
|
|
2014-11-17 03:00:36 +03:00
|
|
|
To make a release tarball and zip archive:
|
2014-11-12 18:51:06 +03:00
|
|
|
|
2014-11-17 03:00:36 +03:00
|
|
|
make archive
|
2014-11-12 18:43:17 +03:00
|
|
|
|
2014-11-19 08:38:45 +03:00
|
|
|
To test the archives:
|
|
|
|
|
|
|
|
make testarchive
|
|
|
|
|
2014-11-14 19:27:16 +03:00
|
|
|
Compiling for Windows
|
|
|
|
---------------------
|
|
|
|
|
|
|
|
You can cross-compile a Windows binary and dll on linux if you have the
|
|
|
|
`mingw32` compiler:
|
|
|
|
|
|
|
|
make mingw
|
|
|
|
|
|
|
|
The binaries will be in `build-mingw/windows/bin`.
|
|
|
|
|
2014-11-12 18:43:17 +03:00
|
|
|
Installing (JavaScript)
|
|
|
|
-----------------------
|
|
|
|
|
2014-11-12 07:45:30 +03:00
|
|
|
The JavaScript library can be installed through `npm`:
|
|
|
|
|
|
|
|
npm install commonmark
|
|
|
|
|
|
|
|
To build the JavaScript library as a single standalone file:
|
2014-11-07 22:15:18 +03:00
|
|
|
|
|
|
|
browserify --standalone commonmark js/lib/index.js -o js/commonmark.js
|
|
|
|
|
2014-11-12 07:45:30 +03:00
|
|
|
Or fetch a pre-built copy from
|
|
|
|
<http://spec.commonmark.org/js/commonmark.js>`.
|
|
|
|
|
|
|
|
To run tests for the JavaScript library:
|
2014-11-07 22:15:18 +03:00
|
|
|
|
2014-11-12 07:49:35 +03:00
|
|
|
make testjs
|
|
|
|
|
|
|
|
or
|
|
|
|
|
2014-11-07 22:15:18 +03:00
|
|
|
node js/test.js
|
|
|
|
|
2014-08-14 09:28:18 +04:00
|
|
|
The spec
|
|
|
|
--------
|
2014-07-22 09:29:16 +04:00
|
|
|
|
2014-11-12 07:45:30 +03:00
|
|
|
[The spec] contains over 500 embedded examples which serve as conformance
|
|
|
|
tests. To run the tests for `cmark`, do `make test`. To run them for
|
|
|
|
another Markdown program, say `myprog`, do `make test PROG=myprog`. To
|
|
|
|
run the tests for `commonmark.js`, do `make testjs`.
|
|
|
|
|
|
|
|
[The spec]: http://jgm.github.io/CommonMark/spec.html
|
|
|
|
|
2014-09-07 06:26:06 +04:00
|
|
|
The source of [the spec] is `spec.txt`. This is basically a Markdown
|
2014-07-22 09:29:16 +04:00
|
|
|
file, with code examples written in a shorthand form:
|
|
|
|
|
|
|
|
.
|
2014-09-07 06:26:06 +04:00
|
|
|
Markdown source
|
2014-07-22 09:29:16 +04:00
|
|
|
.
|
|
|
|
expected HTML output
|
|
|
|
.
|
|
|
|
|
2014-08-14 09:28:18 +04:00
|
|
|
To build an HTML version of the spec, do `make spec.html`. To build a
|
|
|
|
PDF version, do `make spec.pdf`. Both these commands require that
|
2014-11-12 07:45:30 +03:00
|
|
|
[pandoc] is installed, and creating a PDF requires a latex installation.
|
2014-08-14 09:28:18 +04:00
|
|
|
|
|
|
|
The spec is written from the point of view of the human writer, not
|
|
|
|
the computer reader. It is not an algorithm---an English translation of
|
|
|
|
a computer program---but a declarative description of what counts as a block
|
|
|
|
quote, a code block, and each of the other structural elements that can
|
2014-09-07 06:26:06 +04:00
|
|
|
make up a Markdown document.
|
2014-08-14 09:28:18 +04:00
|
|
|
|
|
|
|
Because John Gruber's [canonical syntax
|
|
|
|
description](http://daringfireball.net/projects/markdown/syntax) leaves
|
|
|
|
many aspects of the syntax undetermined, writing a precise spec requires
|
|
|
|
making a large number of decisions, many of them somewhat arbitrary.
|
2014-11-23 04:20:41 +03:00
|
|
|
In making them, we have appealed to existing conventions and
|
2014-08-14 09:28:18 +04:00
|
|
|
considerations of simplicity, readability, expressive power, and
|
2014-11-23 04:20:41 +03:00
|
|
|
consistency. We have tried to ensure that "normal" documents in the many
|
2014-09-07 06:26:06 +04:00
|
|
|
incompatible existing implementations of Markdown will render, as far as
|
2014-11-23 04:20:41 +03:00
|
|
|
possible, as their authors intended. And we have tried to make the rules
|
2014-08-14 09:28:18 +04:00
|
|
|
for different elements work together harmoniously. In places where
|
|
|
|
different decisions could have been made (for example, the rules
|
2014-11-23 04:20:41 +03:00
|
|
|
governing list indentation), we have explained the rationale for
|
|
|
|
my choices. In a few cases, we have departed slightly from the canonical
|
|
|
|
syntax description, in ways that we think further the goals of Markdown
|
2014-08-14 09:28:18 +04:00
|
|
|
as stated in that description.
|
|
|
|
|
2014-11-23 04:20:41 +03:00
|
|
|
For the most part, we have limited ourselves to the basic elements
|
2014-08-14 09:28:18 +04:00
|
|
|
described in Gruber's canonical syntax description, eschewing extensions
|
|
|
|
like footnotes and definition lists. It is important to get the core
|
2014-11-23 04:20:41 +03:00
|
|
|
right before considering such things. However, we have included a visible
|
2014-08-14 09:28:18 +04:00
|
|
|
syntax for line breaks and fenced code blocks.
|
|
|
|
|
2014-11-12 07:45:30 +03:00
|
|
|
Differences from original Markdown
|
|
|
|
----------------------------------
|
|
|
|
|
2014-10-26 23:27:38 +03:00
|
|
|
There are only a few places where this spec says things that contradict
|
|
|
|
the canonical syntax description:
|
|
|
|
|
|
|
|
- It [allows all punctuation symbols to be
|
2014-11-14 02:42:29 +03:00
|
|
|
backslash-escaped](http://jgm.github.io/CommonMark/spec.html#backslash-escapes),
|
2014-11-23 04:20:41 +03:00
|
|
|
not just the symbols with special meanings in Markdown. We found
|
2014-10-26 23:27:38 +03:00
|
|
|
that it was just too hard to remember which symbols could be
|
|
|
|
escaped.
|
|
|
|
|
|
|
|
- It introduces an [alternative syntax for hard line
|
2014-11-14 02:42:29 +03:00
|
|
|
breaks](http://jgm.github.io/CommonMark/spec.html#hard-line-breaks), a
|
2014-10-26 23:27:38 +03:00
|
|
|
backslash at the end of the line, supplementing the
|
|
|
|
two-spaces-at-the-end-of-line rule. This is motivated by persistent
|
|
|
|
complaints about the “invisible” nature of the two-space rule.
|
|
|
|
|
|
|
|
- Link syntax has been made a bit more predictable (in a
|
|
|
|
backwards-compatible way). For example, `Markdown.pl` allows single
|
|
|
|
quotes around a title in inline links, but not in reference links.
|
|
|
|
This kind of difference is really hard for users to remember, so the
|
|
|
|
spec [allows single quotes in both
|
2014-11-14 02:42:29 +03:00
|
|
|
contexts](http://jgm.github.io/CommonMark/spec.html#links).
|
2014-10-26 23:27:38 +03:00
|
|
|
|
|
|
|
- The rule for HTML blocks differs, though in most real cases it
|
|
|
|
shouldn't make a difference. (See
|
2014-11-14 02:42:29 +03:00
|
|
|
[here](http://jgm.github.io/CommonMark/spec.html#html-blocks) for
|
2014-10-26 23:27:38 +03:00
|
|
|
details.) The spec's proposal makes it easy to include Markdown
|
|
|
|
inside HTML block-level tags, if you want to, but also allows you to
|
|
|
|
exclude this. It is also makes parsing much easier, avoiding
|
|
|
|
expensive backtracking.
|
|
|
|
|
|
|
|
- It does not collapse adjacent bird-track blocks into a single
|
|
|
|
blockquote:
|
|
|
|
|
|
|
|
> this is two
|
|
|
|
|
|
|
|
> blockquotes
|
|
|
|
|
|
|
|
> this is a single
|
|
|
|
>
|
|
|
|
> blockquote with two paragraphs
|
|
|
|
|
|
|
|
- Rules for content in lists differ in a few respects, though (as with
|
|
|
|
HTML blocks), most lists in existing documents should render as
|
|
|
|
intended. There is some discussion of the choice points and
|
2014-11-14 02:42:29 +03:00
|
|
|
differences [here](http://jgm.github.io/CommonMark/spec.html#motivation).
|
2014-11-23 04:20:41 +03:00
|
|
|
We think that the spec's proposal does better than any existing
|
2014-10-26 23:27:38 +03:00
|
|
|
implementation in rendering lists the way a human writer or reader
|
2014-11-23 04:20:41 +03:00
|
|
|
would intuitively understand them. (We could give numerous examples
|
2014-10-26 23:27:38 +03:00
|
|
|
of perfectly natural looking lists that nearly every existing
|
|
|
|
implementation flubs up.)
|
|
|
|
|
|
|
|
- The spec stipulates that two blank lines break out of all list
|
|
|
|
contexts. This is an attempt to deal with issues that often come up
|
|
|
|
when someone wants to have two adjacent lists, or a list followed by
|
|
|
|
an indented code block.
|
|
|
|
|
|
|
|
- Changing bullet characters, or changing from bullets to numbers or
|
2014-11-23 04:20:41 +03:00
|
|
|
vice versa, starts a new list. We think that is almost always going
|
2014-10-26 23:27:38 +03:00
|
|
|
to be the writer's intent.
|
|
|
|
|
|
|
|
- The number that begins an ordered list item may be followed by
|
|
|
|
either `.` or `)`. Changing the delimiter style starts a new
|
|
|
|
list.
|
|
|
|
|
|
|
|
- The start number of an ordered list is significant.
|
|
|
|
|
2014-11-14 02:42:29 +03:00
|
|
|
- [Fenced code blocks](http://jgm.github.io/CommonMark/spec.html#fenced-code-blocks) are supported, delimited by either
|
2014-11-18 18:32:46 +03:00
|
|
|
backticks (```` ``` ```` or tildes (` ~~~ `).
|
2014-10-26 23:27:38 +03:00
|
|
|
|
|
|
|
Contributing
|
|
|
|
------------
|
|
|
|
|
|
|
|
There is a [forum for discussing
|
|
|
|
CommonMark](http://talk.commonmark.org); you should use it instead of
|
|
|
|
github issues for questions and possibly open-ended discussions.
|
2014-11-14 02:42:29 +03:00
|
|
|
Use the [github issue tracker](http://github.com/jgm/CommonMark/issues)
|
2014-10-26 23:27:38 +03:00
|
|
|
only for simple, clear, actionable issues.
|
|
|
|
|
2014-11-23 04:20:41 +03:00
|
|
|
Authors
|
|
|
|
-------
|
|
|
|
|
|
|
|
The spec was written by John MacFarlane, drawing on
|
|
|
|
|
|
|
|
- his experience writing and maintaining Markdown implementations in several
|
|
|
|
languages, including the first Markdown parser not based on regular
|
|
|
|
expression substitutions ([pandoc](http://github.com/jgm/pandoc)) and
|
|
|
|
the first markdown parsers based on PEG grammars
|
|
|
|
([peg-markdown](http://github.com/jgm/peg-markdown),
|
|
|
|
[lunamark](http://github.com/jgm/lunamark))
|
|
|
|
- a detailed examination of the differences between existing Markdown
|
|
|
|
implementations using [BabelMark 2](http://johnmacfarlane.net/babelmark2/),
|
|
|
|
and
|
|
|
|
- extensive discussions with David Greenspan, Jeff Atwood, Vicent
|
|
|
|
Marti, Neil Williams, and Benjamin Dumke-von der Ehe.
|
|
|
|
|
|
|
|
John MacFarlane was also responsible for the original versions of the
|
|
|
|
C and JavaScript implementations. The block parsing algorithm was
|
|
|
|
worked out together with David Greenspan. Vicent Marti
|
|
|
|
optimized the C implementation for performance, increasing its speed
|
|
|
|
tenfold. Kārlis Gaņģis helped work out a better parsing algorithm
|
|
|
|
for links and emphasis, eliminating several worst-case performance
|
|
|
|
issues. Nick Wellnhofer contributed many improvements, including
|
|
|
|
most of the C library's API and its test harness.
|
2014-11-13 00:17:06 +03:00
|
|
|
|
2014-11-07 22:15:18 +03:00
|
|
|
[cmake]: http://www.cmake.org/download/
|
2014-11-12 07:45:30 +03:00
|
|
|
[pandoc]: http://johnmacfarlane.net/pandoc/
|
2014-11-12 07:48:09 +03:00
|
|
|
[re2c]: http://re2c.org
|
2014-11-23 04:20:41 +03:00
|
|
|
|