Features of Clang

This page outlines the main goals of Clang, as well as some compelling reasons why you should consider using Clang. In the Goals section below, you will find a brief, bulleted overview of the goals and features that we are striving for in the development of Clang. However, in the Key Features section you will find a more detailed presentation on what we believe are some key drawing points for the Clang front-end.

If you are new to the Clang front-end and you want a reason for considering working on or using the new front-end, then make sure you check out the Key Features section.

Goals

Real-world, production quality compiler
Unified parser for C-based languages

We are only focusing on the C family of languages (C, C++, ObjC); however, if someone wants to work on another language, they are free to take charge of such a project.

Language conformance with C, ObjC, C++, including dialects (C99, C90, ObjC2, etc)
GCC compatibility (GCC extensions and 'quirks', where it makes sense)
Library based architecture with finely crafted C++ APIs

Makes Clang easier to work with and more flexible.

(more details on this in the "Key Features" section)

Easy to extend

(because of the library based architecture)

Multipurpose

Can be used for: Indexing, static analysis, code generation, source to source tools, refactoring

(because of library based architecture)

High performance

Extremely fast (much faster than GCC), low memory footprint, use lazy evaluation and multithreading

(more details in the "Key Features" section)

Better integration with IDEs

(more details in the "Key Features" section)

Expressive diagnostics

Error reporting and diagnostic messages are more detailed an accurate than GCC.

(more details in the "Key Features" section)

BSD License

Fewer restrictions on developers; allows for use in commercial products.

Key Features

There are several key features which we believe make Clang an exciting front-end. These features are designed to make things easier for both the compiler developer (people working on Clang and derivative products) and the application developer (those who use Clang/LLVM).

Library based architecture

A major design concept for the LLVM front-end involves using a library based architecture. In this library based architecture, various parts of the front-end can be cleanly divided into separate libraries which can then be mixed up for different needs and uses. In addition, the library based approach makes it much easier for new developers to get involved and extend LLVM to do new and unique things. In the words of Chris,

"The world needs better compiler tools, tools which are built as libraries. This design point allows reuse of the tools in new and novel ways. However, building the tools as libraries isn't enough: they must have clean APIs, be as decoupled from each other as possible, and be easy to modify/extend. This requires clean layering, decent design, and keeping the libraries independent of any specific client."

Currently, the LLVM front-end is divided into the following libraries:

libsupport - Basic support library, reused from LLVM.
libsystem - System abstraction library, reused from LLVM.
libbasic - Diagnostics, SourceLocations, SourceBuffer abstraction, file system caching for input source files. (depends on above libraries)
libast - Provides classes to represent the C AST, the C type system, builtin functions, and various helpers for analyzing and manipulating the AST (visitors, pretty printers, etc). (depends on above libraries)
liblex - C/C++/ObjC lexing and preprocessing, identifier hash table, pragma handling, tokens, and macros. (depends on above libraries)
libparse - Parsing and local semantic analysis. This library invokes coarse-grained 'Actions' provided by the client to do stuff (e.g. libsema builds ASTs). (depends on above libraries)
libsema - Provides a set of parser actions to build a standardized AST for programs. AST's are 'streamed' out a top-level declaration at a time, allowing clients to use decl-at-a-time processing, build up entire translation units, or even build 'whole program' ASTs depending on how they use the APIs. (depends on libast and libparse)
libcodegen - Lower the AST to LLVM IR for optimization & codegen. (depends on libast)
librewrite - Editing of text buffers, depends on libast.
libanalysis - Static analysis support, depends on libast.
clang - An example driver, client of the libraries at various levels. (depends on above libraries, and LLVM VMCore)

As an example of the power of this library based design.... If you wanted to build a preprocessor, you would take the Basic and Lexer libraries. If you want an indexer, you would take the previous two and add the Parser library and some actions for indexing. If you want a refactoring, static analysis, or source-to-source compiler tool, you would then add the AST building and semantic analyzer libraries. In the end, LLVM's library based design will provide developers with many more possibilities.

Speed and Memory

Another major focus of LLVM's frontend is speed (for all libraries). Even at this early stage, the clang front-end is quicker than gcc and uses less memory.

Memory:

This test was run using Mac OS X's Carbon.h header, which is 12.3MB spread across 558 files! Although large headers are one of the worst case scenarios for GCC, they are very common and it shows how clang's implemenation is significantly more memory efficient.

Performance:

Even at this early stage, the C parser for Clang is able to achieve significantly better performance. Many optimizations are still possible of course.

Performance:

By using very trivial file-system caching, clang can significantly speed up preprocessing-bound applications like distcc. (more details)

Expressive Diagnostics

Clang is designed to efficiently capture range information for expressions and statements, which allows it to emit very detailed diagnostic information when a problem is detected.

Clang vs GCC:

There are several things to take note of in this example:

The error messages from Clang are more detailed.
Clang shows you exactly where the error is, plus the range it has a problem with.

Notes:The first results are from clang; the second results are from gcc.

Better Integration with IDEs

Another design goal of Clang is to integrate extremely well with IDEs. IDEs often have very different requirements than code generation, often requiring information that a codegen-only frontend can throw away. Clang is specifically designed and built to capture this information.