From 0b4b41099b435344006148d2a55c429582d11853 Mon Sep 17 00:00:00 2001 From: Chris Lattner Date: Sun, 13 Sep 2009 17:23:57 +0000 Subject: [PATCH] rewrite readme. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@81678 91177308-0d34-0410-b5e6-96231b3b80d8 --- README.txt | 186 +++++------------------------------------------------ 1 file changed, 17 insertions(+), 169 deletions(-) diff --git a/README.txt b/README.txt index 611dc9d2c0..924ecc416c 100644 --- a/README.txt +++ b/README.txt @@ -1,178 +1,26 @@ //===----------------------------------------------------------------------===// // C Language Family Front-end //===----------------------------------------------------------------------===// - Chris Lattner -I. Introduction: - - clang: noun - 1. A loud, resonant, metallic sound. - 2. The strident call of a crane or goose. - 3. C-language family front-end toolkit. +Welcome to Clang. This is a compiler front-end for the C family of languages +(C, C++, Objective-C, and Objective-C++) which is built as part of the LLVM +compiler intrastructure project. - The world needs better compiler tools, tools which are built as libraries. This - design point allows reuse of the tools in new and novel ways. However, building - the tools as libraries isn't enough: they must have clean APIs, be as - decoupled from each other as possible, and be easy to modify/extend. This - requires clean layering, decent design, and avoiding tying the libraries to a - specific use. Oh yeah, did I mention that we want the resultant libraries to - be as fast as possible? :) +Unlike many other compiler frontends, Clang is useful for a number of things +beyond just compiling code: we intend for Clang to be host to a number of +different source level tools. One example of this is the Clang Static Analyzer. - This front-end is built as a component of the LLVM toolkit that can be used - with the LLVM backend or independently of it. In this spirit, the API has been - carefully designed as the following components: - - libsupport - Basic support library, reused from LLVM. +If you're interested in more (including how to build Clang) it is best to read +the relevant web sites. Here are some pointers: - libsystem - System abstraction library, reused from LLVM. - - libbasic - Diagnostics, SourceLocations, SourceBuffer abstraction, - file system caching for input source files. This depends on - libsupport and libsystem. +Information on Clang: http://clang.llvm.org/ +Building and using Clang: http://clang.llvm.org/get_started.html +Clang Static Analyzer: http://clang-analyzer.llvm.org/ +Information on the LLVM project: http://llvm.org/ - libast - Provides classes to represent the C AST, the C type system, - builtin functions, and various helpers for analyzing and - manipulating the AST (visitors, pretty printers, etc). This - library depends on libbasic. - - - liblex - C/C++/ObjC lexing and preprocessing, identifier hash table, - pragma handling, tokens, and macros. This depends on libbasic. - - libparse - C (for now) parsing and local semantic analysis. This library - invokes coarse-grained 'Actions' provided by the client to do - stuff (e.g. libsema builds ASTs). This depends on liblex. - - libsema - Provides a set of parser actions to build a standardized AST - for programs. AST's are 'streamed' out a top-level declaration - at a time, allowing clients to use decl-at-a-time processing, - build up entire translation units, or even build 'whole - program' ASTs depending on how they use the APIs. This depends - on libast and libparse. - - librewrite - Fast, scalable rewriting of source code. This operates on - the raw syntactic text of source code, allowing a client - to insert and delete text in very large source files using - the same source location information embedded in ASTs. This - is intended to be a low-level API that is useful for - higher-level clients and libraries such as code refactoring. - - libanalysis - Source-level dataflow analysis useful for performing analyses - such as computing live variables. It also includes a - path-sensitive "graph-reachability" engine for writing - analyses that reason about different possible paths of - execution through source code. This is currently being - employed to write a set of checks for finding bugs in software. - - libcodegen - Lower the AST to LLVM IR for optimization & codegen. Depends - on libast. - - clang - An example driver, client of the libraries at various levels. - This depends on all these libraries, and on LLVM VMCore. - - This front-end has been intentionally built as a DAG of libraries, making it - easy to reuse individual parts or replace pieces if desired. For example, to - build a preprocessor, you take the Basic and Lexer libraries. If you want an - indexer, you take those plus the Parser library and provide some actions for - indexing. If you want a refactoring, static analysis, or source-to-source - compiler tool, it makes sense to take those plus the AST building and semantic - analyzer library. Finally, if you want to use this with the LLVM backend, - you'd take these components plus the AST to LLVM lowering code. - - In the future I hope this toolkit will grow to include new and interesting - components, including a C++ front-end, ObjC support, and a whole lot of other - things. - - Finally, it should be pointed out that the goal here is to build something that - is high-quality and industrial-strength: all the obnoxious features of the C - family must be correctly supported (trigraphs, preprocessor arcana, K&R-style - prototypes, GCC/MS extensions, etc). It cannot be used if it is not 'real'. - - -II. Usage of clang driver: - - * Basic Command-Line Options: - - Help: clang --help - - Standard GCC options accepted: -E, -I*, -i*, -pedantic, -std=c90, etc. - - To make diagnostics more gcc-like: -fno-caret-diagnostics -fno-show-column - - Enable metric printing: -stats - - * -fsyntax-only is currently the default mode. - - * -E mode works the same way as GCC. - - * -Eonly mode does all preprocessing, but does not print the output, - useful for timing the preprocessor. - - * -fsyntax-only is currently partially implemented, lacking some - semantic analysis (some errors and warnings are not produced). - - * -parse-noop parses code without building an AST. This is useful - for timing the cost of the parser without including AST building - time. - - * -parse-ast builds ASTs, but doesn't print them. This is most - useful for timing AST building vs -parse-noop. - - * -parse-ast-print pretty prints most expression and statements nodes. - - * -parse-ast-check checks that diagnostic messages that are expected - are reported and that those which are reported are expected. - - * -dump-cfg builds ASTs and then CFGs. CFGs are then pretty-printed. - - * -view-cfg builds ASTs and then CFGs. CFGs are then visualized by - invoking Graphviz. - - For more information on getting Graphviz to work with clang/LLVM, - see: http://llvm.org/docs/ProgrammersManual.html#ViewGraph - - -III. Current advantages over GCC: - - * Column numbers are fully tracked (no 256 col limit, no GCC-style pruning). - * All diagnostics have column numbers, includes 'caret diagnostics', and they - highlight regions of interesting code (e.g. the LHS and RHS of a binop). - * Full diagnostic customization by client (can format diagnostics however they - like, e.g. in an IDE or refactoring tool) through DiagnosticClient interface. - * Built as a framework, can be reused by multiple tools. - * All languages supported linked into same library (no cc1,cc1obj, ...). - * mmap's code in read-only, does not dirty the pages like GCC (mem footprint). - * LLVM License, can be linked into non-GPL projects. - * Full diagnostic control, per diagnostic. Diagnostics are identified by ID. - * Significantly faster than GCC at semantic analysis, parsing, preprocessing - and lexing. - * Defers exposing platform-specific stuff to as late as possible, tracks use of - platform-specific features (e.g. #ifdef PPC) to allow 'portable bytecodes'. - * The lexer doesn't rely on the "lexer hack": it has no notion of scope and - does not categorize identifiers as types or variables -- this is up to the - parser to decide. - -Potential Future Features: - - * Fine grained diag control within the source (#pragma enable/disable warning). - * Better token tracking within macros? (Token came from this line, which is - a macro argument instantiated here, recursively instantiated here). - * Fast #import with a module system. - * Dependency tracking: change to header file doesn't recompile every function - that texually depends on it: recompile only those functions that need it. - This is aka 'incremental parsing'. - - -IV. Missing Functionality / Improvements - -Lexer: - * Source character mapping. GCC supports ASCII and UTF-8. - See GCC options: -ftarget-charset and -ftarget-wide-charset. - * Universal character support. Experimental in GCC, enabled with - -fextended-identifiers. - * -fpreprocessed mode. - -Preprocessor: - * #assert/#unassert - * MSExtension: "L#param" stringizes to a wide string literal. - * Add support for -M* - -Traditional Preprocessor: - * Currently, we have none. :) +If you have questions or comments about Clang, a great place to discuss them is +on the Clang development mailing list: + http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev +If you find a bug in Clang, please file it in the LLVM bug tracker: + http://llvm.org/bugs/