254 строки
12 KiB
ReStructuredText
254 строки
12 KiB
ReStructuredText
|
============
|
||
|
HLSL Changes
|
||
|
============
|
||
|
|
||
|
Introduction
|
||
|
============
|
||
|
|
||
|
This document is meant to describe the changes that have been made to the LLVM
|
||
|
and Clang codebases to support HLSL, the High-Level Shading Language. The
|
||
|
focus is on design changes and general approach rather than specific changes
|
||
|
made to support the language and runtime.
|
||
|
|
||
|
Driving Forces
|
||
|
==============
|
||
|
|
||
|
This section outlines the considerations that have prompted multiple changes.
|
||
|
|
||
|
* Provide a successor to the fxc.exe and d3dcompiler_47.dll components. This
|
||
|
means providing a DLL that can be used in a variety of situations, as well
|
||
|
as a command-line program to access its functionality.
|
||
|
|
||
|
* The requirement to act as a proper DLL means that it shouldn't rely on
|
||
|
process-wide state, as this prevents callers from using it from more than
|
||
|
one thread concurrently. This includes changing the current directory, using
|
||
|
stdin/stdout/stderr, or deciding to terminate the process (except for fatal
|
||
|
conditions).
|
||
|
|
||
|
* There are scenarios in which applications may choose to redistribute the
|
||
|
built DLL, for example an IDE for writing shaders, tools to instrument and
|
||
|
debug them, or a game that chooses to emit shaders at runtime. This means
|
||
|
that compilation time and DLL/program size are important considerations.
|
||
|
|
||
|
* Because the new command-line program should be a replacement of specific
|
||
|
components, it's desirable to keep the interface similar to the prior
|
||
|
versions (API in case of the DLL, command-line format in the case of the
|
||
|
program).
|
||
|
|
||
|
Forking LLVM and Clang
|
||
|
======================
|
||
|
|
||
|
This section describes why and how the HLSL on LLVM project has forked LLVM
|
||
|
and Clang.
|
||
|
|
||
|
LLVM and Clang provide an excellent starting point for building a compiler,
|
||
|
especially one that will exist in an ecosystem of organizations contributing
|
||
|
to the pipeline, whether it be in the form of tooling and abstractions by
|
||
|
middleware and tool authors, or backend compilers by hardware vendors.
|
||
|
|
||
|
The decision was made to fork LLVM and Clang rather than work directly and
|
||
|
upstream all changes for the following reasons:
|
||
|
|
||
|
* While HLSL started out as a C derivative, over time is has drifted away, not
|
||
|
to the extent that C++ and Objective-C have but certainly a fair
|
||
|
bit. Furthermore, some of the behavior was never compatible to begin with,
|
||
|
and so there are significant differences, especially in the type system,
|
||
|
that make upstreaming difficult.
|
||
|
|
||
|
* HLSL is expected to evolve over the next few years to shed some of the
|
||
|
incompatibilities that exist for historical reasons (as opposed to those
|
||
|
that provide actual developer benefit). It's entirely possible that a future
|
||
|
version will be much more closely aligned to C/C++ semantics and could be
|
||
|
more easily adapted. This version of the codebase isn't it.
|
||
|
|
||
|
* The changes to LLVM are meaningless without an execution model, which is
|
||
|
currently being worked on. Changes to LLVM are more likely to get
|
||
|
upstreamed, however we have chosen to not consider this until we have
|
||
|
multiple implementations of DXIL backends.
|
||
|
|
||
|
We have already done a 3.4 to 3.7 upgrade in the sources, and it's entirely
|
||
|
possible that this will happen again. Therefore, all changes introduces are
|
||
|
done in entirely new files, or marked with an 'HLSL Change', or marked with a
|
||
|
pair of 'HLSL Change Starts' / 'HLSL Change Ends'. This makes integrations
|
||
|
easier (or rather less difficult).
|
||
|
|
||
|
Dependency Injection
|
||
|
====================
|
||
|
|
||
|
One of the goals of dxcompiler is to be a reusable component for a variety of
|
||
|
applications. While Clang and LLVM have support for being hosted as a
|
||
|
dynamically loaded library, there is still a number of assumptions made that
|
||
|
make usage problematic, specifically:
|
||
|
|
||
|
- Usage of process-wide handles (stdin/stdout/stderr).
|
||
|
|
||
|
- Reliance on file system access (clang has some level of virtual file system
|
||
|
support in the later versions), including temporary files.
|
||
|
|
||
|
- Direct usage of memory allocation mechanisms.
|
||
|
|
||
|
- Reliance on other process-wide constructs like environment variables.
|
||
|
|
||
|
Redesigning a number of these mechanisms would require changes throughout the
|
||
|
codebase, and it's hard to whether any regressions are introduced when a large
|
||
|
number of changes are integrated.
|
||
|
|
||
|
The solution we have implemented relies on having a thread-local component
|
||
|
that can service I/O requests as well as other OS-implemented or process-wide
|
||
|
constructs. The library is then meant to be used through specific API points
|
||
|
that will set and tear down this component as appropriate. The MSFileSystem
|
||
|
class in include/llvm/Support/MSFileSystem.h provides the access point
|
||
|
(although the API should likely be renamed, as it's broader than a pur file
|
||
|
system abstraction, and 'MS' monikers are being changed to 'HLSL').
|
||
|
|
||
|
To guard against regressions, we can simply verify that no libraries include
|
||
|
APIs that are virtualized, such as calls to CreateFile. If a case is found, a
|
||
|
drop-in replacement function call can be made in its place.
|
||
|
|
||
|
Some of the tasks that are simplified include:
|
||
|
|
||
|
- Host in-process in an IDE to provide tooling services.
|
||
|
|
||
|
- Guarantee that no process execution takes place, or that the host
|
||
|
environment influences the tools without an explicit usage.
|
||
|
|
||
|
- Provide virtualized I/O for all kinds of file access, including redirecting
|
||
|
to in-memory buffers.
|
||
|
|
||
|
- Override memory allocation to constrain memory consumption.
|
||
|
|
||
|
At the moment, memory allocation is still not redirected, but I/O has been
|
||
|
cleaned up.
|
||
|
|
||
|
|
||
|
Error Handling
|
||
|
==============
|
||
|
|
||
|
Clang and LLVM already provide a number of error handling mechanisms:
|
||
|
returning a bool flag to indicate success or failure, returning a structured
|
||
|
object that flags the result along with other information, using STL system
|
||
|
errors for certain errors, or using errno for some C standard library calls.
|
||
|
|
||
|
There are two other kinds of error handling mechanisms introduced by HLSL
|
||
|
on LLVM.
|
||
|
|
||
|
* C++ Exceptions. The primary use case for these is to handle out-of-memory
|
||
|
exceptions. A second case is to handle cases where LLVM and Clang today
|
||
|
attempt to terminate the process even though they are in a recoverable state
|
||
|
(for example, when a '--help' switch is found in command-line options).
|
||
|
|
||
|
* HRESULT. The APIs are designed to be familiar to Windows developers and to
|
||
|
provide a simple transition for d3dcompiler_47.dll users. As such, error
|
||
|
codes are typically returned as HRESULT values that can be tested with the
|
||
|
SUCCEEDED/FAILED macros.
|
||
|
|
||
|
Some of the code that is written to interface with the rest of the system can
|
||
|
also make use of these error handling strategies. A number of macros to handle
|
||
|
them, or to convert one into the other, are available in
|
||
|
include/dxc/Support/Global.h.
|
||
|
|
||
|
Removing Unused Functionality
|
||
|
=============================
|
||
|
|
||
|
Removing unused functionality can help reduce the binary size, improve
|
||
|
performance, and speed up compilation time for the project. However, this has
|
||
|
to be traded off against changing the behavior of LLVM and Clang (and the
|
||
|
cost of understanding this change for developers who are familiar with those
|
||
|
projects), as well as the future maintenance for integrations.
|
||
|
|
||
|
The recommendations is to avoid removing small bits of functionality, and only
|
||
|
do so for significant subsystems that can be "sliced off" cleanly (for
|
||
|
example, the interpreter component or target support).
|
||
|
|
||
|
Component Design
|
||
|
================
|
||
|
|
||
|
The dxcompiler DLL is designed to export a number of components that can be
|
||
|
reused in different contexts. The API is exposed as a lightweight form of the
|
||
|
Microsoft Component Object Model (COM); a similar approach can be seen in the
|
||
|
design of the xmllite library.
|
||
|
|
||
|
The functionality of the library is encapsulated in discrete compmonents, each
|
||
|
of which is embodied in an object that implements one or more
|
||
|
interfaces. Interfaces are derived from IUnknown as in COM and are responsible
|
||
|
for interface discovery and lifetime management. Object construction is done
|
||
|
via the single exported API, DxcCreateObject, which acts much like
|
||
|
DllCreateObject would in a COM library.
|
||
|
|
||
|
Interfaces are mostly COM-compatible and have been designed to be easy to use
|
||
|
from other languages that can consume COM libraries, such as the .NET runtime
|
||
|
or C++ applications. Importantly, memory allocated in the library that should
|
||
|
be freed by the consumer is allocated using the COM allocation, through
|
||
|
CoTaskMemAlloc or the use of IMalloc via CoGetMalloc().
|
||
|
|
||
|
Note that this lightweight COM support implies that some features are missing:
|
||
|
|
||
|
- There is no support for marshalling across COM apartments.
|
||
|
|
||
|
- There is (at the moment at least) no management of library references based
|
||
|
on outstanding objects (a typical bug that would arise from this would be,
|
||
|
for example, unloading dxcompiler while outstanding objects exist, at which
|
||
|
point even releasing them would lead to an access violation).
|
||
|
|
||
|
Text and Buffer Management
|
||
|
==========================
|
||
|
|
||
|
Tradionally, the D3D compilers have used an ID3DBlob interface to encapsulate
|
||
|
a buffer. The HLSL compiler avoids pulling in DirectX headers and defines an
|
||
|
IDxcBlob interface that has the same layout and interface identifier (IID).
|
||
|
|
||
|
Buffers are often used to hold text, for example shader sources or compilation
|
||
|
logs. IDxcBlobEncoding inherits from IDxcBlob and has functionality to declare
|
||
|
the encoding for the buffer.
|
||
|
|
||
|
The design principle for using a character pointer or an IDxcBlobEncoding is
|
||
|
as follows: for internal dxcompiler text, UTF-8 and char* are used; for API
|
||
|
parameters that are "short" such as file names or command line parameters,
|
||
|
UTF-16 and wchar_t* are used; for longer text such as source files or error
|
||
|
logs, IDxcBlobEncoding is used.
|
||
|
|
||
|
The DLL provides a "library" component that provides utility functions to
|
||
|
create and transform blobs and strings.
|
||
|
|
||
|
Specification Database
|
||
|
======================
|
||
|
|
||
|
In the utils\hct directory, an hctdb.py file can be found that initialized a
|
||
|
number of Python object instances describing different aspects of the DXIL
|
||
|
specification. These act as a queryable, programmable repository of
|
||
|
information which feed into other tasks, such as generating documentation,
|
||
|
generating code or performing compatibility checks across versions.
|
||
|
|
||
|
We require that the database be kept up-to-date for the concepts embedded
|
||
|
there to drive a number of code-generation tasks, which can be found in other
|
||
|
.py files in that same directory.
|
||
|
|
||
|
HLSL Modules
|
||
|
============
|
||
|
|
||
|
llvm::Module is the type that represents a shader program. It includes
|
||
|
metadata nodes to provide details around the ABI, flags, etc. However,
|
||
|
manipulation of all this information in terms of metadata is not very
|
||
|
efficient or convenient.
|
||
|
|
||
|
As part of the work with HLSL, we introduce two modules that are attached
|
||
|
in-memory to an llvm::Module: a high-level HLModule, and a low-level
|
||
|
DxilModule. The high-level module is used in the early passes to deal with
|
||
|
HLSL-as-a-language concepts, such as intrinsics, matrices and vectors; the
|
||
|
low-level module is used to deal with concepts as they exist in the DXIL
|
||
|
specification. Only one of these additional modules ever exists at one point;
|
||
|
the DxilGenerationPass that does the translation destroys the high-level
|
||
|
representation and creates a low-level one as part of its work.
|
||
|
|
||
|
To preserve many of the benefits of LLVM's modular pipeline, it is useful to
|
||
|
serialize and deserialize shaders at different stages of processing, and so
|
||
|
both HLModule and DxilModule provide support for these. The expectation for a
|
||
|
wholesale compilation from source, however, is that this information lives
|
||
|
only in memory until it's ready to be serialized out in final DIXL form. As
|
||
|
such, various passes along the way may need to do update to these modules to
|
||
|
maintain consistency (for example, if global DCE removes a variable, the
|
||
|
corresponding resource mapping that reflects the shader ABI should be cleaned
|
||
|
up as well).
|
||
|
|
||
|
|