ALSA: compress offload API documentation
The patch adds the documentation file explaining the API Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Vinod Koul <vinod.koul@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
This commit is contained in:
Родитель
68d5339322
Коммит
57bd9b8ddd
|
@ -0,0 +1,188 @@
|
|||
compress_offload.txt
|
||||
=====================
|
||||
Pierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com>
|
||||
Vinod Koul <vinod.koul@linux.intel.com>
|
||||
|
||||
Overview
|
||||
|
||||
Since its early days, the ALSA API was defined with PCM support or
|
||||
constant bitrates payloads such as IEC61937 in mind. Arguments and
|
||||
returned values in frames are the norm, making it a challenge to
|
||||
extend the existing API to compressed data streams.
|
||||
|
||||
In recent years, audio digital signal processors (DSP) were integrated
|
||||
in system-on-chip designs, and DSPs are also integrated in audio
|
||||
codecs. Processing compressed data on such DSPs results in a dramatic
|
||||
reduction of power consumption compared to host-based
|
||||
processing. Support for such hardware has not been very good in Linux,
|
||||
mostly because of a lack of a generic API available in the mainline
|
||||
kernel.
|
||||
|
||||
Rather than requiring a compability break with an API change of the
|
||||
ALSA PCM interface, a new 'Compressed Data' API is introduced to
|
||||
provide a control and data-streaming interface for audio DSPs.
|
||||
|
||||
The design of this API was inspired by the 2-year experience with the
|
||||
Intel Moorestown SOC, with many corrections required to upstream the
|
||||
API in the mainline kernel instead of the staging tree and make it
|
||||
usable by others.
|
||||
|
||||
Requirements
|
||||
|
||||
The main requirements are:
|
||||
|
||||
- separation between byte counts and time. Compressed formats may have
|
||||
a header per file, per frame, or no header at all. The payload size
|
||||
may vary from frame-to-frame. As a result, it is not possible to
|
||||
estimate reliably the duration of audio buffers when handling
|
||||
compressed data. Dedicated mechanisms are required to allow for
|
||||
reliable audio-video synchronization, which requires precise
|
||||
reporting of the number of samples rendered at any given time.
|
||||
|
||||
- Handling of multiple formats. PCM data only requires a specification
|
||||
of the sampling rate, number of channels and bits per sample. In
|
||||
contrast, compressed data comes in a variety of formats. Audio DSPs
|
||||
may also provide support for a limited number of audio encoders and
|
||||
decoders embedded in firmware, or may support more choices through
|
||||
dynamic download of libraries.
|
||||
|
||||
- Focus on main formats. This API provides support for the most
|
||||
popular formats used for audio and video capture and playback. It is
|
||||
likely that as audio compression technology advances, new formats
|
||||
will be added.
|
||||
|
||||
- Handling of multiple configurations. Even for a given format like
|
||||
AAC, some implementations may support AAC multichannel but HE-AAC
|
||||
stereo. Likewise WMA10 level M3 may require too much memory and cpu
|
||||
cycles. The new API needs to provide a generic way of listing these
|
||||
formats.
|
||||
|
||||
- Rendering/Grabbing only. This API does not provide any means of
|
||||
hardware acceleration, where PCM samples are provided back to
|
||||
user-space for additional processing. This API focuses instead on
|
||||
streaming compressed data to a DSP, with the assumption that the
|
||||
decoded samples are routed to a physical output or logical back-end.
|
||||
|
||||
- Complexity hiding. Existing user-space multimedia frameworks all
|
||||
have existing enums/structures for each compressed format. This new
|
||||
API assumes the existence of a platform-specific compatibility layer
|
||||
to expose, translate and make use of the capabilities of the audio
|
||||
DSP, eg. Android HAL or PulseAudio sinks. By construction, regular
|
||||
applications are not supposed to make use of this API.
|
||||
|
||||
|
||||
Design
|
||||
|
||||
The new API shares a number of concepts with with the PCM API for flow
|
||||
control. Start, pause, resume, drain and stop commands have the same
|
||||
semantics no matter what the content is.
|
||||
|
||||
The concept of memory ring buffer divided in a set of fragments is
|
||||
borrowed from the ALSA PCM API. However, only sizes in bytes can be
|
||||
specified.
|
||||
|
||||
Seeks/trick modes are assumed to be handled by the host.
|
||||
|
||||
The notion of rewinds/forwards is not supported. Data committed to the
|
||||
ring buffer cannot be invalidated, except when dropping all buffers.
|
||||
|
||||
The Compressed Data API does not make any assumptions on how the data
|
||||
is transmitted to the audio DSP. DMA transfers from main memory to an
|
||||
embedded audio cluster or to a SPI interface for external DSPs are
|
||||
possible. As in the ALSA PCM case, a core set of routines is exposed;
|
||||
each driver implementer will have to write support for a set of
|
||||
mandatory routines and possibly make use of optional ones.
|
||||
|
||||
The main additions are
|
||||
|
||||
- get_caps
|
||||
This routine returns the list of audio formats supported. Querying the
|
||||
codecs on a capture stream will return encoders, decoders will be
|
||||
listed for playback streams.
|
||||
|
||||
- get_codec_caps For each codec, this routine returns a list of
|
||||
capabilities. The intent is to make sure all the capabilities
|
||||
correspond to valid settings, and to minimize the risks of
|
||||
configuration failures. For example, for a complex codec such as AAC,
|
||||
the number of channels supported may depend on a specific profile. If
|
||||
the capabilities were exposed with a single descriptor, it may happen
|
||||
that a specific combination of profiles/channels/formats may not be
|
||||
supported. Likewise, embedded DSPs have limited memory and cpu cycles,
|
||||
it is likely that some implementations make the list of capabilities
|
||||
dynamic and dependent on existing workloads. In addition to codec
|
||||
settings, this routine returns the minimum buffer size handled by the
|
||||
implementation. This information can be a function of the DMA buffer
|
||||
sizes, the number of bytes required to synchronize, etc, and can be
|
||||
used by userspace to define how much needs to be written in the ring
|
||||
buffer before playback can start.
|
||||
|
||||
- set_params
|
||||
This routine sets the configuration chosen for a specific codec. The
|
||||
most important field in the parameters is the codec type; in most
|
||||
cases decoders will ignore other fields, while encoders will strictly
|
||||
comply to the settings
|
||||
|
||||
- get_params
|
||||
This routines returns the actual settings used by the DSP. Changes to
|
||||
the settings should remain the exception.
|
||||
|
||||
- get_timestamp
|
||||
The timestamp becomes a multiple field structure. It lists the number
|
||||
of bytes transferred, the number of samples processed and the number
|
||||
of samples rendered/grabbed. All these values can be used to determine
|
||||
the avarage bitrate, figure out if the ring buffer needs to be
|
||||
refilled or the delay due to decoding/encoding/io on the DSP.
|
||||
|
||||
Note that the list of codecs/profiles/modes was derived from the
|
||||
OpenMAX AL specification instead of reinventing the wheel.
|
||||
Modifications include:
|
||||
- Addition of FLAC and IEC formats
|
||||
- Merge of encoder/decoder capabilities
|
||||
- Profiles/modes listed as bitmasks to make descriptors more compact
|
||||
- Addition of set_params for decoders (missing in OpenMAX AL)
|
||||
- Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL)
|
||||
- Addition of format information for WMA
|
||||
- Addition of encoding options when required (derived from OpenMAX IL)
|
||||
- Addition of rateControlSupported (missing in OpenMAX AL)
|
||||
|
||||
Not supported:
|
||||
|
||||
- Support for VoIP/circuit-switched calls is not the target of this
|
||||
API. Support for dynamic bit-rate changes would require a tight
|
||||
coupling between the DSP and the host stack, limiting power savings.
|
||||
|
||||
- Packet-loss concealment is not supported. This would require an
|
||||
additional interface to let the decoder synthesize data when frames
|
||||
are lost during transmission. This may be added in the future.
|
||||
|
||||
- Volume control/routing is not handled by this API. Devices exposing a
|
||||
compressed data interface will be considered as regular ALSA devices;
|
||||
volume changes and routing information will be provided with regular
|
||||
ALSA kcontrols.
|
||||
|
||||
- Embedded audio effects. Such effects should be enabled in the same
|
||||
manner, no matter if the input was PCM or compressed.
|
||||
|
||||
- multichannel IEC encoding. Unclear if this is required.
|
||||
|
||||
- Encoding/decoding acceleration is not supported as mentioned
|
||||
above. It is possible to route the output of a decoder to a capture
|
||||
stream, or even implement transcoding capabilities. This routing
|
||||
would be enabled with ALSA kcontrols.
|
||||
|
||||
- Audio policy/resource management. This API does not provide any
|
||||
hooks to query the utilization of the audio DSP, nor any premption
|
||||
mechanisms.
|
||||
|
||||
- No notion of underun/overrun. Since the bytes written are compressed
|
||||
in nature and data written/read doesn't translate directly to
|
||||
rendered output in time, this does not deal with underrun/overun and
|
||||
maybe dealt in user-library
|
||||
|
||||
Credits:
|
||||
- Mark Brown and Liam Girdwood for discussions on the need for this API
|
||||
- Harsha Priya for her work on intel_sst compressed API
|
||||
- Rakesh Ughreja for valuable feedback
|
||||
- Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for
|
||||
demonstrating and quantifying the benefits of audio offload on a
|
||||
real platform.
|
Загрузка…
Ссылка в новой задаче