531 строка
23 KiB
ReStructuredText
531 строка
23 KiB
ReStructuredText
===========================
|
|
Linux for S/390 and zSeries
|
|
===========================
|
|
|
|
Common Device Support (CDS)
|
|
Device Driver I/O Support Routines
|
|
|
|
Authors:
|
|
- Ingo Adlung
|
|
- Cornelia Huck
|
|
|
|
Copyright, IBM Corp. 1999-2002
|
|
|
|
Introduction
|
|
============
|
|
|
|
This document describes the common device support routines for Linux/390.
|
|
Different than other hardware architectures, ESA/390 has defined a unified
|
|
I/O access method. This gives relief to the device drivers as they don't
|
|
have to deal with different bus types, polling versus interrupt
|
|
processing, shared versus non-shared interrupt processing, DMA versus port
|
|
I/O (PIO), and other hardware features more. However, this implies that
|
|
either every single device driver needs to implement the hardware I/O
|
|
attachment functionality itself, or the operating system provides for a
|
|
unified method to access the hardware, providing all the functionality that
|
|
every single device driver would have to provide itself.
|
|
|
|
The document does not intend to explain the ESA/390 hardware architecture in
|
|
every detail.This information can be obtained from the ESA/390 Principles of
|
|
Operation manual (IBM Form. No. SA22-7201).
|
|
|
|
In order to build common device support for ESA/390 I/O interfaces, a
|
|
functional layer was introduced that provides generic I/O access methods to
|
|
the hardware.
|
|
|
|
The common device support layer comprises the I/O support routines defined
|
|
below. Some of them implement common Linux device driver interfaces, while
|
|
some of them are ESA/390 platform specific.
|
|
|
|
Note:
|
|
In order to write a driver for S/390, you also need to look into the interface
|
|
described in Documentation/s390/driver-model.rst.
|
|
|
|
Note for porting drivers from 2.4:
|
|
|
|
The major changes are:
|
|
|
|
* The functions use a ccw_device instead of an irq (subchannel).
|
|
* All drivers must define a ccw_driver (see driver-model.txt) and the associated
|
|
functions.
|
|
* request_irq() and free_irq() are no longer done by the driver.
|
|
* The oper_handler is (kindof) replaced by the probe() and set_online() functions
|
|
of the ccw_driver.
|
|
* The not_oper_handler is (kindof) replaced by the remove() and set_offline()
|
|
functions of the ccw_driver.
|
|
* The channel device layer is gone.
|
|
* The interrupt handlers must be adapted to use a ccw_device as argument.
|
|
Moreover, they don't return a devstat, but an irb.
|
|
* Before initiating an io, the options must be set via ccw_device_set_options().
|
|
* Instead of calling read_dev_chars()/read_conf_data(), the driver issues
|
|
the channel program and handles the interrupt itself.
|
|
|
|
ccw_device_get_ciw()
|
|
get commands from extended sense data.
|
|
|
|
ccw_device_start(), ccw_device_start_timeout(), ccw_device_start_key(), ccw_device_start_key_timeout()
|
|
initiate an I/O request.
|
|
|
|
ccw_device_resume()
|
|
resume channel program execution.
|
|
|
|
ccw_device_halt()
|
|
terminate the current I/O request processed on the device.
|
|
|
|
do_IRQ()
|
|
generic interrupt routine. This function is called by the interrupt entry
|
|
routine whenever an I/O interrupt is presented to the system. The do_IRQ()
|
|
routine determines the interrupt status and calls the device specific
|
|
interrupt handler according to the rules (flags) defined during I/O request
|
|
initiation with do_IO().
|
|
|
|
The next chapters describe the functions other than do_IRQ() in more details.
|
|
The do_IRQ() interface is not described, as it is called from the Linux/390
|
|
first level interrupt handler only and does not comprise a device driver
|
|
callable interface. Instead, the functional description of do_IO() also
|
|
describes the input to the device specific interrupt handler.
|
|
|
|
Note:
|
|
All explanations apply also to the 64 bit architecture s390x.
|
|
|
|
|
|
Common Device Support (CDS) for Linux/390 Device Drivers
|
|
========================================================
|
|
|
|
General Information
|
|
-------------------
|
|
|
|
The following chapters describe the I/O related interface routines the
|
|
Linux/390 common device support (CDS) provides to allow for device specific
|
|
driver implementations on the IBM ESA/390 hardware platform. Those interfaces
|
|
intend to provide the functionality required by every device driver
|
|
implementation to allow to drive a specific hardware device on the ESA/390
|
|
platform. Some of the interface routines are specific to Linux/390 and some
|
|
of them can be found on other Linux platforms implementations too.
|
|
Miscellaneous function prototypes, data declarations, and macro definitions
|
|
can be found in the architecture specific C header file
|
|
linux/arch/s390/include/asm/irq.h.
|
|
|
|
Overview of CDS interface concepts
|
|
----------------------------------
|
|
|
|
Different to other hardware platforms, the ESA/390 architecture doesn't define
|
|
interrupt lines managed by a specific interrupt controller and bus systems
|
|
that may or may not allow for shared interrupts, DMA processing, etc.. Instead,
|
|
the ESA/390 architecture has implemented a so called channel subsystem, that
|
|
provides a unified view of the devices physically attached to the systems.
|
|
Though the ESA/390 hardware platform knows about a huge variety of different
|
|
peripheral attachments like disk devices (aka. DASDs), tapes, communication
|
|
controllers, etc. they can all be accessed by a well defined access method and
|
|
they are presenting I/O completion a unified way : I/O interruptions. Every
|
|
single device is uniquely identified to the system by a so called subchannel,
|
|
where the ESA/390 architecture allows for 64k devices be attached.
|
|
|
|
Linux, however, was first built on the Intel PC architecture, with its two
|
|
cascaded 8259 programmable interrupt controllers (PICs), that allow for a
|
|
maximum of 15 different interrupt lines. All devices attached to such a system
|
|
share those 15 interrupt levels. Devices attached to the ISA bus system must
|
|
not share interrupt levels (aka. IRQs), as the ISA bus bases on edge triggered
|
|
interrupts. MCA, EISA, PCI and other bus systems base on level triggered
|
|
interrupts, and therewith allow for shared IRQs. However, if multiple devices
|
|
present their hardware status by the same (shared) IRQ, the operating system
|
|
has to call every single device driver registered on this IRQ in order to
|
|
determine the device driver owning the device that raised the interrupt.
|
|
|
|
Up to kernel 2.4, Linux/390 used to provide interfaces via the IRQ (subchannel).
|
|
For internal use of the common I/O layer, these are still there. However,
|
|
device drivers should use the new calling interface via the ccw_device only.
|
|
|
|
During its startup the Linux/390 system checks for peripheral devices. Each
|
|
of those devices is uniquely defined by a so called subchannel by the ESA/390
|
|
channel subsystem. While the subchannel numbers are system generated, each
|
|
subchannel also takes a user defined attribute, the so called device number.
|
|
Both subchannel number and device number cannot exceed 65535. During sysfs
|
|
initialisation, the information about control unit type and device types that
|
|
imply specific I/O commands (channel command words - CCWs) in order to operate
|
|
the device are gathered. Device drivers can retrieve this set of hardware
|
|
information during their initialization step to recognize the devices they
|
|
support using the information saved in the struct ccw_device given to them.
|
|
This methods implies that Linux/390 doesn't require to probe for free (not
|
|
armed) interrupt request lines (IRQs) to drive its devices with. Where
|
|
applicable, the device drivers can use issue the READ DEVICE CHARACTERISTICS
|
|
ccw to retrieve device characteristics in its online routine.
|
|
|
|
In order to allow for easy I/O initiation the CDS layer provides a
|
|
ccw_device_start() interface that takes a device specific channel program (one
|
|
or more CCWs) as input sets up the required architecture specific control blocks
|
|
and initiates an I/O request on behalf of the device driver. The
|
|
ccw_device_start() routine allows to specify whether it expects the CDS layer
|
|
to notify the device driver for every interrupt it observes, or with final status
|
|
only. See ccw_device_start() for more details. A device driver must never issue
|
|
ESA/390 I/O commands itself, but must use the Linux/390 CDS interfaces instead.
|
|
|
|
For long running I/O request to be canceled, the CDS layer provides the
|
|
ccw_device_halt() function. Some devices require to initially issue a HALT
|
|
SUBCHANNEL (HSCH) command without having pending I/O requests. This function is
|
|
also covered by ccw_device_halt().
|
|
|
|
|
|
get_ciw() - get command information word
|
|
|
|
This call enables a device driver to get information about supported commands
|
|
from the extended SenseID data.
|
|
|
|
::
|
|
|
|
struct ciw *
|
|
ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd);
|
|
|
|
==== ========================================================
|
|
cdev The ccw_device for which the command is to be retrieved.
|
|
cmd The command type to be retrieved.
|
|
==== ========================================================
|
|
|
|
ccw_device_get_ciw() returns:
|
|
|
|
===== ================================================================
|
|
NULL No extended data available, invalid device or command not found.
|
|
!NULL The command requested.
|
|
===== ================================================================
|
|
|
|
::
|
|
|
|
ccw_device_start() - Initiate I/O Request
|
|
|
|
The ccw_device_start() routines is the I/O request front-end processor. All
|
|
device driver I/O requests must be issued using this routine. A device driver
|
|
must not issue ESA/390 I/O commands itself. Instead the ccw_device_start()
|
|
routine provides all interfaces required to drive arbitrary devices.
|
|
|
|
This description also covers the status information passed to the device
|
|
driver's interrupt handler as this is related to the rules (flags) defined
|
|
with the associated I/O request when calling ccw_device_start().
|
|
|
|
::
|
|
|
|
int ccw_device_start(struct ccw_device *cdev,
|
|
struct ccw1 *cpa,
|
|
unsigned long intparm,
|
|
__u8 lpm,
|
|
unsigned long flags);
|
|
int ccw_device_start_timeout(struct ccw_device *cdev,
|
|
struct ccw1 *cpa,
|
|
unsigned long intparm,
|
|
__u8 lpm,
|
|
unsigned long flags,
|
|
int expires);
|
|
int ccw_device_start_key(struct ccw_device *cdev,
|
|
struct ccw1 *cpa,
|
|
unsigned long intparm,
|
|
__u8 lpm,
|
|
__u8 key,
|
|
unsigned long flags);
|
|
int ccw_device_start_key_timeout(struct ccw_device *cdev,
|
|
struct ccw1 *cpa,
|
|
unsigned long intparm,
|
|
__u8 lpm,
|
|
__u8 key,
|
|
unsigned long flags,
|
|
int expires);
|
|
|
|
============= =============================================================
|
|
cdev ccw_device the I/O is destined for
|
|
cpa logical start address of channel program
|
|
user_intparm user specific interrupt information; will be presented
|
|
back to the device driver's interrupt handler. Allows a
|
|
device driver to associate the interrupt with a
|
|
particular I/O request.
|
|
lpm defines the channel path to be used for a specific I/O
|
|
request. A value of 0 will make cio use the opm.
|
|
key the storage key to use for the I/O (useful for operating on a
|
|
storage with a storage key != default key)
|
|
flag defines the action to be performed for I/O processing
|
|
expires timeout value in jiffies. The common I/O layer will terminate
|
|
the running program after this and call the interrupt handler
|
|
with ERR_PTR(-ETIMEDOUT) as irb.
|
|
============= =============================================================
|
|
|
|
Possible flag values are:
|
|
|
|
========================= =============================================
|
|
DOIO_ALLOW_SUSPEND channel program may become suspended
|
|
DOIO_DENY_PREFETCH don't allow for CCW prefetch; usually
|
|
this implies the channel program might
|
|
become modified
|
|
DOIO_SUPPRESS_INTER don't call the handler on intermediate status
|
|
========================= =============================================
|
|
|
|
The cpa parameter points to the first format 1 CCW of a channel program::
|
|
|
|
struct ccw1 {
|
|
__u8 cmd_code;/* command code */
|
|
__u8 flags; /* flags, like IDA addressing, etc. */
|
|
__u16 count; /* byte count */
|
|
__u32 cda; /* data address */
|
|
} __attribute__ ((packed,aligned(8)));
|
|
|
|
with the following CCW flags values defined:
|
|
|
|
=================== =========================
|
|
CCW_FLAG_DC data chaining
|
|
CCW_FLAG_CC command chaining
|
|
CCW_FLAG_SLI suppress incorrect length
|
|
CCW_FLAG_SKIP skip
|
|
CCW_FLAG_PCI PCI
|
|
CCW_FLAG_IDA indirect addressing
|
|
CCW_FLAG_SUSPEND suspend
|
|
=================== =========================
|
|
|
|
|
|
Via ccw_device_set_options(), the device driver may specify the following
|
|
options for the device:
|
|
|
|
========================= ======================================
|
|
DOIO_EARLY_NOTIFICATION allow for early interrupt notification
|
|
DOIO_REPORT_ALL report all interrupt conditions
|
|
========================= ======================================
|
|
|
|
|
|
The ccw_device_start() function returns:
|
|
|
|
======== ======================================================================
|
|
0 successful completion or request successfully initiated
|
|
-EBUSY The device is currently processing a previous I/O request, or there is
|
|
a status pending at the device.
|
|
-ENODEV cdev is invalid, the device is not operational or the ccw_device is
|
|
not online.
|
|
======== ======================================================================
|
|
|
|
When the I/O request completes, the CDS first level interrupt handler will
|
|
accumulate the status in a struct irb and then call the device interrupt handler.
|
|
The intparm field will contain the value the device driver has associated with a
|
|
particular I/O request. If a pending device status was recognized,
|
|
intparm will be set to 0 (zero). This may happen during I/O initiation or delayed
|
|
by an alert status notification. In any case this status is not related to the
|
|
current (last) I/O request. In case of a delayed status notification no special
|
|
interrupt will be presented to indicate I/O completion as the I/O request was
|
|
never started, even though ccw_device_start() returned with successful completion.
|
|
|
|
The irb may contain an error value, and the device driver should check for this
|
|
first:
|
|
|
|
========== =================================================================
|
|
-ETIMEDOUT the common I/O layer terminated the request after the specified
|
|
timeout value
|
|
-EIO the common I/O layer terminated the request due to an error state
|
|
========== =================================================================
|
|
|
|
If the concurrent sense flag in the extended status word (esw) in the irb is
|
|
set, the field erw.scnt in the esw describes the number of device specific
|
|
sense bytes available in the extended control word irb->scsw.ecw[]. No device
|
|
sensing by the device driver itself is required.
|
|
|
|
The device interrupt handler can use the following definitions to investigate
|
|
the primary unit check source coded in sense byte 0 :
|
|
|
|
======================= ====
|
|
SNS0_CMD_REJECT 0x80
|
|
SNS0_INTERVENTION_REQ 0x40
|
|
SNS0_BUS_OUT_CHECK 0x20
|
|
SNS0_EQUIPMENT_CHECK 0x10
|
|
SNS0_DATA_CHECK 0x08
|
|
SNS0_OVERRUN 0x04
|
|
SNS0_INCOMPL_DOMAIN 0x01
|
|
======================= ====
|
|
|
|
Depending on the device status, multiple of those values may be set together.
|
|
Please refer to the device specific documentation for details.
|
|
|
|
The irb->scsw.cstat field provides the (accumulated) subchannel status :
|
|
|
|
========================= ============================
|
|
SCHN_STAT_PCI program controlled interrupt
|
|
SCHN_STAT_INCORR_LEN incorrect length
|
|
SCHN_STAT_PROG_CHECK program check
|
|
SCHN_STAT_PROT_CHECK protection check
|
|
SCHN_STAT_CHN_DATA_CHK channel data check
|
|
SCHN_STAT_CHN_CTRL_CHK channel control check
|
|
SCHN_STAT_INTF_CTRL_CHK interface control check
|
|
SCHN_STAT_CHAIN_CHECK chaining check
|
|
========================= ============================
|
|
|
|
The irb->scsw.dstat field provides the (accumulated) device status :
|
|
|
|
===================== =================
|
|
DEV_STAT_ATTENTION attention
|
|
DEV_STAT_STAT_MOD status modifier
|
|
DEV_STAT_CU_END control unit end
|
|
DEV_STAT_BUSY busy
|
|
DEV_STAT_CHN_END channel end
|
|
DEV_STAT_DEV_END device end
|
|
DEV_STAT_UNIT_CHECK unit check
|
|
DEV_STAT_UNIT_EXCEP unit exception
|
|
===================== =================
|
|
|
|
Please see the ESA/390 Principles of Operation manual for details on the
|
|
individual flag meanings.
|
|
|
|
Usage Notes:
|
|
|
|
ccw_device_start() must be called disabled and with the ccw device lock held.
|
|
|
|
The device driver is allowed to issue the next ccw_device_start() call from
|
|
within its interrupt handler already. It is not required to schedule a
|
|
bottom-half, unless a non deterministically long running error recovery procedure
|
|
or similar needs to be scheduled. During I/O processing the Linux/390 generic
|
|
I/O device driver support has already obtained the IRQ lock, i.e. the handler
|
|
must not try to obtain it again when calling ccw_device_start() or we end in a
|
|
deadlock situation!
|
|
|
|
If a device driver relies on an I/O request to be completed prior to start the
|
|
next it can reduce I/O processing overhead by chaining a NoOp I/O command
|
|
CCW_CMD_NOOP to the end of the submitted CCW chain. This will force Channel-End
|
|
and Device-End status to be presented together, with a single interrupt.
|
|
However, this should be used with care as it implies the channel will remain
|
|
busy, not being able to process I/O requests for other devices on the same
|
|
channel. Therefore e.g. read commands should never use this technique, as the
|
|
result will be presented by a single interrupt anyway.
|
|
|
|
In order to minimize I/O overhead, a device driver should use the
|
|
DOIO_REPORT_ALL only if the device can report intermediate interrupt
|
|
information prior to device-end the device driver urgently relies on. In this
|
|
case all I/O interruptions are presented to the device driver until final
|
|
status is recognized.
|
|
|
|
If a device is able to recover from asynchronously presented I/O errors, it can
|
|
perform overlapping I/O using the DOIO_EARLY_NOTIFICATION flag. While some
|
|
devices always report channel-end and device-end together, with a single
|
|
interrupt, others present primary status (channel-end) when the channel is
|
|
ready for the next I/O request and secondary status (device-end) when the data
|
|
transmission has been completed at the device.
|
|
|
|
Above flag allows to exploit this feature, e.g. for communication devices that
|
|
can handle lost data on the network to allow for enhanced I/O processing.
|
|
|
|
Unless the channel subsystem at any time presents a secondary status interrupt,
|
|
exploiting this feature will cause only primary status interrupts to be
|
|
presented to the device driver while overlapping I/O is performed. When a
|
|
secondary status without error (alert status) is presented, this indicates
|
|
successful completion for all overlapping ccw_device_start() requests that have
|
|
been issued since the last secondary (final) status.
|
|
|
|
Channel programs that intend to set the suspend flag on a channel command word
|
|
(CCW) must start the I/O operation with the DOIO_ALLOW_SUSPEND option or the
|
|
suspend flag will cause a channel program check. At the time the channel program
|
|
becomes suspended an intermediate interrupt will be generated by the channel
|
|
subsystem.
|
|
|
|
ccw_device_resume() - Resume Channel Program Execution
|
|
|
|
If a device driver chooses to suspend the current channel program execution by
|
|
setting the CCW suspend flag on a particular CCW, the channel program execution
|
|
is suspended. In order to resume channel program execution the CIO layer
|
|
provides the ccw_device_resume() routine.
|
|
|
|
::
|
|
|
|
int ccw_device_resume(struct ccw_device *cdev);
|
|
|
|
==== ================================================
|
|
cdev ccw_device the resume operation is requested for
|
|
==== ================================================
|
|
|
|
The ccw_device_resume() function returns:
|
|
|
|
========= ==============================================
|
|
0 suspended channel program is resumed
|
|
-EBUSY status pending
|
|
-ENODEV cdev invalid or not-operational subchannel
|
|
-EINVAL resume function not applicable
|
|
-ENOTCONN there is no I/O request pending for completion
|
|
========= ==============================================
|
|
|
|
Usage Notes:
|
|
|
|
Please have a look at the ccw_device_start() usage notes for more details on
|
|
suspended channel programs.
|
|
|
|
ccw_device_halt() - Halt I/O Request Processing
|
|
|
|
Sometimes a device driver might need a possibility to stop the processing of
|
|
a long-running channel program or the device might require to initially issue
|
|
a halt subchannel (HSCH) I/O command. For those purposes the ccw_device_halt()
|
|
command is provided.
|
|
|
|
ccw_device_halt() must be called disabled and with the ccw device lock held.
|
|
|
|
::
|
|
|
|
int ccw_device_halt(struct ccw_device *cdev,
|
|
unsigned long intparm);
|
|
|
|
======= =====================================================
|
|
cdev ccw_device the halt operation is requested for
|
|
intparm interruption parameter; value is only used if no I/O
|
|
is outstanding, otherwise the intparm associated with
|
|
the I/O request is returned
|
|
======= =====================================================
|
|
|
|
The ccw_device_halt() function returns:
|
|
|
|
======= ==============================================================
|
|
0 request successfully initiated
|
|
-EBUSY the device is currently busy, or status pending.
|
|
-ENODEV cdev invalid.
|
|
-EINVAL The device is not operational or the ccw device is not online.
|
|
======= ==============================================================
|
|
|
|
Usage Notes:
|
|
|
|
A device driver may write a never-ending channel program by writing a channel
|
|
program that at its end loops back to its beginning by means of a transfer in
|
|
channel (TIC) command (CCW_CMD_TIC). Usually this is performed by network
|
|
device drivers by setting the PCI CCW flag (CCW_FLAG_PCI). Once this CCW is
|
|
executed a program controlled interrupt (PCI) is generated. The device driver
|
|
can then perform an appropriate action. Prior to interrupt of an outstanding
|
|
read to a network device (with or without PCI flag) a ccw_device_halt()
|
|
is required to end the pending operation.
|
|
|
|
::
|
|
|
|
ccw_device_clear() - Terminage I/O Request Processing
|
|
|
|
In order to terminate all I/O processing at the subchannel, the clear subchannel
|
|
(CSCH) command is used. It can be issued via ccw_device_clear().
|
|
|
|
ccw_device_clear() must be called disabled and with the ccw device lock held.
|
|
|
|
::
|
|
|
|
int ccw_device_clear(struct ccw_device *cdev, unsigned long intparm);
|
|
|
|
======= ===============================================
|
|
cdev ccw_device the clear operation is requested for
|
|
intparm interruption parameter (see ccw_device_halt())
|
|
======= ===============================================
|
|
|
|
The ccw_device_clear() function returns:
|
|
|
|
======= ==============================================================
|
|
0 request successfully initiated
|
|
-ENODEV cdev invalid
|
|
-EINVAL The device is not operational or the ccw device is not online.
|
|
======= ==============================================================
|
|
|
|
Miscellaneous Support Routines
|
|
------------------------------
|
|
|
|
This chapter describes various routines to be used in a Linux/390 device
|
|
driver programming environment.
|
|
|
|
get_ccwdev_lock()
|
|
|
|
Get the address of the device specific lock. This is then used in
|
|
spin_lock() / spin_unlock() calls.
|
|
|
|
::
|
|
|
|
__u8 ccw_device_get_path_mask(struct ccw_device *cdev);
|
|
|
|
Get the mask of the path currently available for cdev.
|