зеркало из https://github.com/mozilla/gecko-dev.git
132 строки
5.5 KiB
ReStructuredText
132 строки
5.5 KiB
ReStructuredText
This technical memo is a cautionary note on using NetScape Portable
|
|
Runtime's (NSPR) IO timeout and interrupt on Windows NT 3.51 and 4.0.
|
|
Due to a limitation of the present implementation of NSPR IO on NT,
|
|
programs must follow the following guideline:
|
|
|
|
If a thread calls an NSPR IO function on a file descriptor and the IO
|
|
function fails with <tt>PR_IO_TIMEOUT_ERROR</tt> or
|
|
<tt>PR_PENDING_INTERRUPT_ERROR</tt>, the file descriptor must be closed
|
|
before the thread exits.
|
|
|
|
In this memo we explain the problem this guideline is trying to work
|
|
around and discuss its limitations.
|
|
|
|
.. _NSPR_IO_on_NT:
|
|
|
|
NSPR IO on NT
|
|
-------------
|
|
|
|
The IO model of NSPR 2.0 is synchronous and blocking. A thread calling
|
|
an IO function is blocked until the IO operation finishes, either due to
|
|
a successful IO completion or an error. If the IO operation cannot
|
|
complete before the specified timeout, the IO function returns with
|
|
<tt>PR_IO_TIMEOUT_ERROR</tt>. If the thread gets interrupted by another
|
|
thread's <tt>PR_Interrupt()</tt> call, the IO function returns with
|
|
<tt>PR_PENDING_INTERRUPT_ERROR</tt>.
|
|
|
|
On Windows NT, NSPR IO is implemented using NT's *overlapped* (also
|
|
called *asynchronous*) *IO*. When a thread calls an IO function, the
|
|
thread issues an overlapped IO request using the overlapped buffer in
|
|
its <tt>PRThread</tt> structure. Then the thread is put to sleep. In the
|
|
meantime, there are dedicated internal threads (called the *idle
|
|
threads*) monitoring the IO completion port for completed IO requests.
|
|
If a completed IO request appears at the IO completion port, an idle
|
|
thread fetches it and wakes up the thread that issued the IO request
|
|
earlier. This is the normal way the thread is awakened.
|
|
|
|
.. _IO_Timeout_and_Interrupt:
|
|
|
|
IO Timeout and Interrupt
|
|
------------------------
|
|
|
|
However, NSPR may wake up the thread in two other situations:
|
|
|
|
- if the overlapped IO request is not completed before the specified
|
|
timeout. (Note that we can't specify timeout on overlapped IO
|
|
requests, so the timeouts are all handled at the NSPR level.) In this
|
|
case, the error is <tt>PR_IO_TIMEOUT_ERROR</tt>.
|
|
- if the thread gets interrupted by another thread's
|
|
<tt>PR_Interrupt()</tt> call. In this case, the error is
|
|
<tt>PR_PENDING_INTERRUPT_ERROR</tt>.
|
|
|
|
These two errors are generated by the NSPR layer, so the OS is oblivious
|
|
of what is going on and the overlapped IO request is still in progress.
|
|
The OS still has a pointer to the overlapped buffer in the thread's
|
|
<tt>PRThread</tt> structure. If the thread subsequently exists and its
|
|
<tt>PRThread</tt> structure gets deleted, the pointer to the overlapped
|
|
buffer will be pointing to freed memory. This is problematic.
|
|
|
|
.. _Canceling_Overlapped_IO_by_Closing_the_File_Descriptor:
|
|
|
|
Canceling Overlapped IO by Closing the File Descriptor
|
|
------------------------------------------------------
|
|
|
|
Therefore, we need to cancel the outstanding overlapped IO request
|
|
before the thread exits. NT's <tt>CancelIo()</tt> function would be
|
|
ideal for this purpose. Unfortunately, <tt>CancelIo()</tt> is not
|
|
available on NT 3.51. So we can't go this route as long as we are
|
|
supporting NT 3.51. The only reliable way to cancel outstanding
|
|
overlapped IO request that works on both NT 3.51 and 4.0 is to close the
|
|
file descriptor, hence the rule of thumb stated at the beginning of this
|
|
memo.
|
|
|
|
.. _Limitations:
|
|
|
|
Limitations
|
|
-----------
|
|
|
|
This seemingly harsh way to force the completion of outstanding
|
|
overlapped IO request has the following limitations:
|
|
|
|
- It is difficult for threads to shared a file descriptor. For example,
|
|
suppose thread A and thread B call <tt>PR_Accept()</tt> on the same
|
|
socket, and they time out at the same time. Following the rule of
|
|
thumb, both threads would close the socket. The first
|
|
<tt>PR_Close()</tt> would succeed, but the second <tt>PR_Close()</tt>
|
|
would be freeing freed memory. A solution that may work is to use a
|
|
lock to ensure only one thread can be using that socket at all times.
|
|
- Once there is a timeout or interrupt error, the file descriptor is no
|
|
longer usable. Suppose the file descriptor is intended to be used for
|
|
the life time of the process, for example, the logging file, this is
|
|
really not acceptable. A possible solution is to add a
|
|
<tt>PR_DisableInterrupt()</tt> function to turn off interrupts when
|
|
accessing such file descriptors.
|
|
|
|
..
|
|
|
|
*A related known bug is that timeout and interrupt don't work for
|
|
<tt>PR_Connect()</tt> on NT. This bug is due to a different
|
|
limitation in our NT implementation.*
|
|
|
|
.. _Conclusions:
|
|
|
|
Conclusions
|
|
-----------
|
|
|
|
As long as we need to support NT 3.51, we need to program under the
|
|
guideline that after an IO timeout or interrupt error, the thread must
|
|
make sure the file descriptor is closed before it exits. Programs should
|
|
also take care in sharing file descriptors and using IO timeout or
|
|
interrupt on files that need to stay open throughout the process.
|
|
|
|
When we stop supporting NT 3.51, we can look into using NT 4's
|
|
<tt>CancelIo()</tt> function to cancel outstanding overlapped IO
|
|
requests when we get IO timeout or interrupt errors. If
|
|
<tt>CancelIo()</tt> really works as advertised, that should
|
|
fundamentally solve this problem.
|
|
|
|
If these limitations with IO timeout and interrupt are not acceptable to
|
|
the needs of your programs, you can consider using the Win95 version of
|
|
NSPR. The Win95 version runs without trouble on NT, but you would lose
|
|
the better performance provided by NT fibers and asynchronous IO.
|
|
|
|
|
|
|
|
|
.. _Original_Document_Information:
|
|
|
|
Original Document Information
|
|
-----------------------------
|
|
|
|
- Author: larryh@netscape.com
|
|
- Last Updated Date: December 1, 2004
|