mail-archives/mono-devel-list/2013-June/040509.html

339 строки
16 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD>
<TITLE> [Mono-dev] mono WebResponse caching
</TITLE>
<LINK REL="Index" HREF="index.html" >
<LINK REL="made" HREF="mailto:mono-devel-list%40lists.ximian.com?Subject=Re%3A%20%5BMono-dev%5D%20mono%20WebResponse%20caching&In-Reply-To=%3CCAB1r_%2BXfEHXtixb%3D-pALu9%2BCmUm_pvwJRmb9tfNbkUb5RMeTGQ%40mail.gmail.com%3E">
<META NAME="robots" CONTENT="index,nofollow">
<style type="text/css">
pre {
white-space: pre-wrap; /* css-2.1, curent FF, Opera, Safari */
}
</style>
<META http-equiv="Content-Type" content="text/html; charset=us-ascii">
<LINK REL="Previous" HREF="040497.html">
<LINK REL="Next" HREF="040513.html">
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>[Mono-dev] mono WebResponse caching</H1>
<B>Daniel Lo Nigro</B>
<A HREF="mailto:mono-devel-list%40lists.ximian.com?Subject=Re%3A%20%5BMono-dev%5D%20mono%20WebResponse%20caching&In-Reply-To=%3CCAB1r_%2BXfEHXtixb%3D-pALu9%2BCmUm_pvwJRmb9tfNbkUb5RMeTGQ%40mail.gmail.com%3E"
TITLE="[Mono-dev] mono WebResponse caching">lists at dan.cx
</A><BR>
<I>Sun Jun 9 12:36:24 UTC 2013</I>
<P><UL>
<LI>Previous message: <A HREF="040497.html">[Mono-dev] mono WebResponse caching
</A></li>
<LI>Next message: <A HREF="040513.html">[Mono-dev] mono WebResponse caching
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#40509">[ date ]</a>
<a href="thread.html#40509">[ thread ]</a>
<a href="subject.html#40509">[ subject ]</a>
<a href="author.html#40509">[ author ]</a>
</LI>
</UL>
<HR>
<!--beginarticle-->
<PRE>There'd be some code for output caching in Mono's ASP.NET implementation, I
wonder if any of it could be reused here. They're different concepts (
ASP.NET is using output caching at the server-side whereas this is request
caching at the client-side) but maybe some of the code could be reused,
especially around cache header parsing.
A caching implementation would definitely have to take into account, at the
minimum:
- Cacheability (public / private)
- Max age and expiry dates
And revalidation using ETags and Last-Modified dates would be a
nice-to-have (doing the request with the old ETag and Last-Modified date
and getting a 304 Not Modified back if the page hasn't been modified) but
isn't entirely necessary with a basic implementation (don't both
revalidating, just clear the cache when the max age is met)
I'd personally cache the results in memory using
System.Runtime.Caching&lt;<A HREF="http://msdn.microsoft.com/en-us/library/system.runtime.caching.aspx">http://msdn.microsoft.com/en-us/library/system.runtime.caching.aspx</A>&gt;rather
than on disk, as on-disk caching can get slow. I haven't checked if
Mono supports this, though.
On Wed, Jun 5, 2013 at 11:32 AM, Mark Lintner &lt;<A HREF="http://lists.ximian.com/mailman/listinfo/mono-devel-list">mlintner at sinenomine.net</A>&gt;wrote:
&gt;<i> Hi Greg,
</I>&gt;<i>
</I>&gt;<i>
</I>&gt;<i>
</I>&gt;<i> Great input. I have not gotten that far yet. Also the cache level is very
</I>&gt;<i> na&#239;ve. Everything caches except Bypass cache which doesn't cache. To check
</I>&gt;<i> the headers it would check the WebResponse Headers property coming
</I>&gt;<i> into TryInsert in DirectoryBackedResponseCache.cs . I would combine the
</I>&gt;<i> result of checking the header with the conditional ShouldCache which also
</I>&gt;<i> needs work. What we are looking for is comments like this, about what you
</I>&gt;<i> think needs to be in a usable implementation, gotchas, things to look out
</I>&gt;<i> for. Also if we are on the wrong track doing it this way I don't want to
</I>&gt;<i> waste time doing something which will not be useful. Assuming I'm not on
</I>&gt;<i> the wrong path, I will add the header check next. This is the kind of
</I>&gt;<i> feedback we need. I only tried to make a prototype of how caching could
</I>&gt;<i> work and then did some performance measurements. Obviously didn't add any
</I>&gt;<i> of the details yet. If this is a viable approach I will continue to fill
</I>&gt;<i> out the implementation guided by research and suggestions. Probably want to
</I>&gt;<i> limit it to what we really need to have. Any other suggestions, comments,
</I>&gt;<i> criticism would be greatly appreciated.
</I>&gt;<i>
</I>&gt;<i>
</I>&gt;<i>
</I>&gt;<i> Thanks,
</I>&gt;<i>
</I>&gt;<i> Mark
</I>&gt;<i> ------------------------------
</I>&gt;<i> *From:* Greg Young [<A HREF="http://lists.ximian.com/mailman/listinfo/mono-devel-list">gregoryyoung1 at gmail.com</A>]
</I>&gt;<i> *Sent:* Tuesday, June 04, 2013 6:49 PM
</I>&gt;<i> *To:* Mark Lintner
</I>&gt;<i> *Cc:* <A HREF="http://lists.ximian.com/mailman/listinfo/mono-devel-list">mono-devel-list at lists.ximian.com</A>
</I>&gt;<i> *Subject:* Re: [Mono-dev] mono WebResponse caching
</I>&gt;<i>
</I>&gt;<i> I would expect this to look at http caching headers in response when
</I>&gt;<i> caching such as expiration, whether something is cachable (public/private) etc
</I>&gt;<i> but don't see any of that code (or maybe I missed it). Can you point to
</I>&gt;<i> where it is in your implementation?
</I>&gt;<i>
</I>&gt;<i> On Tuesday, June 4, 2013, Mark Lintner wrote:
</I>&gt;<i>
</I>&gt;&gt;<i> We have noticed that parts of the Mono Web functionality are not yet
</I>&gt;&gt;<i> implemented. We are hoping to be able to add some of these missing pieces.
</I>&gt;&gt;<i> Caching of
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> HttpWebResponses is the first one we took a look at. It would appear that
</I>&gt;&gt;<i> Microsoft takes advantage of the internet explorer caching. Since it would
</I>&gt;&gt;<i> appear
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> that an HttpWebResponse is fully populated when it is returned from the
</I>&gt;&gt;<i> cache and the only content in the urls that are cached are the bytes that
</I>&gt;&gt;<i> come in on
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> the wire, Microsoft must return things from the cache at a very low level
</I>&gt;&gt;<i> and and let the response populate the normal way. This did not seem
</I>&gt;&gt;<i> possible to
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> emulate when I looked at mono. I also wanted to cache to a directory so
</I>&gt;&gt;<i> that the urls can be inspected or deleted.
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> After looking at the code for awhile I realized that I could save
</I>&gt;&gt;<i> additional information from the response, ie: headercollection,
</I>&gt;&gt;<i> cookiecontainer, Status
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> code etc in the same file as the responsestream. A response can be
</I>&gt;&gt;<i> serialized but not all of it, most importantly the responsestream is not
</I>&gt;&gt;<i> serialized. I
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> wrote a prototype with some helper functions, the only change to existing
</I>&gt;&gt;<i> mono code would be in HttpWebRequest.GetResponse. Before calling
</I>&gt;&gt;<i> BeginGetResponse
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> I call a function that tries to retrieve the url from the cache. If it is
</I>&gt;&gt;<i> not there, BeginGetResponse is called as normal, the response is grabbed
</I>&gt;&gt;<i> from
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> EndGetResponse and passed to a method to insert into the cache. This
</I>&gt;&gt;<i> insert method will open a filename in a directory designated as mono_cache
</I>&gt;&gt;<i> with the name
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> produced by calling HttpUtility.Encode so it is a legal filename. Also
</I>&gt;&gt;<i> you can only read from the responsestream once, which means that if read
</I>&gt;&gt;<i> from it, save
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> it to a file, then the responsestream is no longer accessible from the
</I>&gt;&gt;<i> WebResponse when it is returned to the calling code. If sream copy worked
</I>&gt;&gt;<i> it could be
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> used here. One way to fix the problem is to call GetResponseStream(),
</I>&gt;&gt;<i> read it to the end, which produces a string, then finally convert the
</I>&gt;&gt;<i> string to bytes.
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> Then you can serialize the bytes to the open file, serialize the
</I>&gt;&gt;<i> headercollection and ultimately individually add any field you want in the
</I>&gt;&gt;<i> response when it
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> is retrieved from the cache, then close the file. Then create a new
</I>&gt;&gt;<i> WebConnectionData instance and populate it with headers and all the
</I>&gt;&gt;<i> relevent fields from
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> the old response, finally passing the bytes of data to a memory stream
</I>&gt;&gt;<i> constructor and assign it to the WebConnectionData's stream member. Then I
</I>&gt;&gt;<i> construct a
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> new HttpWebResponse, passing the WebConnectionData and other arguments
</I>&gt;&gt;<i> which are taken from the existing request and the recently retrieved
</I>&gt;&gt;<i> response. The
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> response is then returned to getResponse then finally to the calling
</I>&gt;&gt;<i> code, good as new. Next time the url is requested the cached file is
</I>&gt;&gt;<i> deserialized, one
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> member at a time then a WebConnectionData is setup and finally, a
</I>&gt;&gt;<i> HttpWebResponse is constructed with the info read from the file, the
</I>&gt;&gt;<i> WebConnectionData and
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> from the members of the new request. I concentrated on prototyping
</I>&gt;&gt;<i> caching of HttpWebResponses. Only a simple cache algorithm, no Ftp, no File
</I>&gt;&gt;<i> etc. No cache
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> aging etc. There are many other things that would have to be done for
</I>&gt;&gt;<i> this to be usable.
</I>&gt;&gt;<i> Using the prototype I requested www.google.com and it took .1601419
</I>&gt;&gt;<i> sec to retrieve and cache it, including all the mechanics described above.
</I>&gt;&gt;<i> The
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> second time returning it from the cache, building the HttpWebResponse
</I>&gt;&gt;<i> took .0007671. This is a 288 times speedup. That kind of performance
</I>&gt;&gt;<i> increase, from a
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> prototype that is not yet optimized, is compelling. I must add that I
</I>&gt;&gt;<i> measured the same experiment with google under the .Net 4.5 implementation
</I>&gt;&gt;<i> and the
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> speedup is 769x. Obviously thats something to work toward. Even falling
</I>&gt;&gt;<i> short of that level of speedup, some implementation of response caching for
</I>&gt;&gt;<i> mono
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> would greatly benefit the users who need better performance of services,
</I>&gt;&gt;<i> rest etc.
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> So far this is the only mono code that would change to support caching,
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> from HttpWebRequest.cs:
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> public override WebResponse GetResponse()
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> {
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> WebResponse response = ResponseCache.TryRetrieve (this);
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> if (response == null)
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> {
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> WebAsyncResult result = (WebAsyncResult) BeginGetResponse (null,
</I>&gt;&gt;<i> null);
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> response = EndGetResponse (result);
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> response.IsFromCache = false;
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> HttpWebResponse response2 = ResponseCache.TryInsert
</I>&gt;&gt;<i> ((HttpWebRequest)this, (HttpWebResponse)response);
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> response.Close ();
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> return response2;
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> }
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> return response;
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> }
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> The code can stand some improvement. It is a prototype and it shows. I
</I>&gt;&gt;<i> have included the file ResponseCache.cs which the above
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> code hooks into. We were wondering whether anyone has tried to implement
</I>&gt;&gt;<i> response caching before and whether there is any intent to in the future.
</I>&gt;&gt;<i> What are
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> your thoughts about the approach described above and shown in the
</I>&gt;&gt;<i> ResponseCache.cs file and the 3 other files I included. Bear in mind it
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> would be refactored so that it fit into the mono architecture a bit
</I>&gt;&gt;<i> better, it needs to be generalized, different strategies for different
</I>&gt;&gt;<i> concrete response
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> types need to be plugged in, configuration. and a plethora of things I
</I>&gt;&gt;<i> missed. If you feel this approach has merit I can detail all that
</I>&gt;&gt;<i> seperatly. If you
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> feel a different approach would work better, please let us know, in the
</I>&gt;&gt;<i> end were looking to make mono more robust in some of the ways that are
</I>&gt;&gt;<i> expected but
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> not yet implemented.
</I>&gt;&gt;<i> The main thing I want to show here is how it is possible, to wedge a
</I>&gt;&gt;<i> cache strategy into mono's request pipeline without changing or
</I>&gt;&gt;<i> destabilizing
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> mono itself. The cache implementation could change also, which is one way
</I>&gt;&gt;<i> that this is code falls short is it totally violates open-closed. Is this
</I>&gt;&gt;<i> a
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> viable approach? There would of course be much more to do, that I have
</I>&gt;&gt;<i> not even mentioned here. Your comments would be appreciated.
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> The files can be found here.
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i> <A HREF="http://pastebin.com/FCefsiDx">http://pastebin.com/FCefsiDx</A>
</I>&gt;&gt;<i> <A HREF="http://pastebin.com/8qBh8mMW">http://pastebin.com/8qBh8mMW</A>
</I>&gt;&gt;<i> <A HREF="http://pastebin.com/xVKLaidG">http://pastebin.com/xVKLaidG</A>
</I>&gt;&gt;<i> <A HREF="http://pastebin.com/NMVNv30T">http://pastebin.com/NMVNv30T</A>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;&gt;<i>
</I>&gt;<i>
</I>&gt;<i>
</I>&gt;<i> --
</I>&gt;<i> Le doute n'est pas une condition agr&#233;able, mais la certitude est absurde.
</I>&gt;<i>
</I>&gt;<i> _______________________________________________
</I>&gt;<i> Mono-devel-list mailing list
</I>&gt;<i> <A HREF="http://lists.ximian.com/mailman/listinfo/mono-devel-list">Mono-devel-list at lists.ximian.com</A>
</I>&gt;<i> <A HREF="http://lists.ximian.com/mailman/listinfo/mono-devel-list">http://lists.ximian.com/mailman/listinfo/mono-devel-list</A>
</I>&gt;<i>
</I>&gt;<i>
</I>-------------- next part --------------
An HTML attachment was scrubbed...
URL: &lt;<A HREF="http://lists.ximian.com/pipermail/mono-devel-list/attachments/20130609/7303c7df/attachment-0001.html">http://lists.ximian.com/pipermail/mono-devel-list/attachments/20130609/7303c7df/attachment-0001.html</A>&gt;
</PRE>
<!--endarticle-->
<HR>
<P><UL>
<!--threads-->
<LI>Previous message: <A HREF="040497.html">[Mono-dev] mono WebResponse caching
</A></li>
<LI>Next message: <A HREF="040513.html">[Mono-dev] mono WebResponse caching
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#40509">[ date ]</a>
<a href="thread.html#40509">[ thread ]</a>
<a href="subject.html#40509">[ subject ]</a>
<a href="author.html#40509">[ author ]</a>
</LI>
</UL>
<hr>
<a href="http://lists.ximian.com/mailman/listinfo/mono-devel-list">More information about the Mono-devel-list
mailing list</a><br>
</body></html>