App_offline.htm – Page Not Found

September 5, 2009 at 6:11 PMBen

For a few years, .NET has had the built-in capability to easily take your entire application offline when you need to make an update or perform some maintenance on your site.

By simply putting a file named app_offline.htm in the root directory of your site, ASP.NET will serve the app_offline.htm file, instead of the requested page.

I recently employed this feature for probably the first time.  I put the app_offline.htm file on the site, and pulled up my site in Firefox.  The contents of app_offline.htm displayed as expected.

However, if I were to pull my site in Chrome or IE, I would get a Page Not Found error that appeared as though my entire site did not exist.

App_offline.htm result in IE8:

App_offline.htm in IE

App_offline.htm result in Chrome: 

App_offline in Chrome

As mentioned above, in Firefox, the contents of App_offline.htm would display as expected.

The problem is that when ASP.NET serves the App_offline.htm file, the HTTP Response code it passes out is 404.  Chrome will display the page shown above for 404 errors.  In IE, you can actually avoid that generic error page shown above if you turn off HTTP Friendly errors.

But I obviously cannot expect IE visitors to my site to have HTTP Friendly errors turned off.

The way ASP.NET has implemented app_offline.htm by passing out a 404 HTTP status code is not well designed, in my opinion.  A much better implementation would be for ASP.NET to return a normal 200 HTTP status code.

To accomplish this, for this site, I created a simple HTTP Module that processed the beginning of each request.  It checks an “offline” appSetting in web.config to see if the application should be offline.  If the offline setting is turned on, the module will do a server transfer to my own app offline HTML file.

One thing I found on an IIS7 server is requests for items such as JPG, GIF, CSS files, etc. will also go through this HTTP module.  This is normally a great benefit of IIS7’s integrated mode pipeline.  However, if the application offline HTML file includes an IMG tag for an image on the same site, or a link to a CSS file on the same site, the HTTP module is also going to do a server transfer for these other files (JPG, CSS, etc).  This will result in the image not displaying on the application offline page, or the CSS file not loading in the browser, etc.

A simple filter in the HTTP module to only do a server transfer for actual pages is all that is required.  The fairly simple HTTP Module I ended up creating is below.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Configuration;
using System.IO;

public class AppOffline : IHttpModule
{
    public void Dispose()
    {
    }

    public void Init(HttpApplication context)
    {
        context.BeginRequest += new EventHandler(context_BeginRequest);
    }

    void context_BeginRequest(object sender, EventArgs e)
    {
        HttpContext context = ((HttpApplication)sender).Context;

        if (ConfigurationManager.AppSettings["offline"] == "true")
        {
            string extension = Path.GetExtension(context.Request.Path);

            // Don't server transfer for extensions like .JPG, .CSS, etc.
            string targetedExtensions = ".aspx.ashx.asmx";
            if (targetedExtensions.IndexOf(extension, StringComparison.OrdinalIgnoreCase) == -1)
                return;
            
            context.Server.Transfer("~/application_offline.html");
        }
    }
}

It’s a pretty simple, but effective HTTP module.  When the server transfer is done to my own application offline HTML file, the HTTP status code returned to the client is 200.  No more Page Not Found problems with browsers like IE and Chrome.

Posted in: Development

Tags: ,

Performance: Compiled vs. Interpreted Regular Expressions

August 6, 2009 at 10:33 PMBen

When a regular expression in .NET will be used multiple times, it’s common to create that Regex with the Compiled flag, e.g. RegexOptions.Compiled.  Compiled regexp’s take a bit more time to create initially, but will run faster than a regexp created without the Compiled flag.  At least that’s what the documentation states!

Without the Compiled flag, your regexp will be interpreted.  There’s even "precompiled” regular expressions.  You need to compile these regular expressions into an assembly before runtime.  This might be a good option if you have constant regexps that don’t change.  If your regexps are subject to change, pre-compiled is not a good option.  These three types of regexp’s (interpreted, compiled and pre-compiled) are explained with a few more technical details in this somewhat dated MS blog article.

Theory is great, but real benchmarks are more meaningful.  I’ve assembled some code that benchmarks the difference in speed it takes to create and run 5,000 regular expressions.  There’s actually a big difference in the time taken to run a compiled regular expression the first time, versus subsequent times.  So the results shown here will include the first run time as well as the subsequent run times.

Here’s some code to get us started:

    private static List<Regex> _expressions;
    private static object _SyncRoot = new object();

    private static List<Regex> GetExpressions()
    {
        if (_expressions != null)
            return _expressions;

        lock (_SyncRoot)
        {
            if (_expressions == null)
            {
                DateTime startTime = DateTime.Now;

                List<Regex> tempExpressions = new List<Regex>();
                string regExPattern =
                    @"^[a-zA-Z0-9]+[a-zA-Z0-9._%-]*@{0}$";

                for (int i = 0; i < 5000; i++)
                {
                    tempExpressions.Add(new Regex(
                        string.Format(regExPattern,
                        Regex.Escape("domain" + i.ToString() + "." +
                        (i % 3 == 0 ? ".com" : ".net"))),
                        RegexOptions.IgnoreCase | RegexOptions.Compiled));
                }

                _expressions = new List<Regex>(tempExpressions);
                DateTime endTime = DateTime.Now;
                double msTaken = endTime.Subtract(startTime).TotalMilliseconds;
            }
        }

        return _expressions;
    }

We’re storing 5,000 regular expressions in a static list.  Notice the RegexOptions.Compiled flag is being used.  The regexp’s are just looking for email addresses with specific domain names – domain1.net, domain2.net, domain3.com, etc.  Not very useful, but I just wanted the regexp’s to vary.  You can see we’re also recording the number of milliseconds taken to create the regular expressions.  Now here’s the code that calls GetExpressions() and actually invokes the IsMatch function on each regexp.

    private static void CheckForMatches(string text)
    {
        List<Regex> expressions = GetExpressions();
        DateTime startTime = DateTime.Now;

        foreach (Regex e in expressions)
        {
            bool isMatch = e.IsMatch(text);
        }

        DateTime endTime = DateTime.Now;
        double msTaken = endTime.Subtract(startTime).TotalMilliseconds;
    }

And we call CheckForMatches like so:

    CheckForMatches("some random text with email address, [email protected]");

How much time does it take to create and run these 5,000 compiled expressions?  Here’s what I get:

Compiled Regular Expressions
CREATION TIME: 1662 ms
FIRST RUN TIME: 25137 ms
SUBSEQUENT RUN TIMES: 41 ms

Subsequent runs of all 5,000 expressions is very fast.  However, look how much time it takes the first time these 5,000 expressions are run in CheckForMatches() – 25 seconds!!!

Let’s make ONE change.  Remove the RegexOptions.Compiled flag.  By doing this, our regular expressions will be interpreted.  Here’s what we get:

Interpreted Regular Expressions
CREATION TIME:
493 ms
FIRST RUN TIME: 22 ms
SUBSEQUENT RUN TIMES: 20 ms

Interpreted regexp’s beat compiled in every category!  Running these tests several times produces similar results.  The BIG difference here is obviously the First Run Time.  25 seconds versus .022 seconds.

I’ve seen some benchmarks showing static regexp’s performing a little slower than instance regexp’s.  I ran the same tests without the static modifier on the fields and methods above.  Same results – using the Compiled flag takes around 25 seconds for the regular expressions to run the first time.  Without the Compiled flag, they run in hundredths of a second.

Clearly, interpreted regexps are the winner.  Granted, if you’re only dealing with a small number of regular expressions, and you use the compiled flag, the first run time isn’t going to be anywhere near what I’ve shown here with 5,000 regexps.  However, even with just a few regular expressions, in .NET, you’ll see me sticking with interpreted regular expressions!

Multiple Forms in Master Page Site

March 30, 2009 at 11:14 PMBen

When integrating a website with 3rd party services such as search providers or payment processors, it's not an uncommon need to have a submit button that posts to that 3rd party's site.

ASP.NET 2.0 introduced the PostBackUrl property for controls that can initiate a post -- Buttons, ImageButtons, LinkButtons.  With this property, you can have your page post to any URL.  This is a somewhat decent solution to this problem.  The major downside to PostBackUrl is it requires the visitor to have JavaScript enabled in their browser.  If JavaScript is disabled, you end up with a normal postback to your own page.  To me, this makes PostBackUrl not a very good choice to turn to.

The other day, I needed to send a visitor to a payment processor's site.  My site was using master pages.  I was collecting some preliminary information from the visitor on a Content page.  Once collected, I was going to show them a confirmation of what they would be paying for and give them a button to move onto the payment processor.  I decided to use the PostBackUrl property on this button.

The payment processor needed a few hidden fields included in the form submission.  I put those hidden input fields including the values into non-server hidden input fields since I didn't want the Name or Ids of the input fields mangled by ASP.NET.  The form looked good, and clicking the button took me to the payment processor's website -- but as soon as I got there, the only thing on the page was some error message complaining about incoming data (a very non-specific error message).

I knew their site and this particular landing page worked when I tested posting data in a non-ASP.NET environment.  I confirmed the correct hidden input fields were being sent in the POST through Fiddler, but still no dice.  The only explanation I could come up with is when using PostBackUrl, not only are my custom hidden fields being passed to this other site, but so are all the other standard ASP.NET hidden fields -- ViewState, EventValidation, etc.  It's possible the payment processor's site was not expecting other form values to be passed to it.

So I decided to ditch PostBackUrl ... which was fine since I wasn't a big fan of PostBackUrl to begin with.  I came up with a pretty simple solution to having multiple forms in this master page environment.  The typical master page setup is where a <form runat="server"> tag in the master page surrounds the ContentPlaceHolder.

<body>
    <form id="form1" runat="server">
        <asp:ContentPlaceHolder id="ContentPlaceHolder1" runat="server">
        </asp:ContentPlaceHolder>
    </form>
</body>

Now if you put your own <form> tag in the Content page, you'll end up with a form nested in another form.  Nested forms are not valid.  Here's what I did to avoid nested forms, but still end up with multiple forms.

<body>    
    <form id="form1" runat="server">
        <asp:ContentPlaceHolder id="cphForm" runat="server">
        </asp:ContentPlaceHolder>
    </form>
    
    <asp:ContentPlaceHolder id="cphNoForm" runat="server">
    </asp:ContentPlaceHolder>
</body>

The master page now has 2 content place holders.  One is wrapped in a <form runat="server"> tag, and the other isn't.  The type of controls allowed when not wrapped in a <form runat="server"> tag is limited.  For instance, you cannot have ASP.NET Buttons, TextBoxes and a number of other controls outside a <form runat="server"> tag.  You can however use Literals, PlaceHolders and basically any HTML controls with a runat="server" tag.  Here's the content page markup.

<asp:Content ID="cntForm" ContentPlaceHolderID="cphForm" Runat="Server">
    
    <asp:PlaceHolder ID="phForm" runat="server">
    
        <h2>Select an Item to Purchase</h2>
        <asp:RadioButtonList ID="rblMenu" runat="server">
            <asp:ListItem Text="Item 1" Value="item1" Selected="True"></asp:ListItem>
            <asp:ListItem Text="Item 2" Value="item2"></asp:ListItem>
            <asp:ListItem Text="Item 3" Value="item3"></asp:ListItem>
        </asp:RadioButtonList>
        <asp:Button ID="btnContinue" runat="server"
                    Text="Continue" OnClick="continuePurchase" />

    </asp:PlaceHolder>
    
</asp:Content>

<asp:Content ID="cntNoForm" ContentPlaceHolderID="cphNoForm" Runat="Server">

    <asp:PlaceHolder ID="phConfirmation" runat="server" Visible="false">
    
        <h2>
           Your Selected Item:
           <asp:Literal ID="litSelectedItem" runat="server"></asp:Literal></h2>
    
        <form method="post" action="http://www.example.com/">
            <input type="hidden" name="myId" value="someID" />
            <input type="hidden" name="itemCode" value="<%= itemCode %>" />
            <input type="hidden" name="itemAmount" value="<%= itemAmt %>" />
            <input type="submit" name="payNow" value="Pay Now" />
        </form>
    
    </asp:PlaceHolder>

</asp:Content>

The content page is making use of both content place holders.  The top Content control contains a place holder, which contains a simple order form.  The second Content control contains a place holder with its visibility set to false.  So when the page is first pulled up, nothing in the second Content control is yet visible.

Once an item is selected, and the Continue button is clicked, a postback is done.  During the postback, the placeholder in the first Content control is made invisible, and the placeholder in the second Content control is made visible.  Because the second Content control (and all of its contents) are outside of the <form runat="server"> tag, we've achieved having two different, non-nested <form> tags on the same page.  Here's what the rendered HTML looks like after the person has selected their item and is ready to be sent to the payment processor:

<body>    
   <div>
      <form name="aspnetForm" method="post" action="Content1.aspx" id="aspnetForm">
         <div>
            <input type="hidden" name="__VIEWSTATE"
                   id="__VIEWSTATE" value="some long value" />
         </div>
      </form>
     
      <h2>Your Selected Item: Item 2</h2>
    
      <form method="post" action="http://www.example.com/">
         <input type="hidden" name="myId" value="someID" />
         <input type="hidden" name="itemCode" value="item2" />
         <input type="hidden" name="itemAmount" value="30" />
         <input type="submit" name="payNow" value="Pay Now" />
      </form>
   </div>    
</body>

With this approach, I was able to post the form to the payment processor, and the form contained hidden input fields only relevant to the processor.  This approach will work in most master page situations.  It may be difficult to implement if you have server side controls requiring a <form runat="server"> tag in your master page -- outside of the main content place holder.  In this case, it may still be possible to implement this two ContentPlaceHolder approach, if you do some juggling around of your controls and/or layout.  However, in a lot of situations, this is a practical way to achieve multiple form tags in an ASP.NET site.

TimeZoneInfo - A Small .NET 3.5 Gem

December 7, 2008 at 4:10 PMBen

One new handy class introduced in the .NET 3.5 framework is the TimeZoneInfo class.  This class allows you to get date and time information for any time zone in the world.  Prior to .NET 3.5, the framework exposed methods to get date and time information only for the time zone the server was set at.

I've used this class to find the date/time of a place other than where the server is located.  If you're lucky, the server your website is running on will be in the same timezone you want to save and display dates and times for.  Even if you're in a different timezone, let's say in New York and your server is in California, well you can just add 3 hours to the server time to get New York time.

It isn't always this easy living in Arizona where daylight saving time (DST) is not observed.  In the summer, Arizona is in the same timezone as California (PST) three hours behind the east coast and in the winter, Arizona is in MST two hours behind the east coast.  Your server may be in Arizona but you want a west coast time for your application, or your server may be in California and you want Arizona time for your application.

If the server is in Arizona, and you want to know what time it is in California, you can use the TimeZoneInfo class to find this out.

TimeZoneInfo tzi = TimeZoneInfo.FindSystemTimeZoneById("Pacific Standard Time");
DateTime PstDateTime = TimeZoneInfo.ConvertTime(DateTime.Now, tzi);


PstDateTime now contains the current date/time in California.  This code will work year round as daylight saving time and any other adjustment rules are taken into account by the ConvertTime() static function.  So, in the winter, PstDateTime will be one hour earlier than Arizona time and in the summer, PstDateTime will be the same as Arizona time.  This functionality makes life very convenient since you don't need to worry about trying to calculate when DST starts and ends each year.

You may even want to just know if daylight saving time is in effect for any given date/time value.

bool IsDaylightSavingTime = tzi.IsDaylightSavingTime(PstDateTime);


This tells you whether it is daylight saving time for PstDateTime.  Let's check the status of DST for two different dates:

// false
bool IsDaylightSavingTimeAprilFirst2000 = tzi.IsDaylightSavingTime(DateTime.Parse("4/1/2000"));
// true
bool IsDaylightSavingTimeAprilFirst2008 = tzi.IsDaylightSavingTime(DateTime.Parse("4/1/2008"));


There's actually a handful of other static and non-static members of the TimeZoneInfo class available that can come in handy depending on your needs.  Members such as GetUtcOffset(), ConvertTimeToUtc(), DisplayName, etc. provide a wide range of built-in capability.  The list of available time zones you can pass into the FindSystemTimeZoneById method can be found via the GetSystemTimeZones method.

Posted in: Development

Tags:

Navigate to default document of current directory

November 4, 2008 at 8:15 PMBen

I was working on a web page where I wanted to add a link to take the visitor to the default document of the folder this particular page is in.  This page (e.g. page1.aspx) was in a folder (e.g. folder1).  I actually knew the default document in this folder was index.aspx, and could have just set the link's HREF to "index.aspx", but wanted to make this a little more generic so it didn't matter what the default document was.  Just to be clear, the location of the page I was putting this link on looked like:

http://www.example.com/folder1/page.aspx

Initially, I set the link's HREF to:

"../folder1/"

This worked well.  It would navigate the visitor to:

http://www.example.com/folder1/

But then I needed to copy page.aspx to another folder (e.g. folder2).  So the page was going to exist in both folder1 and folder2.  Once I copied page.aspx to folder2, I could have manually edited the link to:

"../folder2/"

This didn't have a very generic feel to it and I would always need to remember to change the HREF if I needed to copy the page again to other folders.  So I decided to make this link a server side HyperLink control and create some fairly simple .NET code that would determine what the current folder the page was in by parsing the URL and setting the link's HREF so it would take the visitor to the default document of the directory the page was in.  Once the .NET code determined the folder the page was in, it ended up setting the HyperLink's NavigateUrl to something similar to:

HyperLink1.NavigateUrl = "~/folder1/";
       - or even -
HyperLink1.NavigateUrl = "~/folder2/";

The worked well, but I was surprised when I looked at the HTML source to see what the HREF resolved to.  It essentially looked like:

<a id="HyperLink1" href="./">Go to the root of this folder</a>

As you can see, a HREF of "./" is the default document or root of the current folder!  Instead of running this .NET code I created to parse the URL and determine the current folder name, all that's needed is just statically setting the HREF to "./".  This gave me flashbacks to the old DOS days where running a simple DIR command (in a directory other than root) always results in the first two lines being:

<DIR> .
<DIR> ..

The "." DIR refers to the current folder and the ".." DIR refers to the parent folder.  You can still see this today by opening up a command prompt and running DIR.

It's also worth mentioning that you can use "./" when redirecting a visitor in server side code:

Response.Redirect("./");

I'm sure I'll now start finding lots of places to sprinkle "./" HREFs in!

Posted in: Development

Tags: ,

Disable ViewState - reap performance gains

October 19, 2008 at 11:17 AMBen

Turning off viewstate for an entire page or just for certain controls can potentially reduce bandwidth greatly.  This is not news for most developers, but at least for me, I often forget to take this into account when developing ASP.NET pages.

Just the other day, I found two server side controls that had enough content in them to make viewstate much larger than it should have been.  After disabling viewstate on those 2 controls, the viewstate hidden field sent to the client went down from about 6,500 bytes to 500 bytes!  That's 6 KB of unneeded data that was being sent down to each client.  And because viewstate is sent to the client in a hidden input field, if there's a postback, all that data gets sent back up to the server.  Most people don't have a very fast upload speed, so the post hurts performance more than sending the data to the client.

Especially for pages that don't do any postbacks, there's no reason I can think of to even have viewstate turned on at all.  For these pages, disable viewstate at the page level.  It's scary to imagine the number of ASP.NET pages out there that output static reports in large tables (gridview, datagrid, listview etc) with viewstate unnecessarily enabled.  Bandwidth savings on such pages could probably be as large as the tens or even hundreds of kilobytes.

There are reasons to leave viewstate enabled for server controls on pages that do postbacks, however.  During a postback, if the resources required are high for obtaining the data needed to put back into a disabled viewstate control, it may be more advantageous to leave viewstate enabled.  For instance, if you need to make a database call or consume a web service across a network to obtain the data, leaving viewstate turned on may be better.  Same with most standard form controls (e.g. input, select), there's typically not going to be enough of a performance gain to justify turning off viewstate and losing some of the perks you get with viewstate enabled when a page is going back and forth between the client and server one or more times.  The big performance gains are going to be with server controls that output large blocks of HTML not editable by the user.

Posted in: Development

Tags: ,

Simplifying prevention of undesired Css/Js file caching

September 26, 2008 at 8:12 PMBen

It's an unfortunate reality that after updating an external CSS or JavaScript file referenced from a web page, not all browsers that have been to your site before will detect an updated file it has previously cached.  When this happens, browsers are running outdated JS and applying old styles to your page elements.

One of the common workarounds for this situation that I've found quite effective is to append some value to the query string of the external file.  So instead of the typical link (to a CSS file) in the Head section of your document,

<link href="styles.css" type="text/css" rel="stylesheet" /> 


You instead use a link with an arbitrary value following the actual file name:

<link href="styles.css?v=1" type="text/css" rel="stylesheet" />


Browsers cache the CSS file with an Id of "styles.css?v=1".  If you update your CSS file, you can change the "v=1" to "v=2" in your HTML page and the browser treats styles.css?v=2 as a different file than styles.css?v=1.  Since styles.css?v=2 isn't already cached, the browser will fetch the latest copy of styles.css from your web server.  Constantly modifying (and trying to remember to modify) the query string value when I make changes to CSS and JS files has always been a manual task for me.  Recently, however, I created a mechanism to automate this process.

The automated process appends the timestamp of the external file (CSS of JS) to the query string following the file's name that is sent to the browser.  The file timestamp is stored in the .NET cache with a cache dependency to the actual file on the web server.  Whenever I update the CSS or JS file, the timestamp is automatically removed from the cache and the next time a visitor arrives at the page and the timestamp is needed, the timestamp is retrieved and stored in cache.  The purpose of the cache is to obviously reduce the amount of file I/O and overall time required in getting the page to the browser.  I've fallen in love with this process as it's made my life easier!

The download link at the bottom of this post contains the code for a static GetFileWithVersion() function.  The same function can be used for any external file that you want to prevent browsers from mistakenly hanging onto a copy of after you may have updated the file.  At present, GetFileWithVersion returns a string value containing the filename and appended query string value.  It's the caller's responsibility to add the necessary link or script tag to the head section of the page.

Example usage of the GetFileWithVersion() function with a CSS file:

string CssFileWithVersion = utils.GetFileWithVersion("MainCss", "~/styles.css");
 
System.Web.UI.HtmlControls.HtmlLink cssLink = new System.Web.UI.HtmlControls.HtmlLink();
cssLink.Href = CssFileWithVersion;
cssLink.Attributes.Add("type", "text/css");
cssLink.Attributes.Add("rel", "stylesheet");
this.Header.Controls.Add(cssLink);


Example usage of the GetFileWithVersion() function with a JavaScript file:

string JsKey = "MainJs";
if (!ClientScript.IsClientScriptIncludeRegistered(this.GetType(), JsKey))
{
    string JsFileWithVersion = utils.GetFileWithVersion(JsKey, "~/tools.js");
    ClientScript.RegisterClientScriptInclude(this.GetType(), JsKey, ResolveClientUrl(JsFileWithVersion));
}


I've been using the GetFileWithVersion() function from master pages and standalone pages for a few weeks now.  Putting this type of mechanism in place is definitely a time saver and since implementing this, I personally haven't seen or heard of any issues with old CSS or JavaScript showing up.

Code Download (1.74 kb)

Posted in: Development

Tags: ,