Monday, June 1, 2009

Script Bloat

The Problem

Rich client web applications tend use a lot of JavaScript. Not only do they rely on extensive libraries to support client side functionality, but in my opinion good modular code design tends to lead to lots of JavaScript files (especially since the Visual Studio JavaScript editor lacks code folding or regions and thus makes working with large files tiresome to say the least). Typically I find it easiest to place each client side class in a single file, much like we do with our server side code. Thus, for a typical page, consisting of a list, a filter, and one or two editor controls I might have the following script includes:
  • JSON library
  • jQuery / jQuery UI
  • Five or six standard jQuery plug-ins (blockui, timer, dimensions, hoverintent, flot, bgiframe, etc) to provide extended functionality beyond that of the base libraries)
  • Our own library of base controls that the domain specific controls inherit from to obtain our form, list, and filter functionality
  • A control for each editor and list on the page, with these controls often having sub controls
Looking over a sample page from a recent application, I found 18 scripts included! This can cause significant performance issue for client, especially those with more limited bandwidth, for two reasons:
  • Total script size: We are downloading source code, which can be quite large. Source code contains white space and comments, and is formatted to aid maintainability, not to produce svelte downloads.
  • Number of Requests: Since there may be dependencies between the scripts (almost everything we use has some dependency on jQuery for instance), browsers will only download and evaluate one script at a time. Given the overhead in fetching and evaluating each script, this time can add up.
There are several established techniques to address these issues:
  • Compression: Most moderns browsers support gzip compression. Using gzip can reduce the size of the scripts significantly.
  • Concatenation: By combining all of your scripts into a single file, the client only needs to make one request. Order is still important, so you need to concatenate the scripts in the correct order of their dependencies.
  • Remove whitespace: Tools like JSMIN and packer can remove whitespace and comments, greatly reducing the download size.
  • Caching: By enabling caching for libraries and other scripts that are unlikely to change frequently, the browser will only download the script file once.

While using these techniques can make an enormous difference to the end user, they can make life very hard on the developer. Compression and caching require you to configure and maintain settings in IIS which add additional deployment and maintenance concerns. And combining / minifying your scripts makes them very difficult to maintain. Imagine having all your developers work on a single, uncommented, non white space JavaScript file!

The Solution

The solution is to apply these techniques at run time, rather than at development or build time. There are a variety of implementations for various frameworks, but none quite suited my needs. Specifically, I wanted something that would:
  • Work with minimal IIS configuration and work in both IIS6 and IIS7
  • Allow for a "debug" mode that delivered readable scripts to aid in debugging
  • Require minimal configuration and ideally use a centralized configuration file
  • Be flexible enough to allow for varying cache intervals and to compensate for the fact that some library scripts I've used seem to dislike being minified
  • Work well with .NET, but not require the client side AJAX.NET libraries (no ScriptManager base implementations)
  • Be simple

Overview

My solution is based on a custom ASP.NET HttpHandler that allows web pages to include requests for scripts groups instead of just single scripts. The handler dynamically concatenates and minifies the script as well as emitting caching and gzip compression headers as appropriate.

A request for a script group containing all of my base jQuery files would look like:


   1:  <script src="/ScriptOptimizer.pragmatix?groups=jQuery" type="text/javascript"></script>

Multiple scripts can be combined into a single request by combining in a comma separated list:


   1:      <script src="/ScriptOptimizer.pragmatix?groups=jQuery,PragmatixBase" type="text/javascript"></script>


I define my script group mappings in the web.config inside a custom configuration section. I've seen a few folks online try to auto discover the scripts referenced by a page to avoid a configuration file, but I felt the overhead in terms of code wasn't worth the work.


   1:    <ScriptOptimizer>

   2:      <ScriptGroup Name="jQuery" Compress="true" AllowCache="true" CacheLengthInDays="7">

   3:        <Script Path="/scripts/jquery-1.3.2.min.js" Enabled="true" Minify="false"/>

   4:        <Script Path="/scripts/jquery-ui-1.7.1.custom.min.js" Enabled="true" Minify="false"/>

   5:  

   6:  

   7:  

   8:      </ScriptGroup>

   9:      <ScriptGroup Name="PragmatixBase" Compress="true" AllowCache="false" CacheLengthInDays="1">

  10:        <Script Path="/scripts/V2/PragmatixControl.js" Enabled="true" Minify="true"/>

  11:        <Script Path="/scripts/V2/PragmatixModalEditor.js" Enabled="true" Minify="true"/>

  12:      </ScriptGroup>

  13:    </ScriptOptimizer>



As you can see the caching and compression options apply to an entire script group while enabling and minifying can be controlled on a per script level. If you combine groups, the lowest caching interval applies to all included scripts.

The Configuration

In order to keep the configuration info inside the web.config and avoid yet another config file I implemented my own configuration section. I just found out that my previous way of doing this, using the IConfigurationSectionHandler interface is deprecated, but I chose to ignore this for now instead of using the newer ConfigurationSection () class. This allowed me to stick with the same XmlSerializer loading and saving of procedures that I use for other stuff. Plus I didn’t feel like implementing a custom class for all my collections (although using generics is a pretty solid work around for that, see: http://utahdnug.org/blogs/josh/archive/2007/08/21/generic-configurationelementcollection.aspx)

The configuration section is pretty simple. The configuration data is loaded up as an object by deserializing the contents of the config section.


   1:  public object Create(object parent, object configContext, XmlNode section)

   2:  {

   3:      XmlElement root = (XmlElement)section;

   4:      XmlSerializer s = new XmlSerializer(typeof(ScriptOptimizerConfig));

   5:      return (ScriptOptimizerConfig)s.Deserialize(new XmlNodeReader(section));

   6:  }




The ScriptOptimizerConfig class contains a collection of script groups, which in turn contain a collection of script definitions. Using XML markup attributes like this:


   1:  [XmlRoot(ElementName = "ScriptOptimizer")]

   2:  public class ScriptOptimizerConfig

   3:  {

   4:   

   5:          #region Properties

   6:          [XmlElement("ScriptGroup")]

   7:          public List<ScriptOptimizerScriptGroupConfig> ScriptGroups

   8:          {

   9:              get { return m_ScriptGroups; }

  10:              set { m_ScriptGroups = value; }

  11:          }

  12:          private List<ScriptOptimizerScriptGroupConfig> m_ScriptGroups;

  13:      

  14:          #endregion




I really like using the .NET serialization, its possible to whip up a quick and nicely structured configuration file in just a few minutes. Plus, if the project scope expands, I can reuse the configuration objects with NHibernate persistence without significant code changes.


The Code

The actual work of script optimization is done inside the ScriptOptimizerHttpHandler class. The entry point is the ProcessRequest method.


   1:  public void ProcessRequest(HttpContext context)

   2:  {

   3:      // load our configuration section

   4:      ScriptOptimizerConfig Config = (ScriptOptimizerConfig)ConfigurationManager.GetSection("ScriptOptimizer");

   5:      string[] groups = context.Request.QueryString["groups"].Split(new char[] { ',', ';', ':' });

   6:   

   7:      // determine the combined settings when multiple groups are requested

   8:      ResultantScriptGroupSetting CombinedGroupSettings = Config.GetCachingSettings(groups);

   9:   

  10:   

  11:   

  12:      // set up GZIP compression if configured for such and the client allows it

  13:      if (CombinedGroupSettings.Compress.Value && IsGZipSupported(context))

  14:      {

  15:        context.Response.AppendHeader("Content-Encoding", "gzip");

  16:        ICSharpCode.SharpZipLib.GZip.GZipOutputStream OutputGZIPStream;

  17:        OutputGZIPStream = new ICSharpCode.SharpZipLib.GZip.GZipOutputStream(context.Response.Filter);

  18:        OutputGZIPStream.SetLevel(ICSharpCode.SharpZipLib.Zip.Compression.Deflater.BEST_COMPRESSION);

  19:        context.Response.Filter = OutputGZIPStream;

  20:      }

  21:   

  22:      // ready the response for writing out the scripts

  23:      context.Response.Clear();

  24:      context.Response.ContentType = "application/x-javascript";

  25:   

  26:      // write caching headers

  27:      // if we are combining groups, we need to generate a combined set of caching headers

  28:      // that makes sense. In this implementation, the lowest caching interval wins

  29:   

  30:      if (CombinedGroupSettings.AllowCache.Value)

  31:      {

  32:          context.Response.Cache.SetCacheability(HttpCacheability.Public);

  33:          context.Response.Cache.SetExpires(DateTime.Now.AddDays(CombinedGroupSettings.CacheLengthInDays.Value));

  34:   

  35:      }

  36:      else

  37:      {

  38:          context.Response.Cache.SetCacheability(HttpCacheability.NoCache);

  39:      }

  40:   

  41:      // append the scripts 

  42:      AppendScripts(context, Config, groups, CombinedGroupSettings);

  43:   

  44:  }




First I load the configuration section, parse out my requested groups from the query string, and calculate some of the group-level settings if more than one group is requested. Next we emit our compression and caching headers. Finally we call the AppendScripts function to return each of the requested scripts.


   1:  public void AppendScripts(System.Web.HttpContext context, ScriptOptimizerConfig Config, string[] groups, ResultantScriptGroupSetting CombinedGroupSettings)

   2:  {

   3:   

   4:      foreach (string groupName in groups)

   5:      {

   6:          ScriptOptimizerScriptGroupConfig groupconfig = Config.GetGroup(groupName);

   7:          // loop over each script. minify is necessary and append to the output stream

   8:          foreach (ScriptOptimizerScriptConfig scriptconfig in groupconfig.Scripts)

   9:          {

  10:              if (scriptconfig.Enabled)

  11:              {

  12:                  string FullScriptPath = context.Server.MapPath(scriptconfig.Path);

  13:                  

  14:                  // we can choose to exclude scripts from the minifications process

  15:                  // some libraries we don't need to debug and some scritps react poorly

  16:                  if (scriptconfig.Minify)

  17:                  {

  18:                      MemoryStream FileJavaScript = new MemoryStream(Encoding.ASCII.GetBytes(File.ReadAllText(FullScriptPath)));

  19:                      JavaScriptMinifier min = new JavaScriptMinifier();

  20:                      min.Minify(FileJavaScript, context.Response.OutputStream);

  21:                  }

  22:                  else

  23:                  {

  24:                      context.Response.WriteFile(FullScriptPath);

  25:                  }

  26:   

  27:                  // this helps correct for scripts that aren't properly terminated

  28:                  // this is fine when they are individual files, but causes issues when concantenated

  29:                  context.Response.Write(";;"); 

  30:              }

  31:          }

  32:      }

  33:   

  34:   

  35:  }



I am using a modified version of the JSMIN C# class provided by Douglas Crockford. My only modification was to rework some of the input and output functions to make it easier to interface with the library.



Installation

First, you need to add the configuration section to your web.config :


   1:  <section name="ScriptOptimizer" type="Pragmatix.ScriptOptimizer.ScriptOptimizerConfigurationHandler, CDXLibrary"/>



Next register the HTTPHandler under the section like so:


   1:        <add verb="*" path="ScriptOptimizer.pragmatix" type="Pragmatix.ScriptOptimizer.ScriptOptimizerHttpHandler,CDXLibrary "/>



You can choose any path you like, my choice was pretty arbitrary. One thing to remember is that, while this will work fine in Visual Studio (using the built in Cassini debugger), it will fail in IIS 6 unless you register whatever path you chose to be handled by ASP.NET isapi dll. IIS 7 doesn't have this problem if you use integrated mode, since all extensions are routed through .NET.





Once this is configured, define your script groups, add the references in your pages, and you should be good to go.

The Results

The firebug windows below show the results of adding the script optimizer to a page. The number of blocking script requests is greatly reduced, shortening the "JavaScript load ladder" and reducing page load time by almost two thirds. (NB This is running locally in a test server with Firebug and Fiddler running. 9.91 seconds is not a good production page time!)

Before:


After:


Future Plans

This same technique could be applied to combine and compress css and possible html fragments if you load html dynamically in the client side. Eventually I plan to add a server control to replace the script reference tag, so that in debug mode I can return multiple script references for easier searching through code in Firebug. I would also like to do some server side caching of the resultant output, thus reducing the overhead of running JSMIN and reading all the file from disk.

The Files


If you want to try this out for yourself, here is the source code.

No comments: