Cache Busting via Gulp.js

Have you ever thought about how many HTTP requests your app is wasting? Many developers think the native caching mechanisms of browsers are sufficient. However, did you know every time a page is loaded, an HTTP request is still typically made for every single resource to confirm that the server doesn’t have a new version? The web server will return a 304 Not Modified if the resource hasn’t changed. You may not think this is a big deal, but every HTTP request has a cost, and they add up. As you can see, when you load a blog post here on my site, many files return a 304 if you’ve already visited:

A List of HTTP 304 Unmodified Responses

A List of HTTP 304 Unmodified Responses

Now you may think this doesn’t matter much today in the age of high bandwidth, but all these 304’s aren’t costly because of their size. They’re costly because of latency. You see, over the years, bandwidth has indeed skyrocketed, but way back in the days of modems we’d already nearly hit a theoretical limit on latency. As Paul Irish outlined in Delivering the Goods in 1000ms, latency is an issue for all connections:

From https://www.youtube.com/watch?v=E5lZ12Z889k

From https://www.youtube.com/watch?v=E5lZ12Z889k

The pain of latency is really magnified with mobile networks:

Wireless Latency from https://www.youtube.com/watch?v=E5lZ12Z889k

Wireless Latency from https://www.youtube.com/watch?v=E5lZ12Z889k

Factor in the pain and complexity of TCP slow start, and the verdict is clear: Worrying about page size isn’t enough. For the ultimate performance, we must minimize the number of HTTP requests our apps make.

Declaring a War on Latency

So, it’s settled. If you want to provide users the ultimate in performance, you need to eliminate as many HTTP requests as possible. Thus, you must consider setting far future expires headers. This basically tells your visitors browsers “Never ask for this file again. Ever. I mean it.” Now the details on how to do this vary by web server (Here’s how in IIS, Apache) but you basically set an expiration date for your assets in the distant future so the browser won’t check the server for those resources again. Of course, eventually you’re going to change the file, so how do you tell everyone’s browsers that you didn’t mean it? That’s where cache busting techniques come into play. And while there are various ways to pull this off, I prefer using Gulp.

Gulp makes it easy to quickly create powerful build scripts that read much like English. Piping streams together feels really natural for anyone familiar with Linux. And processing files in memory not only gives Gulp a significant performance advantage over Grunt, it also means the configuration is lighter and much easier to read.

Both Gulp and Grunt offer a massive list of plugins, so in either case the hardest decision is deciding which mix of plugins best solve your problem. I’ve just created a cache-busting process in Gulp using two different methods. Let’s explore two approaches and consider their merits.

Option 1: Dynamically Inject Script Tag on the Server

This option utilizes the gulp-rev plugin. Gulp-rev versions your assets by appending a hash to filenames. So for example, if you run script.js through gulp-rev, it’ll spit out something like script-ddc06e08.js. Handy.

The big problem, of course, is how do you assure that any files which referenced script.js now reference script-098f6bcd.js instead? Well, handily, the gulp-rev plugin optionally generates a manifest.json file. This file contains JSON that maps the source filename to the new dynamically generated filename. So, using the above example, the manifest.json file would look like this:

{
 "script.js": "script-098f6bcd.js"
}

You can probably guess why this is handy. Now all you need to do is read this file to get the new valid filename. Assuming you’re doing server-side rendering, you can simply use your server-side language to open this file, grab the value, and change the src attribute of the corresponding <script> tag accordingly. There’re some other similar approaches provided in the docs, but they all operate with a similar philosophy of relying on the manifest.json mapping file. I wasn’t in love with this approach or any of the approaches they outlined, so I created my own approach below.

Option 2: Replace Script References via Regex

This approach is similar to the approach above, but the <script> src attribute is set immediately by Gulp instead. To pull this off, I dropped the gulp-rev plugin and simply created my own suffix using today’s date. I like the clarity of being able to see when the file was last built. But here’s the real win, instead of generating the <script> tag dynamically on the server like we did in step #1, we use gulp-replace to change the relevant src attribute immediately upon build.

Here’s an example gulpfile:

var gulp = require('gulp');
var concat = require('gulp-concat');
var replace = require('gulp-replace');
 
gulp.task('js', function () {
    var filename = 'script-' + getDate() + '.js';
    gulp.src('scripts/**/*.js')
        .pipe(concat(filename))
        .pipe(gulp.dest(paths.build));
 
    return gulp.src('Admin/Default.aspx', { base: './' }) //must define base so I can overwrite the src file below. Per http://stackoverflow.com/questions/22418799/can-gulp-overwrite-all-src-files
        .pipe(replace(/<script id=\"bundle\".*>\/<script>/g, '<script id="bundle" src="/dist/' + filename + '"></script>'))  //so find the script tag with an id of bundle, and replace its src.
        .pipe(gulp.dest('./')); //Write the file back to the same spot.
});

Let’s dissect this file.

  1. Lines 1-5 pull in the necessary gulp plugins and define a task called js.
  2. Line 6 defines the new filename we want to use. The filename will end up being script-1-24-2015.js for example, assuming that’s the current date. I left the guts of the getDate() function out for brevity, but it simply returns a string in the above format.
  3. Line 7 specifies a glob that will retrieve all .js files in the scripts directory. This list of files is looped through on the following two lines.
  4. Line 8 concatenates all the files into a single file.
  5. Line 9 writes the concatenated file to the specified destination. Again, I left out the initialization of paths.build on this example for brevity.
  6. Line 11 defines a new src that needs our attention. This is the file that needs to reference the new dynamically named bundle we just generated. We specify the path to the filename, and also include a second parameter, base. This parameter is necessary so we can overwrite the file in the final step.
  7. Line 12 is where the magic happens. We simply use a regex to replace the current script tag src with a new value that matches the filename created on line 6. This way our file references the new filename. We access the script tag by using a regex that finds the script tag by id. In this case, we’re looking for a script tag with an id of bundle.
  8. Line 13 the updated file is written to disk.

I prefer approach #2 for two reasons:

  1. It occurs on build instead of load. In my opinion, a build task shouldn’t add overhead to every request (such is the case with approach #1 above since the path is dynamically generated on each request).
  2. It’s more discoverable. All the code that is manipulating your application for your build resides within your Gulp file with option #2.

And now with this wired up, I can save my users from making HTTP requests just to see if resources have changed. Bandwidth and time are saved on every page load. When the time comes to update the file, I have a simple build task to generate a new filename and update the corresponding file(s) that reference it.

One final piece of note: If you prefer for the filename to only change when the file contents have changed, you can continue to use the gulp-rev plugin and simply read the hashed filename that it assigns from the config file (since the hash only changes when the file contents change). In my use case, the large bundled file nearly always changes with each release, so I preferred the simplicity of a date based filename. Another approach often considered is to simply append a cache busting querystring on a reference to a static filename, but that may cause issues with some proxies, so I recommend truly changing the filename to bust cache as outlined above.

Have another approach to cache busting that you prefer? Please chime in via the comments.

6 Quick Tips for Presenting Code in Visual Studio

As a frequent conference presenter and attendee, I often see live code walk-throughs. There are a variety of tweaks you can make to optimize the Visual Studio experience for presenting code to others. It’s important to minimize visual distractions, size text properly, and practice your demo in a low resolution.

To that end, here’s 5 quick tips for presenting in Visual Studio.

1. Run Full Screen

Hit Alt+Shift+Enter to toggle full screen mode. This hides the Windows taskbar, title bar, and many items in the Visual Studio UI such as Solution Explorer. This is especially helpful when presenting on low resolution projectors. As a side-note, this mode is also useful for daily use since it helps avoid distractions while coding by hiding all other OS elements.

2. Configure Full Screen Mode

Visual Studio full screen mode hides all explorer windows by default. You may decide to show certain items to assist with the demo. Simply re-enable displaying selected portions of the UI while in Full Screen mode and Visual Studio will keep track of your separate full screen settings. So once you select UI elements you’d like to display in this mode, it will remember from now on. Handy. (also, remember you can open files by hitting Ctrl+, and typing the name, so you likely don’t need to display the Solution Explorer).

3. Enable Presenter Mode

Presenter Mode is a simple feature in Productivity Power Tools. Once you’ve installed this extension, all you have to do is hit Ctrl+Q to place the cursor in the quick launch input, then type the word present. Select Present On/Present Off to toggle. You’ll find font sizes throughout the UI are increased.

Enable Visual Studio Presentation Mode via Quick Launch

Enable Visual Studio Presentation Mode via Quick Launch

4. Resize Further as Needed

In larger rooms with smaller projection screens, it may still be difficult for people in the back to read code, even with presenter mode enabled. In this case you can increase the font size of individual files by holding Ctrl and scrolling or via the zoom dropdown in the bottom left-hand corner of Visual Studio. When using the above techniques, this is rarely necessary, which is good because this setting doesn’t “stick” between files. If you need a more global solution, you can increase DPI scaling in Windows, though you may notice sizing quirks in any apps that haven’t been updated to support DPI scaling.

5. Zoom it

Zoom It is a free tool that allows you to zoom in and annotate specific portions of the screen. Simply hit Ctrl+1 to zoom in on a specific portion of the screen. This can help draw attention to an area and is also useful when a piece of UI outside of Visual Studio can’t be scaled by other means. I’ve even seen some presenters utilize Zoom It throughout an entire presentation, though I find this a bit disorienting.

6. Use the Light Theme

The standard light theme of dark text on a white background is typically easiest to read on projectors. The lower contrast alternatives look great on a nice LCD, but often lack sufficient contrast for projector use. (Credit to @craigber)

Bonus tip

You can consider simply creating multiple Visual Studio settings files and switching between them as desired.

Have other tips for presenting code? Please chime in below.

Two Quick TFS Performance Tips

Team Foundation Server (TFS) continues to improve, but one area I’ve struggled recently is performance. I work in a very large codebase that knocks up against the 100,000 file limit with a single branch (yes, that’s a smell of bigger issues). Anyway, here’s two quick TFS performance tips that may help you be more productive.

1) Create Separate Workspaces for Each Branch

Annoyed that the TFS Source Control explorer is slow? The culprit may be too many files in a single workspace. The solution is simple. Create a separate workspace for each branch you’re working with. Name each workspace descriptively so you can easily switch between them. Having multiple mapped branches in the same workspace will make TFS Source Control Explorer extremely slow if you have over 100,000 files in your workspace (we hit this cap after a single branch). You can manage and rename your workspaces in Visual Studio here:

TFS Workspaces

2) Consider not creating a feature branch at all

If you’re working by yourself on a feature. you can simply work off main and create a shelveset each day to save your work until you’re ready to commit to the main line. This avoids the time-consuming overhead of branching. Just think about the list of overhead you take on with a feature branch:

  • Create branch
  • Get the entire repo again
  • Burn a ton more space on your hard drive due to duplicated physical files
  • Keep the branch updated via merges from main
  • Merge your changes back to main later
  • Switch between your branch and main so you can fix bugs in the main line

That’s a lot of pain that may not be necessary. Assuming you’re working by yourself, the only potential downside to using shelvesets off of main that I see is this: Bug fixes in main may be tricky if your new feature changes impact code related to a reported bug in a previous version. That’s a minor downside that I find worth accepting for all the time it saves by avoiding the overhead listed above.

Shadow DOM vs iframes

I’m really excited about the new HTML5 Web Components Standard. The Shadow DOM is particularly interesting, as it finally gives us encapsulated markup and styling. This should radically decrease the complexity of our CSS and help us finally design and deliver reusable components that don’t conflict with one another. We now have the tools to design our own custom HTML elements that feel native and offer even greater power than today’s HTML elements.

However, at first I had to wonder: Why do we need the Shadow DOM when we already have a way to encapsulate markup and styles using iframes? It’s an interesting point. Today iframes are commonly used to assure separate scope and styling. Examples include Google’s embedded maps and YouTube videos.

However, iframes are designed to embed another full document within another HTML document. This means accessing values in a given DOM element in an iframe from the parent document is a hassle by design. The DOM elements are in a completely separate context, so you need to traverse the iframe’s DOM to access the values you’re looking for.

Contrast this with HTML5 web components which offer an elegant way to expose a clean API for accessing the values of custom elements. Well written web components that utilize the Shadow DOM are as easy to access and manipulate as any native HTML elements. Try accessing the value of an input that exists in iframe from the parent document. It is a painful, clunky, and brittle process in comparison.

Imagine creating a page using a set of 5 iframes that each contain one component. Each component would need a separate URL to host the iframe’s content. The resulting markup would be littered with iframe tags, yielding markup with low semantic meaning that is also clunky to read and manage. Oh, and you’d also have to deal with the pain of properly sizing each iframe.

In contrast, web components support declaring rich semantic tags for each component. These tags operate as first class citizens in HTML. This aids the reader (in other words, the maintenance developer).

So while both iframes and the shadow DOM provide encapsulation, only the shadow DOM was designed for use with web components, and thus avoids the excessive separation, setup overhead, and clunky markup that occurs with iframes.

Excited about web components? Chrome and Opera already offer full support so get started!

Knockout Bindings are Evaluated Left to Right

I just resolved an odd behavior that tripped me up. Have you ever attached a click handler to a checkbox/radio with Knockout and wondered why the old value is received in the click handler? Well here’s the solution. In Knockout, bindings fire left to right.

So if you put your bindings in this order:

<input type="checkbox" data-bind="checked: value, click: funcToCall">

Then funcToCall will see the updated state for the checkbox as expected. However, if you reverse the binding order to this:

<input type="checkbox" data-bind="click: funcToCall, checked: value">

Then the click will fire before the checked binding which means the checkbox’s new state won’t be reflected. So be sure to declare your bindings in a logical order!

An Epic Week of Development in Norway

tldr; NDC was amazing! I was a guest on .NET Rocks! My recorded sessions from NDC in Oslo, Norway are below. And I finally got to meet Uncle Bob!

I just had an amazing experience at my first ever international conference. I’m back from attending the Norwegian Developer Conference in Oslo, Norway. It was an exceptionally well run conference with a few features I’ve never seen before. One exceptional feature was the overflow room which allowed people to watch 8 concurrent sessions simultaneously! Great for those times when you can’t find a seat or you’re not sure which session to pick.

And I couldn’t have been more excited to finally meet one of my programming heroes, Uncle Bob Martin. We had a wonderful chat after his session and I thoroughly enjoyed getting to talk shop 1:1 with an author I’ve looked up to for so long. Thanks Bob!

I also got to see Scott Hanselman speak in person for the first time. Absolutely superb edutainment! And I got to meet Douglas Crockford and attend his excellent sessions as well!

.NET Rocks!

To top it all off, Carl and Richard from .NET Rocks invited me over for my first guest appearance on the show! I’ve enjoyed the show for years and was absolutely flattered to finally be invited to be a guest. They’re a lot of fun and two of the nicest guys one could hope to meet. The show was on Single Page Application Development and we dove into the unique challenges of SPA development in the automotive industry.

Give the .NET Rocks show a listen here!

Oh Yeah, I Spoke Too.

Finally, my sessions at NDC Oslo went great! It was standing room only in both the sessions. A full room just makes speaking that much more fun by adding that extra spark of energy.

Crowd

Videos from both the sessions I presented at NDC are now up on Vimeo.

This is a subset of my Pluralsight course. If you’ve seen the Becoming an Outlier course, you’ll find this contains some unique content, especially at the beginning. If you haven’t seen the course, this is really just a preview since it’s less than half the full course content.

And here’s the session we chatted about on .NET Rocks. This is a case study on the largest single-page application of my career. Many lessons were learned along the way!

In Summary…

There were some touchy points along the way…

But I’m not sure I’ve ever learned more at a conference. I’ll save my gushing on the amazing sessions I attended for a separate post. I feel so lucky to have been a part of it all!

The TDD Divide: Everyone is Right

I’ve been enjoying the back and forth regarding the Death of TDD on the interwebs. The intellectual volleying between “legalists” like Robert C. Martin (Uncle Bob) and “pragmatists” like David Heinemeier Hansson (DHH) is nothing short of fascinating. I have a tremendous amount of respect for both these gentlemen for different reasons, and I can see wisdom in each of their views.

DHH argues that an excessive fixation on unit testing has added indirection, abstraction, and conceptual overhead.

Don’t pervert your architecture in order to prematurely optimize for the performance characteristics of the mid-nineties. Embrace the awesome power of modern computers, and revel in the clarity of a code base unharmed by test-induced design damage.

Unit testing indeed isn’t free and, as DHH argues, provides questionable benefit to offset this cost. Yet he’s fixated on test speed being the reason that we code to an interface. I agree that fast integration tests that hit the database are much more practical in the age of SSDs and fast processors. However, speed isn’t the only reason coding to an interface has merit.

We also code to an interface so that:

  1. We can easily switch out the implementation behind the scenes
  2. We can agree on the interface and have two separate teams handle each side of the interaction independently and concurrently.
  3. We can abstract away an ugly DB schema or unreliable third party.
  4. We can write tests first and use them to help drive the design. Uncle Bob and Kent Beck see this as a core benefit. DHH sees this as test induced design damage.

I again see the wisdom of both sides. TDD has been shown to reduce bugs and improve design. Yet every abstraction has a cost and must be justified. The T in TDD is *more code*. Be pragmatic. The fact is, in many kinds of software, occasional bugs are an acceptable risk. When they occur, fix it and move on.

Uncle Bob is fixated on craftsmanship, perfection, and centralized control. His brand is cleanliness and professionalism. The idea that the era of unit testing could feasibly be replaced by automated integration testing is unsurprisingly viewed as illogical heresy to existing thought leaders in the space.

“It is difficult to get a man to understand something, when his salary depends on his not understanding it.” – Upton Sinclair

Of course, this quote cuts both ways. Some have accused DHH of declaring TDD dead merely because unit testing is hard to do in the very framework he created: Rails.

Bottom line

One important thing to keep in mind: Uncle Bob sells consulting. DHH sells software. It’s a common divide. Software “coaches” like Uncle Bob believe strongly in TDD and software craftsmanship because that’s their business. Software salespeople like Joel Spolsky, Jeff Atwood, and DHH believe in pragmatism and “good enough” because their goal isn’t perfection. It’s profit. So if you want to build the most beautiful, reliable, scalable software, listen to the consultant. If you want to build a profitable product, listen to the salespeople too.

The world is a messy place. Deadlines loom, team skills vary widely, and the impact of bugs varies greatly by industry. Ultimately, we write software to make money and solve problems. Tests are a tool that help us do both. Consider the context to determine which testing style fits for your project.

Uncle Bob is right. Quality matters. Separation of concerns and unit testing help assure the utmost quality, speed, and flexibility.

DHH is right. Sometimes the cost of unit tests exceed their benefit. Some of the benefit of automated testing can be achieved through automated integration testing instead.

My take: Search for the wisdom in both of these viewpoints so you can determine where unit testing has merit on your project.


What’s your take? Chime in via the comments below or on Reddit Programming. Like this article? Submit it to Hacker News