An API for Browser Screenshots

What do the following screenshots all have in common?

From Amazon’s Cloud Reader Installation:

From the University of Virigina’s guide to setting proxy settings in Firefox:

From HootSuite’s TwitterBar acquisition announcement:

From Tecca’s Guide to Internet Explorer:

That’s right: they all include portions of browser chrome. (Chrome 13, Firefox 3, Firefox 4 for Windows, and Internet Explorer 9, I believe.)

What else do these screenshots have in common? They will all one day be out of date (if they aren’t already). As soon as Google modifies their extension installation dialog, or Mozilla changes their proxy settings tab, or the Firefox address bar gets a new background color, these screenshots will no longer accurately represent the interaction through which they’re meant to guide the user.

A Modest Proposal

I propose that this problem of stale browser screenshots could be alleviated by the creation of a Web service that exists solely to serve semi-dynamic screenshots of browser chrome. Allow me to explain with examples.

The Amazon screenshot above could be replaced with a call like this:

<img src="http://browsers.foo/addons/installation?highlight=confirm&w=460&h=60" />

Or the TwitterBar image could use this URL instead:

<img src="http://browsers.foo/toolbar/?include=url-bar,icon&icon=http://foo.com/hoot.png&highlight=icon" />

(Note the idea of being able to merge existing images into the screenshots.)

The IE add-ons dialog screenshot could just as easily call this URL:

<img src="http://browsers.foo/addons/tracking-protection?browser=ie&version=9&highlight=easy-list" />

The API would automatically use the user’s user-agent to determine what browser, version, and platform to show in the screenshot (although these could also be specified manually, as seen in the IE example). If images from the exact current version aren’t available, the most recent version could be used instead.

I think that with a couple dozen high-resolution, high-quality screenshots of the various windows and dialogs in each major browser version on each major platform combined with metadata defining the position of key elements in those screenshots (e.g., the home button, the address bar, the History menu), 90%+ of the browser-specific screenshots on the Web could be replaced by calls to this service.

What do you think?

Is this a solution in search of a problem, or is it a legitimately useful idea? I think it would be worth its development costs just for organizations like Mozilla or Google to use in order to populate their help documents with screenshots that would always be up to date. Tell me what you think in the comments below.

13 comments on “An API for Browser Screenshots”

There is a gtk-vector-screenshot tool to make SVG, PDF or PS screenshots of GTK-3:
http://www.joachim-breitner.de/projects#gtk-vector-screenshot

Havenâ€™t looked at it much yet (GTK-3).

August 26, 2011 at 7:32 am Reply

I’ve lost count of how many times I’ve seen documenters working with a browser application and they’ve purchased a license for some special screenshot application and have been assigned weeks or months to create a document, almost all of which is subsumed by screenshots.

Anacdotely, I’m confident there’s a lot of merit in this idea. It’s a solution for a problem which people have been addressing using an alternate technique which is thoroughly inefficient.

August 26, 2011 at 7:59 am Reply

This seems interesting however I see ways that this feature could be used to create security attacks that trick the user into giving up secure info. For example take your dynamic screenshot and overlay it with some form element and phish away. This could result in the user giving up info such as their sync username, password and key.

August 26, 2011 at 9:09 am Reply

Christopher Finke says:

Good insight Kevin. Definitely something to consider.

August 26, 2011 at 11:00 am Reply

Let me pull out my shoebox with all my memories.
I understand the appeal for better and up to date information. There is a trade-off in making things live which is the loss of memory. A live system without its temporal archiving system counterpart is not always good.

You could have a system ala

//example.org/mywonderfuldoc/last

with a temporary redirect to the dated version

//example.org/mywonderfuldoc/20110826

etc.

The automatic “screenshot” is dependent on many things and specifically the solidification of a universal UI. It is likely that a few people will customize their UI and everything will fall apart.

It has security issues too.

User agent sniffing is likely to break and it is not easy to maintain. Too many devices/browsers. More than 1000 for Opera alone.

What could be done maybe is too have an API which is an abstraction layer on how we call the different parts of the browser, and doing that we are able to show the relevant part of this element to the user. The issue being that I’m pretty sure that the chrome is highly dependent on the platform. You enter into a giant maze of difficulties.

Even if not perfect, the simple screenshot seems to be a lower cost if combined with a good information management. Photo shoeboxes ;)

August 26, 2011 at 12:01 pm Reply

I think the problem I see here is that if the UI has changed enough that the user is likely to be confused by an outdated screenshot then it will probably have changed enough so that anything you’re layering on top is no longer correctly placed or even relevant

August 26, 2011 at 12:15 pm Reply

I see a more useful API for screencasts that people watch specifically to learn how to do something. Seeing their own UI in that would reduce confusion without being a security risk, and instead of a video screencast it could be SVG SMIL animated, so the same animation would run just with different styles for UI, synced with an track.

August 26, 2011 at 3:42 pm Reply

That last bit was meant to be “synced with an audio track”, lost due to me typing it as html tag. This would also allow choosing both an audio track and UI in the language detected by the useragent, or manually of course.

August 26, 2011 at 3:45 pm Reply

In practice, for use for something like help docs, I’d be fine with the automation, but instead of automatically updating, I’d want it flagged for approval in a queue so I can see what’s going up before it does.

Should something go awry such that the automated shot is inaccurate, inappropriate or embarrassing, I’d rather have the slightly older browser shot up than have the wrong one up (for who knows how long).

August 26, 2011 at 9:47 pm Reply

This is an intriguing idea. One of the issues about Firefox’s new rapid release schedule for enterprise support is the (unpredictable) cost of updating documentation if there are UI changes. However, as Caspy7 alludes, doc maintainers would want approval before updates go live. If the screenshot matches the current software, but no longer matches the accompanying text, that could cause as much user confusion as an outdated screenshot (with matching text). I could see this as a subscription service, where the author gets notified, either through email or through an integration in their CMS or help authoring tool, that an updated screenshot is available. This also has the advantage of automatically informing the author of *which* screenshots have changed and which haven’t, thereby reducing some of the uncertainty associated with version updates.

August 28, 2011 at 10:48 am Reply

Definitely a solution in search of a problem. Because a screenshot isn’t just a screenshot – it’s an image within the context of some documentation. And if the screenshot becomes obsolete, some human is going to have to review the whole thing anyway, to judge whether the process being documented has changed.

Less Talk, More Do

Christopher Finke writes about things he has done: software, woodworking, and other creative endeavors.

A Modest Proposal

What do you think?

Related

13 comments on “An API for Browser Screenshots”

Leave a Reply Cancel reply