Adapted from “Better DITA Graphics for a Multi-Screen World”, Best Practices Volume 16, Issue 5, October 2014. While the article is addressed to users of DITA XML, the principles can be applied to any structured content environment.
Good visual communication is essential. But, as our image libraries grow, making graphics work well in an increasing range of screen sizes, devices, and contexts becomes a challenge. We realize that we can’t keep manually tweaking different versions for different outputs. We need a new approach to get our graphics looking good across multiple outputs and screens. Image processing tools are smart enough to handle this automatically; we just need to set up the rules. But rather than starting with the technology, the key to success is to first apply good information architecture principles: the same needs analysis, planning, and content type definitions that we’d apply to any structured content solution. This article is intended for information architects and others involved in planning a graphics strategy.
Visual communication has always been valuable in technical communication. It illustrates relationships, demonstrates subtle concepts, and removes ambiguity. It is true that graphics have sometimes been overused, culminating with the style of software manual that featured a UI screenshot alongside each painstaking hand-holding step. However, as the balance has tilted back towards providing context and supporting decisions, most graphics are once again serving a useful purpose.
The changing nature of business communications means that visual communication becomes ever more important. Organizations are realizing the true value of brand — not just in the sense of corporate fonts and colors, but also in the fact that all customer touchpoints, including product content, reflect the personality of the organization. Content needs to be visually appealing as well as pragmatically effective.
Another factor is the dominance of the web and mobile devices in customer information-seeking and corporate communication. There is an increasing expectation that web content should work well on whatever device the user chooses to view it with. As Karen McGrane says in Content Strategy for Mobile:
You don’t get to decide which platform or device your customers use to access your content: they do.
With more users choosing to bring mobile devices to work — or do work at home — it is crucial that all of our content works well on smaller screens, including graphics. Often, this means ensuring that our images work well in a responsive web design. Even where content is not available on the web, it may be delivered via a mobile application, with similar needs for mobile-ready images.
The challenges of multiple screens
Unlike HTML-based text, which is reflowable, there are a number of challenges to making images work across various displays. The most obvious challenge is that images should generally fit the width of the screen, to avoid the need for horizontal scrolling. A simplistic and common approach is to set the CSS
max-width property of images to 100 percent, so that they never overflow the boundaries of the parent block element (typically a column or the full viewport width). However, when an image is taller than it is wide, for example a portrait aspect screenshot, this approach breaks down, as the image can take up most of a smartphone or tablet display. Cropped images can also be problematic, as they can appear grossly out of proportion. There needs to be some way of identifying the appropriate scale for each image — some metadata about the nature of the image content that goes beyond merely calculating the dimensions of the image itself.
Another recent challenge is that high-resolution or “Retina” displays used on many devices work best with correspondingly high-resolution graphics. However, these displays can increase the bandwidth and storage space needed, so it is often not advisable to use the same resolution for every device. It may be necessary to have two versions of each graphic available — one for high-resolution displays and one for lower resolutions. In HTML 5, the selection of an appropriate image is now possible with the
srcset attribute. However, it is still necessary to create the images in the first place. (Note that wholly vector-based images do not suffer from this problem, as they can be resized without losing quality. Nevertheless, as soon as you need to use a raster image, even if embedded into a vector file format, the same concerns apply.)
Finally, even when an image fits into the viewport and takes an appropriate amount of bandwidth and storage space, the details of the image may not be clear on a small screen. The lack of clarity is a particular problem with diagrams and images containing text. It may be necessary to use multiple images with different degrees of cropping or arrangements of the information for different display sizes.
DITA and adaptive images
While DITA solutions have brought huge efficiency and cost savings to many organizations, it seems that our approach to graphics is a relic of the Desktop Publishing (DTP) era. Just as we used to tweak font kerning, insert hair spaces, and adjust margins to get our document designs just right, so we still take an artisanal approach towards working with graphics. While it is possible to scale, resize, and crop images manually, this is not suitable for many DITA solutions where a large quantity of content needs to be processed in a standardized, automated way. The time spent tweaking can quickly mount up. If we imagine a conservative case where each image version requires 2 minutes’ work, and there are 50 images per publication, 3 outputs, and 10 languages (assuming images are localized), the time spent on each publication could be as much as 50 hours!
There is certainly a will to use graphics in better ways, but the challenges can seem insurmountable. In a recent informal survey with 61 respondents, less than half of respondents (47 percent) catered for multiple screen sizes, and more than half (61 percent) indicated that it was difficult to make graphics work across different outputs and screen sizes. This perceived difficulty seems to be having a real effect on the content that DITA users produce, as nearly half of respondents (49 percent) said they would use more graphics if it were easier to make them work across various outputs and screen sizes. One respondent even commented:
We exclude most graphics from HTML output.
What is needed is a new approach to handling graphics — one that applies the knowledge gained from DITA’s semantic tagging and separation of content from presentation.
An IA approach to graphics rendition
We tag pieces of textual content semantically, not presentationally. That often means using tags to indicate the role that these elements play in the surrounding context. For example, we use a
<shortdesc> tag to indicate that the contents succinctly describe the purpose or scope of the overall topic. It doesn’t indicate that the contents should be formatted as a separate paragraph. Inline formatting can sometimes be more appropriate, especially when the shortdesc is contained within a longer
In the same way, we can attach metadata to graphics that indicates the role they play rather than the specific size or crop needed for any one presentational context. When used in an automated graphics rendition solution, this metadata then enables rule-based processing that generates appropriate renditions for each output. For example, a designer can set metadata on an icon to indicate that it is for use inline. When our toolchain encounters this metadata, it can then apply rules to control the display height of the icon appropriately, so it neither appears too tall, forcing text lines apart, nor too short and hence hard to see. These rules could be:
- For web output, create a “regular” version 25 pixels high and a “retina” version 50 pixels high, naming them appropriately so they can be referenced within an HTML5
srcsetattribute, enabling responsive images.
- For PDF output, control the display height using the nominal PPI metadata on the generated image. Divide the height in pixels by the required display height for inline icons. Set the PPI property to the resulting value. The PDF formatter then bases the display size on this value.
While developing such a solution may seem dauntingly technical, in fact the hardest part is defining the requirements clearly. Once that is done, it is a relatively straightforward task for a developer to script graphics processing tools and integrate the scripts into a publishing toolchain. The following sections break down the process of specifying requirements into six clear, achievable steps.
1. Audit your graphics
The first step is to build a detailed picture of your graphics usage. Using a spreadsheet or any other convenient tool, gather as many examples as you can of the graphics you are using. Then, go through these examples with key stakeholders to determine the role that each graphic plays.
This review is an excellent chance to focus on effective image usage. While graphics can be incredibly useful, they are expensive to produce and may not always be effective. Weed out:
- Images that are purely decorative
- Images that may impede learning instead of facilitating it (for example, where a product diagram or screenshot may divert users’ attention from interacting with the product itself).
- Images that may be ambiguous or even misleading. In a presentation at DITA Europe 2011, Marie-Louise Flacke gave the example of a poisonous substance container which featured a skull and crossbones symbol to indicate the danger. However, some children misinterpreted this to mean “pirate food”, with potentially very serious consequences.
Once you have weeded out unnecessary images, loosely categorize the remainder according to the visual role they play in the document.
|Inline icon||Needs to fit well into inline text|
|Definition table icon||Larger than the inline icon, as the extra space in a table allows for more detail to be shown.|
|Screenshot||Can take up to the whole column / container width, but care needed so it doesn’t look out of proportion with the text.|
|System diagram||Most of these have significant detail. Care needed to preserve this detail on a small screen.|
Note that this is not semantic tagging as such, but rather abstracting the visual role of the image from the specifics of either the specific size and display parameters or the graphics file format.
2. Define content types and required outcomes per output
Now is the time to decide how each image should appear in each presentational context. Looking at the loose categorizations from the previous step and with examples of each presentational context in front of you, consider the following:
- Do you need to use graphics inline within text? If so, what is the ideal display height so as not to force lines apart or reduce legibility?
- For larger, block-level graphics, what scale or proportion requirements do you have? Do you have portrait-aspect images or cropped screenshots that would look odd if scaled to the full container width? Do you need to keep consistent proportions across a whole docset or site, even if source images come in a variety of sizes?
- Is there a danger that information-rich diagrams or photos will be illegible if scaled down? Would cropping improve this image, and if so, can you find some consistency in the requirements (for example, does the focal point and the important information in an image tend to be in one location, for one content type at least?) Alternatively, is there potential to automate the highlighting or otherwise manipulate the images to bring out the important information?
Once these and similar considerations have been addressed, record the requirements. A format that works well is a table with presentational contexts in the columns and graphics content types in the rows.
|Source image types||Output image type and corresponding rendition actions|
|Description||File type||Web (low resolution)||Web (high resolution / retina)|
|Inline icon||PNG||Control display height. Inline icons should be as large as possible without overlapping or overly pushing out surrounding lines of text. Table icons should be bigger but still a set height. Needs further research as to exact sizes.|
|Table definition icon|
|Smartphone screenshot||PNG||Control display size. Size should be based on 1.8 inches wide for an uncropped, portrait aspect screenshot. Preserve fidelity of original image.||Control display size. Size should be based on 280 pixels wide for an uncropped, portrait aspect screenshot.||Control display size. Size should be based on 560 pixels wide for an uncropped, portrait aspect screenshot.|
|Diagram||SVG||Use full container width. In a later phase, consider whether highlighting can be scripted in illustration tool.|
3. Research the transformations
Once you have established how your images should appear in outputs, you need to research tools and the commands necessary to generate those results. The basic requirement is a graphics processing tool that can be run via the command line and hence can be scripted.
A commonly used open source tool for automated graphics transformations is ImageMagick, available from Imagemagick.org. This tool offers a wide variety of commands, particularly for use with raster graphics. For vector graphics, particularly transforming vector graphics to high-quality raster images, another useful tool is Batik from Apache. If you have identified a need for more advanced vector graphics manipulation such as highlighting certain areas or rearranging diagrams, illustration tools such as Adobe Illustrator or Inkscape should be considered.
Once you have arrived at the commands and parameters you will be using, record them in a similar format.
|Source image types||Output image type and corresponding rendition actions|
|Description||File type||Web (low resolution)||Web (high resolution / retina)|
|Inline icon||PNG||To get required PPI, divide height in pixels by specified display size for that content type. Then set PPI.
||Set height, based on content type.
Use ImageMagick to set height:
|Table definition icon|
|Smartphone screenshot||PNG||Set PPI only, based on content type.
Use ImageMagick to set PPI:
|Resize and convert to Q90 JPG, based on content type. Resizing by percentage according to content type but limited by max width.
|Diagram||SVG||[Set width with custom XSL]||Convert to Q90 JPG. Output size equates to maximum possible container size.
Convert using Batik rasterizer:
4. Decide on a naming scheme
For an automated solution to work, each image needs to have attached metadata that indicates the content type. Image formats have various ways of attaching metadata, and there is also the possibility of using an external manifest file. However, a simple, reliable, and universal way of attaching metadata is using suffixes on the image filenames themselves. Every major high-level language is capable of processing this information without any use of additional libraries. There is also less danger of the metadata becoming detached from the image.
You need to decide on the separator character to use (a character that you won’t use in the descriptive part of the filename, for instance), and the actual suffix you’ll use for each content type.
|Image type||Filename suffix|
|Table definition icon||
5. Decide on the overall solution architecture
There are two major approaches to automated graphics rendition in DITA. One is to render graphics on the fly at publishing time.
This approach has the advantage of flexibility: any changes to your graphics renditions will take effect the next time you publish. However, one disadvantage could be slower publishing, depending on the specific transformations. A way to alleviate slower publishing would be caching: the first time a graphic is requested for any output, it is rendered on the fly and then cached. It would only be re-rendered if the rendition settings or the image itself changed.
The other approach is possible if you have a CMS that can store multiple resolutions for each image and involves rendering graphics on import to the system as shown in Figure 3.
This approach is less flexible, as changes to the rendition parameters will require re-importing existing graphics for the changes to be used on those graphics. However, there are two advantages of this approach. One is performance: it is very quick to retrieve the appropriate rendition for each graphic at publishing time. The other advantage is that some CMSs make it straightforward to define one resolution as your preview or thumbnail resolution, which can then be used by authoring tools when previewing or editing topics.
Make sure to use a configuration file to define the actual rendition values (for example the degree of scaling for a particular content type), rather than hardcoding values.
6. After the solution is built, test everything
I have assumed that readers can either use the services of a developer or are able to build an automated graphics solution based on the requirements gathered in steps 1-5. However, as with all software development, it is crucial to test the results thoroughly. Each combination of graphics type with output should be checked, preferably by designers or other content creators, who will quickly notice issues, rather than someone who is only involved from a technical standpoint.
By applying information architecture techniques to graphics transformations, we can get a pushbutton solution that saves a great deal of time and effort. While I have given only one example, these techniques can be applied to a wide variety of requirements and could be extended to passing through graphics type information to the HTML markup to use in browser-based scaling and even the cropping approach described in this article: juliankussman.com/scaling/.
Of course, as with all automation, the impact on team dynamics cannot be ignored. Often when teams move from a DTP approach to content production to structured, modular content management, there are team members who feel that their position is threatened and they cannot have the same degree of creative freedom as before. In the same way, when aspects of graphics production are automated, content producers may believe that this restricts their degree of expression. In this case, the benefit to communicate clearly is the reduction in tedious, mechanical work that in fact allows more time for true creative input. Ultimately, this benefits customers, providing them with all the information they need — including the crucial information contained in graphics — in whatever context they need it.
Here are the slides that I used when presenting on this theme at the CMS/DITA North America 2014 conference: