The Need for Readability
Many people in the world now use mobile devices to browse the web. It's fast, convenient, and absolutely indispensable when you absolutely have to prove that you are correct in an argument. However, early on in the mobile browsing craze, my father asked me (and he wasn't the only one), "Why would anyone choose to use the internet on such a tiny screen?" Probably the most obvious reason is the convenience of being able to look up restaurants, stores, reviews, or virtually anything, while you're trying to find a place and you're disconnected from a traditional desktop/monitor setup. Or, for some people, being able to play Angry Birds or watch a TV show while outside sitting in a hammock is enough reason to want the web on your mobile device.
However, small text is difficult to read. If you are unable to read text, surfing the web really isn't that much fun. It's a lot of frustration and irritation - a commonly enjoyable task now becomes a disliked chore. How can we get around this? Well, zooming is a common way to alleviate something that is too small - effectively including a magnifying glass in the software you write. So, if something is too small on the screen - fonts, images, smileys, shortcuts, etc... - you simply increase the number of pixels these objects take up, effectively making them larger. This works reasonably well, assuming, or course, you have enough data. But, zooming has its own issues. If we zoom in on a piece of text, we're making that text larger, along with all the surrounding text. This is what we want, but what if by zooming in on the text to make it readable, we now have a smaller amount of text on the screen? That is, we have to scroll after having read only a small amount of information. If you have to scroll after having read only two words, it gets tiresome very quickly. Another issue is having to scroll in multiple directions. We're fairly content with scrolling vertically (or horizontally, if that's the only direction), but having to scroll horizontally AND vertically is a pain (if you don't believe me, try reading a PDF paper on your mobile device with the default reader - after the first half page, I get annoyed at having to scroll horizontally AND vertically).
Enter Font Inflation, Stage Left
Font Inflation1 is the term for the Gecko implementation of an adjustment made to text during Reflow2. The basic idea is that we divide up our set of frames into some which are text containers, and then increase the size of the fonts used for text within these containers. Font Inflation, as a concept, is not actually unique to Gecko. A form of this concept, called text size adjustment, is also implemented in WebKit, specifically for iOS devices3. The tricky parts are determining what constitutes a container for font size inflation, and enlarging text without drastically changing the page layout.
Non-technical readers (or technically saavy readers that aren't interested in the explicit details) might want to skip this part, as it goes over how font inflation works within Gecko.
David Baron developed the basic font inflation algorithm and processing in Gecko4. There are really two basic stages in which font inflation plays a role: Frame Construction and Reflow. During the phase of Frame Construction, we mark some frames as font inflation containers, and a subset of these as font inflation flow roots. A font inflation container is a frame that is an ancestor of frames containing text, with the added condition that font inflation containers are never line participants (e.g. inline frames such as
<b>, and line breaks), because we want font inflation to be consistent within a line (i.e. we don't want font inflation to adjust a line like this). You can think of font inflation containers as block frames containing some amount of text to be inflated. If there is more than one frame that represents a particular node in the content tree, only the outermost frame is a font inflation container. Font inflation containers happen to be the smallest unit of text for which we can disable font inflation entirely.
A font inflation flow root, on the other hand, is a slightly different beast. It's a frame that is a font inflation container, where we want to start aggregation of font inflation data. In other words, it's a bit of an artificial construct designed to fine-tune the heuristics we use to factor out things like copyright notices at the bottom of pages, and other areas of text where font inflation isn't desired. In order to be a font inflation flow root, a frame must satisfy the following conditions:
- It must be a font inflation container.
- It must establish a new block formatting context5
- It must be either6:
- Absolutely positioned
- or, floating
- or, a table cell
- or, it's the root frame
If these conditions are met, then during frame construction, the frame is set as a font inflation flow root. This essentially means that this is the beginning of an area of "text flow" that we want to inflate. (Ideally,the content within the frames underneath a given flow root should be "connected" in the user's mind. An example might be different paragraphs of a single section of a single article. Unfortunately, we can't determine semantic connections, at least not without assistance from a more robust language than HTML 4.01). In other words, this is a subtree of the frame tree for which we want a separate font inflation statistical-aggregate data structure. It isn't uncommon for the root frame to be the only font inflation flow root in a document.
Since I'm a visual person, I thought I'd give a visual example of what we're talking about here. You can take a look at the example at link borked. If you're currently on Firefox for Android, then you should see the font inflated page. If you're on desktop Firefox, open a new tab, type about:config, search for inflation, and set
200. You'll then need to reload the example page. (A quick warning: this will inflate any pages you visit, so you'll want to turn it off again after you've reloaded the example).
If you're not interested in testing the font inflation for yourself on this example, here's an idea of what it looks like after these inflation configuration settings have been enabled:
So why weren't the sidebar and footer inflated? Shouldn't they have been inflated as much as possible, just like the main article text? The answer to these questions lies in the work that was done to specify regions of layout that collect separate font inflation data (the flow roots mentioned above). This example is similar in structure to the layout of the New York Times website, which we had issues with regarding font inflation abnormally inflating the footer of the page (see "Footer Text", in the Exceptions section, below, for context as to why this works the way it does). In order to better understand how this process, it's useful to look at the example alongside a representation of the frame tree that is built within Gecko at the time of page layout. A non-inflated version of this page looks something like this:
This shows the example page rendered without font inflation. The background colors indicate which frame the particular element corresponds to in the frame tree image (below).
I've color-coded the different frames so that it's easier to coordinate them with the frame tree, which looks something like the following:
This shows the frame tree for the example web page, condensed for brevity and clarity.
As you can see from the frame tree diagram, the root frame and the two block frames with id "main" and "sidebar" are our font inflation flow roots. This means that font inflation data is aggregated, starting at each of these frames, and not including the other font inflation flow root frames, or their respective subtrees. The font inflation containers are the block frames "body", "page", and "footer". Since the descendants of the "footer" frame are the only content within the "root frame" font inflation flow root, and the text isn't sufficient (the meaning of this might not be clear until the
lineThreshold setting is discussed below), it isn't inflated. For the same reason, the text underneath the "sidebar" flow root isn't inflated. The text underneath the "main" frame, however, is sufficient, and thus is inflated as expected.
In other words, you can think of the font inflation flow roots as areas of the page that collect statistical data regarding the size (i.e. text amount) of their children, in total. This data is then used to determine whether or not the font inflation functionality should take effect for the child frames beneath these roots. They are also used to determine the maximum width (and thus the minimum text size), if font inflation is enabled for a given frame beneath the flow root. Font inflation containers are used to disable font inflation entirely for a smaller chunk of text than the flow roots.
Once we've initialized our frames to have the appropriate state bits and such, we perform a font size inflation calculation during reflow. First, given a particular target frame for which we want to inflate fonts, we find its font size inflation flow root, and compute the width. In order to compute the width that descendent frames will use from a given flow root, we find the nearest common ancestor of the first and last pieces of inflatable text within a given flow root. We then use the width of this ancestor frame, which might not be the flow root itself. Then, given the two preferences for font size inflation (discussed below), we compute the minimum font size that will satisfy these parameters and will fit within the given container width. Using this minimum font size, we map all font sizes in the target frame within the range (0-150%] of the minimum font size to the range [100-150%] of the minimum font size7.
With a minimum font size of 20.0 px, 12.0 px gets mapped to 23.33 px.
Why not simply map everything under the minimum font size to the minimum font size itself? Because we want to be able to preserve differentiation between fonts at a size less than the minimum. So, for example, if we have text that is 12pt, with headers that are 16pt and 14pt, we want to make sure that all of these fonts, once adjusted, can be distinguished from one another.
One thing I glossed over here is that we only perform a font size inflation calculation during reflow if font inflation is enabled for the frame in question. This actually ends up being an important point, because there are certain situations where we want to disable font inflation for certain frames. The exceptional cases where we want to detect a certain (somewhat general) category of web page layout, and disable font inflation, either for individual frames or entire pages, is detailed in the next section.
The basic algorithm works really well, but there are still some situations where we want to limit or disable font inflation logic. Together, these situations comprise the bulk of the heuristics Gecko abides by when determining font inflation settings for a given set of frames. Development of the font inflation feature is essentially the refinement of these heuristics. Putting them together, and verifying which ones work well without too drastically impacting the performance of the rendering engine, as well as testing for correctness and general usability has been the goal of the font inflation project for the past 6+ months. Unfortunately, these don't all fit very easily into a single area of the code, so it's difficult to tell someone to look at the logic in X class on line Y if he/she wants to learn about how the font inflation heuristics work.
In the following sections, a brief summary of the most pressing issues confronting font inflation over the last few months are described. I try to explain these situations as clearly and concisely as possible. In addition, I try to give code references and documentation links that you can look to for more information if you'd like to see how something was implemented, or perhaps would like to try to fix something that you think is broken with the font inflation algorithm as a whole. That said, keep in mind that only as a whole do these heuristics work to bring you the font inflation feature that is new in Firefox 14.0 and the all-new Firefox for Android (Fennec), so any individual piece may be only one part of the process, or even nonsensical, when taken by itself.
Small Text Bits
The most notable (perhaps notorious is a better descriptor) bug we encountered when fine-tuning the heuristics is in what we've come to refer to as "the New York Times footer case" (Bug 706193). In this case, the New York Times site was inflated in certain areas (conforming as expected to the basic algorithm), but there was a problem where the algorithm was over-inflating the footer text, and causing the text to wrap - the text at the bottom of the page with links to different sections of the New York Times.
The difference in footer text size versus the article text size was incredibly noticeable, and quite distracting for users.
This applies to situations where small bits of text are in the document, and where the layout of the page is dependent on these small bits of text. Usually, site authors may not have explicitly defined a height on these bits, because there wasn't previously a possibility that the text could wrap. Once we inflate the size of the text, however, this assumption is no longer valid.
To fix this, we utilize a technique where a threshold value is specified. This threshold is an amount of text for which font inflation is disabled if the threshold is not met. However, we couldn't just use a specific amount of text within a block directly. If this were the case, then adjacent blocks within a document that had differing amounts of text would have differing font inflation settings, resulting in possibly strange rendering of text sizes. To better adjust for this, a line threshold was added as part of the fix for this bug. The line threshold works like this - for each block formatting context, we construct a set of data by scanning the items contained within that block formatting context (BFC). As we scan, we accumulate the text in each of the frames contained in the BFC - if we find sufficient text that has the same font size for inflation, then we set a bit in the font inflation data structure that enables font inflation for that size font.
An astute reader will realize this sounds a lot like what was described in the "Basic Algorithm" section above. In fact, this is the case. Since it's such an important aspect of how David rewrote the font inflation code, I consider it part of the "Basic Algorithm," and describe it twice. One thing that I didn't explain in detail above, though, was what makes the text amount "sufficient." The preference that was added for controlling this is somewhat unintuitive. The name of the preference is
font.size.inflation.lineThreshold, and it controls "the percentage of a number of lines of text we'd need to trigger inflation"8. What that means is that if each of our characters' widths are equivalent to 1em (most characters aren't - they are usually smaller than this), then a value of 100 for this preference means we'd need 1 line of 1em-width text to trigger font inflation. Since we know the width and the size of the font for the text, we can determine how much text will fit on a line, based on the em-size of the text. By default, (at the time of this writing), the value for this threshold is 400, which means we need approximately 4 lines of text (under the assumption that all characters are square, which means that we'll actually only need slightly under 2 lines of normal text) to trigger font inflation.
One of the first caveats to the font inflation algorithm that was recognized by David Baron was that block-level elements that have a constrained height cannot perform font inflation in the same way as blocks with unconstrained heights. The point of font inflation in the first place is to increase readability while discouraging the use of horizontal scrolling. This second part is important in this case, because if we want to avoid horizontal scrolling, then at some point, we're always going to have constrained width. This means that as text gets larger, there will be more lines of text, since wrapping will happen more frequently. This is, in large part, due to the fact that the ultimate upper bound on the width, the screen size, remains constant. Thus, since we have more lines of text, we're going to grow in the block direction - i.e. the height of our blocks will grow as font inflation deviates more and more from the original font size.
This issue was addressed in the original font inflation bug with a new frame state bit,
NS_FRAME_IN_CONSTRAINED_HEIGHT. If the font inflation algorithm detects that a frame has this bit set, it does not enable font inflation for that frame. Interestingly, we'd also need to handle this case if we were to implement reflow-on-zoom (see the bottom part of note 1 for more details about reflow-on-zoom), so this is a problem that plagues both of the major readability enhancement solutions.
A related, but slightly different bug, had to do with constrained sizes with form controls. This was actually the symptom of a more general case, wherein if a frame was to be inflated, but between that frame and it's nearest ancestor font inflation container on the frame tree there was a frame representing a non-inline element with constrained height or width (because these can't wrap), then font inflation should not happen on that frame. This sounds complex, but, in reality, it's a pretty simple case of determining whether we can actually inflate, based on size restrictions.
Sites that are already optimized for mobile (e.g. m.twitter.com) likely don't need additional inflation, since the web developer already has adjusted the font size to fit on a phone-sized screen. These sites can often be detected by looking for the mobile-specific tag
<meta name="viewport"> in the header of the document. However, this was a bit tricky for us, because the code that detects the
<meta name="viewport"> tag was in the front-end code of Firefox, rather than in the platform, Gecko. As part of Bug 706198, we pulled the processing of the
<meta name="viewport"> element into Gecko and added two conditions that disable font inflation in what we consider to be "mobile" cases.
The first case depends on the default zoom attribute, which corresponds to the initial-scale attribute of the viewport metadata element. What this indicates is how "zoomed in" the viewport is initially upon page load. Typically, this value is inferred from other settings (such as when width or height are set) in the viewport metadata, so it's not usually specified directly, unless a developer wants to use a specific zoom setting9. If this default zoom attribute is greater than or equal to 1.0, then we assume that this was set explicitly, or inferred from the width/height being set, which are typically done on mobile-optimized websites. Thus, in this case, we disable font inflation. Similarly, we disable font inflation on sites where the width or height attributes of the viewport meta tag are set to device-width.
There's also something somewhat subtle about the logic for this condition - if the doctype string contains the text "WML", "WAP", or "Mobile", or if the meta tag
<meta name="handheldFriendly"> has its content equal to "true", we return early from our viewport metadata processing. Since the
defaultZoom in this case is initially set to 1.0, font inflation will be disabled under these conditions as well. The former of these indicates that the page is using either the XHTML Mobile Profile or the Wireless Markup Language10, and the latter indicates that the site is optimized for mobile using the old AvantGo standard for Palm devices (being phased out now, I believe).
Bug 758079 shows another, more difficult bug to deal with. Bullets and ordered list numerals are rendered in the space given to either the left (for left to right text) or right (for right to left text) of the list item. Specifically, this ends up being placed in the margins of the block element in which the list is embedded. Unfortunately, this isn't associated in the code directly with the list item itself. That means that if the list item is inflated, the space to the left or right (where the bullet/list numeral is rendered) is not inflated. By default, the area we have to render a list numeral or bullet icon is 40px. This normally works fine, but once we start playing with the font size in the rendering engine, we end up with numerals that might be larger than the amount of space we have in which to render the text. In this case, the text gets clipped somewhere before the start of the block element's boundary.
Ideally, we'd solve this problem by linking the thing that constructs the indentation of the list item with the list item itself within the platform. This could be a quite massive change, though, and would require that we significantly reconstruct how we deal with bullets and lists. Instead, we opted for a bit more pragmatic approach: increase the amount of margin space in the event of font inflation. We increase the margin by multiplying the default margin size by the same ratio as the text in the bullet is inflated.
Incorrect Flow Roots
As a final exception to the basic algorithm, I'm going to discuss a problem that we don't yet have a solution for: incorrect flow root assignment. Sometimes, we actually don't want a table cell or float to be a flow root. It can cause issues like what happens at ycombinator.com when reading comments (Bug 707195):
Because some sites utilize tables for layout (indentation in the case of ycombinator.com) rather than for actually displaying tabular data, there exists a problem where the same type of syntactic node is used for two vastly different things. When indentation is controlled by tables by using nested tables, the width of each individual cell gets smaller with each nesting. This causes comments that come later in the thread (and thus are indented more) to be inflated less, and comments that come earlier in the thread to be inflated more, simply due to width restrictions. Because each table cell is an individual font inflation flow root, we can't accumulate data across table cells to consistently apply font inflation to the entire set of table cells.
Somehow, we need to add heuristics to determine when a table cell is used for layout, and thus should not be a font inflation flow root, and when it should be a flow root. This isn't unique to tables, either. Reddit.com also experiences a similar problem:
The all-new Firefox for Android, nicknamed "Fennec" or, sometimes, "Fennec Native" has shipped. The new readability enhancements described above are included in this version of the software. This is fantastic, and it makes readability for the web on the Android platform better than ever. But, the work isn't done yet. Font inflation is a great feature, but it's not the only aspect of readability that we want to include in future versions of Firefox. Crisper, clearer fonts are also on the docket, along with an investigation of how reflow on zoom might be incorporated into our product. Another area of current development is Reader Mode11. Reader Mode takes the text content of an article, strips away aspects of the page that might not be relevant to a user while trying to read the article, and places the resulting text onto a more readable background, with an easier-to-read font.
Let's face it - much of what we do on the web is reading. We read news articles to learn about current events. We read recipes in order to prepare that fantastic salmon dinner your boyfriend/girlfriend wants. We read through instructions on how to fix that misfiring cylinder in our car. Almost every web page you visit has some text on it - it's integral to the way we communicate and live. We want to make that experience not just better, but better than any other browser on the market. We literally (no pun intended) want everyone -- from your 102-year-old grandmother to your 4-year-old nephew who just learned to read -- to be able to interpret clear text in Firefox as easily as if they were reading a piece of paper. Our goal is to develop the features that will make you want to use Firefox as your daily browser because of how it renders text.
Notes and References
A special thanks to Daniel Holbert and David Baron for proofreading this post and pointing out issues they saw. Guys, I really appreciate your help and guidance!
This is sometimes called "text size adjustment," or just "text size adjust." Font inflation is the name given to the Gecko implementation of this feature. Note that this is actually different than font-size-adjust, a CSS property that is used to preserve the aspect value (relative height of lowercase letters as compared to uppercase equivalents). It's quite confusing, since both the text-size-adjust and font-size-adjust properties exist, and are different. It can also be confused with Reflow on zoom, another technique for increasing readability, which triggers the reflow process when a user double-taps to zoom. The key to this approach is that the user has effectively given us more information about what he/she is interested in, and thus we are able to make that text larger at the expense of the rest of the page (which is outside the viewport). Unfortunately, this approach doesn't play nicely with panning after zooming, nor does it work quite right with pinch-to-zoom interfaces. Opera and the Android stock browser use the latter technique for readability, whereas the former is used by Google Chrome, Firefox, and Safari. ↩
Reflow is the process by which frames representing rectangular areas containing the content of a webpage, are laid out (i.e. given a width and height, and x and y locations on the screen) by the rendering engine of a web browser in preparation for display on the screen. Rendering a webpage happens essentially in a set of phases: 1) construction of a data structure representing content (element) tree, 2) parsing and construction of the style data structure, 3) combination of the style data structure and content tree to construct frame (or render) tree, 4) reflow, which includes placing and sizing frames, and 5) Painting/Compositing, which is the process of utilizing the frame tree, content tree, and style structures to produce individually-colored pixels, which are then drawn on the screen. Much of what we call "layout" code in Gecko is specifically targeted toward algorithms that create the frame tree and place content within frames during the reflow process. You can see an example of how reflow works (visually) by watching the following video: Gecko Reflow Visualization. ↩
Adjusting the Text Size (2011). Safari Web Content Guide, iOS Developer Library. Retrieved 28 June 2012, from http://developer.apple.com/library/ios/#DOCUMENTATION/AppleApplications/Reference/SafariWebContent/AdjustingtheTextSize/AdjustingtheTextSize.html ↩
Font inflation as a feature in Gecko was originally developed under Bug 627842: Allow minimum font size based on size of frame, which landed on mozilla-central on 23 November, 2011. It wasn't included in a release until the new Firefox for Android, which happens to be Firefox 14.0. ↩
A container that establishes a new Block Formatting Context is a container inside which individual block frames are laid out vertically (i.e. in the "block" direction). A new block formatting context is created with an absolutely positioned frame, a float, a block container that's not a block box, and a block box with an overflow setting other than 'visible'. All of the items in condition 3 establish a new block formatting context, so this condition is somewhat redundant, and serves only so that this can be described in more detail. The visual formatting section of the CSS v2 specification has more information on block formatting contexts at http://www.w3.org/TR/CSS2/visuren.html#block-formatting. ↩
There are a couple of other situations where a font inflation flow root is established. Namely, nsOuterSVGFrame (frames that bound the outermost <svg> element in an SVG document), nsSVGForeignObjectFrame (frames within an SVG document containing SVG Foreign Objects), nsBoxFrame (frames containing XUL boxes), and nsLeafBoxFrame (a generic class containing logic that is shared by the frames displaying XUL trees, XUL images, and XUL text boxes), but these are glossed over since they are reasonably special cases, and aren't probably of interest to the reader. ↩
This algorithm might change in the future. Currently, the algorithm has a somewhat flat curve in terms of font differentiation, and we'd like it to be a bit steeper, so that different font sizes are more readily distinguishable after inflation. See Bug 777089 for more information. ↩
Bovens, Andreas (2011). An Introduction to meta viewport and @viewport. Retrieved June 28, 2012, from http://dev.opera.com/articles/view/an-introduction-to-meta-viewport-and-viewport/. ↩
Rocha, Lucas (2012). Reader Mode in Firefox Mobile. Retrieved June 29, 2012, from http://lucasr.org/2012/06/21/reader-mode-in-firefox-mobile/. ↩