The Need for Readability
Many people in the world now use mobile devices to browse the web. It's fast, convenient, and absolutely indispensable when you absolutely have to prove that you are correct in an argument. However, early on in the mobile browsing craze, my father asked me (and he wasn't the only one), "Why would anyone choose to use the internet on such a tiny screen?" Probably the most obvious reason is the convenience of being able to look up restaurants, stores, reviews, or virtually anything, while you're trying to find a place and you're disconnected from a traditional desktop/monitor setup. Or, for some people, being able to play Angry Birds or watch a TV show while outside sitting in a hammock is enough reason to want the web on your mobile device.
However, small text is difficult to read. If you are unable to read text, surfing the web really isn't that much fun. It's a lot of frustration and irritation - a commonly enjoyable task now becomes a disliked chore. How can we get around this? Well, zooming is a common way to alleviate something that is too small - effectively including a magnifying glass in the software you write. So, if something is too small on the screen - fonts, images, smileys, shortcuts, etc... - you simply increase the number of pixels these objects take up, effectively making them larger. This works reasonably well, assuming, or course, you have enough data. But, zooming has its own issues. If we zoom in on a piece of text, we're making that text larger, along with all the surrounding text. This is what we want, but what if by zooming in on the text to make it readable, we now have a smaller amount of text on the screen? That is, we have to scroll after having read only a small amount of information. If you have to scroll after having read only two words, it gets tiresome very quickly. Another issue is having to scroll in multiple directions. We're fairly content with scrolling vertically (or horizontally, if that's the only direction), but having to scroll horizontally AND vertically is a pain (if you don't believe me, try reading a PDF paper on your mobile device with the default reader - after the first half page, I get annoyed at having to scroll horizontally AND vertically).
Enter Font Inflation, Stage Left
Font Inflation is the term for the Gecko implementation of an adjustment made to text during Reflow[1:1]. The basic idea is that we divide up our set of frames into some which are text containers, and then increase the size of the fonts used for text within these containers. Font Inflation, as a concept, is not actually unique to Gecko. A form of this concept, called text size adjustment, is also implemented in WebKit, specifically for iOS devices[1:2]. The tricky parts are determining what constitutes a container for font size inflation, and enlarging text without drastically changing the page layout.
Non-technical readers (or technically saavy readers that aren't interested in the explicit details) might want to skip this part, as it goes over how font inflation works within Gecko.
David Baron developed the basic font inflation algorithm and processing in Gecko[1:3]. There are really two basic stages in which font inflation plays a role: Frame Construction and Reflow. During the phase of Frame Construction, we mark some frames as font inflation containers, and a subset of these as font inflation flow roots. A font inflation container is a frame that is an ancestor of frames containing text, with the added condition that font inflation containers are never line participants (e.g. inline frames such as
<b>, and line breaks), because we want font inflation to be consistent within a line (i.e. we don't want font inflation to adjust a line like this). You can think of font inflation containers as block frames containing some amount of text to be inflated. If there is more than one frame that represents a particular node in the content tree, only the outermost frame is a font inflation container. Font inflation containers happen to be the smallest unit of text for which we can disable font inflation entirely.
A font inflation flow root, on the other hand, is a slightly different beast. It's a frame that is a font inflation container, where we want to start aggregation of font inflation data. In other words, it's a bit of an artificial construct designed to fine-tune the heuristics we use to factor out things like copyright notices at the bottom of pages, and other areas of text where font inflation isn't desired. In order to be a font inflation flow root, a frame must satisfy the following conditions:
- It must be a font inflation container.
- It must establish a new block formatting context[1:4]
- It must be either[1:5]:
- Absolutely positioned
- or, floating
- or, a table cell
- or, it's the root frame
If these conditions are met, then during frame construction, the frame is set as a font inflation flow root. This essentially means that this is the beginning of an area of "text flow" that we want to inflate. (Ideally,the content within the frames underneath a given flow root should be "connected" in the user's mind. An example might be different paragraphs of a single section of a single article. Unfortunately, we can't determine semantic connections, at least not without assistance from a more robust language than HTML 4.01). In other words, this is a subtree of the frame tree for which we want a separate font inflation statistical-aggregate data structure. It isn't uncommon for the root frame to be the only font inflation flow root in a document.
Since I'm a visual person, I thought I'd give a visual example of what we're talking about here. You can take a look at the example at link borked. If you're currently on Firefox for Android, then you should see the font inflated page. If you're on desktop Firefox, open a new tab, type about:config, search for inflation, and set
200. You'll then need to reload the example page. (A quick warning: this will inflate any pages you visit, so you'll want to turn it off again after you've reloaded the example).
If you're not interested in testing the font inflation for yourself on this example, here's an idea of what it looks like after these inflation configuration settings have been enabled:
So why weren't the sidebar and footer inflated? Shouldn't they have been inflated as much as possible, just like the main article text? The answer to these questions lies in the work that was done to specify regions of layout that collect separate font inflation data (the flow roots mentioned above). This example is similar in structure to the layout of the New York Times website, which we had issues with regarding font inflation abnormally inflating the footer of the page (see "Footer Text", in the Exceptions section, below, for context as to why this works the way it does). In order to better understand how this process, it's useful to look at the example alongside a representation of the frame tree that is built within Gecko at the time of page layout. A non-inflated version of this page looks something like this:
This shows the example page rendered without font inflation. The background colors indicate which frame the particular element corresponds to in the frame tree image (below).
I've color-coded the different frames so that it's easier to coordinate them with the frame tree, which looks something like the following:
This shows the frame tree for the example web page, condensed for brevity and clarity.
As you can see from the frame tree diagram, the root frame and the two block frames with id "main" and "sidebar" are our font inflation flow roots. This means that font inflation data is aggregated, starting at each of these frames, and not including the other font inflation flow root frames, or their respective subtrees. The font inflation containers are the block frames "body", "page", and "footer". Since the descendants of the "footer" frame are the only content within the "root frame" font inflation flow root, and the text isn't sufficient (the meaning of this might not be clear until the
lineThreshold setting is discussed below), it isn't inflated. For the same reason, the text underneath the "sidebar" flow root isn't inflated. The text underneath the "main" frame, however, is sufficient, and thus is inflated as expected.
In other words, you can think of the font inflation flow roots as areas of the page that collect statistical data regarding the size (i.e. text amount) of their children, in total. This data is then used to determine whether or not the font inflation functionality should take effect for the child frames beneath these roots. They are also used to determine the maximum width (and thus the minimum text size), if font inflation is enabled for a given frame beneath the flow root. Font inflation containers are used to disable font inflation entirely for a smaller chunk of text than the flow roots.
Once we've initialized our frames to have the appropriate state bits and such, we perform a font size inflation calculation during reflow. First, given a particular target frame for which we want to inflate fonts, we find its font size inflation flow root, and compute the width. In order to compute the width that descendent frames will use from a given flow root, we find the nearest common ancestor of the first and last pieces of inflatable text within a given flow root. We then use the width of this ancestor frame, which might not be the flow root itself. Then, given the two preferences for font size inflation (discussed below), we compute the minimum font size that will satisfy these parameters and will fit within the given container width. Using this minimum font size, we map all font sizes in the target frame within the range (0-150%] of the minimum font size to the range [100-150%] of the minimum font size[1:6].
With a minimum font size of 20.0 px, 12.0 px gets mapped to 23.33 px.
Why not simply map everything under the minimum font size to the minimum font size itself? Because we want to be able to preserve differentiation between fonts at a size less than the minimum. So, for example, if we have text that is 12pt, with headers that are 16pt and 14pt, we want to make sure that all of these fonts, once adjusted, can be distinguished from one another.
One thing I glossed over here is that we only perform a font size inflation calculation during reflow if font inflation is enabled for the frame in question. This actually ends up being an important point, because there are certain situations where we want to disable font inflation for certain frames. The exceptional cases where we want to detect a certain (somewhat general) category of web page layout, and disable font inflation, either for individual frames or entire pages, is detailed in the next section.
The basic algorithm works really well, but there are still some situations where we want to limit or disable font inflation logic. Together, these situations comprise the bulk of the heuristics Gecko abides by when determining font inflation settings for a given set of frames. Development of the font inflation feature is essentially the refinement of these heuristics. Putting them together, and verifying which ones work well without too drastically impacting the performance of the rendering engine, as well as testing for correctness and general usability has been the goal of the font inflation project for the past 6+ months. Unfortunately, these don't all fit very easily into a single area of the code, so it's difficult to tell someone to look at the logic in X class on line Y if he/she wants to learn about how the font inflation heuristics work.
In the following sections, a brief summary of the most pressing issues confronting font inflation over the last few months are described. I try to explain these situations as clearly and concisely as possible. In addition, I try to give code references and documentation links that you can look to for more information if you'd like to see how something was implemented, or perhaps would like to try to fix something that you think is broken with the font inflation algorithm as a whole. That said, keep in mind that only as a whole do these heuristics work to bring you the font inflation feature that is new in Firefox 14.0 and the all-new Firefox for Android (Fennec), so any individual piece may be only one part of the process, or even nonsensical, when taken by itself.
Small Text Bits
The most notable (perhaps notorious is a better descriptor) bug we encountered when fine-tuning the heuristics is in what we've come to refer to as "the New York Times footer case" (Bug 706193). In this case, the New York Times site was inflated in certain areas (conforming as expected to the basic algorithm), but there was a problem where the algorithm was over-inflating the footer text, and causing the text to wrap - the text at the bottom of the page with links to different sections of the New York Times.
The difference in footer text size versus the article text size was incredibly noticeable, and quite distracting for users.
This applies to situations where small bits of text are in the document, and where the layout of the page is dependent on these small bits of text. Usually, site authors may not have explicitly defined a height on these bits, because there wasn't previously a possibility that the text could wrap. Once we inflate the size of the text, however, this assumption is no longer valid.
To fix this, we utilize a technique where a threshold value is specified. This threshold is an amount of text for which font inflation is disabled if the threshold is not met. However, we couldn't just use a specific amount of text within a block directly. If this were the case, then adjacent blocks within a document that had differing amounts of text would have differing font inflation settings, resulting in possibly strange rendering of text sizes. To better adjust for this, a line threshold was added as part of the fix for this bug. The line threshold works like this - for each block formatting context, we construct a set of data by scanning the items contained within that block formatting context (BFC). As we scan, we accumulate the text in each of the frames contained in the BFC - if we find sufficient text that has the same font size for inflation, then we set a bit in the font inflation data structure that enables font inflation for that size font.
An astute reader will realize this sounds a lot like what was described in the "Basic Algorithm" section above. In fact, this is the case. Since it's such an important aspect of how David rewrote the font inflation code, I consider it part of the "Basic Algorithm," and describe it twice. One thing that I didn't explain in detail above, though, was what makes the text amount "sufficient." The preference that was added for controlling this is somewhat unintuitive. The name of the preference is
font.size.inflation.lineThreshold, and it controls "the percentage of a number of lines of text we'd need to trigger inflation"[1:7]. What that means is that if each of our characters' widths are equivalent to 1em (most characters aren't - they are usually smaller than this), then a value of 100 for this preference means we'd need 1 line of 1em-width text to trigger font inflation. Since we know the width and the size of the font for the text, we can determine how much text will fit on a line, based on the em-size of the text. By default, (at the time of this writing), the value for this threshold is 400, which means we need approximately 4 lines of text (under the assumption that all characters are square, which means that we'll actually only need slightly under 2 lines of normal text) to trigger font inflation.
One of the first caveats to the font inflation algorithm that was recognized by David Baron was that block-level elements that have a constrained height cannot perform font inflation in the same way as blocks with unconstrained heights. The point of font inflation in the first place is to increase readability while discouraging the use of horizontal scrolling. This second part is important in this case, because if we want to avoid horizontal scrolling, then at some point, we're always going to have constrained width. This means that as text gets larger, there will be more lines of text, since wrapping will happen more frequently. This is, in large part, due to the fact that the ultimate upper bound on the width, the screen size, remains constant. Thus, since we have more lines of text, we're going to grow in the block direction - i.e. the height of our blocks will grow as font inflation deviates more and more from the original font size.
This issue was addressed in the original font inflation bug with a new frame state bit,
NS_FRAME_IN_CONSTRAINED_HEIGHT. If the font inflation algorithm detects that a frame has this bit set, it does not enable font inflation for that frame. Interestingly, we'd also need to handle this case if we were to implement reflow-on-zoom (see the bottom part of note 1 for more details about reflow-on-zoom), so this is a problem that plagues both of the major readability enhancement solutions.
A related, but slightly different bug, had to do with constrained sizes with form controls. This was actually the symptom of a more general case, wherein if a frame was to be inflated, but between that frame and it's nearest ancestor font inflation container on the frame tree there was a frame representing a non-inline element with constrained height or width (because these can't wrap), then font inflation should not happen on that frame. This sounds complex, but, in reality, it's a pretty simple case of determining whether we can actually inflate, based on size restrictions.
Sites that are already optimized for mobile (e.g. m.twitter.com) likely don't need additional inflation, since the web developer already has adjusted the font size to fit on a phone-sized screen. These sites can often be detected by looking for the mobile-specific tag
<meta name="viewport"> in the header of the document. However, this was a bit tricky for us, because the code that detects the
<meta name="viewport"> tag was in the front-end code of Firefox, rather than in the platform, Gecko. As part of Bug 706198, we pulled the processing of the
<meta name="viewport"> element into Gecko and added two conditions that disable font inflation in what we consider to be "mobile" cases.
The first case depends on the default zoom attribute, which corresponds to the initial-scale attribute of the viewport metadata element. What this indicates is how "zoomed in" the viewport is initially upon page load. Typically, this value is inferred from other settings (such as when width or height are set) in the viewport metadata, so it's not usually specified directly, unless a developer wants to use a specific zoom setting[1:8]. If this default zoom attribute is greater than or equal to 1.0, then we assume that this was set explicitly, or inferred from the width/height being set, which are typically done on mobile-optimized websites. Thus, in this case, we disable font inflation. Similarly, we disable font inflation on sites where the width or height attributes of the viewport meta tag are set to device-width.
There's also something somewhat subtle about the logic for this condition - if the doctype string contains the text "WML", "WAP", or "Mobile", or if the meta tag
<meta name="handheldFriendly"> has its content equal to "true", we return early from our viewport metadata processing. Since the
defaultZoom in this case is initially set to 1.0, font inflation will be disabled under these conditions as well. The former of these indicates that the page is using either the XHTML Mobile Profile or the Wireless Markup Language[1:9], and the latter indicates that the site is optimized for mobile using the old AvantGo standard for Palm devices (being phased out now, I believe).
Bug 758079 shows another, more difficult bug to deal with. Bullets and ordered list numerals are rendered in the space given to either the left (for left to right text) or right (for right to left text) of the list item. Specifically, this ends up being placed in the margins of the block element in which the list is embedded. Unfortunately, this isn't associated in the code directly with the list item itself. That means that if the list item is inflated, the space to the left or right (where the bullet/list numeral is rendered) is not inflated. By default, the area we have to render a list numeral or bullet icon is 40px. This normally works fine, but once we start playing with the font size in the rendering engine, we end up with numerals that might be larger than the amount of space we have in which to render the text. In this case, the text gets clipped somewhere before the start of the block element's boundary.
Ideally, we'd solve this problem by linking the thing that constructs the indentation of the list item with the list item itself within the platform. This could be a quite massive change, though, and would require that we significantly reconstruct how we deal with bullets and lists. Instead, we opted for a bit more pragmatic approach: increase the amount of margin space in the event of font inflation. We increase the margin by multiplying the default margin size by the same ratio as the text in the bullet is inflated.
Incorrect Flow Roots
As a final exception to the basic algorithm, I'm going to discuss a problem that we don't yet have a solution for: incorrect flow root assignment. Sometimes, we actually don't want a table cell or float to be a flow root. It can cause issues like what happens at ycombinator.com when reading comments (Bug 707195):
Because some sites utilize tables for layout (indentation in the case of ycombinator.com) rather than for actually displaying tabular data, there exists a problem where the same type of syntactic node is used for two vastly different things. When indentation is controlled by tables by using nested tables, the width of each individual cell gets smaller with each nesting. This causes comments that come later in the thread (and thus are indented more) to be inflated less, and comments that come earlier in the thread to be inflated more, simply due to width restrictions. Because each table cell is an individual font inflation flow root, we can't accumulate data across table cells to consistently apply font inflation to the entire set of table cells.
Somehow, we need to add heuristics to determine when a table cell is used for layout, and thus should not be a font inflation flow root, and when it should be a flow root. This isn't unique to tables, either. Reddit.com also experiences a similar problem:
The all-new Firefox for Android, nicknamed "Fennec" or, sometimes, "Fennec Native" has shipped. The new readability enhancements described above are included in this version of the software. This is fantastic, and it makes readability for the web on the Android platform better than ever. But, the work isn't done yet. Font inflation is a great feature, but it's not the only aspect of readability that we want to include in future versions of Firefox. Crisper, clearer fonts are also on the docket, along with an investigation of how reflow on zoom might be incorporated into our product. Another area of current development is Reader Mode[1:10]. Reader Mode takes the text content of an article, strips away aspects of the page that might not be relevant to a user while trying to read the article, and places the resulting text onto a more readable background, with an easier-to-read font.
Let's face it - much of what we do on the web is reading. We read news articles to learn about current events. We read recipes in order to prepare that fantastic salmon dinner your boyfriend/girlfriend wants. We read through instructions on how to fix that misfiring cylinder in our car. Almost every web page you visit has some text on it - it's integral to the way we communicate and live. We want to make that experience not just better, but better than any other browser on the market. We literally (no pun intended) want everyone -- from your 102-year-old grandmother to your 4-year-old nephew who just learned to read -- to be able to interpret clear text in Firefox as easily as if they were reading a piece of paper. Our goal is to develop the features that will make you want to use Firefox as your daily browser because of how it renders text.
Notes and References
A special thanks to Daniel Holbert and David Baron for proofreading this post and pointing out issues they saw. Guys, I really appreciate your help and guidance!