Last revised: 4 Dec 2002 $Revision: 1.5 $
This page captures the specific requirements for extensions that have been identified, either through community dicussion or by the existence of extensions in existing FO implementations.
This section captures requirements that are not specific to any particular delivery media or tool.
There are a number of use cases that require the ability to correlate original XML input elements to artifacts of the paginated document they are rendered on to, including the page numbers of the pages they occurred on and the values of markers on the pages they occurred on. These use cases include:
The most general statement of this requirement is:
There are a number of use cases in which the same element produces different layout results depending on that element's positional relationship to other elements in the FO instance. These use cases include:
Some financial documents such as invoices or bank statements when span more than one page use special header or footer with "carried forward" summary that contains some partial information up to the page break like partial sum, partial number of items etc. While in general this requires some kind of correlation between source xml tree and the formatter, it can be reduced to requirement to retrieve markers from within table footer/header/caption if stylesheet author prepares partial information to be formatted for all possible breaks as markers.
These requirements are specific to the functions provided by Adobe PDF. PDF is a proproprietary format for final-form delivery of documents. However, PDF is in such wide use that is is essentially a standard. Adobe publishes the PDF specification and there are a number of open-source solutions for both viewing PDF documents and accessing PDF documents programmatically. PDF is primary rendition target for most FO implementations.
PDF provides a number of useful features for online document delivery, including hyperlinks, bookmarks, interactive forms, and so on. There is a general requirement to be able to take advantage of these features of PDF using FO-based composition systems. Some of these features have a direct mapping from generic FO constructs (e.g., basic-link maps to PDF links) but many do not. Most of the PDF-producing FO implementations, certainly all the commercial ones, have recognized this and already provide proprietary extensions for many of these requirements.
PDF provides a generic "annotation" mechanism, by which different types of annotations can be applied to documents, including highlighting of text, notes overlaid on a page, and arbitrary pen-line marks.
It would be handy if annotations could be created as part of the PDF generation process. For example, given an online review system in which annotations were created in some generic form, documents could be rendered to PDF with the annotations reflected as PDF annotations rather than simply as base components of the rendered pages.
While it is possible to add annotations to PDF documents, either using the Adobe Acrobat product or programmatically, it would be difficult to do this for annotations associated only with the original XML document without some mapping between the original input XML elements and the pages those elements were eventually rendered to (see general requirements).
This section captures requirements for layout functionality not currently provided by the FO specification. Many of these requirements are already booked by the XSL Working Group and will likely be addressed in future versions of the XSL FO specification. At such time as the XSL Working Group takes up any of the requirements in this section, their discussion here shall be retired. Any implementations of these requirements should be considered as being in the service of gaining implementation experience in advance of formal standardization.
When tables span multiple pages, there is often a requirement to create either a separate "table continued" message above or below the table or to add something like "Continued" to the table caption. This requirement can be met in a weak way using markers in the static content but it is not completely satisfactory, especially for messages placed at the bottom of the table (e.g., "Table continued on next page") because there is no way to reliably place the message immediately below the table (because the vertical extent of a continued table cannot be absolutely controlled). Thus, satisfying this requirement requires an extension to FO that would provide a way to either retrieve markers within the table caption, table header, or table footer or define an additional component of the table caption to hold text for use on pages after the first. This requirement could be partially satisfied by using repeating before floats, as provided by Epic editor 4.3's FO implementation.
XSLT provides all the grouping and sorting functions needed to generate back-of-the-book indexes from index entry markup in an XML document (see the Docbook XSL Stylesheets project for an example of how to generate indexes with XSLT). However, the result of such a generated index must necessarily be index entries (in the composed index) with multiple references to the same page. This is for the simple reason that there is no way to know, at XSLT processing time, what page a given index marker in the document flow will resolve to.
Thus, to be able to create proper indexes, there must be a way to eliminate duplicate page numbers from the list of page numbers associated with a given index entry.
This requirement can be met using a clever post-processing mechanism developed by Ken Holman (Ken--need a pointer to your write up on this). However, this process requires human intervension and so is not generally useful in lights-out production environments.