RSS 2.0
Sign In
# Friday, 23 April 2010

There are some complications with streamed tree that we have implemented in saxon. They are due to the fact that only a view of input data is available at any time. Whenever you access some element that's is not available you're getting an exception.

Consider an example. We have a log created with java logging. It looks like this:

<log>
  <record>
    <date>...</date>
    <millis>...</millis>
    <sequence>...</sequence>
    <logger>...</logger>
    <level>INFO</level>
    <class>...</class>
    <method>...</method>
    <thread>...</thread>
    <message>...</message>
  </record>
  <record>
    ...
  </record>
  ...

We would like to write an xslt that returns a page of log as html:

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:t="http://www.nesterovsky-bros.com/xslt/this"
  xmlns="http://www.w3.org/1999/xhtml"
  exclude-result-prefixes="xs t">

  <xsl:param name="start-page" as="xs:integer" select="1"/>
  <xsl:param name="page-size" as="xs:integer" select="50"/>

  <xsl:output method="xhtml" byte-order-mark="yes" indent="yes"/>

  <!-- Entry point. -->
  <xsl:template match="/log">
    <xsl:variable name="start" as="xs:integer"
      select="($start-page - 1) * $page-size + 1"/>
    <xsl:variable name="records" as="element()*"
      select="subsequence(record, $start, $page-size)"/>

    <html>
      <head>
        <title>
          <xsl:text>A log file. Page: </xsl:text>
          <xsl:value-of select="$start-page"/>
        </title>
      </head>
      <body>
        <table border="1">
          <thead>
            <tr>
              <th>Level</th>
              <th>Message</th>
            </tr>
          </thead>
          <tbody>
            <xsl:apply-templates mode="t:record" select="$records"/>
          </tbody>
        </table>
      </body>
    </html>
  </xsl:template>

  <xsl:template mode="t:record" match="record">
    <!-- Make a copy of record to avoid streaming access problems. -->
    <xsl:variable name="log">
      <xsl:copy-of select="."/>
    </xsl:variable>

    <xsl:variable name="level" as="xs:string"
      select="$log/record/level"/>
    <xsl:variable name="message" as="xs:string"
      select="$log/record/message"/>

    <tr>
      <td>
        <xsl:value-of select="$level"/>
      </td>
      <td>
        <xsl:value-of select="$message"/>
      </td>
    </tr>
  </xsl:template>

</xsl:stylesheet>

This code does not work. Guess why? Yes, it's subsequence(), which is too greedy. It always wants to know what's the next node, so it naturally skips a content of the current node. Algorithmically, such saxon code could be rewritten, and could possibly work better also in modes other than streaming.

A viable workaround, which does not use subsequence, looks rather untrivial:

<!-- Entry point. -->
<xsl:template match="/log">
  <xsl:variable name="start" as="xs:integer"
    select="($start-page - 1) * $page-size + 1"/>
  <xsl:variable name="end" as="xs:integer"
    select="$start + $page-size"/>

  <html>
    <head>
      <title>
        <xsl:text>A log file. Page: </xsl:text>
        <xsl:value-of select="$start-page"/>
      </title>
    </head>
    <body>
      <table border="1">
        <thead>
          <tr>
            <th>Level</th>
            <th>Message</th>
          </tr>
        </thead>
        <tbody>
          <xsl:sequence select="
            t:generate-records(record, $start, $end, ())"/>
        </tbody>
      </table>
    </body>
  </html>
</xsl:template>

<xsl:function name="t:generate-records" as="element()*">
  <xsl:param name="records" as="element()*"/>
  <xsl:param name="start" as="xs:integer"/>
  <xsl:param name="end" as="xs:integer?"/>
  <xsl:param name="result" as="element()*"/>

  <xsl:variable name="record" as="element()?" select="$records[$start]"/>

  <xsl:choose>
    <xsl:when test="(exists($end) and ($start > $end)) or empty($record)">
      <xsl:sequence select="$result"/>
    </xsl:when>
    <xsl:otherwise>
      <!-- Make a copy of record to avoid streaming access problems. -->
      <xsl:variable name="log">
        <xsl:copy-of select="$record"/>
      </xsl:variable>

      <xsl:variable name="level" as="xs:string"
        select="$log/record/level"/>
      <xsl:variable name="message" as="xs:string"
        select="$log/record/message"/>

      <xsl:variable name="next-result" as="element()*">
        <tr>
          <td>
            <xsl:value-of select="$level"/>
          </td>
          <td>
            <xsl:value-of select="$message"/>
          </td>
        </tr>
      </xsl:variable>

      <xsl:sequence select="
        t:generate-records
        (
          $records,
          $start + 1,
          $end,
          ($result, $next-result)
        )"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:function>

Here we observed the greediness of saxon, which too early tried to consume more input than it's required. In the other cases we have seen that it may defer actual data access to the point when there is no data anymore.

So, without tuning internal saxon logic it's possible but not easy to write stylesheets that exploit streaming features.

P.S. Updated sources are at streamedtree.zip

Friday, 23 April 2010 10:12:38 UTC  #    Comments [0] -
Thinking aloud | xslt
All comments require the approval of the site owner before being displayed.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview
Archive
<2024 October>
SunMonTueWedThuFriSat
293012345
6789101112
13141516171819
20212223242526
272829303112
3456789
Statistics
Total Posts: 387
This Year: 3
This Month: 0
This Week: 0
Comments: 1749
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2024, Nesterovsky bros
All Content © 2024, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)