Lately we do not program in XSLT too often but rather in java, C#, SQL and javascript, but from time to time we have tasks in XSLT.
People claim that those languages are too different and use this argument to explain why XSLT is only a niche language. We, on the other hand, often spot similarities between them.
So, what it is in other languages that is implemented as tunnel parameters in XSLT?
To get an answer we reiterated how they work in XSLT, so, you:
- define a template with parameters marked as
tunnel="yes" ;
- use these parameters the same way as regular parameters;
- pass template parameters down to other templates marking them as
tunnel="yes" ;
The important difference of regular template parameters from tunnel parameters is that the tunnel parameters are implicitly passed down the call chain of templates. This means that you:
- define your API that is expected to receive some parameter;
- pass these parameters somewhere high in the stack, or override them later in the stack chain;
- do not bother to propagate them (you might not even know all of the tunnel parameters passed, so encapsulation is in action);
As a result we have a template with some parameters passed explicitly, and some others are receiving values from somewhere, usually not from direct caller. It’s possible to say that these tunnel parameters are injected into a template call. This resembles a lot injection API in other languages where you configure that some parameters are prepared for you by some container rather then by direct caller.
Now, when we have expressed this idea it seems so obvious but before we thought of this we did not realize that tunnel parameters in XSLT and Dependency Injection in other languages are the same thing.
Although the golden age of IE8 has already passed and Microsoft
already has stopped its support, this browser still occupies about
3% of the of the world market desktop browsers. Despite this, many big organisations still
use this browser for enterprise web applications. We may confirm this, since we deal with
such organisations around the world. Companies try to get rid of IE8, but this
often requires Windows upgrade and resources to re-test all their web applications. If such company has many web terminals with Windows 7 or even with XP, this task becames rather expensive. So, this process
advances rather slowly. Meanwhile, these organizations don't stop development of new
web applications that must work on new HTML5 browsers and on old IE8.
A year ago we had developed an
UIUpload AngularJS directive and service that simplifies file uploading in web applications with AngularJS client. It works as expected on all HTML5 browsers.
But few days ago, we were asked to help with file uploading from
AngularJS web application that will work in IE8. We've spent few hours in order to investigate
existing third-party AngularJS directives and components. Here are few of them:
All of these directives for IE8 degrade down to <form> and <iframe> and then track the uploading progress. These solutions don't allow to select files for old browsers. At the same time, our aim was to implement an AngularJS directive that allows selecting a file and perform uploading, which will work for IE8 and for new browsers too.
Since IE8 neither supports FormData nor File API, thus, the directive must work with DOM elements directly. In order to open file selection dialog we need to hide <input type="file"/> element and then to route client-side event to it. When a file is selected it is sent to a server as multipart/form-data message. The server's result will be caught by hidden <iframe> element and passed to the directive's controller.
After few attempts we've implemented the desired directive. The small VS2015 solution that demonstrates this directive and server-side file handler you may download here.
The key feature of this directive is emulation of replace and template directive's definition properties:
var directive =
{
restrict: "AE",
scope:
{
id: "@",
serverUrl: "@",
accept: "@?",
onSuccess: "&",
onError: "&?",
},
link: function (scope, element, attrs, controller)
{
var id = scope.id || ("fileUpload" + scope.$id);
var template = "<iframe name='%id%$iframe' id='%id%$iframe' style='display: none;'>" +
"</iframe><form name='%id%$form' enctype='multipart/form-data' " +
"method='post' action='%action%' target='%id%$iframe'>" +
"<span style='position:relative;display:inline-block;overflow:hidden;padding:0;'>" +
"%html%<input type='file' name='%id%$file' id='%id%$file' " +
"style='position:absolute;height:100%;width:100%;left:-10px;top:-1px;z-index:100;" +
"font-size:50px;opacity:0;filter:alpha(opacity=0);'/></span></form>".
replace("%action%", scope.serverUrl).
replace("%html%", element.html()).
replace(/%id%/g, id);
element.replaceWith(template);
...
}
}
We used such emulation, since each directive instance (an element) must have unique name and ID in order to work properly. On the one hand template that returned by function should have a root element when you use replace. On the other hand, IE8 doesn't like such root element (e.g. we've not succeeded to dispatch the click javascript event to the <input> element).
The usage of the directive looks like as our previous example (see UIUpload):
<a file-upload=""
class="btn btn-primary"
accept=".*"
server-url="api/upload"
on-success="controller.uploadSucceed(data, fileName)"
on-error="controller.uploadFailed(e)">Click here to upload file</a>
Where:
- accept
- is a comma separated list of acceptable file extensions.
- server-url
-
is server URL where to upload the selected file.
In case when there is no "server-url" attribute the content of selected file will be
passed to success handler as a data URI.
- on-success
- A "success" handler, which is called when upload is finished successfully.
- on-error
- An "error" handler, which is called when upload is failed.
We hope this simple directive may help to keep calm for those of you who is forced to deal with IE8 and more advanced browsers at the same time.
Recently we have found and fixed a bug in unreachable statement optimization in jxom.
Latest version of stylesheets can be found at github.com languages-xom.
Our genuine love is C++. Unfortunately clients don't always share our favors, so we mostly occupied in the C#, java and javascript. Nevertheless, we're closely watching the evolution of the C++. It became more mature in the latest specs.
Recently, we wondered
how would we deal with dependency injection in C++.
What we found is only strengthened our commitment to C++.
Parameter packs introduced in C++ 11 allow trivial implementation of constructor injection, while std::type_index, std::type_info and std:any give service containers.
In fact there are many DI implementations out there. The one we refer here is Boost.DI. It's not standard nor we can claim it's the best but it's good example of how this concept can be implemented.
So, consider their example seen in Java with CDI, in C# in .NET Core injection, and in C++:
Java:
@Dependent
public class Renderer
{
@Inject @Device
private int device;
};
@Dependent
public class View
{
@Inject @Title
private String title;
@Inject
private Renderer renderer;
};
@Dependent
public class Model {};
@Dependent
public class Controller
{
@Inject
private Model model;
@Inject
private View view;
};
@Dependent
public class User {};
@Dependent
public class App
{
@Inject
private Controller controller;
@Inject
private User user;
};
...
Privider<App> provider = ...
App app = provider.get();
C#:
public class RenderedOptions
{
public int Device { get; set; }
}
public class ViewOptions
{
public int Title { get; set; }
}
public class Renderer
{
public Renderer(IOptions<RendererOptions> options)
{
Device = options.Device;
}
public int Device { get; set; }
}
public class View
{
public View(IOptions<ViewOptions> options, Renderer renderer)
{
Title = options.Title;
Renderer = renderer;
}
public string Title { get; set; }
public Renderer Renderer { get; set; }
}
public class Model {}
public class Controller
{
public Controller(Model model, View view)
{
Model = model;
View = view;
}
public Model Model { get; set; }
public View View { get; set; }
};
public class User {};
public class App
{
public App(Controller controller, User user)
{
Controller = controller;
User = user;
}
public Controller Controller { get; set; }
public User User { get; set; }
};
...
IServiceProvider serviceProvider = ...
serviceProvider.GetService<App>();
C++:
#include <boost/di.hpp>
namespace di = boost::di;
struct renderer
{
int device;
};
class view
{
public:
view(std::string title, const renderer&) {}
};
class model {};
class controller
{
public:
controller(model&, view&) {}
};
class user {};
class app
{
public:
app(controller&, user&) {}
};
int main()
{
/**
* renderer renderer_;
* view view_{"", renderer_};
* model model_;
* controller controller_{model_, view_};
* user user_;
* app app_{controller_, user_};
*/
auto injector = di::make_injector();
injector.create<app>();
}
What is different between these DI flavors?
Not too much from the perspective of the final task achieved.
In java we used member injection, with qualifiers to inject scalars.
In C# we used constructor injection with Options pattern to inject scalars.
In C++ we used constructor injection with direct constants injected.
All technologies have their API to initialize DI container, but, again, while API is different, the idea is the same.
So, expressiveness of C++ matches to those of java and C#.
Deeper analysis shows that java's CDI is more feature rich than DI of C# and C++, but, personally, we consider it's advantage of C# and C++ that they have such a light DI.
At the same time there is an important difference between C++ vs java and C#.
While both java and C# are deemed to use reflection (C# in theory could use code generation on the fly to avoid reflection), C++'s DI natively constructs and injects services.
What does it mean for the user?
Well, a lot! Both in java and in C# you would not want to use DI in a performance critical part of code (e.g. in a tight loop), while it's Ok in C++ due to near to zero performance impact from DI. This may result in more modular and performant code in C++.
While reading on ASP.NET Core Session, and analyzing the difference with previous version of ASP.NET we bumped into a problem...
At Managing Application State
they note:
Session is non-locking, so if two requests both attempt to modify the contents of session, the last one will win. Further, Session is implemented as a coherent session, which means that all of the contents are stored together. This means that if two requests are modifying different parts of the session (different keys), they may still impact each other.
This is different from previous versions of ASP.NET where session was blocking, which meant that if you had multiple concurrent requests to the session, then all requests were synchronized. So, you could keep consistent state.
In ASP.NET Core you have no built-in means to keep a consistent state of the session. Even assurances that the session is coherent does not help in any way.
You options are:
- build your own synchronization to deal with this problem (e.g. around the database);
- decree that your application cannot handle concurrent requests to the same session, so client should not attempt it, otherwise behaviour is undefined.
Angular 2 is already available though there are a lot of code and libraries that are still in Angular 1.x.
Here we outline how to write AngularJS 1.x in the modern javascript.
Prerequisites: EcmaScript 2015, javascript decorators, AngularJS 1.x. No knowledge of Angular 2.0 is required.
Please note that decorators we have introduced, while resemble those from Angular 2, do not match them exactly.
A sample uses nodejs, npm and gulp as a building pipeline. In addition we have added Visual Studio Nodejs project, and maven project.
Build pipeline uses Babel with ES 2015 and decorator plugins to transpile sources into javascript that today's browsers do support. Babel can be replaced or augmented with Typescript compiler to support Microsoft's javascript extensions. Sources are combinded and optionally minified into one or more javascript bundles. In addition html template files are transformed into javascript modules that export a content of html body as a string literals. In general all sources are in src folder and the build's output is assembled in the dist folder. Details of build process are in gulpfile.js
So, let's introduce an API we have defined in angular-decorators.js module:
-
Class decorators:
Component(name, options?) - a decorator to register angular component.
Controller(name) - a decorator to register angular controller.
Directive(name, options?) - a decorator to register angular directive.
Injectable(name) - a decorator to register angular service.
Module(name, ...require) - a decorator to declare an angular module;
Pipe(name, pure?) - a decorator to register angular filter.
Component's and Directive's options is the same object passed into Module.component(), Module.directive() calls with difference that no
options.bindings , options.scope , options.require is specified.
Instead @Attribute(), @Input(), @Output(), @TwoWay(), @Collection(), @Optional() are used to describe options.bindings , and
@Host(), Self(), SkipSelf(), @Optional() are used to describe options.require
Every decorated class can use @Inject() member decorator to inject a service.
-
Member decorators:
Attribute(name?) - a decorator that binds attribute to the property.
BindThis() - a decorator that binds "this " of the function to the class instance.
Collection() - a decorator that binds a collection property to an expression in attribute in two directions.
Host(name?) - a decorator that binds a property to a host controller of a directive found on the element or its ancestors.
HostListener(name?) - a decorator that binds method to a host event.
Inject(name?) - an injection member decorator.
Input(name?) - a decorator that binds a property to an expression in attribute.
Optional() - a decorator that optionally binds a property.
Output(name?) - a decorator that provides a way to execute an expression in the context of the parent scope.
Self(name?) - a decorator that binds a property to a host controller of a directive found on the element.
SkipSelf(name?) - a decorator that binds a property to a host controller of a directive found on the ancestors of the element.
TwoWay() - a decorator that binds a property to an expression in attribute in two directions.
If optional name is omitted in the member decorator then property name is used as a name parameter.
@Host(), @Self(), @SkipSelf() accept class decorated with @Component() or @Directive() as a name parameter.
@Inject() accepts class decorated with @Injectable() or @Pipe() as a name parameter.
-
Other:
modules(...require) - converts an array of modules, possibly referred by module classes, to an array of module names.
Now we can start with samples. Please note that we used samples scattered here and there on the Anuglar site.
@Component(), @SkipSelf(), @Attribute()
-
In the Angular's component development guide there is a sample myTabs and myPane components.
Here its rewritten form
components/myTabs.js:
import { Component } from "../angular-decorators"; // Import decorators
import template from "../templates/my-tabs.html"; // Import template for my-tabs component
@Component("myTabs", { template, transclude: true }) // Decorate class as a component
export class MyTabs // Controller class for the component
{
panes = []; // List of active panes
select(pane) // Selects an active pane
{
this.panes.forEach(function(pane) { pane.selected = false; });
pane.selected = true;
}
addPane(pane) // Adds a new pane
{
if (this.panes.length === 0)
{
this.select(pane);
}
this.panes.push(pane);
}
}
components/myPane.js:
import { Component, Attribute, SkipSelf } "../angular-decorators"; // Import decorators
import { MyTabs } from "./myTabs"; // Import container's directive.
import template from "../templates/my-pane.html"; // Import template.
@Component("myPane", { template, transclude: true }) // Decorate class as a component
export class MyPane // Controller class for the component
{
@SkipSelf(MyTabs) tabsCtrl; //Inject ancestor MyTabs controller.
@Attribute() title; // Attribute "@" binding.
$onInit() // Angular's $onInit life-cycle hook.
{
this.tabsCtrl.addPane(this);
console.log(this);
};
}
- @Component(), @Input(), @Output()
-
In the Angular's component development guide there is a sample
myTabs component.
Here its rewritten form
components/heroDetail.js:
import { Component, Input, Output } from "../angular-decorators";
import template from "../templates/heroDetail.html";
@Component("heroDetail", { template }) // Decorate class as a component
export class HeroDetail // Controller class for the component
{
@Input() hero; // One way binding "<"
@Output() onDelete; // Bind expression in the context of the parent scope "&"
@Output() onUpdate; // Bind expression in the context of the parent scope "&"
delete()
{
this.onDelete({ hero: this.hero });
};
update(prop, value)
{
this.onUpdate({ hero: this.hero, prop, value });
};
}
@Directive(), @Inject(), @Input(), @BindThis()
-
import { Directive, Inject, Input, BindThis } from "../angular-decorators"; // Import decorators
@Directive("myCurrentTime") // Decorate MyCurrentTime class as a directive
export class MyCurrentTime // Controller class for the directive
{
@Inject() $interval; // "$interval" service is injected into $interval property
@Inject() dateFilter; // "date" filter service is injected into dateFilter property
@Inject() $element; // "$element" instance is injected into $element property.
@Input() myCurrentTime; // Input one way "<" property.
timeoutId;
// updateTime is adapted as following in the constructor:
// this.updateTime = this.updateTime.bind(this);
@BindThis() updateTime()
{
this.$element.text(this.dateFilter(new Date(), this.myCurrentTime));
}
$onInit() // Angular's $onInit life-cycle hook.
{
this.timeoutId = this.$interval(this.updateTime, 1000);
}
$onDestroy() // Angular's $onDestroys life-cycle hook.
{
this.$interval.cancel(this.timeoutId);
}
$onChanges(changes) // Angular's $onChanges life-cycle hook.
{
this.updateTime();
}
}
@Directive(), @Inject(), @HostListener(), @BindThis()
-
In the Angular's directive development guide there is a sample myDraggable directive.
Here its rewritten form. directives/myDraggable.js:
import { Directive, Inject, HostListener, BindThis } from "../angular-decorators"; // Import decorators
@Directive("myDraggable") // Decorate class as a directive
export class MyDraggable // Controller class for the directive
{
@Inject() $document; // "$document" instance is injected into $document property.
@Inject() $element;// "$element" instance is injected into $element property.
startX = 0;
startY = 0;
x = 0;
y = 0;
// Listen mousedown event over $element.
@HostListener() mousedown(event)
{
// Prevent default dragging of selected content
event.preventDefault();
this.startX = event.pageX - this.x;
this.startY = event.pageY - this.y;
this.$document.on('mousemove', this.mousemove);
this.$document.on('mouseup', this.mouseup);
}
@BindThis() mousemove(event) // bind mousemove() function to "this" instance.
{
this.y = event.pageY - this.startY;
this.x = event.pageX - this.startX;
this.$element.css({
top: this.y + 'px',
left: this.x + 'px'
});
}
@BindThis() mouseup() // bind mouseup() function to "this" instance.
{
this.$document.off('mousemove', this.mousemove);
this.$document.off('mouseup', this.mouseup);
}
$onInit() // Angular's $onInit life-cycle hook.
{
this.$element.css(
{
position: 'relative',
border: '1px solid red',
backgroundColor: 'lightgrey',
cursor: 'pointer'
});
}
}
@Injectable(), @Inject()
-
In the Angular's providers development guide there is a sample notifier service.
Here its rewritten form. services/notify.js:
import { Inject, Injectable } from "../angular-decorators"; // Import decorators
@Injectable("notifier") // Decorate class as a service
export class NotifierService
{
@Inject() $window; // Inject "$window" instance into the property
msgs = [];
notify(msg)
{
this.msgs.push(msg);
if (this.msgs.length === 3)
{
this.$window.alert(this.msgs.join('\n'));
this.msgs = [];
}
}
}
@Pipe()
-
In the Angular's filters development guide there is a sample reverse custom filter.
Here its rewritten form. filters/reverse.js:
import { Pipe } from "../angular-decorators"; // Import decorators
@Pipe("reverse") // Decorate class as a filter
export class ReverseFilter
{
transform(input, uppercase) // filter function.
{
input = input || '';
var out = '';
for(var i = 0; i < input.length; i++)
{
out = input.charAt(i) + out;
}
// conditional based on optional argument
if (uppercase)
{
out = out.toUpperCase();
}
return out;
}
}
- Module(), modules(), angular.bootstrap()
-
Here are an examples of a class representing angular module, and manual angular bootstrap:
import { angular, modules, Module } from "../angular-decorators"; // Import decorators
import { MyController } from "./controllers/myController"; // Import components.
import { HeroList } from "./components/heroList";
import { HeroDetail } from "./components/heroDetail";
import { EditableField } from "./components/editableField";
import { NotifierService } from "./services/notify";
import { MyTabs } from "./components/myTabs";
import { MyPane } from "./components/myPane";
import { ReverseFilter } from "./filters/reverse";
import { MyCurrentTime } from "./directives/myCurrentTime";
import { MyDraggable } from "./directives/myDraggable";
@Module( // Decorator to register angular module, and refer to other modules or module components.
"my-app",
[
MyController,
NotifierService,
HeroList,
HeroDetail,
EditableField,
MyTabs,
MyPane,
ReverseFilter,
MyCurrentTime,
MyDraggable
])
class MyApp { }
// Manual bootstrap, with modules() converting module classes into an array of module names.
angular.bootstrap(document, modules(MyApp));
Please see angular-decorators.js to get detailed help on decorators.
Good bad and good news.
- Good: recently a new version Saxon XSLT processor was published:
-
12 May 2016
Saxon 9.7.0.5 maintenance release for Java and .NET.
- Bad: we run that release on our code base and found a bug:
-
See Internal error in Saxon-HE-9.7.0-5
- Good: Michael Kay has confirmed the problem and even fixed it:
-
See Bug #2770
- The only missing ingredient is when the patch will be available to the public:
"We tend to do a new maintenance release every 4-6 weeks. Can't commit to firm dates."
Our recent task required us to find all sets of not intersecting rectangles for a rectangle list.
At first glance it did not look like a trivial task. Just consider that for a list of N rectangles you can form
2^N different subsets. So, even result list, theoretically, can be enormous.
Fortunately, we knew that our result will be manageable in size. But nevertheless, suppose you have a list of
couple of hundred rectangles, how would you enumerate all different sets of rectangles?
By the way, this task sounds the same as one of a Google interview's question. So, you may try to solve it by yourself before to check our solution.
We didn't even dare to think of brute-force solution: to enumerate all sets and then check each one whether it fits our needs.
Instead we used induction:
- Suppose S(N) - is an solution for our task for N rectangles R(n), where S(N) is a set of sets of rectangles;
- Then solution for S(N+1) will contain whole S(N), R(N+1) - a set consisting of single rectangle, and
some sets of rectangles from S(N) combinded with R(N+1) provided they fit the condition;
- S(0) - is an empty set.
The algorithm was implemented in java, and at first it was using
Streaming and recursion.
Then we have figured out that we can use
Stream.reduce or Stream.collect to implement
the same algorithm. That second implementation was a little bit longer but probably faster, and besides it used standard idioms.
But then at last step we reformulated the algorithms in terms of
Collections.
Though the final implementation is the least similar to original induction algorithm,
it's straightforward and definitely fastest among all implementations we tried.
So, here is the code:
/**
* For a sequence of items builds a list of matching groups.
* @param identity an identity instance used for the group.
* @param items original sequence of items.
* @param matcher a group matcher of item against a group.
* @param combiner creates a new group from a group (optional) and an item.
* @return a list of matching groups.
*/
public static <T, G> List<G> matchingGroups(
G identity,
Iterable<T> items,
BiPredicate<G, T> matcher,
BiFunction<G, T, G> combiner)
{
ArrayList<G> result = new ArrayList<>();
for(T item: items)
{
int size = result.size();
result.add(combiner.apply(identity, item));
for(int i = 0; i < size; ++i)
{
G group = result.get(i);
if (matcher.test(group, item))
{
result.add(combiner.apply(group, item));
}
}
}
return result;
}
The sample project on GitHub contains implementation and a tests of this algorithm.
8 Ways to Become a Better Coder is a good article. Read and apply to yourself. Never mind what your occupation is. Replace "coder" with your profession. Suits to everybody who wants to be the best.
Visitor pattern is often used to separate operation from object graph it operates with. Here we assume that the reader is familiar with the subject.
The idea is like this:
- The operation over object graph is implemented as type called
Visitor .
Visitor defines methods for each type of object in the graph, which a called during traversing of the graph.
- Traversing over the graph is implemented by a type called
Traverser , or by the Visitor or by each object type in the graph.
Implementation should collect, aggregate or perform other actions during visit of objects in the graph, so that at the end of the visit the purpose of operation will be complete.
Such implementation is push-like: you create operation object and call a method that gets object graph on input and returns operation result on output.
In the past we often dealt with big graphs (usually these are virtual graphs backended at database or at a file system).
Also having a strong experience in the XSLT we see that the visitor pattern in OOP is directly mapped into xsl:template and xsl:apply-templates technique.
Another thought was that in XML processing there are two camps:
- SAX (push-like) - those who process xml in callbacks, which is very similar to visitor pattern; and
- XML Reader (pull-like) - those who pull xml components from a source, and then iterate and process them.
As with SAX vs XML Reader or, more generally, push vs pull processing models, there is no the best one. One or the other is preferable in particular circumstances. E.g. Pull like component fits into a transformation pipeline where one pull component has another as its source; another example is when one needs to process two sources at once, which is untrivial with push like model. On the other hand push processing fits better into Reduce part of MapReduce pattern where you need to accumulate results from source.
So, our idea was to complete classic push-like visitor pattern with an example of pull-like implementation.
For the demostration we have selected Java language, and a simplest boolean expression calculator.
Please follow GitHub nesterovsky-bros/VisitorPattern to see the detailed explanation.
Until now we've been aware of 2 excelent artificial intelligence frameworks written in C#. These are AForge.NET and its successor Accord.NET. The both include a lot of algorithms for solving wide range of tasks.
Yesterday we've discovered that Microsoft has published as an open-source project their Computational Network Toolkit in order to speed up advances in artificial intelligence and made it available to a broader group of developers.
The sources written in C++ which scales good and uses GPUs. This gives a competitive advantage to CNTK, see more details here.
Although the main aim of such development was speech recognition, the CNTK contains a Neural Network framework that may be used for other artificial intelligence tasks.
It's very old theme...
Many years ago we have defined a .NET wrapper around Windows Uniscribe API.
Uniscribe API is used to render bidirectional languages like Hebrew, so it's important mainly here in Israel.
Once in a while we get request from people to give that API, so we published it on GitHub at https://github.com/nesterovsky-bros/BidiVisualConverter.
You're welcome to use it!
Essence of the problem (see Error during transformation in Saxon 9.7, thread on forum):
- XPath engine may arbitrary reorder predicates whose expressions do not depend on a context position.
- While an XPath expression
$N[@x castable as xs:date][xs:date(@x) gt xs:date("2000-01-01")] cannot raise an error if it's evaluated from the left to right, an expression with reordered predicates $N[xs:date(@x) gt xs:date("2000-01-01")][@x castable as xs:date] may generate an error when @x is not a xs:date .
To avoid a potential problem one should rewrite the expression like this: $N[if (@x castable as xs:date) then xs:date(@x) gt xs:date("2000-01-01") else false()] .
Please note that the following rewrite will not work: $N[(@x castable as xs:date) and (xs:date(@x) gt xs:date("2000-01-01"))] , as arguments of and expression can be evaluated in any order, and error that occurs during evaluation of any argument may be propageted.
With these facts we faced a task to check our code base and to fix possible problems.
A search has brought ~450 instances of XPath expessions that use two or more consequtive predicates. Accurate analysis limited this to ~20 instances that should be rewritten. But then, all of sudden, we have decided to commit an experiment. What if we split XPath expression in two sub expressions. Can error still resurface?
Consider:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:variable name="elements" as="element()+"><a/><b value="c"/></xsl:variable>
<xsl:template match="/">
<xsl:variable name="a" as="element()*" select="$elements[self::d or self::e]"/>
<xsl:variable name="b" as="element()*" select="$a[xs:integer(@value) = 1]"/>
<xsl:sequence select="$b"/>
</xsl:template>
</xsl:stylesheet>
As we expected Saxon 9.7 internally assembles a final XPath with two predicates and reorders them. As result we get an error:
Error at char 20 in xsl:variable/@select on line 8 column 81 of Saxon9.7-filter_speculation.xslt:
FORG0001: Cannot convert string "c" to an integer
This turn of events greately complicates the code review we have to commit.
Michiel Kay's answer to this example:
I think your argument that the reordering is inappropriate when the expression is written using variables is very powerful. I shall raise the question with my WG colleagues.
In fact we think that either: reordering of predicates is inappropriate, or (weaker, to allow reordering) to treat an error during evaluation of predicate expression as false() . This is what is done in XSLT patterns. Other solutions make XPath less intuitive.
In other words we should use XPath (language) to express ideas, and engine should correctly and efficiently implement them. So, we should not be forced to rewrite expression to please implementation.
On December, 30 we have opened a thread in Saxon help forum that shows a stylesheet generating an error. This is the stylesheet:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:variable name="elements" as="element()+"><a/><b value="c"/></xsl:variable>
<xsl:template match="/">
<xsl:sequence select="$elements[self::d or self::e][xs:integer(@value) = 1]"/>
</xsl:template>
</xsl:stylesheet>
We get an error:
Error at char 47 in xsl:sequence/@select on line 7 column 83 of Saxon9.7-filter_speculation.xslt:
FORG0001: Cannot convert string "c" to an integer
Exception in thread "main" ; SystemID: .../Saxon9.7-filter_speculation.xslt; Line#: 7; Column#: 47
ValidationException: Cannot convert string "c" to an integer
at
...
It's interesting that error happens in Saxon 9.7 but not in earlier versions.
The answer we got was expected but disheartening:
The XPath specification (section 2.3.4, Errors and Optimization) explicitly allows the predicates of a filter expression to be reordered by an optimizer. See this example, which is very similar to yours:
The expression in the following example cannot raise a casting error if it is evaluated exactly as written (i.e., left to right). Since neither predicate depends on the context position, an implementation might choose to reorder the predicates to achieve better performance (for example, by taking advantage of an index). This reordering could cause the expression to raise an error.
$N[@x castable as xs:date][xs:date(@x) gt xs:date("2000-01-01")]
Following the spec, Michael Kay advices us to rewrite XPath:
$elements[self::d or self::e][xs:integer(@value) = 1]
like this:
$elements[if (self::d or self::e) then xs:integer(@value) = 1 else false()]
Such subtleties make it hard to reason about and to teach XPath. We doubt many people will spot the difference immediately.
We think that if such optimization was so much important to spec writers, then they had to change filter rules to treat failed predicates as false() . This would avoid any obscure differences in these two, otherwise equal, expressions. In fact something similar already exists with templates where failed evaluation of pattern is treated as un-match.
A collegue has approached to us with a question on how Akinator engine may work.
To our shame we have never heard about this amazing game before. To fill the gap we have immediately started to play it, and have identified it as a Troubleshooting solver.
It took us a couple of minutes to come up with a brilliant solution: "We just need to google and find the engine in the internet".
Unfortunately, this led to nowhere, as no Akinator itself is open sourced, and no other good quality open source solutions are available.
After another hour we have got two more ideas:
- The task should fit into SQL;
- The task is a good candidate for a neural network.
In fact, the first might be required to teach the second, so we have decided to formalize the problem in terms of SQL, while still keeping in mind a neural network.
With this goal we have created a GitHub project. Please see the algorithm and its implementation at github.com/nesterovsky-bros/KB.
|