Yesterday, I've installed IE8.
Looks better here and there.
Today, I'm shocked!
I've reopened my web mail and it remembered the session. It keeps session cookies after closing IE8 instance!
I did not believe to myself and logged into an another web application, and then opened another IE8 instance. What do you think? - It shares the session between instances!
That is a serious security problem.
It prevents me from opening two sessions of a web application on my computer.
P.S. we have found that this problem was already discussed. See IE8 handles sessions/cookies different than IE7 - big trouble for - ...
Someone needs a brain surgery...
Quick solution: run IE8 with -nomerge command line option.
We'd like to return to the binary tree algorithms and spell what you cannot
do with generics in C#. Well, you can do many things, however with generalization
penalty.
Consider a binary tree node: Node(Parent, Left, Right) . RB, AVL, and
others algorithms attach some private information to this node to perform
balancing.
You can express this idea methematically (and in C++), you cannot implement it efficiently in C#.
More focused example. Consider RB tree: Node(Parent, Left, Right, Color) .
There are a number of ways you may implement the internal structure of the tree.
Algorithms themselves stay the same.
Straightforward implementation:
class Node
{
Node Parent;
Node Left;
Node Right;
bool Color;
}
This implementation allocates nodes in the heap and each node refers to other
nodes.
Node navigator implementation:
class Node
{
Node Left;
Node Right;
bool Color;
}
struct NodeNavigator
{
Node[] nodes;
int index;
}
Node does not refer to the parent. This reduces the memory consumption and
simplifies object graph, which is good for GC. Tree is walked using a node
navigator, which stores ancestors of the node.
Node as a structure:
struct Node
{
int Parent;
int Left;
int Right;
bool Color; // This might be integrated as highest bit of parent.
}
Tree is stored as an array of nodes. This is compact and GC efficient
implementation.
Node as a structure, and with node navigator:
struct Node
{
int Left;
int Right;
bool Color; // This might be integrated as highest bit of left.
}
struct NodeNavigator
{
Tree tree;
int[] nodes;
int index;
}
Tree is stored as an array of nodes, and a navigator is used to walk it. This is the most compact implementation.
Each implementation has its virtues. The common between implementations is that
they share the same balancing and navigation algorithms. Storage
differences prevent a single C# implementation. To the contrast, C++ allows to
define a concept "tree" and to define specializations of this concept, allowing
a unified algorithms; all this is done without performance penalty.
P.S. java in this regard, is almost alternativeless...
Recently, we have started looking into a task of creating an interactive parser. A generic one.
Yes, we know there are plenty of them all around, however the goals we have
defined made us to construct the new implementation.
The goals:
- Parser must be incremental.
You should direct what to parse, and when to stop. This virtually demands rather "pull" than conventional "push"
implementation.
- Parser must be able to synchronize a tree with text.
Whenever the underlying text is changed, a limited part of a tree should
to be updated.
- Parser should be able to recover from errors, and continue parsing.
- Parser should be manageable.
This is a goal of every program, really.
- Parser must be fast.
- A low memory footprint is desired.
What's implemented (VS2008, C#) and put at SourceForge, is called an
Incremental Parser.
These are parser's insights:
- Bookmarks are objects to track text points. We use a binary tree (see
Bare binary tree algorithms)
to adjust positions of bookmarks when text is changed.
- Ranges define parsed tree elements. Each range is defined by two
bookmarks, and a grammar annotation.
- There are grammar primitives, which can be composed into a grammar graph.
- A grammar graph along with ranges form a state machine.
- Grammar chains are cached, turning
parsing into a series of probes of literal tokens and transitions between
grammar chains. This caching is done on demand, which results in warming-up effect.
- Parser itself includes a random access tokenizer, and a queue of ranges
pending to be parsed.
- Parsing is conducted as a cycle of pulling and parsing of pending ranges.
- Whenever text is changed a closest range is queued for the reparsing.
- A balance between amount of parsing and memory consumption can be achieved
through a detalization of grammar annotation for a range. An active text
fragment can be fully annotated, while for other text parts a coarse range can
be stored.
We have defined
xpath like grammar to test our ideas. See printed
parsed trees to get understanding of what information can be seen from
ranges.
Do you agree that binary trees and algorithms that keep trees reasonably balanced
are important?
Our answer is yes!
It's interesting enough, however, that you won't easily find these algorithms
publicly available.
Though red-black,
AVL and other algorithms
described in the wikipedia are defined in terms of tree manipulation, all
implementations we have seen, deal with trees annotated with keys and values.
These implementations really use tree balancing algorithms behind the schene,
and expose a commonplace set or map containers to a client. Even
C++ Standard
Library suffers from this disease.
We think that binary trees are valuable independent concepts, and they worth to
be implemented separately, at least because there are other algorithms, except
sets and maps, using trees.
And well, we did it in C#! See
RedBlackTree.cs.
Consider an example - a simple scheduler,
ScheduleBookmark.cs, with operations:
- schedule an action;
- remove an action from the schedule;
- enumerate actions;
- find a date, an action is scheduled for;
- find an action (or at least closest one) for a specified date;
- postpone actions due to delays;
A balanced binary tree allows efficient implementation of such a scheduler. Tree
node stores an action, and a time span between parent node and this node.
This way:
Operation |
Steps |
schedule an action |
find place + link node + rebalance tree |
remove an action from the schedule |
unlink node + rebalance tree |
enumerate actions |
navigate tree |
find a date, an action is scheduled for |
find node in tree |
find an action for a specified date |
cumulate time spans up to the tree root |
postpone actions due to delays |
fixup time spans from a node up to the tree root |
Compare operation complexities between tree, array, list and map:
Operation |
Tree |
Array |
List |
Map |
schedule an action |
O(ln(N)) |
O(N) |
O(N) |
O(ln(N)) |
remove an action from the schedule |
O(ln(N)) |
O(N) |
O(1) |
O(ln(N)) |
enumerate actions |
O(ln(N)) |
O(1) |
O(1) |
O(ln(N)) |
find a date, an action is scheduled for |
O(ln(N)) |
O(1) |
O(1) |
O(1) |
find an action for a specified date |
O(ln(N)) |
O(ln(N)) |
O(N) |
O(ln(N)) |
postpone actions due to delays |
O(ln(N)) |
O(N) |
O(N) |
O(N*ln(N)) |
Complexity of each operation for the tree is O(ln(N)). No arrays, lists, or maps achieve similar worst case guaranty.
Finally, the test program is
Program.cs,
and a whole project (VS2008) is
Tree.zip
Could you think of a C# method accepting an ancestor, and
forbidding a descendant of a class at compile time?
The answer to this probably is: why do you need such a reptile.
Well, I don't. I didn't meant to create such a method, but generics help a lot!
public class BinaryTreeNode<Node>
where Node: BinaryTreeNode<Node>
{
public Node parent;
public Node left;
public Node right;
}
public class MyNode: BinaryTreeNode<MyNode>
{
public int key;
}
public class MyRoot: MyNode
{
}
public class Test
{
public void test()
{
MyRoot root = new MyRoot();
// print((MyNode)root); // This works.
print(root); // This does not work.
}
private static void print<T>(T node)
where T: BinaryTreeNode<T>
{
Console.WriteLine("print me");
}
}
By the way, BinaryTreeNode is an "abstract" class, as you cannot instantiate it but inherit only.
A simple demand nowdays - a good IDE.
Almost a ten years have passed since xslt has appeared but still, we're not pleased with IDEs
claiming xslt support. Our expectaions are not too high. There are things however, which must be present in such
an IDE.
- A notion of project, and possibly a group of projects. You may think of
it as a main xslt including other xslts participationg in the project.
- A code completion. A feature providing typing hints for language constructs, includes,
prefixes, namespaces, functions, templates, modes, variables, parameters, schema
elements, and other (all this should work in a context of the project).
- A code refactoring. A means to move parts of code between (or inside) files and projects,
rename things (functions, templates, parameters, variables, prefixes,
namespaces, and other).
- Code validation and run.
- Optional debug feature.
We would be grateful if someone had pointed to any such IDE.
Once upon a time, we created a function mimicking
decapitalize() method defined in java in java.beans.Introspector. Nothing
special, indeed. See the source:
/**
* Utility method to take a string and convert it to normal Java variable
* name capitalization. This normally means converting the first
* character from upper case to lower case, but in the (unusual) special
* case when there is more than one character and both the first and
* second characters are upper case, we leave it alone.
* <p>
* Thus "FooBah" becomes "fooBah" and "X" becomes "x", but "URL" stays
* as "URL".
*
* @param name The string to be decapitalized.
* @return The decapitalized version of the string.
*/
public static String decapitalize(String name) {
if (name == null || name.length() == 0) {
return name;
}
if (name.length() > 1 && Character.isUpperCase(name.charAt(1)) &&
Character.isUpperCase(name.charAt(0))){
return name;
}
char chars[] = name.toCharArray();
chars[0] = Character.toLowerCase(chars[0]);
return new String(chars);
}
We typed implementation immediately:
<xsl:function name="t:decapitalize" as="xs:string">
<xsl:param name="value" as="xs:string?"/>
<xsl:variable name="c" as="xs:string"
select="substring($value, 2, 1)"/>
<xsl:sequence select="
if ($c = upper-case($c)) then
$value
else
concat
(
lower-case(substring($value, 1, 1)),
substring($value, 2)
)"/>
</xsl:function>
It worked, alright, until recently, when it has fallen to work, as the output was
different from java's counterpart.
The input was W9Identifier. Function naturally returned the same value, while
java returned w9Identifier. We has fallen with the assumption that
$c = upper-case($c) returns true when character is an upper case letter. That's
not correct for numbers. Correct way is:
<xsl:function name="t:decapitalize" as="xs:string">
<xsl:param name="value" as="xs:string?"/>
<xsl:variable name="c" as="xs:string"
select="substring($value, 2, 1)"/>
<xsl:sequence select="
if ($c != lower-case($c)) then
$value
else
concat
(
lower-case(substring($value, 1, 1)),
substring($value, 2)
)"/>
</xsl:function>
100% Agree.
> Basically we work for clients and if they ask that we need
> this output from this input then we don't thing about it and
> get the result.
I guess the definition of professionalism is that if you're a professional,
you advise the client when he gets the requirements wrong, and if you're
not, you build whatever rubbish he asks for.
Michael Kay
http://www.saxonica.com/
The last year we were working on a project, which in essence dealt with
transformation of graphs. Our experience with xslt 1.0, and other available information
was promising - xslt 2.0 is a perfect match.
We were right, xslt 2.0 fitted very well to the problem.
It's easy to learn xslt 2.0/xquery: be acquainted with xml schema; read through
a syntax, which is rather concise; look at examples, and start coding. API you
will learn incrementally.
The same as other languages, xslt 2.0 is only a media to express algorithms. As such
it fills its role rather good, as good as SQL:2003 and its variations do, and
sometimes even better than other programming languages like C++ do.
Compare expressions "get data satisfying to a specific criteria" and "for each
data part check a specific condition, and if it true, add it to the result".
These often represent the same idea from two perspectives: human (or math)
thinkning; and thinking in terms of execution procedure.
Both kinds of expressions have their use, however it has happened so that we're
the human beings and perceive more easily natural language notions like:
subjects, objects, predicates, deduction, induction and so on. I think the
reason is that a human's (not positronic) brain grasps ideas, conceptions,
images as something static, while execution procedure demands a notion of time
(or at least notions of a sequence and an order) for the comprehension. ("Are you serious?", "Joke!" )
There is the other side to this story.
We have made the project design in relatively short terms. A good scalable
design. We needed people who know xslt 2.0 to implement it. It has turned out, this was a strong objection against xslt!
Our fellow, xslt guru, Oleg
Tkachenko has left our company to make his career at Microsoft, and to our disbelief it
was impossible to find a person who was interested in a project involvong 85% of
xslt and 15% of other technologies including java. Even in java world people prefer routine projects, like
standard swing or web application, to a project demanding creativeness.
Possibly, it was our mistake, to allow to our company to look for developers the standard
way: some secretary was looking through her sources, and inevitably was finding
so-so java + poor xml + almost zero xslt knowledge graduates. We had to make
appeals on xslt forums especially since the project could be easily developed
with a distributed group.
Finally, we have designed and implemented the project by ourselves but to the
present day our managers are calling and suggesting java developers
for our project. What a bad joke!
Just for fun I've created exslt2.xslt and exslt2-test.xslt to model concepts
discussed at EXSLT 2.0
forum. I did nothing special but used tuple as reference, and also I've
defined f:call() to make function call indirectly.
<?xml version="1.0" encoding="utf-8"?>
<!--
exslt 2 sketches.
-->
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:f="http://exslt.org/v2"
xmlns:t="this"
xmlns:p="private"
exclude-result-prefixes="xs t f">
<xsl:include href="exslt2.xslt"/>
<xsl:template match="/" name="main">
<root>
<xsl:variable name="refs" as="item()*" select="
for $i in 1 to 20 return
f:ref(1 to $i)"/>
<total-items>
<xsl:sequence select="
sum
(
for $ref in $refs return
count(f:deref($ref))
)"/>
</total-items>
<sums-per-ref>
<xsl:for-each select="$refs">
<xsl:variable name="index" as="xs:integer" select="position()"/>
<sum
index="{$index}"
value="{sum(f:deref(.))}"/>
</xsl:for-each>
</sums-per-ref>
<add>
<xsl:text>1 + 2 = </xsl:text>
<xsl:sequence select="f:call(xs:QName('t:add'), (1, 2))"/>
</add>
</root>
</xsl:template>
<xsl:function name="t:add" as="xs:integer">
<xsl:param name="arguments" as="xs:integer+"/>
<xsl:variable name="first" as="xs:integer" select="$arguments[1]"/>
<xsl:variable name="second" as="xs:integer" select="$arguments[2]"/>
<xsl:sequence select="$first + $second"/>
</xsl:function>
</xsl:stylesheet>
Code can be found at saxon.extensions.9.1.zip.
We have created
Java Xml Object Model purely for purposes of our project. In fact jxom at
present has siblings: xml models for sql dialects. There are also different APIs
like name normalizations, refactorings, compile time evaluation.
It turns out that jxom is also good enough for other developers.
The drawback of jxom, however, is rather complex xml schema. It takes time to understand it. To simplify things
we have created (and planning to create more) a couple of examples allowing to feel
how jxom xml looks like.
The latest version can be loaded from
jxom.zip
We would be pleased to see more comments on the subject.
Although in last our projects we're using more Java and XSLT, we always compare Java and .NET features. It's not a secret that in most applications we may find cache solutions used to improve performance. Unlike .NET providing a robust cache solution Java doesn't provide anything standard. Of course Java's adept may find a lot of caching frameworks or just to say: "use HashMap (ArrayList etc.) instead", but this is not the same.
Think about options for Java:
1. Caching frameworks (caching systems). Yes, they do their work. Do it perfectly. Some of them are brought to the state of the art, but there are drawbacks. The crucial one is that for simple data caching one should use a whole framework. This option requires too many efforts to solve a simple problem.
2. Collection classes (HashMap , ArrayList etc.) for caching data. This is very straightforward solution, and very productive. Everyone knows these classes, nothing to configure. One should declare an instance of such class, take care of data access synchronization and everything starts working immediately. An admirable caching solution but for "toy applications", since it solves one problem and introduces another one. If an application works for hours and there are a lot of data
to cache, the amount of data grows only and never reduces, so this is the reason why such caching is very quickly surrounded with all sort of rules that somehow reduce its size at run-time. The solution very quickly lost its shine and become not portable, but it's still applicable for some applications.
3. Using Java reference objects for caching data. The most appropriate for cache solution is a java.util.WeekHashMap class. WeakHashMap works exactly like a hash table but uses weak references internally. In practice, entries in the WeakHashMap are reclaimed at any time if they are not refered outside of map. This caching strategy
depends on GC's whims and is not entirely reliable, may increase a number of cache misses.
We've decided to create our simple cache with sliding expiration of data.
One may create many cache instances but there is only one global service that tracks expired objects among these instances:
private Cache<String, Object> cache = new Cache<String, Object>();
There is a constructor that specifies an expiration interval in milliseconds for all cached objects:
private Cache<String, Object> cache = new Cache<String, Object>(15 * 60 * 1000)
Access is similar to HashMap :
instance = cache.get("key"); and cache.put("key", instance);
That's all one should know to start use it. Click here to download the Java source of this class. Feel free to use it in your applications.
Recently, working on completely different thing, I've realized that one may create a
"generator", function returning different values per each call. I was somewhat
puzzled with this conclusion, as I thought xslt functions have no side effects,
and for the same arguments xslt function returns the same result.
I've confirmed the conclusion at the forum. See
Scope of uniqueness of generate-id().
In short:
- each node has an unique identity;
- function in the course of work creates a temporary node and produces a result
depending on identity of that node.
Example:
<xsl:stylesheet version="2.0"
xmlns:f="data:,f"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:message select="
for $i in 1 to 8 return
f:fun()"/>
</xsl:template>
<xsl:function name="f:fun" as="xs:string">
<xsl:variable name="x">!</xsl:variable>
<xsl:sequence select="generate-id($x)"/>
</xsl:function>
</xsl:stylesheet>
The next thought was that if you may create a generator then it's easy to create
a good random number generator (that's a trivial math task).
Hey gurus, take a chance!
Yesterday I've read of a new Garbage Collection implementation
G1.
To be honest I was not impressed.
I think Garbage Collection is an evil, or at least its present implementations.
I do not believe in algorithms that in their very core assume a centralized
execution.
On the other hand it's clear it's not in my power to change the status quo. My
lot is to give advices mostly incompetent and ignorable.
I'm waiting for the time when someone will reach the idea to bring some parts of
GC logic out of runtime scope. This will require more VM intelligence,
however will bear its fruits.
JIT or compiler during a static analysis may prove that some objects being
collected may make some of their referring objects unreachable, provided it can
prove that referring objects are not reachable through the other means (e.g.
private field which is not stored in other places). This is close to the ideas
expressed in
Muse on value types in java. It's possible to prepare a garbage graph in
advance before runtime.
In many cases it's also possible to prove that when method's variable goes out
of scope it's not reachable through the other means and may be collected. This
allows to implement a stage of automatic garbage collection when objects that
are proven to be a garbage be immedeately added to a free memory set.
As an example I'm thinking of java's ArrayList object which stores private
array. When ArrayList is reclaimed or resized a reference to the private array
is getting lost and memory can be added to the free set immediately.
This mechanics being integrated as the first stage of GC will make it less
centralized, as I believe many objects will be collected this way.
Suppose you have constructed a sequence of attributes.
How do you access a value of attribute "a"?
Simple, isn't it? It has taken a couple of minutes to find a solution!
<xsl:variable name="attributes" as="attribute()*">
<xsl:apply-templates mode="t:generate-attributes" select="."/>
</xsl:variable>
<xsl:variable name="value" as="xs:string?"
select="$attributes[self::attribute(a)]"/>
|