If you have a string variable $value as xs:string , and want to know whether it starts from a digit, then what's the best way to do it in the xpath?
Our answer is: ($value ge '0') and ($value lt ':') .
Looks a little funny (and disturbing).
In our project we're generating a lot of xml files, which are subjects of manual
changes, and repeated generations (often with slightly different generation
options). This way a life flow of such an xml can be described as following:
- generate original xml (version 1)
- manual changes (version 2)
- next generation (version 3)
- manual changes integrated into the new generation (version 4)
If it were a regular text files we could use diff utility to prepare
patch between versions 1 and 2, and apply it with patch utility to
a version 3. Unfortunately xml has additional semantics compared to a plain text. What's an
invariant or a simple modification in xml is often a drastic change in text.
diff /patch does not work well for us. We need xml diff
and patch.
The first guess is to google it! Not so simple.
We have failed to find a tool or an API that can be used from ant. There are a
lot of GUIs to show xml differences and to perform manual merge, or doing
similar but different things to what we need
(like MS's xmldiffpatch).
Please point us to such a program!
Meantime, we need to proceed. We don't believe that such a tool can be
done on the knees, as it's a heuristical and mathematical at the same time
task requiring a careful design and good statistics for the use cases. Our idea
is to exploit
diff /patch . To achieve the goals we're going to
perform some normalization of xmls before diff to remove redundant
invariants, and normalization after the patch to return it into a readable form.
This includes:
- ordering attributes by their names;
- replacing unsignificant whitespaces with line breaks;
- entering line breaks after element names and before attributes, after
attribute name and before it's value, and after an attribute value.
This way we expect to recieve files reacting to modifications similarly to text
files.
Ladies and gentlemen of the jury, exhibit number one is what the seraphs, the misinformed, simple, noble-winged seraphs, envied. Look at this the most ancient program:
See origin
At present C# serializer knows how to print comments and do some formatting (we had to create micro xml serializer within xslt to serialize xml comments). C#'s formatting is not as advanced as java's one, but it should not be such in the first place, as C# text tends to be more neat due to properties and events. Compare:
Java: instance.getItems().get(10).setValue(value);
vs
C#: instance.Items[10].Value = value;
TODO: implement API existing in jxom and missing in C# xom. This includes:
- name normalization - rewriting tree to make names unique (duplicate names are often appear during generation from code templates);
- namespaces normalization - rewriting tree to elevate type namespaces (during generation, types are usually fully qualified);
- unreachable code detection - optional feature (in java it's required, as unreachable code is an error, while in C# it's only a warning);
- compile time expression evaluation - optional feature used in code optimization and in reachability checks;
- state machine refactoring - not sure, as C# has
yield statement that does the similar thing.
Update can be found at: jxom/C# xom.
June, 24 update: name and namespace normalizations are implemented.
Writing a language serializer is an as easy task, as riding a bicycle. Once you learned it, you won't apply a mental force anymore to create a new one.
This still requires essential mechanical efforts to write and test things.
Well, this is the first draft of the C# xslt serializer. Archive contains both C# xom and jxom.
Note: no comments are still supported; nothing is done to format code except line wrapping.
Today an old book was extracted by my son on the light of God. The book was immediatly opened on this verse:
Any trifle can become a main business of your life.
You just need be a firmly believed that there is nothing more important that can be achieved. And then nothing won't prevent you gasp out from delight to engage with this nonsense.
Unfortunatelly too often these facetious verses of Gregory Oster becoming a true.
I've read a popular scientific stuff about DNA, RNA, proteins, cells,
prokaryotes and eukaryotes their structures, roles, operational principles,
evolution.
All the computer technologies and robotics seem like a childish babbling
comparing to microbiology and molecular biology.
I would wish to be so open minded, and have so capable brain with infinite work
capacity (and live so long life ) to push, to break through the borders of knowledge of the humanity.
Ah, I envy to the Renaissance people who were capble to hold and drive the
science and art, this contrasts so much with contemporary specializations.
Well, it's jxom no more but also csharpxom!
A project concerns demanded us to create a C# 3.0 xml schema.
Shortly we expect to create an xslt serializing an xml document in this schema into a text. Thankfully to the original design we can reuse java streamer almost without changes.
A fact: C# schema more than twice bigger than the java's.
Today, I've found a C++0x FAQ by Bjarne Stroustrup reviewing most of the new features that we shall see in the next version.
A good insight for those who don't track the WG progress.
But what attracts me is a passage:
Sounds rather pessimistic to my taste.
There is a nice
ServiceLoader API in java 6 implementing a service provider idiom. It's good
(good because it's standard) class resolving interface implementation using
META-INF/service location.
Unfortunately, there is even no JSR implementation for this class in java 5. This
makes it impossible for us to use it.
What a nuisance!
We honour the memory of our grandfathers and grandmothers who battled that
cruel war. Our grandfather has fallen in that war, other grandfather and grandmothers have survived and
lived long lives.
Time is relentless, they have left this world but we shall keep them and
their deeds in memory.
|
|
Yesterday, we've found an article
"Repackaging Saxon". It's about a decision to go away from Saxon-B/Saxon-SA
packaging to a more conventional product line: Home/Professional/Enterprise
Editions.
The good news are that the Saxon stays open source. That's most important as an
open comunity spirit will be preserved. On the other hand Professional and
Enterprise Editions will not be free.
In this regard the most interesting comments are:
John Cowan> I suspect that providing packaging only for $$ (or pounds or euros) won't actually work, because someone else will step in and provide that packaging for free, as the licensing permits.
and response:
Michael Kay> This will be interesting to see. I'm relying partly on the idea that there's a fair degree of trust, and expectation of support, associated with Saxonica's reputation, and that the people who are risking their business on the product might be hesitant to rely on third parties, who won't necessarily be prompt in issuing maintenance releases etc; at the same time, such third parties may serve the needs of the hobbyists who are the real market for the open source version.
and also:
Michael Kay> ...I haven't been able to make a model based on paid services plus free software work for me. It's hard enough to get the services business; when you do get it, it's hard to get enough revenue from it to fund the time spent on developing and supporting the software. Personally, I think the culture of free software has gone too far, and it is now leading to a lack of investment in new software...
Sunny> Look what have I found! Consider a C#:
public class T
{
public T free;
}
public void NewTest()
{
T cache = new T();
Stopwatch timer = new Stopwatch();
timer.Reset();
timer.Start();
for(int i = 0; i < 10000000; ++i)
{
// Get from cache.
T t;
if (cache.free == null)
{
cache.free = new T();
}
t = cache.free;
// Release
cache.free = t;
t = null;
}
timer.Stop();
long cacheTicks = timer.ElapsedTicks;
timer.Reset();
timer.Start();
for(int i = 0; i < 10000000; ++i)
{
new T();
}
timer.Stop();
long newTicks = timer.ElapsedTicks;
Console.WriteLine("cache: {0}, new: {1}", cacheTicks, newTicks);
}
Gloomy> And?
Sunny> Tests show that new T() is almost as fast as
caching! GC's "new" probably has a fast route, where it shifts free memory border
in an atomic way, thus allocation takes just several cycles.
Gloomy> Well, you're probably right, there is a fast route. I, however,
have a different opinion. To track references, a generational garbage collector
implements field assign as a call rather than a mov .
This routine, except move itself, marks touched memory page in a special card
table (who said GC is cheap?); thus, I think, a reference field setter is
almost as slow as the "new" call.
.Net is known for its array covariance. That means that any array can be cast to
an array of base elements:
public class T: B
{
}
T[] tlist = ...
B[] blist = tlist;
This feature comes at cost:
B b = ...
T t = ...
blist[0] = b; // This efficiently is: blist[0] = (T)b;
tlist[0] = t; // This is the same: tlist[0] = (T)t;
We pay the cost of additional cast, just for nothing. Let this dubious design decision opresses .Net/Java inventors.
You can eliminate the cast. Just use array of structs:
struct S<T>
{
public T t;
}
S<T>[] slist = ...
slist[0].t = t; // Works without cast.
Measurment show that S[] is ~35% faster than T[] on write, and slower (JIT could do better) on read.
Well, ugly workaround of ugly design.
P.S. In java there is no relief...
|