RSS 2.0
Sign In
# Wednesday, 24 June 2009

In our project we're generating a lot of xml files, which are subjects of manual changes, and repeated generations (often with slightly different generation options). This way a life flow of such an xml can be described as following:

  1. generate original xml (version 1)
  2. manual changes (version 2)
  3. next generation (version 3)
  4. manual changes integrated into the new generation (version 4)

If it were a regular text files we could use diff utility to prepare patch between versions 1 and 2, and apply it with patch utility to a version 3. Unfortunately xml has additional semantics compared to a plain text. What's an invariant or a simple modification in xml is often a drastic change in text. diff/patch does not work well for us. We need xml diff and patch.

The first guess is to google it! Not so simple. We have failed to find a tool or an API that can be used from ant. There are a lot of GUIs to show xml differences and to perform manual merge, or doing similar but different things to what we need (like MS's xmldiffpatch).

Please point us to such a program!

Meantime, we need to proceed. We don't believe that such a tool can be done on the knees, as it's a heuristical and mathematical at the same time task requiring a careful design and good statistics for the use cases. Our idea is to exploit diff/patch. To achieve the goals we're going to perform some normalization of xmls before diff to remove redundant invariants, and normalization after the patch to return it into a readable form. This includes:

  • ordering attributes by their names;
  • replacing unsignificant whitespaces with line breaks;
  • entering line breaks after element names and before attributes, after attribute name and before it's value, and after an attribute value.

This way we expect to recieve files reacting to modifications similarly to text files.

Wednesday, 24 June 2009 11:40:32 UTC  #    Comments [0] -
Tips and tricks | xslt
All comments require the approval of the site owner before being displayed.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview
Archive
<2009 June>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011
Statistics
Total Posts: 387
This Year: 0
This Month: 0
This Week: 0
Comments: 2561
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2025, Nesterovsky bros
All Content © 2025, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)