As you probably know we have implemented our custom Protocol Handler for the Windows Search.
It's called .xml-gz, and has a goal to index compressed xml files and to have search results with a subtree precision. So, for xml:
<data> <item>...</item> <item>...</item> ... </data>
search finds results within item and returns xml's url and stream offset of the item. Using ZLIB API we can compress data with stream bookmarks, so fast random access to the data is possible.
item
The only problem we have is about notification of changes (create, delete, update) of such files.
Spec describes several techniques (nothing has worked for us):
1. Call catalogManager.ReindexMatchingURLs() - it just returns without any impact.
2.Call changeSink.OnItemsChanged() - returns error.
3. Implement .xml-gz IFilter and call IGatherNotifyInline (see " have your .zip urls indexed when they are created or modified") - that's a mistery, as:
4. Implement root url in form .xml-gz:/// and perform Windows Search:
.xml-gz:///
SELECT System.ItemUrl, System.DateModified FROM SystemIndex WHERE System.FileExtension='.xml-gz'
to find all .xml-gz sources. This is not reliable, as your protocol handler can be (and is) called before file is indexed.
.xml-gz
So, the only reliable way to index your data is to (re-)add indexing rule for the protocol handler, which in most cases reindexes everything.
The only bearable solution we found is to define indexing rule in the form: .xml-gz://file:d:/data/... and to use IShellFolder(2) interfaces to discover sub items and their modification times. This technique allows minimal data scan when you're (re-)add indexing rule.
.xml-gz://file:d:/data/...
Remember Me
a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u