RSS 2.0
Sign In
# Tuesday, 26 April 2011

Earlier, we have described an approach to call Windows Search from SQL Server 2008. But it has turned out that our problem is more complicated...

All has started from the initial task:

  • to allow free text search in a store of huge xml files;
  • files should be compressed, so these are *.xml.gz;
  • search results should be addressable to a fragment within xml.

Later we shall describe how we have solved this task, and now it's enough to say that we have implemented a Protocol Handler for Windows Search named '.xml-gz:'. This way original file stored say at 'file:///c:/store/data.xml-gz' is seen as a container by the Windows Search:

  • .xml-gz:///file:c:/store/data.xml-gz/id1.xml
  • .xml-gz:///file:c:/store/data.xml-gz/id2.xml
  • ...

This way search xml should be like this:

select System.ItemUrl from SystemIndex where scope='.xml-gz:' and contains(...)

Everything has worked during test: we have succeeded to issue Windows Search selects from SQL Server and join results with other sql queries.

But later on when we considered a runtime environment we have seen that our design won't work. The reason is simple. Windows Search will work on a computer different from those where SQL Servers run. So, the search query should look like this:

select System.ItemUrl from Computer.SystemIndex where scope='.xml-gz:' and contains(...)

Here we have realized the limitation of current (Windows Search 4) implementation: remote search works for shared folders only, thus query may only look like:

select System.ItemUrl from Computer.SystemIndex where scope='file://Computer/share/' and contains(...)

Notice that search restricts the scope to a file protocol, this way remoter search will never return our results. The only way to search in our scope is to perform a local search.

We have considered following approaches to resolve the issue.

The simplest one would be to access Search protocol on remote computer using a connection string: "Provider=Search.CollatorDSO;Data Source=Computer" and use local queries. This does not work, as provider simply disregards Data Source parameter.

The other try was to use MS Remote OLEDB provider. We tried hard to configure it but it always returns obscure error, and more than that it's deprecated (Microsoft claims to remove it in future).

So, we decided to forward request manually:

  • SQL Server calls a web service (through a CLR function);
  • Web service queries Windows Search locally.

Here we considered WCF Data Services and a custom web service.

The advantage of WCF Data Services is that it's a technology that has ambitions of a standard but it's rather complex task to create implementation that will talk with Windows Search SQL dialect, so we have decided to build a primitive http handler to get query parameter. That's trivial and also has a virtue of simple implementation and high streamability.

So, that's our http handler (WindowsSearch.ashx):

<%@ WebHandler Language="C#" Class="WindowsSearch" %>

using System;
using System.Web;
using System.Xml;
using System.Text;
using System.Data.OleDb;

/// <summary>
/// A Windows Search request handler.
/// </summary>
public class WindowsSearch: IHttpHandler
{
  /// <summary>
  /// Handles the request.
  /// </summary>
  /// <param name="context">A request context.</param>
  public void ProcessRequest(HttpContext context)
  {
    var request = context.Request;
    var query = request.Params["query"];
    var response = context.Response;

    response.ContentType = "text/xml";
    response.ContentEncoding = Encoding.UTF8;

    var writer = XmlWriter.Create(response.Output);

    writer.WriteStartDocument();
    writer.WriteStartElement("resultset");

    if (!string.IsNullOrEmpty(query))
    {
      using(var connection = new OleDbConnection(provider))
      using(var command = new OleDbCommand(query, connection))
      {
        connection.Open();

        using(var reader = command.ExecuteReader())
        {
          string[] names = null;

          while(reader.Read())
          {
            if (names == null)
            {
              names = new string[reader.FieldCount];

              for (int i = 0; i < names.Length; ++i)
              {
                names[i] = XmlConvert.EncodeLocalName(reader.GetName(i));
              }
            }

            writer.WriteStartElement("row");

            for(int i = 0; i < names.Length; ++i)
            {
              writer.WriteElementString(
                names[i],
                Convert.ToString(reader[i]));
            }

            writer.WriteEndElement();
          }
        }
      }
    }

    writer.WriteEndElement();
    writer.WriteEndDocument();

    writer.Flush();
  }

  /// <summary>
  /// Indicates that a handler is reusable.
  /// </summary>
  public bool IsReusable { get { return true; } }

  /// <summary>
  /// A connection string.
  /// </summary>
  private const string provider =
    "Provider=Search.CollatorDSO;" +
    "Extended Properties='Application=Windows';" +
    "OLE DB Services=-4";
}

And a SQL CLR function looks like this:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Net;
using System.IO;
using System.Xml;

/// <summary>
/// A user defined function.
/// </summary>
public class UserDefinedFunctions
{
  /// <summary>
  /// A Windows Search returning result as xml strings.
  /// </summary>
  /// <param name="url">A search url.</param>
  /// <param name="userName">A user name for a web request.</param>
  /// <param name="password">A password for a web request.</param>
  /// <param name="query">A Windows Search SQL.</param>
  /// <returns>A result rows.</returns>
  [SqlFunction(
    IsDeterministic = false,
    Name = "WindowsSearch",
    FillRowMethodName = "FillWindowsSearch",
    TableDefinition = "value nvarchar(max)")]
  public static IEnumerable Search(
    string url,
    string userName,
    string password,
    string query)
  {
    return SearchEnumerator(url, userName, password, query);
  }

  /// <summary>
  /// A filler of WindowsSearch function.
  /// </summary>
  /// <param name="value">A value returned from the enumerator.</param>
  /// <param name="row">An output value.</param>
  public static void FillWindowsSearch(object value, out string row)
  {
    row = (string)value;
  }

  /// <summary>
  /// Gets a search row enumerator.
  /// </summary>
  /// <param name="url">A search url.</param>
  /// <param name="userName">A user name for a web request.</param>
  /// <param name="password">A password for a web request.</param>
  /// <param name="query">A Windows Search SQL.</param>
  /// <returns>A result rows.</returns>
  private static IEnumerable<string> SearchEnumerator(
    string url,
    string userName,
    string password,
    string query)
  {
    if (string.IsNullOrEmpty(url))
    {
      throw new ArgumentException("url");
    }

    if (string.IsNullOrEmpty(query))
    {
      throw new ArgumentException("query");
    }

    var requestUrl = url + "?query=" + Uri.EscapeDataString(query);

    var request = WebRequest.Create(requestUrl);

    request.Credentials = string.IsNullOrEmpty(userName) ?
      CredentialCache.DefaultCredentials :
      new NetworkCredential(userName, password);

    using(var response = request.GetResponse())
    using(var stream = response.GetResponseStream())
    using(var reader = XmlReader.Create(stream))
    {
      bool read = true;

      while(!read || reader.Read())
      {
        if ((reader.Depth == 1) && reader.IsStartElement())
        {
          // Note that ReadInnerXml() advances the reader similar to Read().
          yield return reader.ReadInnerXml();

          read = false;
        }
        else
        {
          read = true;
        }
      }
    }
  }
}

And, finally, when you call this service from SQL Server you write query like this:

with search as
(
  select
    cast(value as xml) value
  from
    dbo.WindowsSearch
    (
      N'http://machine/WindowsSearchService/WindowsSearch.ashx',
      null,
      null,
      N'
        select
          "System.ItemUrl"
        from
          SystemIndex
        where
          scope=''.xml-gz:'' and contains(''...'')'
    )
)
select
  value.value('/System.ItemUrl[1]', 'nvarchar(max)')
from
  search

Design is not trivial but it works somehow.

After dealing with all these problems some questions remain unanswered:

  • Why SQL Server does not allow to query Windows Search directly?
  • Why Windows Search OLEDB provider does not support "Data Source" parameter?
  • Why Windows Search does not support custom protocols during remote search?
  • Why SQL Server does not support web request/web services natively?
Tuesday, 26 April 2011 08:26:10 UTC  #    Comments [0] -
SQL Server puzzle | Thinking aloud | Tips and tricks | Window Search
# Thursday, 14 April 2011

Hello everybody! You might think that we had died, since there were no articles in our blog for a too long time, but no, we’re still alive…

A month or so we were busy with Windows Search and stuff around it. Custom protocol handlers, support of different file formats and data storages are very interesting tasks, but this article discusses another issue.

The issue is how to compile and install native code written in C++, which was built under Visual Studio 2008 (SP1), on a clean computer.

The problem is that native dlls now resolve the problem known as a DLL hell using assembly manifests. This should help to discover and load the right DLL. The problem is that there are many versions of CRT, MFC, ATL and other dlls, and it's not trivial to create correct setup for a clean computer.

In order to avoid annoying dll binding problems at run-time, please define BIND_TO_CURRENT_CRT_VERSION and/or (_BIND_TO_CURRENT_ATL_VERSION, _BIND_TO_CURRENT_MFC_VERSION). Don’t forget to make the same definitions for all configurations/target platforms you intend to use. Build the project and check the resulting manifest file (just in case). It should contain something like that:

<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<assembly xmlns='urn:schemas-microsoft-com:asm.v1' manifestVersion='1.0'>
  <trustInfo xmlns="urn:schemas-microsoft-com:asm.v3">
    <security>
      <requestedPrivileges>
        <requestedExecutionLevel level='asInvoker' uiAccess='false' />
      </requestedPrivileges>
    </security>
  </trustInfo>
  <dependency>
    <dependentAssembly>
      <assemblyIdentity type='win32' name='Microsoft.VC90.DebugCRT'
        version='9.0.30729.4148' processorArchitecture='x86'
        publicKeyToken='1fc8b3b9a1e18e3b' />
    </dependentAssembly>
  </dependency>
</assembly>

The version of dependent assembly gives you a clue what a native run-time version(s) requires your application. The same thing you have to do for all your satellite projects.

The next step is to create a proper setup project using VS wizard.

Right click on setup project and select “Add->Merge Module…”. Select “Microsoft_VC90_CRT_x86.msm” or/and (“Microsoft_VC90_DebugCRT_x86.msm”, “Microsoft_VC90_ATL_x86.msm”, “Microsoft_VC90_MFC_x86.msm”…) for installing of corresponding run-time libraries and “policy_9_0_Microsoft_VC90_CRT_x86.msm” etc. for route calls of old version run-time libraries to the newest versions. Now you're ready to build your setup project.

You may also include “Visual C++ Runtime Libraries” to a setup prerequisites.

As result, you'll get 2 files (setup.exe and Setup.msi) and an optional folder (vcredist_x86) with C++ run-time redistributable libraries.

Note: only setup.exe installs those C++ run-time libraries.

More info concerning this theme:
Thursday, 14 April 2011 09:50:23 UTC  #    Comments [0] -
Tips and tricks
# Monday, 07 March 2011

Let's assume you're loading data into a table using BULK INSERT from tab separated file. Among others you have some varchar field, which may contain any character. Content of such field is escaped with usual scheme:

  • '\' as '\\';
  • char(13) as '\n';
  • char(10) as '\r';
  • char(9) as '\t';

But now, after loading, you want to unescape content back. How would you do it?

Notice that:

  • '\t' should be converted to a char(9);
  • '\\t' should be converted to a '\t';
  • '\\\t' should be converted to a '\' + char(9);

It might be that you're smart and you will immediately think of correct algorithm, but for us it took a while to come up with a neat solution:

declare @value varchar(max);

set @value = ...

-- This unescapes the value
set @value =
  replace
  (
    replace
    (
      replace
      (
        replace
        (
          replace(@value, '\\', '\ '),
          '\n',
          char(10)
        ),
        '\r',
        char(13)
      ),
      '\t',
      char(9)
    ),
    '\ ',
    '\'
  );

 

Do you know a better way?

Monday, 07 March 2011 21:01:24 UTC  #    Comments [0] -
SQL Server puzzle | Tips and tricks
# Friday, 04 March 2011

We were trying to query Windows Search from an SQL Server 2008.

Documentation states that Windows Search is exposed as OLE DB datasource. This meant that we could just query result like this:

SELECT
  *
FROM
  OPENROWSET(
    'Search.CollatorDSO.1',
    'Application=Windows',
    'SELECT "System.ItemName", "System.FileName" FROM SystemIndex');

But no, such select never works. Instead it returns obscure error messages:

OLE DB provider "Search.CollatorDSO.1" for linked server "(null)" returned message "Command was not prepared.".
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "Search.CollatorDSO.1" for linked server "(null)" reported an error. Command was not prepared.
Msg 7350, Level 16, State 2, Line 1
Cannot get the column information from OLE DB provider "Search.CollatorDSO.1" for linked server "(null)".

Microsoft is silent about reasons of such behaviour. People came to a conclusion that the problem is in the SQL Server, as one can query search results through OleDbConnection without problems.

This is very unfortunate, as it bans many use cases.

As a workaround we have defined a CLR function wrapping Windows Search call and returning rows as xml fragments. So now the query looks like this:

select
  value.value('System.ItemName[1]', 'nvarchar(max)') ItemName,
  value.value('System.FileName[1]', 'nvarchar(max)') FileName
from
  dbo.WindowsSearch('SELECT "System.ItemName", "System.FileName" FROM SystemIndex')

Notice how we decompose xml fragment back to fields with the value() function.

The C# function looks like this:

using System;
using System.Collections;
using System.IO;
using System.Xml;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using System.Data.OleDb;

using Microsoft.SqlServer.Server;

public class UserDefinedFunctions
{
  [SqlFunction(
    FillRowMethodName = "FillSearch",
    TableDefinition="value xml")]
  public static IEnumerator WindowsSearch(SqlString query)
  {
    const string provider =
      "Provider=Search.CollatorDSO;" +
      "Extended Properties='Application=Windows';" +
      "OLE DB Services=-4";

    var settings = new XmlWriterSettings
    {
      Indent = false,
      CloseOutput = false,
      ConformanceLevel = ConformanceLevel.Fragment,
      OmitXmlDeclaration = true
    };

    string[] names = null;

    using(var connection = new OleDbConnection(provider))
    using(var command = new OleDbCommand(query.Value, connection))
    {
      connection.Open();

      using(var reader = command.ExecuteReader())
      {
        while(reader.Read())
        {
          if (names == null)
          {
            names = new string[reader.FieldCount];

            for (int i = 0; i < names.Length; ++i)
            {
              names[i] = XmlConvert.EncodeLocalName(reader.GetName(i));
            }
          }

          var stream = new MemoryStream();
          var writer = XmlWriter.Create(stream, settings);

          for(int i = 0; i < names.Length; ++i)
          {
            writer.WriteElementString(names[i], Convert.ToString(reader[i]));
          }

          writer.Close();

          yield return new SqlXml(stream);
        }
      }
    }
  }

  public static void FillSearch(object value, out SqlXml row)
  {
    row = (SqlXml)value;
  }
}

Notes:

  •  Notice the use of "OLE DB Services=-4" in provider string to avoid transaction enlistment (required in SQL Server 2008).
  • Permission level of the project that defines this extension function should be set to unsafe (see Project Properties/Database in Visual Studio) otherwise it does not allow the use OLE DB.
  • SQL Server should be configured to allow CLR functions, see Server/Facets/Surface Area Configuration/ClrIntegrationEnabled in Microsoft SQL Server Management Studio
  • Assembly should either be signed or a database should be marked as trustworthy, see Database/Facets/Trustworthy in Microsoft SQL Server Management Studio.
Friday, 04 March 2011 09:22:49 UTC  #    Comments [0] -
SQL Server puzzle | Thinking aloud | Tips and tricks | Window Search
# Wednesday, 23 February 2011

A search "java web service session object" has reached our site.

Unfortunately, we cannot help to the original searcher but a next one might find this info usefull.

To get http session in the web service you should add a field to your class that will be populated with request context.

@WebService
public class MyService
{
  @WebMethod
  public int method(String value)
  {
    MessageContext messageContext = context.getMessageContext();
    HttpServletRequest request =
      (HttpServletRequest)messageContext.get(MessageContext.SERVLET_REQUEST);
    HttpSession session = request.getSession();

    // go ahead.
  }

  // A web service context.
  @Resource
  private WebServiceContext context;

}

Wednesday, 23 February 2011 11:33:37 UTC  #    Comments [0] -
Java | Tips and tricks
# Sunday, 20 February 2011

Last few days we were testing Java web-applications that expose web-services. During these tests we've found few interesting features.

The first feature allows to retrieve info about all endpoints supported by the web-application on GET request. The feature works at least for Metro that implements JAX-WS API v2.x. In order to get such info, a client sends any endpoint's URL to the server. The result is an HTML page with a table. Each row of such table contains an endpoint's data for each supported web-service method. This feature may be used as a web-services discovery mechanism.

The second feature is bad rather than good. JAX-WS API supposes that a developer annotates classes and methods that he/she wants to expose as web-services. Then, an implementation generates additional layer-bridge between developer's code and API that does all routine work behind the scene. May be that was a good idea, but Metro's implementation is imperfect. Metro dynamically generates such classes at run-time when a web-application starts. Moreover, Metro does such generation for all classes at once. So, in our case, when the generated web-based application contains dozens or even hundreds of web-services, the application's startup takes a lot of time.

Probably, Metro developers didn't want to deal with implementation of lazy algorithms, when a web-service is generated and cached on demand. We hope this issue will be solved in next releases.

Sunday, 20 February 2011 10:20:12 UTC  #    Comments [0] -
Java | Thinking aloud | Tips and tricks
# Tuesday, 08 February 2011

A while ago we have created a simple cache for Java application. It was modelled like a Map<K, V>: it cached values for keys.

Use cases were:

Cache<String, Object> cache = new Cache<String, Object>();
...
instance = cache.get("key");
cache.put("key", instance);

But now we thought of different implementation like a WeakReference<V> and with map access as additional utility methods.

 Consider an examples:

1. Free standing CachedReference<V> instance.

CachedReference<Data> ref = new CachedReference<Data>(1000, true);
...
ref.set(data);
...
data = ref.get();

2. Map of CachedReference<V> instances.

ConcurrentHashMap<String, CachedReference<Data>> cache =
  new ConcurrentHashMap<String, CachedReference<Data>>();

CachedReference.put(cache, "key", data, 1000, true);
...
data = CachedReference.get(cache, "key");

The first case is faster than original Cache<K, V> as it does not use any hash map at all. The later case provides the same performance as Cache<K, V> but gives a better control over the storage. Incidentally, CachedReference<V> is more compact than Cache<K, V>.

The new implementation is CachedReference.java, the old one Cache.java.

Tuesday, 08 February 2011 15:20:29 UTC  #    Comments [0] -
Java | Thinking aloud
# Saturday, 05 February 2011

We have updated @Yield annotation processor to support better debug info.

Annotation processor can be downloaded from Yield.zip or Yield.jar.

We also decided to consider jxom's state machine refactoring as obsolete as @Yield annotation allows to achieve the same effect but with more clear code.

JXOM can be downloaded from languages-xom.zip

See also:

Yield return feature in java
Why @Yield iterator should be Closeable
What you can do with jxom.

Saturday, 05 February 2011 21:12:05 UTC  #    Comments [3] -
Announce | Java | xslt
# Thursday, 03 February 2011

michaelhkay: Just posted a new internal draft of XSLT 3.0. Moving forward on maps, nested sequences, and JSON support.

Hope they will finally appear there!

See also: Tuples and maps - next try, Tuples and maps - Status: CLOSED, WONTFIX, Tuples and maps in Saxon and other blog posts on our site about immutable maps.

Thursday, 03 February 2011 11:07:49 UTC  #    Comments [0] -
Thinking aloud | xslt
# Thursday, 27 January 2011

A method pattern we have suggested to use along with @Yield annotation brought funny questions like: "why should I mark my method with @Yield annotation at all?"

Well, in many cases you may live with ArrayList populated with data, and then to perform iteration. But in some cases this approach is not practical either due to amount of data or due to the time required to get first item.

In later case you usually want to build an iterator that calculates items on demand. The @Yield annotation is designed as a marker of such methods. They are refactored into state machines at compilation time, where each addition to a result list is transformed into a new item yielded by the iterator.

So, if you have decided to use @Yield annotation then at some point you will ask yourself what happens with resources acquired during iteration. Will resources be released if iteration is interrupted in the middle due to exception or a break statement?

To address the problem yield iterator implements Closeable interface.

This way when you call close() before iteration reached the end, the state machine works as if break statement of the method body is injected after the yield point. Thus all finally blocks of the original method are executed and resources are released.

Consider an example of data iterator:

@Yield
public Iterable<Data> getData(final Connection connection)
  throws Exception
{
  ArrayList<Data> result = new ArrayList<Data>();

  PreparedStatement statement =
    connection.prepareStatement("select key, value from table");

  try
  {
    ResultSet resultSet = statement.executeQuery();

    try
    {
      while(resultSet.next())
      {
        Data data = new Data();

        data.key = resultSet.getInt(1);
        data.value = resultSet.getString(2);

        result.add(data); // yield point
      }
    }
    finally
    {
      resultSet.close();
    }
  }
  finally
  {
    statement.close();
  }

  return result;
}

private static void close(Object value)
  throws IOException
{
  if (value instanceof Closeable)
  {
    Closeable closeable = (Closeable)value;

    closeable.close();
  }
}

public void daoAction(Connection connection)
  throws Exception
{
  Iterable<Data> items = getData(connection);

  try
  {
    for(Data data: items)
    {
      // do something that potentially throws exception.
    }
  }
  finally
  {
    close(items);
  }
}

getData() iterates over sql data. During the lifecycle it creates and releases PreparedStatement and ResultSet.

daoAction() iterates over results provided by getData() and performs some actions that potentially throw an exception. The goal of close() is to release opened sql resources in case of such an exception.

Here you can inspect how state machine is implemented for such a method:

@Yield()
public static Iterable<Data> getData(final Connection connection)
  throws Exception
{
  assert (java.util.ArrayList<Data>)(ArrayList<Data>)null == null;

  class $state implements java.lang.Iterable<Data>, java.util.Iterator<Data>, java.io.Closeable
  {
    public java.util.Iterator<Data> iterator() {
      if ($state$id == 0) {
        $state$id = 1;

        return this;
      } else return new $state();
    }

    public boolean hasNext() {
      if (!$state$nextDefined) {
        $state$hasNext = $state$next();
        $state$nextDefined = true;
      }

      return $state$hasNext;
    }

    public Data next() {
      if (!hasNext()) throw new java.util.NoSuchElementException();

      $state$nextDefined = false;

      return $state$next;
    }

    public void remove() {
      throw new java.lang.UnsupportedOperationException();
    }

    public void close() {
      do switch ($state$id) {
      case 3:
        $state$id2 = 8;
        $state$id = 5;

        continue;
      default:
        $state$id = 8;

        continue;
      } while ($state$next());
    }

    private boolean $state$next() {
      java.lang.Throwable $state$exception;

      while (true) {
        try {
          switch ($state$id) {
          case 0:
            $state$id = 1;
          case 1:
            statement = connection.prepareStatement("select key, value from table");
            $state$exception1 = null;
            $state$id1 = 8;
            $state$id = 2;
          case 2:
            resultSet = statement.executeQuery();
            $state$exception2 = null;
            $state$id2 = 6;
            $state$id = 3;
          case 3:
            if (!resultSet.next()) {
              $state$id = 4;

              continue;
            }

            data = new Data();
            data.key = resultSet.getInt(1);
            data.value = resultSet.getString(2);
            $state$next = data;
            $state$id = 3;

            return true;
          case 4:
            $state$id = 5;
          case 5:
            {
              resultSet.close();
            }

            if ($state$exception2 != null) {
              $state$exception = $state$exception2;

              break;
            }

            if ($state$id2 > 7) {
              $state$id1 = $state$id2;
              $state$id = 7;
            } else $state$id = $state$id2;

            continue;
          case 6:
            $state$id = 7;
          case 7:
            {
              statement.close();
            }

            if ($state$exception1 != null) {
              $state$exception = $state$exception1;

              break;
            }

            $state$id = $state$id1;

            continue;
          case 8:
          default:
            return false;
          }
        } catch (java.lang.Throwable e) {
          $state$exception = e;
        }

        switch ($state$id) {
        case 3:
        case 4:
          $state$exception2 = $state$exception;
          $state$id = 5;

          continue;
        case 2:
        case 5:
        case 6:
          $state$exception1 = $state$exception;
          $state$id = 7;

          continue;
        default:
          $state$id = 8;

          java.util.ConcurrentModificationException ce = new java.util.ConcurrentModificationException();

          ce.initCause($state$exception);

          throw ce;
        }
      }
    }

    private PreparedStatement statement;
    private ResultSet resultSet;
    private Data data;
    private int $state$id;
    private boolean $state$hasNext;
    private boolean $state$nextDefined;
    private Data $state$next;
    private java.lang.Throwable $state$exception1;
    private int $state$id1;
    private java.lang.Throwable $state$exception2;
    private int $state$id2;
  }

  return new $state();
}

Now, you can estimate for what it worth to write an algorithm as a sound state machine comparing to the conventional implementation.

Yield annotation processor can be downloaded from Yield.zip or Yield.jar

See also Yield return feature in java.

Thursday, 27 January 2011 10:33:54 UTC  #    Comments [0] -
Java | Thinking aloud | Tips and tricks
# Monday, 24 January 2011

We're happy to announce that we have implemented @Yield annotation both in javac and in eclipse compilers.

This way you get built-in IDE support for the feature!

To download yield annotation processor please use the following link: Yield.zip

It contains both yield annotation processor, and a test project.

If you do not want to compile the sources, you can download Yield.jar

We would like to reiterate on how @Yield annotation works:

  1. A developer defines a method that returns either Iterator<T> or Iterable<T> instance and marks it with @Yield annotation.
  2. A developer implements iteration logic following the pattern:
    • declare a variable to accumulate results:
        ArrayList<T> items = new ArrayList<T>();
    • use the following statement to add item to result:
        items.add(...);
    • use
        return items;
      or
        return items.iterator();
      to return result;
    • mark method's params, if any, as final.
  3. A devoloper ensures that yield annotation processor is available during compilation (see details below).
  4. YieldProcessor rewrites method into a state machine at compilation time.

The following is an example of such a method:

@Yield
public static Iterable<Integer> generate(final int from, final int to)
{
  ArrayList<Integer> items = new ArrayList<Integer>();

  for(int i = from; i < to; ++i)
  {
    items.add(i);
  }

  return items;
}

The use is like this:

for(int value: generate(7, 20))
{
  System.out.println("generator: " + value);
}

Notice that method's implementation still will be correct in absence of YieldProcessor.

Other important feature is that the state machine returned after the yield processor is closeable.

This means that if you're breaking the iteration before the end is reached you can release resources acquired during the iteration.

Consider the example where break exits iteration:

@Yield
public static Iterable<String> resourceIteration()
{
  ArrayList<String> items = new ArrayList<String>();

  acquire();

  try
  {
    for(int i = 0; i < 100; ++i)
    {
      items.add(String.valueOf(i));
    }
  }
  finally
  {
    release();
  }

  return items;
}

and the use

int i = 0;
Iterable<String> iterator = resourceIteration();

try
{
  for(String item: iterator)
  {
    System.out.println("item " + i + ":" + item);

    if (i++ > 30)
    {
      break;
    }
  }
}
finally
{
  close(iterator);
}

...

private static <T> void close(T value)
  throws IOException
{
  if (value instanceof Closeable)
  {
    Closeable closeable = (Closeable)value;

    closeable.close();
  }
}

Close will execute all required finally blocks. This way resources will be released.

To configure yield processor a developer needs to refer Yield.jar in build path, as it contains @Yield annotation. For javac it's enough, as compiler will find annotation processor automatically.

Eclipse users need to open project properties and:

  • go to the "Java Compiler"/"Annotation Processing"
  • mark "Enable project specific settings"
  • select "Java Compiler"/"Annotation Processing"/"Factory Path"
  • mark "Enable project specific settings"
  • add Yield.jar to the list of "plug-ins and JARs that contain annotation processors".

At the end we want to point that @Yield annotation is a syntactic suggar, but it's important the way the foreach statement is important, as it helps to write concise and an error free code.

See also
  Yield feature in java implemented!
  Yield feature in java

Monday, 24 January 2011 10:23:53 UTC  #    Comments [2] -
Announce | Java | Thinking aloud | Tips and tricks
# Friday, 14 January 2011

For some reason we never knew about instance initializer in java; on the other hand static initializer is well known.

class A
{
  int x;
  static int y;

  // This is an instance initializer.
  {
    x = 1;
  }

  // This is a static initializer.
  static
  {
    y = 2;
  }
}

Worse, we have missed it in the java grammar when we were building jxom. This way jxom was missing the feature.

Today we fix the miss and introduce a schema element:

<class-initializer static="boolean">
  <block>
    ...
  </block>
</class-initializer>

It superseeds:

<static>
  <block>
    ...
  </block>
</static>

 that supported static initializers alone.

Please update languages-xom xslt stylesheets.

P.S. Out of curiosity, did you ever see any use of instance initializers?

Friday, 14 January 2011 21:29:04 UTC  #    Comments [0] -
Announce | Java | xslt
# Tuesday, 11 January 2011

We could not stand the temptation to implement the @Yield annotation that we described earlier.

Idea is rather clear but people are saying that it's not an easy task to update the sources.

They were right!

Implementation has its price, as we were forced to access JDK's classes of javac compiler. As result, at present, we don't support other compilers such as EclipseCompiler. We shall look later what can be done in this area.

At present, annotation processor works perfectly when you run javac either from the command line, from ant, or from other build tool.

Here is an example of how method is refactored:

@Yield
public static Iterable<Long> fibonachi()
{
  ArrayList<Long> items = new ArrayList<Long>();

  long Ti = 0;
  long Ti1 = 1;

  while(true)
  {
    items.add(Ti);

    long value = Ti + Ti1;

    Ti = Ti1;
    Ti1 = value;
  }
}

And that's how we transform it:

@Yield()
public static Iterable<Long> fibonachi() {
  assert (java.util.ArrayList<Long>)(ArrayList<Long>)null == null : null;

  class $state$ implements java.lang.Iterable<Long>, java.util.Iterator<Long>, java.io.Closeable {

    public java.util.Iterator<Long> iterator() {
      if ($state$id == 0) {
        $state$id = 1;
        return this;
      } else return new $state$();
    }

    public boolean hasNext() {
      if (!$state$nextDefined) {
        $state$hasNext = $state$next();
        $state$nextDefined = true;
      }

      return $state$hasNext;
    }

    public Long next() {
      if (!hasNext()) throw new java.util.NoSuchElementException();

      $state$nextDefined = false;

      return $state$next;
    }

    public void remove() {
      throw new java.lang.UnsupportedOperationException();
    }

    public void close() {
      $state$id = 5;
    }

    private boolean $state$next() {
      while (true) switch ($state$id) {
      case 0:
        $state$id = 1;
      case 1:
        Ti = 0;
        Ti1 = 1;
      case 2:
        if (!true) {
          $state$id = 4;
          break;
        }

        $state$next = Ti;
        $state$id = 3;

        return true;
      case 3:
        value = Ti + Ti1;
        Ti = Ti1;
        Ti1 = value;
        $state$id = 2;

        break;
      case 4:
      case 5:
      default:
        $state$id = 5;

        return false;
      }
    }

    private long Ti;
    private long Ti1;
    private long value;
    private int $state$id;
    private boolean $state$hasNext;
    private boolean $state$nextDefined;
    private Long $state$next;
  }

  return new $state$();
}

Formatting is automatic, sorry, but anyway it's for diagnostics only. You will never see this code.

It's iteresting to say that this implementation is very precisely mimics xslt state machine implementation we have done back in 2008.

You can download YieldProcessor here. We hope that someone will find our solution very interesting.

Tuesday, 11 January 2011 16:08:41 UTC  #    Comments [0] -
Announce | Thinking aloud | Tips and tricks | xslt | Java
# Wednesday, 22 December 2010

You might be interested in the following article that was written in form of a little guide. It can educate about new ways to learn SQL and hopefully may help someone to get a job. See "How to get MS SQL certification" that was written by Michele Rouse.

Wednesday, 22 December 2010 13:16:49 UTC  #    Comments [0] -
Announce
# Monday, 20 December 2010

Several times we have already wished to see yield feature in java and all the time came to the same implementation: infomancers-collections. And every time with dissatisfaction turned away, and continued with regular iterators.

Why? Well, in spite of the fact it's the best implementation of the feature we have seen, it's still too heavy, as it's playing with java byte code at run-time.

We never grasped the idea why it's done this way, while there is post-compile time annotation processing in java.

If we would implemented the yeild feature in java we would created a @Yield annotation and would demanded to implement some well defined code pattern like this:

@Yield
Iteratable<String> iterator()
{
  // This is part of pattern.
  ArrayList<String> list = new ArrayList<String>();

  for(int i = 0; i < 10; ++i)
  {
    // list.add() plays the role of yield return.
    list.add(String.valueOf(i));
  }

  // This is part of pattern.
  return list;
}

or

@Yield
Iterator<String> iterator()
{
  // This is part of pattern.
  ArrayList<String> list = new ArrayList<String>();

  for(int i = 0; i < 10; ++i)
  {
    // list.add() plays the role of yield return.
    list.add(String.valueOf(i));
  }

  // This is part of pattern.
  return list.iterator();
}

Note that the code will work correctly even, if by mischance, post-compile-time processing will not take place.

At post comile time we would do all required refactoring to turn these implementations into a state machines thus runtime would not contain any third party components.

It's iteresting to recall that we have also implemented similar refactoring in pure xslt.

See What you can do with jxom.

Update: implementation can be found at Yield.zip

Monday, 20 December 2010 16:28:35 UTC  #    Comments [0] -
Java | Thinking aloud | Tips and tricks | xslt
Archive
<2011 April>
SunMonTueWedThuFriSat
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567
Statistics
Total Posts: 387
This Year: 0
This Month: 0
This Week: 0
Comments: 2562
Locations of visitors to this page
Disclaimer
The opinions expressed herein are our own personal opinions and do not represent our employer's view in anyway.

© 2025, Nesterovsky bros
All Content © 2025, Nesterovsky bros
DasBlog theme 'Business' created by Christoph De Baene (delarou)