It's now the time to explore CLR implementation of the Numbers and Split functions in the SQL Server.
I've created a simple C# assembly that defines two table valued functions Numbers_CLR and Split_CLR. Note that I had to fix autogenerated sql function declaration in order to replace nvarchar(4000) with nvarchar(max):
using System; using System.Collections; using System.Collections.Generic; using System.Data; using System.Data.SqlClient; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; using System.Diagnostics;
public class UserDefinedFunctions { [SqlFunction] public static long GetTimestamp() { return Stopwatch.GetTimestamp(); }
[SqlFunction] public static long GetFrequency() { return Stopwatch.Frequency; }
[SqlFunction( Name="Numbers_CLR", FillRowMethodName = "NumbersFillRow", IsPrecise = true, IsDeterministic = true, DataAccess = DataAccessKind.None, TableDefinition = "value int")] public static IEnumerator NumbersInit(int count) { for (int i = 0; i < count; i++) { yield return i; } }
public static void NumbersFillRow(Object obj, out int value) { value = (int)obj; }
[SqlFunction( Name = "Split_CLR", FillRowMethodName = "SplitFillRow", IsPrecise = true, IsDeterministic = true, DataAccess = DataAccessKind.None, TableDefinition = "value nvarchar(max)")] public static IEnumerator SplitInit(string value, string splitter) { if (string.IsNullOrEmpty(value)) yield break;
if (string.IsNullOrEmpty(splitter)) splitter = ",";
for(int i = 0; i < value.Length; ) { int next = value.IndexOf(splitter, i);
if (next == -1) { yield return value.Substring(i);
break; } else { yield return value.Substring(i, next - i);
i = next + splitter.Length; } } }
public static void SplitFillRow(Object obj, out string value) { value = (string)obj; } };
These are results of the test of differents variants of the numbers function for different numbers of lines to return (length): i description length duration msPerNumber
---- -------------- -------- ---------- -----------
0 Numbers 1 0.0964 0.0964
0 Numbers_CTE 1 0.2319 0.2319
0 Numbers_Table 1 0.1710 0.1710
0 Numbers_CLR 1 0.1729 0.1729
1 Numbers 2 0.0615 0.0307
1 Numbers_CTE 2 0.1327 0.0663
1 Numbers_Table 2 0.0816 0.0408
1 Numbers_CLR 2 0.1078 0.0539
2 Numbers 4 0.0598 0.0149
2 Numbers_CTE 4 0.1609 0.0402
2 Numbers_Table 4 0.0810 0.0203
2 Numbers_CLR 4 0.1092 0.0273
3 Numbers 8 0.0598 0.0075
3 Numbers_CTE 8 0.2308 0.0288
3 Numbers_Table 8 0.0813 0.0102
3 Numbers_CLR 8 0.1129 0.0141
4 Numbers 16 0.0598 0.0037
4 Numbers_CTE 16 0.3724 0.0233
4 Numbers_Table 16 0.0827 0.0052
4 Numbers_CLR 16 0.1198 0.0075
5 Numbers 32 0.0606 0.0019
5 Numbers_CTE 32 0.6473 0.0202
5 Numbers_Table 32 0.0852 0.0027
5 Numbers_CLR 32 0.1347 0.0042
6 Numbers 64 0.0615 0.0010
6 Numbers_CTE 64 1.1926 0.0186
6 Numbers_Table 64 0.0886 0.0014
6 Numbers_CLR 64 0.1648 0.0026
7 Numbers 128 0.0637 0.0005
7 Numbers_CTE 128 2.2886 0.0179
7 Numbers_Table 128 0.0978 0.0008
7 Numbers_CLR 128 0.2204 0.0017
8 Numbers 256 0.0679 0.0003
8 Numbers_CTE 256 4.9774 0.0194
8 Numbers_Table 256 0.1243 0.0005
8 Numbers_CLR 256 0.3486 0.0014
9 Numbers 512 0.0785 0.0002
9 Numbers_CTE 512 8.8983 0.0174
9 Numbers_Table 512 0.1523 0.0003
9 Numbers_CLR 512 0.5635 0.0011
10 Numbers 1024 0.0958 0.0001
10 Numbers_CTE 1024 17.8679 0.0174
10 Numbers_Table 1024 0.2453 0.0002
10 Numbers_CLR 1024 1.0504 0.0010
11 Numbers 2048 0.1324 0.0001
11 Numbers_CTE 2048 35.8185 0.0175
11 Numbers_Table 2048 0.3811 0.0002
11 Numbers_CLR 2048 1.9206 0.0009
12 Numbers 4096 0.1992 0.0000
12 Numbers_CTE 4096 70.9478 0.0173
12 Numbers_Table 4096 0.6772 0.0002
12 Numbers_CLR 4096 3.6921 0.0009
13 Numbers 8192 0.3361 0.0000
13 Numbers_CTE 8192 143.3364 0.0175
13 Numbers_Table 8192 1.2809 0.0002
13 Numbers_CLR 8192 7.3931 0.0009
14 Numbers 16384 0.6099 0.0000
14 Numbers_CTE 16384 286.7471 0.0175
14 Numbers_Table 16384 2.4579 0.0002
14 Numbers_CLR 16384 14.4731 0.0009
15 Numbers 32768 1.1546 0.0000
15 Numbers_CTE 32768 573.6626 0.0175
15 Numbers_Table 32768 4.7919 0.0001
15 Numbers_CLR 32768 29.0313 0.0009
16 Numbers 65536 2.3103 0.0000
16 Numbers_CTE 65536 1144.4052 0.0175
16 Numbers_Table 65536 9.5132 0.0001
16 Numbers_CLR 65536 57.7154 0.0009
17 Numbers 131072 4.4265 0.0000
17 Numbers_CTE 131072 2314.5917 0.0177
17 Numbers_Table 131072 18.9130 0.0001
17 Numbers_CLR 131072 116.4268 0.0009
18 Numbers 262144 8.7860 0.0000
18 Numbers_CTE 262144 4662.7233 0.0178
18 Numbers_Table 262144 38.3024 0.0001
18 Numbers_CLR 262144 230.1522 0.0009
19 Numbers 524288 18.4638 0.0000
19 Numbers_CTE 524288 9182.8146 0.0175
19 Numbers_Table 524288 83.4575 0.0002
19 Numbers_CLR 524288 468.0195 0.0009
These are results of the test of differents variants of the split function for different length of the string (length): i description strLength duration msPerChar
---- -------------- --------- ---------- ----------
0 Split 1 0.1442 0.1442
0 Split_CTE 1 0.2665 0.2665
0 Split_Table 1 0.2090 0.2090
0 Split_CLR 1 0.1964 0.1964
1 Split 2 0.0902 0.0451
1 Split_CTE 2 0.1788 0.0894
1 Split_Table 2 0.1087 0.0543
1 Split_CLR 2 0.1056 0.0528
2 Split 4 0.0933 0.0233
2 Split_CTE 4 0.2618 0.0654
2 Split_Table 4 0.1162 0.0291
2 Split_CLR 4 0.1143 0.0286
3 Split 8 0.1092 0.0137
3 Split_CTE 8 0.4408 0.0551
3 Split_Table 8 0.1344 0.0168
3 Split_CLR 8 0.1324 0.0166
4 Split 16 0.1422 0.0089
4 Split_CTE 16 0.7990 0.0499
4 Split_Table 16 0.1715 0.0107
4 Split_CLR 16 0.1687 0.0105
5 Split 32 0.2090 0.0065
5 Split_CTE 32 1.4924 0.0466
5 Split_Table 32 0.2458 0.0077
5 Split_CLR 32 0.4582 0.0143
6 Split 64 0.3464 0.0054
6 Split_CTE 64 2.9129 0.0455
6 Split_Table 64 0.3947 0.0062
6 Split_CLR 64 0.3880 0.0061
7 Split 128 0.6101 0.0048
7 Split_CTE 128 5.7348 0.0448
7 Split_Table 128 0.6898 0.0054
7 Split_CLR 128 0.6825 0.0053
8 Split 256 1.1504 0.0045
8 Split_CTE 256 11.5610 0.0452
8 Split_Table 256 1.3044 0.0051
8 Split_CLR 256 1.2901 0.0050
9 Split 512 2.2430 0.0044
9 Split_CTE 512 23.3854 0.0457
9 Split_Table 512 2.4992 0.0049
9 Split_CLR 512 2.4838 0.0049
10 Split 1024 4.5048 0.0044
10 Split_CTE 1024 45.7030 0.0446
10 Split_Table 1024 4.8886 0.0048
10 Split_CLR 1024 4.8601 0.0047
11 Split 2048 8.8229 0.0043
11 Split_CTE 2048 92.6160 0.0452
11 Split_Table 2048 9.7381 0.0048
11 Split_CLR 2048 9.8848 0.0048
12 Split 4096 17.6285 0.0043
12 Split_CTE 4096 184.3265 0.0450
12 Split_Table 4096 19.4092 0.0047
12 Split_CLR 4096 19.3849 0.0047
13 Split 8192 36.5924 0.0045
13 Split_CTE 8192 393.8663 0.0481
13 Split_Table 8192 39.3296 0.0048
13 Split_CLR 8192 38.9569 0.0048
14 Split 16384 70.7693 0.0043
14 Split_CTE 16384 740.2636 0.0452
14 Split_Table 16384 77.6300 0.0047
14 Split_CLR 16384 77.6878 0.0047
15 Split 32768 141.4202 0.0043
15 Split_CTE 32768 1481.5788 0.0452
15 Split_Table 32768 155.0163 0.0047
15 Split_CLR 32768 155.5904 0.0047
16 Split 65536 282.8597 0.0043
16 Split_CTE 65536 3098.3636 0.0473
16 Split_Table 65536 315.7588 0.0048
16 Split_CLR 65536 316.1782 0.0048
17 Split 131072 574.3652 0.0044
17 Split_CTE 131072 6021.9827 0.0459
17 Split_Table 131072 630.6880 0.0048
17 Split_CLR 131072 650.8676 0.0050
18 Split 262144 5526.9491 0.0211
18 Split_CTE 262144 17645.2219 0.0673
18 Split_Table 262144 5807.3244 0.0222
18 Split_CLR 262144 5759.6946 0.0220
19 Split 524288 11006.3019 0.0210
19 Split_CTE 524288 35093.2482 0.0669
19 Split_Table 524288 11585.3233 0.0221
19 Split_CLR 524288 11550.8323 0.0220
The results are:
- Recursive common table expression shows the worst timing.
- Split_CLR is on the pair with Split_Table, however Numbers_Table is better than Numbers_CLR.
- Split and Numbers based on unrolled recursion show the best timing (most of the time).
The End.
Well, several days have passed but for a some reason I've started to feel uncomfortable about Numbers function. It's all because of poor recursive CTE implementation. I have decided to unroll the cycle. The new version hovewer isn't a beautiful but is providing much more superior performance comparing with previous implementation:
/* Returns numbers table. Table has a following structure: table(value int not null); value is an integer number that contains numbers from 1 to a specified value. */ create function dbo.Numbers ( /* Number of rows to return. */ @count int ) returns table as return with Number4(Value) as ( select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 union all select 0 ), Number8(Value) as ( select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 union all select 0 from Number4 ), Number32(Value) as ( select 0 from Number8 N1, Number8 N2, Number8 N3, Number8 N4 ) select top(@count) row_number() over(order by Value) Value from Number32;
The performance achieved is on pair with numbers table. Estimated number of rows is precise whenever we pass constant as parameter.
What is the moral? - There is a space for the enhancements in the recursive CTE.
Next day
Guess what? - Yes! there is also the CLR, which allows to create one more implementation of the numbers and split functions. In the next entry I'll show it, and performance comparison of different approaches.
This task is already discussed many times. SQL Server 2005 allows to create an inline function that splits such a string. The logic of such a function is self explanatory, which also hints that SQL syntax became better:
/* Returns numbers table. Table has a following structure: table(value int not null); value is an integer number that contains numbers from 0 to a specified value. */ create function dbo.Numbers ( /* Number of rows to return. */ @count int ) returns table as return with numbers(value) as ( select 0 union all select value * 2 + 1 from numbers where value < @count / 2 union all select value * 2 + 2 from numbers where value < (@count - 1) / 2 ) select row_number() over(order by U.v) value from numbers cross apply (select 0 v) U;
/* Splits string using split character. Returns a table that contains split positions and split values: table(Pos, Value) */ create function dbo.Split ( /* A string to split. */ @value nvarchar(max), /* An optional split character.*/ @splitChar nvarchar(max) = N',' ) returns table as return with Bound(Pos) as ( select Value from dbo.Numbers(len(@value)) where (Value = 1) or (substring(@value, Value - 1, len(@splitChar)) = @splitChar) ), Word(Pos, Value) as ( select Bound.Pos, substring ( @value, Bound.Pos, case when Splitter.Pos > 0 then Splitter.Pos else len(@value) + 1 end - Bound.Pos ) from Bound cross apply (select charindex(@splitChar, @value, Pos) Pos) Splitter ) select Pos, Value from Word;
Test:
declare @s nvarchar(max);
set @s = N'ALFKI,BONAP,CACTU,FRANK';
select Value from System.Split(@s, default) order by Pos;
See also: Arrays and Lists in SQL Server, Numbers table in SQL Server 2005, Parade of numbers
SQL Server 2005 has got built-in partitions. As result, I have been given a task to port a database from SQL Server 2000 to 2005, and replace old style partitions with new one. It seems reasonable, but before modifying a production database, which is about 5TB in size, I've tested a small one.
Switch the data - it's an easy part. I need also to test all related stored procedures. At this point I've found shortcomings, which tightly related to a nature of the partitions.
In select statement SQL Server 2005 iterates over partitions, in contrast SQL Server 2000 rolls out partition view and embeds partition tables into an execution plan. The performance difference can be dramatic (the case I'm dealing with).
Suppose you are to get 'top N' rows of ordered set of data from several partitions. SQL Server 2000 can perform operations on partitions (to get ordered result per partition), and then merge them, and return 'top N' rows. However, if execution plan just iterates partitions and applies the same operations to each partition in sequential manner the result will be semiordered. To get 'top N' rows the sort operator is required. This is the case of SQL Server 2005.
The problem is that the SQL Server 2005 never uses merge operator to combine results!
To illustrate the problem let's define two partitioned tables:
create partition function [test](smalldatetime) as range left for values (N'2007-01-01', N'2007-02-01') go
create partition scheme [testScheme] as partition [test] to [primary], [primary], [primary]) go
CREATE TABLE [dbo].[Test2000_12]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL, CONSTRAINT [PK_Test2000_12] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC ) ) GO
CREATE NONCLUSTERED INDEX [IX_Test2000_12] ON [dbo].[Test2000_12] ( [B] ASC, [A] ASC ) GO
CREATE TABLE [dbo].[Test2000_01]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL, CONSTRAINT [PK_Test2000_01] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC ) ) GO
CREATE NONCLUSTERED INDEX [IX_Test2000_01] ON [dbo].[Test2000_01] ( [B] ASC, [A] ASC ) GO
CREATE TABLE [dbo].[Test2000_02]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL, CONSTRAINT [PK_Test2000_02] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC ) ) GO
CREATE NONCLUSTERED INDEX [IX_Test2000_02] ON [dbo].[Test2000_02] ( [B] ASC, [A] ASC ) GO
CREATE TABLE [dbo].[Test2005]( [A] [smalldatetime] NOT NULL, [B] [int] NOT NULL, [C] [nvarchar](50) NULL, CONSTRAINT [PK_Test2005] PRIMARY KEY CLUSTERED ( [A] ASC, [B] ASC ) ) ON [testScheme]([A]) GO
CREATE NONCLUSTERED INDEX [IX_Test2005] ON [dbo].[Test2005] ( [B] ASC, [A] ASC ) ON [testScheme]([A]) GO
ALTER TABLE [dbo].[Test2000_01] WITH CHECK ADD CONSTRAINT [CK_Test2000_01] CHECK (([A]>='2007-01-01' AND [A]<'2007-02-01')) GO ALTER TABLE [dbo].[Test2000_01] CHECK CONSTRAINT [CK_Test2000_01] GO
ALTER TABLE [dbo].[Test2000_02] WITH CHECK ADD CONSTRAINT [CK_Test2000_02] CHECK (([A]>='2007-02-01')) GO ALTER TABLE [dbo].[Test2000_02] CHECK CONSTRAINT [CK_Test2000_02] GO
ALTER TABLE [dbo].[Test2000_12] WITH CHECK ADD CONSTRAINT [CK_Test2000_12] CHECK (([A]<'2007-01-01')) GO ALTER TABLE [dbo].[Test2000_12] CHECK CONSTRAINT [CK_Test2000_12] GO
create view [dbo].[test2000] as select * from dbo.test2000_12 union all select * from dbo.test2000_01 union all select * from dbo.test2000_02 go
/* Returns numbers table. Table has a following structure: table(value int not null); value is an integer number that contains numbers from 0 to a specified value. */ create FUNCTION dbo.[Numbers] ( /* Number of rows to return. */ @count int ) RETURNS TABLE AS RETURN with numbers(value) as ( select 0 union all select value * 2 + 1 from numbers where value < @count / 2 union all select value * 2 + 2 from numbers where value < (@count - 1) / 2 ) select row_number() over(order by U.v) value from numbers cross apply (select 0 v) U
Pupulate tables:
insert into dbo.Test2005 select cast(N'2006-01-01' as smalldatetime) + 0.001 * N.Value, N.Value, N'Value' + cast(N.Value as nvarchar(16)) from dbo.Numbers(500000) N go
insert into dbo.Test2000 select cast(N'2006-01-01' as smalldatetime) + 0.001 * N.Value, N.Value, N'Value' + cast(N.Value as nvarchar(16)) from dbo.Numbers(500000) N go
Perform a test:
select top 20 A, B from dbo.Test2005 --where --(A between '2006-01-10' and '2007-01-10') order by B
select top 20 A, B from dbo.Test2000 --where --(A between '2006-01-10' and '2007-01-10') order by B --option(merge union)
The difference is obvious if you will open execution plan. In the first case estimated subtree cost is: 17.4099; in the second: 0.0455385.
SQL server cannot efficiently use index on columns (B, A). The problem presented here can appear in any select that occasionally accesses two partitions, but regulary uses only one, provided it uses a secondary index. In fact this covers about 30% of all selects in my database.
Next day
I've meditated a little bit more and devised a centaur: I can define a partition view over partition table. Thus I can use either this view or table depending on what I'm trying to achieve either iterate partitions or roll them out.
create view [dbo].[Test2005_View] as select * from dbo.Test2005 where $partition.test(A) = 1 union all select * from dbo.Test2005 where $partition.test(A) = 2 union all select * from dbo.Test2005 where $partition.test(A) = 3
The following select is running the same way as SQL Server 2000 partitions:
select top 20 A, B from dbo.Test2005_View -- dbo.Test2005 order by B
In one of our latest projects (GUI on .NET 2.0) we've felt all the power of .NET globalization, but an annoying thing happened too...
In our case such an annoying thing was sharing of UI culture info between main (UI) thread and all auxiliary threads (threads from ThreadPool, manually created threads etc.). It seems we've fallen into a .NET globalization pitfall.
We guessed that the same as main thread UI culture info for, at least, all asynchronous delegates' calls is used. This is a common mistake, and what's more annoying, there is no a single line in MSDN documentation about this issue.
Let's look closer at this issue. Our application starts on computer with English regional settings ("en-En"), and during application starting we are changing UI culture info to one specified in configuration file: // set the culture from the config file
try
{
Thread.CurrentThread.CurrentUICulture =
new CultureInfo(Settings.Default.CultureName);
}
catch
{
// use the default UI culture info
}
Thus, all the screens of this GUI application will be displayed according with the specified culture. There are also localized strings stored in resource files that are used as log, exception messages etc., which can be displayed from within different threads (e.g. asynchronous delegates' calls).
So, when application is running and even all screens are displayed according with the specified culture, all the exceptions from auxiliary threads still in English. This happened since threads for asynchronous calls are pulled out from ThreadPool, and all these threads were created using default culture.
Conclusion Take care about CurrentUICulture in different threads by yourself, and be careful - there are still pitfalls on this way...
I need to log actions into log table in my stored procedure, which is called in context of some transaction. The records in the log table I need no matter what happens (no, it's even more important to get them there if operation fails).
begin transaction ... execute some_proc ... if (...) commit transaction else rollback transaction
some_proc:
...
insert into log...
insert ... update ...
insert into log...
...
How to do this?
November 25
I've found two approaches:
- table variables, which do not participate into transactions;
- remote queries, which do not participate into local transactions;
The second way is more reliable, however not the fastest one. The idea is to execute query on the same sever as if it's a linked server.
Suppose you have a log table:
create table System.Log ( ID int identity(1,1) not null, Date datetime not null default getdate(), Type int null, Value nvarchar(max) null );
To add log record you shall define a stored procedure:
create procedure System.WriteLog ( @type int, @message nvarchar(max) ) as begin set nocount on;
execute( 'insert into dbname.System.Log(Type, Value) values(?, ?)', @type, @message) as user = 'user_name' at same_server_name; end
Whenever you're calling System.WriteLog in context of local transaction the records are inserted into the System.Log table in a separate transaction.
Читая его впервые, студентом, видишь его вещающим оракулом. Перечитывая, замечаешь, что он в значительной степени критик прошлого и настоящего.
Какая ирония - он ведь предупреждал о вреде истории для творца! Кто изменилcя - я или Ницше?
My next SQL puzzle (thanks to fabulous XQuery support in SQL Server 2005) is how to reconstruct xml from the hierarchy table. This is reverse to the "Load xml into the table".
Suppose you have: select Parent, Node, Name from Data
where (Parent, Node) - defines xml hierarchy, and Name - xml element name.
How would you restore original xml?
November 8, 2006 To my anonymous reader:
declare @content nvarchar(max);
set @content = '';
with Tree(Node, Parent, Name) as ( /* Source tree */ select Node, Parent, Name from Data ), Leaf(Node) as ( select Node from Tree except select Parent from Tree ), NodeHeir(Node, Ancestor) as ( select Node, Parent from Tree union all select H.Node, T.Parent from Tree T inner join NodeHeir H on H.Ancestor = T.Node ), ParentDescendants(Node, Descendats) as ( select Ancestor, count(Ancestor) from NodeHeir where Ancestor > 0 group by Ancestor ), Line(Row, Node, Text) as ( select O.Row, T.Node, O.Text from ParentDescendants D inner join Tree T on D.Node = T.Node cross apply ( select D.Node * 2 - 1 Row, '<' + T.Name + '>' Text union all select (D.Node + D.Descendats) * 2, '</' + T.Name + '>' ) O union all select D.Node * 2 - 1, T.Node, '<' + T.Name + '/>' from Leaf D inner join Tree T on D.Node = T.Node ) select top(cast(0x7fffffff as int)) @content = @content + Text from Line order by Row asc, Node desc option(maxrecursion 128);
select cast(@content as xml);
Well, I like DasBlog Engine, however it does not allow to add new comments in our blog. This is unfortunate.
In the activity log I regulary see errors related to the CAPTCHA component. For now I have switched it off. I believe we'll start getting comments at least one per year.
Say you need to load a table from an xml document, and this table defines some hierarchy. Believe me or not, but this is not that case when its better to store xml in the table.
Let's presume the table has:
- Node - document node id;
- Parent - parent node id;
- Name - node name.
The following defines a sample xml document we shall work with: declare @content xml;
set @content = '
<document>
<header/>
<activity>
<title/>
<row/>
<row/>
<row/>
<row/>
<total/>
</activity>
<activity>
<title/>
<row/>
<total/>
</activity>
<activity>
<title/>
<row/>
<total/>
</activity>
<activity>
<title/>
<row/>
<row/>
<row/>
<total/>
</activity>
</document>';
How would you solved this task?
I've been spending a whole day building acceptable solution. This is probably because I'm not an SQL guru. I've found answers using cursors, openxml, pure xquery, and finally hybrid of xquery and sql ranking functions.
The last is fast, and has linear dependency of working time to xml size. with NodeGroup(ParentGroup, Node, Name) as
(
select
dense_rank() over(order by P.Node),
row_number() over(order by N.Node),
N.Node.value('local-name(.)', 'nvarchar(max)')
from
@content.nodes('//*') N(Node)
cross apply
Node.nodes('..') P(Node)
),
Node(Parent, Node, Name) as
(
select
min(Node) over(partition by ParentGroup) - 1, Node, Name
from
NodeGroup
)
select * from Node order by Node;
Is there a better way? Anyone?
Return a table of numbers from 0 up to a some value. I'm facing this recurring task once in several years. Such periodicity induces me to invent solution once again but using contemporary features.
November 18:
This time I have succeeded to solve the task in one select:
declare @count int;
set @count = 1000;
with numbers(value) as ( select 0 union all select value * 2 + 1 from numbers where value < @count / 2 union all select value * 2 + 2 from numbers where value < (@count - 1) / 2 ) select row_number() over(order by U.V) value from numbers cross apply (select 1 V) U;
Do you have a better solution?
Do you think they are different? I think not too much.
Language, for a creative programmer, is a matter to express his virtues, and a hammer that brings a good salary for a skilled labourer.
Each new generation of programmers tries to prove itself. But how?
Well, C++ programmers invent their strings and smart pointers, and Java adepts (as they cannot create strings) design their springs and rubies.
It's probably all right - there is no dominance, but when I'm reading docs of someone's last pearl, I'm experiencing deja vu. On the whole, all looks as a chaotic movement.
Other time I think - how it's interesting to build something when you should not design the brick, even if you know that bricks aren't perfect.
I've been given a task to fix several xsls (in fact many big xsls) that worked with msxml and stoped to work with .NET. At first I thought it will be easy stuff, indeed both implementations are compatible as both implement http://www.w3.org/1999/XSL/Transform.
Well, I was wrong. After a 10 minutes I've been abusing that ignorant who has written xsls. More over, I was wondering how msxml could accept that shit.
So, come to the point. I had following xsl:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:user="http://mycompany.com/mynamespace">
<xsl:template match="/"> <HTML dir="rtl"> ... <BODY dir="rtl"> ... <TD width="68" dir="ltr" align="middle"> <FONT size="1" face="David"> <xsl:variable name="DegB" select="//*[@ZihuyMuzar='27171']" /> </FONT> </TD> ... <TD height="19" dir="ltr" align="middle"> <FONT size="2" face="David"> <xsl:value-of select="$DegB/*/Degem[1]/@*[3]" /> % </FONT> </TD> ...
I don't want to talk about "virtues" of "html" that's produced by this alleged "xsl", however about xsl itself. To my amazement msxml sees $DegB, which is declared and dies in different scope. At first I thought I was wrong: "Must be scope is defined defferently then I thought?", but no. OK, I said to myself I can fix that. I've created another xsl that elevates xsl:variable declarations to a scope where they are visible to xsl:value-of.
But that wasn't my main head ache. Some genius has decided to use third, forth, and so on attribute. What does this mean in the god's sake? How one could rely on this? I'll kill him if I'll find him! There was thousands of such @*[3]. Even if I'll see the original xml how can I be sure that msxml and .NET handle attribute collections in the same order?
There was no other way, but check this assumption. I've verified that both implementations store attributes in xml source order. This easied my pains.
To clarify implementation details I have digged in XmlDocument/XPathDocument implementations in .NET 1.1 and 2.0. I was curious how they store a set of attributes. It's interesting to know that they has decided to keep only ordered list of attributes in either implementation. This means that ordered attribute access is fast, and named access leads to list scan. In my opinion it's dubious solution. Probably the idea behind is that there is in average a few attributes to scan when one uses named access. In my case there were up to 100 attributes per element.
Conclusion? 1. Don't allow to ignorants to come near the xsl. 2. Don't design xmls that use many attributes when you're planning to use .NET xslt.
Recently we were creating a BizTalk 2006 project. A map was used to normalize input data, where numbers were stored with group separators like "15,000,000.00" and text (Hebrew in our case) was stored visually like "ןודנול סלפ קנב סדיולל".
We do need to store the output data the xml way, this means numbers as "15000000.00" and Hebrew text in logical form "ללוידס בנק פלס לונדון". Well, it's understood that there are no standard functoids that deal with bidi, as there are no too many people that know about the problem in the first place. However we thought at least that there will not be problems with removing of "," in numbers.
BizTalk 2006 does not provide functoids to solve either of these tasks! To answer our needs we have designed two custom functoids.
"Replace string": Returns a string with text Replaced using a regular expression or search string. First parameter is a string where to Replace. Second parameter is a string or regular expression pattern in the format /pattern/flags to Replace. Third parameter is a string or regular expression pattern that Replaces all found matches.
"Logical to visual converter": Converts an input "logical" string into a "visual" string. First parameter is a string to convert. Optional second parameter is a start embedding level (LTR or RTL).
Download sample code.
In our recent .NET 2.0 GUI project our client ingenuously asked us to implement undo and redo facility. Nothing unusual nowadays, however it's still not the easiest thing in the world to implement.
Naturally you want to have this feature for a free. You do not want to invest too much time to support it. We had no much time to implement this "sugar" also. I know, I know, this is important for a user, however when you're facing a big project with a lot of logic to be implemented in short time you're starting to think it would be nice to have undo and redo logic that works independently (at least almost independently) on business logic.
Thus, what's that place where we could plug this service? - Exactly! - It's data binding layer.
When you're binding your data to controls the "Type Descriptor Architecture" is used to retrieve and update the data. Fortunately this architecture is allowing us to create a data wrapper (ICustomTypeDescriptor). Such wrapper should track property modifications of the data object thus providing undo and redo service. In short that's all, other are technical details.
Let's look at how undo and redo service goes into the action. Instead of: bindingSource.DataSource = data;
you have to write: bindingSource.DataSource = Create-UndoRedo-Wrapper(data);
There should also be a class to collect and track actions. User should create an instance of this class to implement the simplest form of code with undo and redo support: // Create UndoRedoManager. undoRedoManager = new UndoRedoManager(); // Create undo and redo wrapper around the data object. // Bind controls. dataBindingSource.DataSource = new UndoRedoTypeDescriptor(data, undoRedoManager);
Now turn our attention to the implementation of the undo and redo mechanism. There are two types in the core: UndoRedoManager and IAction. The first one is to track actions, the later one is to define undo and redo actions. UndoRedoManager performs either "Do/Redo", or "Undo" operations over IAction instances. We have provided two useful implementations of the IAction interface: UndoRedoTypeDescriptor - wrapper around an object that tracks property changes, and UndoRedoList - wrapper around the IList that tracks collection modifications. Users may create their implementations of the IAction to handle other undo and redo activities.
We have created a sample application to show undo and redo in action. You can download it from here.
|