These are initial positions for this writing:
- SQL Server allows to execute dynamic SQL.
- Dynamic SQL is useful and often unavoidable, e.g. when you have to filter or order data in a way that you cannot code efficiently in advance.
- Dynamic SQL has proven to be a dangerous area, as with improper use it can open hole in a security.
In general nothing stops you from building and then excuting of SQL string. Our goal, however, is to define rules that make work with dynamic SQL is more managable and verifiable.
Here we outline these rules, and then give some examples and tips.
Rule #1. Isolate dynamic SQL
Put all logic related to building of dynamic SQL into a separate function.
We usually define a separate scheme Dynamic
, and define functions like Dynamic.GetSQL_XXX(params)
.
This makes it simple to perform code review.
Rule #2. Xml as parameters
Use xml
type to pass parameters to a function that builds dynamic SQL.
In many cases dynamic SQL depends on variable number of parameters (like a list of values to check against).
Xml fits here to represent structured information.
On a client (e.g. in C# or java) you can define a class with all parameters, populate an instance and serialize it to an xml.
Rule #3. XQuery as template language
Use XQuery to define SQL template and to generate SQL tree from the input parameters.
Here is an example of such XQuery:
@data.query('
<sql>
select
T.*
from
Data.Ticket T
where
{
for $ticketID in data/ticketID return
<sql>(T.TicketID = <int>{$ticketID}</int>) and </sql>
}
(1 = 1)
</sql>')
You can see that output is an xml
with sql
element to represent literal SQL, and int
element to represent integer literal.
In fact whole output schema can be defined like this:
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="sql"/>
<xs:element name="name"/>
<xs:element name="string" nillable="true"/>
<xs:element name="int" nillable="true"/>
<xs:element name="decimal" nillable="true"/>
<xs:element name="date" nillable="true"/>
<xs:element name="time" nillable="true"/>
<xs:element name="datetime" nillable="true"/>
</xs:schema>
where sql
is to represent literal content, name
to represent a name, and other elements to represent different literal values.
Rule #4. Escape literals
Use function Dynamic.ToSQL(@template)
to build final SQL text.
Here we quote the definition:
-- Builds a text of SQL function for an sql template.
create function Dynamic.ToSQL
(
-- SQL template.
@template xml
)
returns nvarchar(max)
with returns null on null input
as
begin
return
(
select
case
when N.Node.exist('*[xs:boolean(@xsi:nil)]') = 1 then
'null'
when N.Node.exist('self::int') = 1 then
isnull(N.Node.value('xs:int(.)', 'nvarchar(max)'), '# int #')
when N.Node.exist('self::string') = 1 then
'N''' +
replace
(
N.Node.value('.', 'nvarchar(max)'),
'''',
''''''
) +
''''
when N.Node.exist('self::name') = 1 then
isnull
(
quotename(N.Node.value('.', 'nvarchar(128)'), '['),
'# name #'
)
when N.Node.exist('self::datetime') = 1 then
isnull
(
'convert(datetime2, ''' +
N.Node.value('xs:dateTime(.)', 'nvarchar(128)') +
''', 126)',
'# datetime #'
)
when N.Node.exist('self::date') = 1 then
isnull
(
'convert(date, ''' +
N.Node.value('xs:date(.)', 'nvarchar(128)') +
''', 126)',
'# date #'
)
when N.Node.exist('self::time') = 1 then
isnull
(
'convert(time, ''' +
N.Node.value('xs:time(.)', 'nvarchar(128)') +
''', 114)',
'# time #'
)
when N.Node.exist('self::decimal') = 1 then
isnull
(
N.Node.value('xs:decimal(.)', 'nvarchar(128)'),
'# decimal #'
)
when N.Node.exist('self::*') = 1 then
'# invalid template #'
else
N.Node.value('.', 'nvarchar(max)')
end
from
@template.nodes('//sql/node()[not(self::sql)]') N(Node)
for xml path(''), type
).value('.', 'nvarchar(max)');
end;
Now, we want to stress that this function plays an important role in prevention of the SQL injection, as it escapes literals from the SQL tree.
Rule #5 (optional). Collect data
Use SQL to collect additional data required to build dynamic SQL. Here is an example of how we get a Ticket
by StatusID
, while on input we receive a StatusName
:
create function Dynamic.GetSQL_GetTicketByStatus(@data xml)
returns nvarchar(max)
as
begin
set @data =
(
select
@data,
(
select
T.StatusID
from
@data.nodes('/data/status') N(Node)
inner join
Metadata.Status T
on
T.StatusName = Node.value('.', 'nvarchar(128)')
for xml auto, type, elements
)
for xml path('')
);
return Dynamic.ToSQL
(
@data.query
('
<sql>
select
T.*
from
Data.Ticket T
where
T.Status in ({ for $status in /T/StatusID return <sql><int>{$status}</int>,</sql> } null)
</sql>
')
);
end;
Notice code in red that collects some more data before calling XQuery.
Rule #6. Execute
The final step is to call dynamic SQL.
This is done like this:
-- build
declare @sql nvarchar(max) = Dynamic.GetSQL_GetTicket(@data);
-- execute
execute sp_executesql
@sql
-- {, N'@parameter_name data_type [ OUT | OUTPUT ][ ,...n ]' }
-- { , [ @param1 = ] 'value1' [ ,...n ] }
with result sets
(
(
TicketID int not null,
CreatedAt datetime2 not null,
Summary nvarchar(256) null,
Status int,
Severity int,
DeadLineAt datetime2 null
)
);
Notice that the use of dynamic SQL does not prevent static parameters.
Notice also that with result sets
clause is used to specify output.
Example. Tickets system
Let's assume you're dealing with a tickets system (like Bugzilla), and you have a table Data.Ticket
to describe tickets. Assume that DDL for this table is like this:
create table Data.Ticket
(
TicketID bigint not null primary key,
CreatedAt datetime2 not null,
Summary nvarchar(128) null,
Status int not null,
UpdatedAt datetime2(7) not null
)
Suppose you have to build C# code to search different tickets, where Entity Framework is used to access the database.
Search should be done by a range of CreatedAt
, a range of UpdatedAt
, Summary
, or by different Status
values. It should be possible to order results in different ways.
We start out solution from the C# and define classes for a request:
public enum Direction
{
Asc,
Desc
}
public struct Order
{
public string Field { get; set; }
public Direction Direction {get; set; }
}
public class DateRange
{
public DateTime? From { get; set; }
// This property is to omit From element if value is null.
// See rules for xml serialization.
public bool FromSpecified
{
get { return From != null; }
}
public DateTime? To { get; set; }
public bool ToSpecified
{
get { return To != null; }
}
}
public class TicketsRequest
{
public DateRange CreatedAt { get; set; }
public string Summary { get; set; }
public DateRange UpdatedAt { get; set; }
[XmlElement]
public Order[] Order { get; set; }
[XmlElement]
public int[] Status { get; set; }
}
Notice that we're going to use XmlSerializer
to convert request to xml and then to pass parameter into EF's model. Here is utility method to perform such conversion:
public static string ToXmlString<T>(T value)
{
if (value == null)
{
return null;
}
var serializer = new XmlSerializer(typeof(T));
var builder = new StringBuilder();
var writer = XmlWriter.Create(
builder,
new XmlWriterSettings
{
OmitXmlDeclaration = true,
Indent = false
});
serializer.Serialize(writer, value);
writer.Flush();
return builder.ToString();
}
Now we proceed to the database and define a procedure that runs the search:
-- Gets tickets.
create procedure Data.GetTickets
(
-- A query parameters.
@params xml
)
as
begin
set nocount on;
-- This is for EF to guess type of result.
if (1 = 0)
begin
select
TicketID,
CreatedAt,
Summary,
Status,
UpdatedAt
from
Data.Ticket;
end;
declare @sql nvarchar(max) = Dynamic.GetSQL_GetTickets(@params);
execute sp_executesql @sql
with result sets
(
(
TicketID int not null,
CreatedAt datetime2 not null,
Summary nvarchar(256) null,
Status int,
UpdatedAt datetime2 null
)
);
end;
Switch back to C#, import the Data.GetTickets
into the EF model, and create a search method:
public IEnumerable<Ticket> GetTickets(TicketsRequest request)
{
var model = new Model();
return model.GetTickets(ToXmlString(request));
}
The last ingredient is Dynamic.GetSQL_GetTickets()
function.
create function Dynamic.GetSQL_GetTickets(@data xml)
returns nvarchar(max)
as
begin
return Dynamic.ToSQL
(
@data.query('
<sql>
select
T.TicketID,
T.CreatedAt,
T.Summary,
T.Status,
T.UpdatedAt
from
Data.Ticket T
where
{
for $range in */CreatedAt return
(
for $date in $range/From return
<sql>
(T.CreatedAt >= <datetime>{$date}</datetime>) and
</sql>,
for $date in $range/To return
<sql>
(<datetime>{$date}</datetime> > T.CreatedAt) and
</sql>
),
for $range in */UpdatedAt return
(
for $date in $range/From return
<sql>
(T.UpdatedAt >= <datetime>{$date}</datetime>) and
</sql>,
for $date in $range/To return
<sql>
(<datetime>{$date}</datetime> > T.UpdatedAt) and
</sql>
),
for $summary in */Summary return
<sql>
(T.Summary like <string>{$summary}</string>) and
</sql>,
if (*/Status) then
<sql>
T.Status in
({
for $status in */Status return
<sql><int>{$status}</int>, </sql>
} null) and
</sql>
else ()
}
(1 = 1)
order by
{
for $order in
*/Order
[
Field = ("TicketID", "CreatedAt", "Summary", "UpdatedAt", "Status")
]
return
<sql>
<name>{$order/Field}</name>
{" desc"[$order[Direction = "Desc"]]},
</sql>
}
(select null)
</sql>
')
);
end;
SQL text from Dynamic.GetSQL_GetTickets()
Consider now SQL text produced by this function. For an input:
<TicketsRequest>
<CreatedAt>
<From>2014-01-01T00:00:00</From>
</CreatedAt>
<Summary>hello%</Summary>
<Order>
<Field>Status</Field>
<Direction>Desc</Direction>
</Order>
<Status>1</Status>
<Status>3</Status>
</TicketsRequest>
the output is:
select
T.TicketID,
T.CreatedAt,
T.Summary,
T.Status,
T.UpdatedAt
from
Data.Ticket T
where
(T.CreatedAt >= convert(datetime2, '2014-01-01T00:00:00', 126)) and
(T.Summary like N'hello%') and
T.Status in
(1, 3, null) and
(1 = 1)
order by
[Status] desc,
(select null)
Though the text is not formatted as we would like, it's perfectly valid SQL.
Tips for building XQuery templates
What is called XQuery in SQL Server is in fact a very limited subset of XQuery 1.0. Microsoft clearly states this fact. What is trivial in XQuery is often impossible or ugly in XQuery of SQL Server.
Nevertheless XQuery in SQL Server works rather well as SQL template language. To make it most efficient, however, you should learn several tips.
Tip #1. Where clause
In template you might want to build a where
clause:
<sql>
select
...
where
{
if (...) then
<sql>...</sql>
else ()
}
</sql>
and it might happen that for a certain input a condition under where
might collapse, and you will be left with where
keyword without a real condition, which is wrong. A simple work around is to always add some true condition under ther where
like this:
<sql>
select
...
where
{
if (...) then
<sql>... and </sql>
else ()
} (1 = 1)
</sql>
Tip #2. "in" expression
If you want to generate "in" expression like this:
value in (item1, item2,...)
then you might find that it's much easier generate equivalent a code like this:
value in (item1, item2,..., null)
.
Here is a XQuery to generate such template:
value in
({
for $item in ... return
<sql><int>{$item}</int>, </sql>
} null) and
Tip #3. Order by
You can conclude an order by
clause built from a data with a dummy expression like this:
order by
{
for $item in ... return
<sql>
<name>{$item/Field}</name>
{" desc"[$item/Direction = "Desc"]},
</sql>
} (select null)
Alternatively you can use first column from a clustered index.
Tip #4. Group by
In a group by
clause we cannot introduce terminator expression as it was with order by
, so a code is a less trivial:
{
let $items := ... return
if ($items) then
<sql>
group by <name>{$items[1]}</name>
{
for $item in $items[position() > 1] return
<sql>, <name>{$item}</name></sql>
}
</sql>
else ()
}
In fact similar logic may work with order by
.
Tip #5. Escape literals
It's crusial not to introduce SQL injection while building SQL. Thus use:
<int>{...}</int>
- for literal int;
<decimal>{...}</decimal>
- for literal decimal;
<string>{...}</string>
- for literal string;
<datetime>{...}</datetime>
- for literal datetime2;
<date>{...}</date>
- for literal date;
<time>{...}</time>
- for literal time;
<name>{...}</name>
- for a name to quote.
Note that you can use xsi:nil
, so
<int xsi:nil="true"/>
means null
.
If you generate a field name from an input data then it worth to validate it against a list of available names.
Tip #6. Validate input.
It worth to define xml schema for an input xml, and to validate parameters against it.
This makes code more secure, and also adds a documentation.
Tip #7. Don't abuse dynamic SQL
There are not too many cases when you need a dynamic SQL. Usually SQL engine knows how to build a good execution plan. If your query contains optional conditions then you can write it a way that SQL Server can optimize, e.g.:
select
*
from
T
where
((@name is null) or (Name = @name)) and
((@date is null) or (Date = @date))
option(recompile)