Introduction
We migrate code out of MF to Azure.
Tool we use produces plain good functionally equivalent C# code.
But it turns it's not enough!
So, what's the problem?
Converted code is very slow, especially for batch processing,
where MF completes job, say in 30 minutes, while converted code
finishes in 8 hours.
At this point usually someone appears and whispers in the ear:
Look, those old technologies are proven by the time. It worth to stick to old good Cobol, or better to Assembler if you want to do the real thing.
We're curious though: why is there a difference?
Turns out the issue lies in differences of network topology between MF and Azure solutions.
On MF all programs, database and file storage virtually sit in a single box, thus network latency is negligible.
It's rather usual to see chatty SQL programs on MF that are doing a lot of small SQL queries.
In Azure - programs, database, file storage are different services most certainly sitting in different phisical boxes.
You should be thankfull if they are co-located in a single datacenter.
So, network latency immediately becomes a factor.
Even if it just adds 1 millisecond per SQL roundtrip, it adds up in loops, and turns in the showstopper.
There is no easy workaround on the hardware level.
People advice to write programs differently: "Tune applications and databases for performance in Azure SQL Database".
That's a good advice for a new development but discouraging for migration done by a tool.
So, what is the way forward?
Well, there is one. While accepting weak sides of Azure we can exploit its strong sides.
Parallel refactoring
Before continuing let's consider a code demoing the problem:
public void CreateReport(StringWriter writer)
{
var index = 0;
foreach(var transaction in dataService.
GetTransactions().
OrderBy(item => (item.At, item.SourceAccountId)))
{
var sourceAccount = dataService.GetAccount(transaction.SourceAccountId);
var targetAccount = transaction.TargetAccountId != null ?
dataService.GetAccount(transaction.TargetAccountId) : null;
++index;
if (index % 100 == 0)
{
Console.WriteLine(index);
}
writer.WriteLine($"{index},{transaction.Id},{
transaction.At},{transaction.Type},{transaction.Amount},{
transaction.SourceAccountId},{sourceAccount?.Name},{
transaction.TargetAccountId},{targetAccount?.Name}");
}
}
This cycle queries transactions, along with two more small queries to get source and target accounts for each transaction. Results are printed into a report.
If we assume query latency just 1 millisecond, and try to run such code for 100K transactions we easily come to 200+ seconds of execution.
Reality turns to be much worse. Program spends most of its lifecycle waiting for database results, and iterations don't advance until all work of previous iterations is complete.
We could do better even without trying to rewrite all code!
Let's articulate our goals:
- To make code fast.
- To leave code recognizable.
The idea is to form two processing pipelines:
- (a) one that processes data in parallel out of order;
- (b) other that processes data serially, in original order;
Each pipeline may post sub-tasks to the other, so (a) runs its tasks in parallel unordered, while (b) runs its tasks as if everything was running serially.
So, parallel plan would be like this:
- Queue parallel sub-tasks (a) for each transaction.
- Parallel sub-task in (a) reads source and target accounts, and queues serial sub-task (b) passing transaction and accounts.
- Serial sub-task (b) increments index, and writes report record.
- Wait for all tasks to complete.
To reduce burden of task piplelines we use Dataflow (Task Parallel Library), and encapsulate everything in a small wrapper.
Consider refactored code:
public void CreateReport(StringWriter writer)
{
using var parallel = new Parallel(options.Value.Parallelism);
var index = 0;
parallel.ForEachAsync(
dataService.
GetTransactions().
OrderBy(item => (item.At, item.SourceAccountId)),
transaction =>
{
var sourceAccount = dataService.GetAccount(transaction.SourceAccountId);
var targetAccount = transaction.TargetAccountId != null ?
dataService.GetAccount(transaction.TargetAccountId) : null;
parallel.PostSync(
(transaction, sourceAccount, targetAccount),
data =>
{
var (transaction, sourceAccount, targetAccount) = data;
++index;
if (index % 100 == 0)
{
Console.WriteLine(index);
}
writer.WriteLine($"{index},{transaction.Id},{
transaction.At},{transaction.Type},{transaction.Amount},{
transaction.SourceAccountId},{sourceAccount?.Name},{
transaction.TargetAccountId},{targetAccount?.Name}");
});
});
}
Consider ⬅️ points:
- We create
Parallel utility class passing degree of parallelism requested.
- We iterate transactions using
parallel.ForEachAsync() that queues parallel sub-tasks for each transaction, and then waits until all tasks are complete.
- Each parallel sub-task recieves a transaction. It may be called from a different thread.
- Having recieved required accounts we queue a sub-task for synchronous execution using
parallel.PostSync() , and
- Pass there data collected in parallel sub-task: transaction and accounts.
- We deconstruct data passed into variables, and then proceed with serial logic.
What we achieve with this refactoring:
- Top level query that brings transactions is done and iterated serially.
- But each iteration body is run in parallel. By default we set it up to allow up to 100 parallel executions.
All those parallel sub-task do not wait on each other so their waitings do not add up.
- Sync sub-tasks are queued and executed in order of their serial appearance, so increments and report records are not a subject of race conditions, nor a subject of reordering of output records.
We think that such refactored code is still recognizible.
As for performance this is what log shows:
Serial test
100
...
Execution time: 00:01:33.8152540
Parallel test
100
...
Execution time: 00:00:05.8705468
Reference
Please take a look at project to understand implementation details, and in particular
Parallel class implementing API to post parallel and serial tasks, run cycles and some more.
Please continue reading on GitHub.
While doing a migration of some big xslt 3 project into plain C# we run into a case that was not obvious to resolve.
Documents we process can be from a tiny to a moderate size. Being stored in xml they might take from virtually zero to, say, 10-20 MB.
In C# we may rewrite Xslt code virtually in one-to-one manner using standard features like XDocument, LINQ, regular classes, built-in collections, and so on. Clearly C# has a reacher repertoire, so task is easily solved unless you run into multiple opportunities to solve it.
The simplest solution is to use XDocument API to represent data at runtime, and use LINQ to query it. All features like xslt keys, templates, functions, xpath sequences, arrays and maps and primitive types are natuarally mapped into C# language and its APIs.
Taking several xslt transformations we could see that xslt to C# rewrite is rather straightforward and produces recognizable functional programs that have close C# source code size to their original Xslt. As a bonus C# lets you write code in asynchronous way, so C# wins in a runtime scalability, and in a design-time support.
But can you do it better in C#, especially when some data has well defined xml schemas?
The natural step, in our opinion, would be to produce C# plain object model from xml schema and use it for runtime processing.
Fortunately .NET has xml serialization attributes and tools to produce classes from xml schemas. With small efforts we have created a relevant class hierarchy for a rather big xml schema. XmlSerializer is used to convert object model to and from xml through XmlReader and XmlWriter. So, we get typed replacement of generic XDocument that still supports the same LINQ API over collections of objects, and takes less memory at runtime.
The next step would be to commit a simple test like:
-
read object model;
-
transform it;
-
write it back.
We have created such tests both for XDocument and for object model cases, and compared results from different perspectives.
Both solution produce very similar code, which is also similar to original xslt both in style and size.
Object model has static typing, which is much better to support.
But the most unexpected outcome is that object model was up to 20% slower due to serialization and deserialization even with pregenerated xmlserializer assemblies. Difference of transformation performance and memory consumption was so unnoticable that it can be neglected. These results were confirmed with multiple tests, with multiple cycles including heating up cycles.
Here we run into a case where static typing harms more than helps. Because of the nature of our processing pipeline, which is offline batch, this difference can be mapped into 10th of minutes or even more.
Thus in this particular case we decided to stay with runtime typing as a more performant way of processing in C#.
Couple of days ago, while integrating with someones C# library, we had to debug it, as something went wrong.
The code is big and obscure but for the integration purposes it's rather simple: you just create and call a class, that's all.
Yet, something just did not work. We had to prove that it's not our fault, as the other side is uncooperative and would not run common debug session to resolve the problem.
To simplify the matter as much as possible here is the case:
var input = ...
var x = new X();
var output = x.Execute(input);
You pass correct input, and get correct output. Simple, right? But it did not work!
So, we delved into the foreign code, and this is what we have seen:
class X: Y
{
public Output Execute(Input input)
{
return Perform(input);
}
protected override Output Run(Input input)
{
...
return output;
}
}
class Y: Z
{
...
}
class Z
{
protected Output Perform(Input input)
{
return Run(Input);
}
protected virtual Output Run(Input input)
{
return null;
}
}
Do you see, still flow is simple, right?
We call X.Execute() , it calls Z.Perform() , which in turn calls overriden X.Run() that returns the result.
But to our puzzlement we got null on output, as if Z.Run() was called!
We stepped through the code in debugger and confirmed that Z.Perform() calls Z.Run() , even though "this " instance is of type X .
How can it be? It's a nonsence! Yet, no overriden method was ever called.
No matter how much scrunity we applied to sources X and Z it just did not work.
We verified that the signature of X.Run() matches the signature of Z.Run() , so it overrides the method.
Then what do we see here?
And then enlightenment come! Yes, X.Run() overrides the method, but what method?
We looked closely at class Y , and bingo, we can see there following:
class Y: Z
{
...
protected virtual Output Run(Input input)
{
return null;
}
...
}
So, X.Run() overrides Y.Run() and not Z.Run() !
Per .NET Y.Run() and Z.Run() are two independant virtual methods, where Y.Run() in addition hides Z.Run() .
IDE even issued a warning that it's better declare Y.Run() as:
protected new virtual Output Run(Input input)
{
return null;
}
So, someones code was plainly wrong: Y.Run() had to use override rather than virtual .
We won, right?
Well, it's hard to call it a win.
We spent a hour looking at someones ugly code just to prove we're still sane.
So, what is conclusion of this story?
We think here it is:
- be cautious while looking at someones code;
- look at IDE warnings, don't disregard them, and try to resolve all of them in your code base;
- try to write simple code.
Recently we have found that BinaryFormatter.Serialize and BinaryFormatter.Deserialize methods are marked as obsolete in .NET 5.0, and are declared dangerous:
The BinaryFormatter type is dangerous and is not recommended for data processing. Applications should stop using BinaryFormatter as soon as possible, even if they believe the data they're processing to be trustworthy. BinaryFormatter is insecure and can't be made secure.
See BinaryFormatter security guide for more details.
That guide along with its links go and expand on what problems BinaryFormatter poses. The schema of dangeous use cases, we have seen so far, is like that:
- two different sides communicate to each other;
- one side supplies input in BinaryFormatter's format;
- other side reads input using BinaryFormatter and instantiates classes.
A threat arises when two sides cannot trust to each other or cannot establish trusted communication chanel. In these cases malign input can be supplied to a side reading the data, which might lead to unexpected code execution, deny of service, data exposure and to other bad consequences.
Arguing like this, today's maintainers of .NET concluded that it's better to tear down BinaryFormatter and similar APIs out of the framework.
Note that they don't claim BinaryFormatter itself, or Reflection API that it uses, as a core of the problem. They blame on communication.
Spelling out clearly what are concerns could help to everyone to better understand how to address it.
In the area of security of communication there are a lot of ready solutions like:
- use signature to avoid tampering the data;
- use encription to avoid spying the data;
- use access rights to avoid even access to the data;
- use secure communication channels.
We can surely state that without applying these solutions no other serialization format is reliable and is subject of the same vulnerabilities.
After all it looked like an attempt to throw out the baby with the bath water. The good news is that thankfully to now modular structure of .NET runtime we're able to access binary serialization library, which are (and will be) available on nugets repositories. So, it's futile effort to erase this usefull API.
Eventually we've started to deal with tasks that required machine learning. Thus, the good tutorial for ML.NET was required and we had found this one that goes along with good simple codesamples. Thanks to Jeff Prosise. Hope this may be helpfull to you too.
Although the golden age of IE8 has already passed and Microsoft
already has stopped its support, this browser still occupies about
3% of the of the world market desktop browsers. Despite this, many big organisations still
use this browser for enterprise web applications. We may confirm this, since we deal with
such organisations around the world. Companies try to get rid of IE8, but this
often requires Windows upgrade and resources to re-test all their web applications. If such company has many web terminals with Windows 7 or even with XP, this task becames rather expensive. So, this process
advances rather slowly. Meanwhile, these organizations don't stop development of new
web applications that must work on new HTML5 browsers and on old IE8.
A year ago we had developed an
UIUpload AngularJS directive and service that simplifies file uploading in web applications with AngularJS client. It works as expected on all HTML5 browsers.
But few days ago, we were asked to help with file uploading from
AngularJS web application that will work in IE8. We've spent few hours in order to investigate
existing third-party AngularJS directives and components. Here are few of them:
All of these directives for IE8 degrade down to <form> and <iframe> and then track the uploading progress. These solutions don't allow to select files for old browsers. At the same time, our aim was to implement an AngularJS directive that allows selecting a file and perform uploading, which will work for IE8 and for new browsers too.
Since IE8 neither supports FormData nor File API, thus, the directive must work with DOM elements directly. In order to open file selection dialog we need to hide <input type="file"/> element and then to route client-side event to it. When a file is selected it is sent to a server as multipart/form-data message. The server's result will be caught by hidden <iframe> element and passed to the directive's controller.
After few attempts we've implemented the desired directive. The small VS2015 solution that demonstrates this directive and server-side file handler you may download here.
The key feature of this directive is emulation of replace and template directive's definition properties:
var directive =
{
restrict: "AE",
scope:
{
id: "@",
serverUrl: "@",
accept: "@?",
onSuccess: "&",
onError: "&?",
},
link: function (scope, element, attrs, controller)
{
var id = scope.id || ("fileUpload" + scope.$id);
var template = "<iframe name='%id%$iframe' id='%id%$iframe' style='display: none;'>" +
"</iframe><form name='%id%$form' enctype='multipart/form-data' " +
"method='post' action='%action%' target='%id%$iframe'>" +
"<span style='position:relative;display:inline-block;overflow:hidden;padding:0;'>" +
"%html%<input type='file' name='%id%$file' id='%id%$file' " +
"style='position:absolute;height:100%;width:100%;left:-10px;top:-1px;z-index:100;" +
"font-size:50px;opacity:0;filter:alpha(opacity=0);'/></span></form>".
replace("%action%", scope.serverUrl).
replace("%html%", element.html()).
replace(/%id%/g, id);
element.replaceWith(template);
...
}
}
We used such emulation, since each directive instance (an element) must have unique name and ID in order to work properly. On the one hand template that returned by function should have a root element when you use replace. On the other hand, IE8 doesn't like such root element (e.g. we've not succeeded to dispatch the click javascript event to the <input> element).
The usage of the directive looks like as our previous example (see UIUpload):
<a file-upload=""
class="btn btn-primary"
accept=".*"
server-url="api/upload"
on-success="controller.uploadSucceed(data, fileName)"
on-error="controller.uploadFailed(e)">Click here to upload file</a>
Where:
- accept
- is a comma separated list of acceptable file extensions.
- server-url
-
is server URL where to upload the selected file.
In case when there is no "server-url" attribute the content of selected file will be
passed to success handler as a data URI.
- on-success
- A "success" handler, which is called when upload is finished successfully.
- on-error
- An "error" handler, which is called when upload is failed.
We hope this simple directive may help to keep calm for those of you who is forced to deal with IE8 and more advanced browsers at the same time.
Our genuine love is C++. Unfortunately clients don't always share our favors, so we mostly occupied in the C#, java and javascript. Nevertheless, we're closely watching the evolution of the C++. It became more mature in the latest specs.
Recently, we wondered
how would we deal with dependency injection in C++.
What we found is only strengthened our commitment to C++.
Parameter packs introduced in C++ 11 allow trivial implementation of constructor injection, while std::type_index, std::type_info and std:any give service containers.
In fact there are many DI implementations out there. The one we refer here is Boost.DI. It's not standard nor we can claim it's the best but it's good example of how this concept can be implemented.
So, consider their example seen in Java with CDI, in C# in .NET Core injection, and in C++:
Java:
@Dependent
public class Renderer
{
@Inject @Device
private int device;
};
@Dependent
public class View
{
@Inject @Title
private String title;
@Inject
private Renderer renderer;
};
@Dependent
public class Model {};
@Dependent
public class Controller
{
@Inject
private Model model;
@Inject
private View view;
};
@Dependent
public class User {};
@Dependent
public class App
{
@Inject
private Controller controller;
@Inject
private User user;
};
...
Privider<App> provider = ...
App app = provider.get();
C#:
public class RenderedOptions
{
public int Device { get; set; }
}
public class ViewOptions
{
public int Title { get; set; }
}
public class Renderer
{
public Renderer(IOptions<RendererOptions> options)
{
Device = options.Device;
}
public int Device { get; set; }
}
public class View
{
public View(IOptions<ViewOptions> options, Renderer renderer)
{
Title = options.Title;
Renderer = renderer;
}
public string Title { get; set; }
public Renderer Renderer { get; set; }
}
public class Model {}
public class Controller
{
public Controller(Model model, View view)
{
Model = model;
View = view;
}
public Model Model { get; set; }
public View View { get; set; }
};
public class User {};
public class App
{
public App(Controller controller, User user)
{
Controller = controller;
User = user;
}
public Controller Controller { get; set; }
public User User { get; set; }
};
...
IServiceProvider serviceProvider = ...
serviceProvider.GetService<App>();
C++:
#include <boost/di.hpp>
namespace di = boost::di;
struct renderer
{
int device;
};
class view
{
public:
view(std::string title, const renderer&) {}
};
class model {};
class controller
{
public:
controller(model&, view&) {}
};
class user {};
class app
{
public:
app(controller&, user&) {}
};
int main()
{
/**
* renderer renderer_;
* view view_{"", renderer_};
* model model_;
* controller controller_{model_, view_};
* user user_;
* app app_{controller_, user_};
*/
auto injector = di::make_injector();
injector.create<app>();
}
What is different between these DI flavors?
Not too much from the perspective of the final task achieved.
In java we used member injection, with qualifiers to inject scalars.
In C# we used constructor injection with Options pattern to inject scalars.
In C++ we used constructor injection with direct constants injected.
All technologies have their API to initialize DI container, but, again, while API is different, the idea is the same.
So, expressiveness of C++ matches to those of java and C#.
Deeper analysis shows that java's CDI is more feature rich than DI of C# and C++, but, personally, we consider it's advantage of C# and C++ that they have such a light DI.
At the same time there is an important difference between C++ vs java and C#.
While both java and C# are deemed to use reflection (C# in theory could use code generation on the fly to avoid reflection), C++'s DI natively constructs and injects services.
What does it mean for the user?
Well, a lot! Both in java and in C# you would not want to use DI in a performance critical part of code (e.g. in a tight loop), while it's Ok in C++ due to near to zero performance impact from DI. This may result in more modular and performant code in C++.
While reading on ASP.NET Core Session, and analyzing the difference with previous version of ASP.NET we bumped into a problem...
At Managing Application State
they note:
Session is non-locking, so if two requests both attempt to modify the contents of session, the last one will win. Further, Session is implemented as a coherent session, which means that all of the contents are stored together. This means that if two requests are modifying different parts of the session (different keys), they may still impact each other.
This is different from previous versions of ASP.NET where session was blocking, which meant that if you had multiple concurrent requests to the session, then all requests were synchronized. So, you could keep consistent state.
In ASP.NET Core you have no built-in means to keep a consistent state of the session. Even assurances that the session is coherent does not help in any way.
You options are:
- build your own synchronization to deal with this problem (e.g. around the database);
- decree that your application cannot handle concurrent requests to the same session, so client should not attempt it, otherwise behaviour is undefined.
Angular 2 is already available though there are a lot of code and libraries that are still in Angular 1.x.
Here we outline how to write AngularJS 1.x in the modern javascript.
Prerequisites: EcmaScript 2015, javascript decorators, AngularJS 1.x. No knowledge of Angular 2.0 is required.
Please note that decorators we have introduced, while resemble those from Angular 2, do not match them exactly.
A sample uses nodejs, npm and gulp as a building pipeline. In addition we have added Visual Studio Nodejs project, and maven project.
Build pipeline uses Babel with ES 2015 and decorator plugins to transpile sources into javascript that today's browsers do support. Babel can be replaced or augmented with Typescript compiler to support Microsoft's javascript extensions. Sources are combinded and optionally minified into one or more javascript bundles. In addition html template files are transformed into javascript modules that export a content of html body as a string literals. In general all sources are in src folder and the build's output is assembled in the dist folder. Details of build process are in gulpfile.js
So, let's introduce an API we have defined in angular-decorators.js module:
-
Class decorators:
Component(name, options?) - a decorator to register angular component.
Controller(name) - a decorator to register angular controller.
Directive(name, options?) - a decorator to register angular directive.
Injectable(name) - a decorator to register angular service.
Module(name, ...require) - a decorator to declare an angular module;
Pipe(name, pure?) - a decorator to register angular filter.
Component's and Directive's options is the same object passed into Module.component(), Module.directive() calls with difference that no
options.bindings , options.scope , options.require is specified.
Instead @Attribute(), @Input(), @Output(), @TwoWay(), @Collection(), @Optional() are used to describe options.bindings , and
@Host(), Self(), SkipSelf(), @Optional() are used to describe options.require
Every decorated class can use @Inject() member decorator to inject a service.
-
Member decorators:
Attribute(name?) - a decorator that binds attribute to the property.
BindThis() - a decorator that binds "this " of the function to the class instance.
Collection() - a decorator that binds a collection property to an expression in attribute in two directions.
Host(name?) - a decorator that binds a property to a host controller of a directive found on the element or its ancestors.
HostListener(name?) - a decorator that binds method to a host event.
Inject(name?) - an injection member decorator.
Input(name?) - a decorator that binds a property to an expression in attribute.
Optional() - a decorator that optionally binds a property.
Output(name?) - a decorator that provides a way to execute an expression in the context of the parent scope.
Self(name?) - a decorator that binds a property to a host controller of a directive found on the element.
SkipSelf(name?) - a decorator that binds a property to a host controller of a directive found on the ancestors of the element.
TwoWay() - a decorator that binds a property to an expression in attribute in two directions.
If optional name is omitted in the member decorator then property name is used as a name parameter.
@Host(), @Self(), @SkipSelf() accept class decorated with @Component() or @Directive() as a name parameter.
@Inject() accepts class decorated with @Injectable() or @Pipe() as a name parameter.
-
Other:
modules(...require) - converts an array of modules, possibly referred by module classes, to an array of module names.
Now we can start with samples. Please note that we used samples scattered here and there on the Anuglar site.
@Component(), @SkipSelf(), @Attribute()
-
In the Angular's component development guide there is a sample myTabs and myPane components.
Here its rewritten form
components/myTabs.js:
import { Component } from "../angular-decorators"; // Import decorators
import template from "../templates/my-tabs.html"; // Import template for my-tabs component
@Component("myTabs", { template, transclude: true }) // Decorate class as a component
export class MyTabs // Controller class for the component
{
panes = []; // List of active panes
select(pane) // Selects an active pane
{
this.panes.forEach(function(pane) { pane.selected = false; });
pane.selected = true;
}
addPane(pane) // Adds a new pane
{
if (this.panes.length === 0)
{
this.select(pane);
}
this.panes.push(pane);
}
}
components/myPane.js:
import { Component, Attribute, SkipSelf } "../angular-decorators"; // Import decorators
import { MyTabs } from "./myTabs"; // Import container's directive.
import template from "../templates/my-pane.html"; // Import template.
@Component("myPane", { template, transclude: true }) // Decorate class as a component
export class MyPane // Controller class for the component
{
@SkipSelf(MyTabs) tabsCtrl; //Inject ancestor MyTabs controller.
@Attribute() title; // Attribute "@" binding.
$onInit() // Angular's $onInit life-cycle hook.
{
this.tabsCtrl.addPane(this);
console.log(this);
};
}
- @Component(), @Input(), @Output()
-
In the Angular's component development guide there is a sample
myTabs component.
Here its rewritten form
components/heroDetail.js:
import { Component, Input, Output } from "../angular-decorators";
import template from "../templates/heroDetail.html";
@Component("heroDetail", { template }) // Decorate class as a component
export class HeroDetail // Controller class for the component
{
@Input() hero; // One way binding "<"
@Output() onDelete; // Bind expression in the context of the parent scope "&"
@Output() onUpdate; // Bind expression in the context of the parent scope "&"
delete()
{
this.onDelete({ hero: this.hero });
};
update(prop, value)
{
this.onUpdate({ hero: this.hero, prop, value });
};
}
@Directive(), @Inject(), @Input(), @BindThis()
-
import { Directive, Inject, Input, BindThis } from "../angular-decorators"; // Import decorators
@Directive("myCurrentTime") // Decorate MyCurrentTime class as a directive
export class MyCurrentTime // Controller class for the directive
{
@Inject() $interval; // "$interval" service is injected into $interval property
@Inject() dateFilter; // "date" filter service is injected into dateFilter property
@Inject() $element; // "$element" instance is injected into $element property.
@Input() myCurrentTime; // Input one way "<" property.
timeoutId;
// updateTime is adapted as following in the constructor:
// this.updateTime = this.updateTime.bind(this);
@BindThis() updateTime()
{
this.$element.text(this.dateFilter(new Date(), this.myCurrentTime));
}
$onInit() // Angular's $onInit life-cycle hook.
{
this.timeoutId = this.$interval(this.updateTime, 1000);
}
$onDestroy() // Angular's $onDestroys life-cycle hook.
{
this.$interval.cancel(this.timeoutId);
}
$onChanges(changes) // Angular's $onChanges life-cycle hook.
{
this.updateTime();
}
}
@Directive(), @Inject(), @HostListener(), @BindThis()
-
In the Angular's directive development guide there is a sample myDraggable directive.
Here its rewritten form. directives/myDraggable.js:
import { Directive, Inject, HostListener, BindThis } from "../angular-decorators"; // Import decorators
@Directive("myDraggable") // Decorate class as a directive
export class MyDraggable // Controller class for the directive
{
@Inject() $document; // "$document" instance is injected into $document property.
@Inject() $element;// "$element" instance is injected into $element property.
startX = 0;
startY = 0;
x = 0;
y = 0;
// Listen mousedown event over $element.
@HostListener() mousedown(event)
{
// Prevent default dragging of selected content
event.preventDefault();
this.startX = event.pageX - this.x;
this.startY = event.pageY - this.y;
this.$document.on('mousemove', this.mousemove);
this.$document.on('mouseup', this.mouseup);
}
@BindThis() mousemove(event) // bind mousemove() function to "this" instance.
{
this.y = event.pageY - this.startY;
this.x = event.pageX - this.startX;
this.$element.css({
top: this.y + 'px',
left: this.x + 'px'
});
}
@BindThis() mouseup() // bind mouseup() function to "this" instance.
{
this.$document.off('mousemove', this.mousemove);
this.$document.off('mouseup', this.mouseup);
}
$onInit() // Angular's $onInit life-cycle hook.
{
this.$element.css(
{
position: 'relative',
border: '1px solid red',
backgroundColor: 'lightgrey',
cursor: 'pointer'
});
}
}
@Injectable(), @Inject()
-
In the Angular's providers development guide there is a sample notifier service.
Here its rewritten form. services/notify.js:
import { Inject, Injectable } from "../angular-decorators"; // Import decorators
@Injectable("notifier") // Decorate class as a service
export class NotifierService
{
@Inject() $window; // Inject "$window" instance into the property
msgs = [];
notify(msg)
{
this.msgs.push(msg);
if (this.msgs.length === 3)
{
this.$window.alert(this.msgs.join('\n'));
this.msgs = [];
}
}
}
@Pipe()
-
In the Angular's filters development guide there is a sample reverse custom filter.
Here its rewritten form. filters/reverse.js:
import { Pipe } from "../angular-decorators"; // Import decorators
@Pipe("reverse") // Decorate class as a filter
export class ReverseFilter
{
transform(input, uppercase) // filter function.
{
input = input || '';
var out = '';
for(var i = 0; i < input.length; i++)
{
out = input.charAt(i) + out;
}
// conditional based on optional argument
if (uppercase)
{
out = out.toUpperCase();
}
return out;
}
}
- Module(), modules(), angular.bootstrap()
-
Here are an examples of a class representing angular module, and manual angular bootstrap:
import { angular, modules, Module } from "../angular-decorators"; // Import decorators
import { MyController } from "./controllers/myController"; // Import components.
import { HeroList } from "./components/heroList";
import { HeroDetail } from "./components/heroDetail";
import { EditableField } from "./components/editableField";
import { NotifierService } from "./services/notify";
import { MyTabs } from "./components/myTabs";
import { MyPane } from "./components/myPane";
import { ReverseFilter } from "./filters/reverse";
import { MyCurrentTime } from "./directives/myCurrentTime";
import { MyDraggable } from "./directives/myDraggable";
@Module( // Decorator to register angular module, and refer to other modules or module components.
"my-app",
[
MyController,
NotifierService,
HeroList,
HeroDetail,
EditableField,
MyTabs,
MyPane,
ReverseFilter,
MyCurrentTime,
MyDraggable
])
class MyApp { }
// Manual bootstrap, with modules() converting module classes into an array of module names.
angular.bootstrap(document, modules(MyApp));
Please see angular-decorators.js to get detailed help on decorators.
It's very old theme...
Many years ago we have defined a .NET wrapper around Windows Uniscribe API.
Uniscribe API is used to render bidirectional languages like Hebrew, so it's important mainly here in Israel.
Once in a while we get request from people to give that API, so we published it on GitHub at https://github.com/nesterovsky-bros/BidiVisualConverter.
You're welcome to use it!
It's time to align csharpxom to the latest version of C#. The article New Language Features in C# 6 sums up what's being added.
Sources can be found at nesterovsky-bros/languages-xom, and C# model is at csharp folder.
In general we feel hostile to any new features until they prove they bring an added value. So, here our list of new features from most to least useless:
String interpolation
var s = $"{p.Name} is {p.Age} year{{s}} old";
This is useless, as it does not account resource localization.
Null-conditional operators
int? first = customers?[0].Orders?.Count();
They claim to reduce cluttering from null checks, but in our opinion it looks opposite. It's better to get NullReferenceException if arguments are wrong.
Exception filters
private static bool Log(Exception e) { /* log it */ ; return false; }
…
try { … } catch (Exception e) when (Log(e)) {}
"It is also a common and accepted form of “abuse” to use exception filters for side effects; e.g. logging."
Design a feature for abuse just does not tastes good.
Expression-bodied function and property members.
public Point Move(int dx, int dy) => new Point(x + dx, y + dy);
public string Name => First + " " + Last;
Not sure it's that usefull.
Though ADO.NET and other ORM framworks like EntityFramework and Dapper support async pattern, you should remember that database drivers (at least all we know about) do not support concurrent db commands running against a single connection.
To see what we mean consider a bug we have recently identified. Consider a code:
await Task.WhenAll(
newImages.
Select(
async image =>
{
// Load data from url.
image.Content = await HttpUtils.ReadData(image.Url);
// Insert image into the database.
image.ImageID = await context.InsertImage(image);
}));
The code runs multiple tasks to read images, and to write them into a database.
Framework decides to run all these tasks in parallel. HttpUtils.ReadData() has no problem with parallel execution, while context.InsertImage() does not run well in parallel, and is a subject of race conditions.
To workaround the problem we had to use async variant of a critical section. So the fixed code looks like this:
using(var semaphore = new SemaphoreSlim(1))
{
await Task.WhenAll(
newImages.
Select(
async image =>
{
// Load data from url.
image.Content = await HttpUtils.ReadData(image.Url);
await semaphore.WaitAsync();
try
{
// Insert image into the database.
image.ImageID = await context.InsertImage(image);
}
finally
{
semaphore.Release();
}
}));
}
So, in the async world we still should care about race conditions.
In one of our last projects we were dealing with audio: capture audio in browser, store it on server and then return it by a request and replay in browser.
Though an audio capturing is by itself rather interesting and challenging task, it's addressed by HTML5, so for example take a look at this article. Here we share our findings about other problem, namely an audio conversion.
You might thought that if you have already captured an audio in browser then you will be able to play back it. Thus no additional audio conversion is required.
In practice we are limited by support of various audio formats in browsers. Browsers can capture audio in WAV format, but this format is rather heavy for storing and streaming back. Moreover, not all browsers support this format for playback. See wikipedia for details. There are only two audio formats that more or less widely supported by mainstream browsers: MP3 and AAC. So, you have either convert WAV to MP3, or to AAC.
The obvious choice is to select WAV to MP3 conversion, the benefit that there are many libraries and APIs for such conversion. But in this case you risk falling into a trap with MP3 licensing, especially if you deal with iteractive software products.
Eventually, you will come to the only possible solution (at least for moment of writting) - conversion WAV to AAC.
The native solution is to use NAudio library, which behind the scene uses Media Foundation Transforms. You'll shortly get a working example. Actually the core of solution will contain few lines only:
var source = Path.Combine(root, "audio.wav");
var target = Path.Combine(root, "audio.m4a");
using(var reader = new NAudio.Wave.WaveFileReader(source))
{
MediaFoundationEncoder.EncodeToAac(reader, target);
}
Everything is great. You'll deploy your code on server (by the way server must be Windows Server 2008R2 or higher) and at this point you may find that your code fails. The problem is that Media Foundation API is not preinstalled as a part of Windows Server installation, and must be installed separately. If you own this server then everything is all right, but in case you use a public web hosting server then you won't have ability to install Media Foundation API and your application will never work properly. That's what happened to us...
After some research we came to conclusion that another possible solution is a wrapper around an open source video/audio converter - FFPEG. There were two issues with this solution:
- how to execute ffmpeg.exe on server asynchronously;
- how to limit maximum parallel requests to conversion service.
All these issues were successfully resolved in our prototype conversion service that you may see here, with source published on github. The solution is Web API based REST service with simple client that uploads audio files using AJAX requests to server and plays it back. As a bonus this solution allows us perform not only WAV to AAC conversions, but from many others format to AAC without additional efforts.
Let's take a close look at crucial details of this solution. The core is FFMpegWrapper class that allows to run ffmpeg.exe asynchronously:
/// <summary>
/// A ffmpeg.exe open source utility wrapper.
/// </summary>
public class FFMpegWrapper
{
/// <summary>
/// Creates a wrapper for ffmpeg utility.
/// </summary>
/// <param name="ffmpegexe">a real path to ffmpeg.exe</param>
public FFMpegWrapper(string ffmpegexe)
{
if (!string.IsNullOrEmpty(ffmpegexe) && File.Exists(ffmpegexe))
{
this.ffmpegexe = ffmpegexe;
}
}
/// <summary>
/// Runs ffmpeg asynchronously.
/// </summary>
/// <param name="args">determines command line arguments for ffmpeg.exe</param>
/// <returns>
/// asynchronous result with ProcessResults instance that contains
/// stdout, stderr and process exit code.
/// </returns>
public Task<ProcessResults> Run(string args)
{
if (string.IsNullOrEmpty(ffmpegexe))
{
throw new InvalidOperationException("Cannot find FFMPEG.exe");
}
//create a process info object so we can run our app
var info = new ProcessStartInfo
{
FileName = ffmpegexe,
Arguments = args,
CreateNoWindow = true
};
return ProcessEx.RunAsync(info);
}
private string ffmpegexe;
}
It became possible to run a process asynchronously thanks to James Manning and his ProcessEx class.
Another useful part is a semaphore declaration in Global.asax.cs:
public class WebApiApplication : HttpApplication
{
protected void Application_Start()
{
GlobalConfiguration.Configure(WebApiConfig.Register);
}
/// <summary>
/// Gets application level semaphore that controls number of running
/// in parallel FFMPEG utilities.
/// </summary>
public static SemaphoreSlim Semaphore
{
get { return semaphore; }
}
private static SemaphoreSlim semaphore;
static WebApiApplication()
{
var value =
ConfigurationManager.AppSettings["NumberOfConcurentFFMpegProcesses"];
int intValue = 10;
if (!string.IsNullOrEmpty(value))
{
try
{
intValue = System.Convert.ToInt32(value);
}
catch
{
// use the default value
}
}
semaphore = new SemaphoreSlim(intValue, intValue);
}
}
And the last piece is the entry point, which was implemented as a REST controller:
/// <summary>
/// A controller to convert audio.
/// </summary>
public class AudioConverterController : ApiController
{
/// <summary>
/// Gets ffmpeg utility wrapper.
/// </summary>
public FFMpegWrapper FFMpeg
{
get
{
if (ffmpeg == null)
{
ffmpeg = new FFMpegWrapper(
HttpContext.Current.Server.MapPath("~/lib/ffmpeg.exe"));
}
return ffmpeg;
}
}
/// <summary>
/// Converts an audio in WAV, OGG, MP3 or other formats
/// to AAC format (MP4 audio).
/// </summary>
/// <returns>A data URI as a string.</returns>
[HttpPost]
public async Task<string> ConvertAudio([FromBody]string audio)
{
if (string.IsNullOrEmpty(audio))
{
throw new ArgumentException(
"Invalid audio stream (probably the input audio is too big).");
}
var tmp = Path.GetTempFileName();
var root = tmp + ".dir";
Directory.CreateDirectory(root);
File.Delete(tmp);
try
{
var start = audio.IndexOf(':');
var end = audio.IndexOf(';');
var mimeType = audio.Substring(start + 1, end - start - 1);
var ext = mimeType.Substring(mimeType.IndexOf('/') + 1);
var source = Path.Combine(root, "audio." + ext);
var target = Path.Combine(root, "audio.m4a");
await WriteToFileAsync(audio, source);
switch (ext)
{
case "mpeg":
case "mp3":
case "wav":
case "wma":
case "ogg":
case "3gp":
case "amr":
case "aif":
case "mid":
case "au":
{
await WebApiApplication.Semaphore.WaitAsync();
var result = await FFMpeg.Run(
string.Format(
"-i {0} -c:a libvo_aacenc -b:a 96k {1}",
source,
target));
WebApiApplication.Semaphore.Release();
if (result.Process.ExitCode != 0)
{
throw new InvalidDataException(
"Cannot convert this audio file to audio/mp4.");
}
break;
}
default:
{
throw new InvalidDataException(
"Mime type: '" + mimeType + "' is not supported.");
}
}
var buffer = await ReadAllBytes(target);
var response = "data:audio/mp4;base64," + System.Convert.ToBase64String(buffer);
return response;
}
finally
{
Directory.Delete(root, true);
}
}
For those who'd like to read more about audio conversion, we may suggest to read this article.
Earlier this year Mike Wasson has published a post: "Dependency Injection in ASP.NET Web API 2" that describes Web API's approach to the Dependency Injection design pattern.
In short it goes like this:
- Web API provides a primary integration point through
HttpConfiguration.DependencyResolver property, and tries to obtain many services through this resolver;
- Web API suggests to use your favorite Dependecy Injection library through the integration point. Author lists following libraries: Unity (by Microsoft), Castle Windsor, Spring.Net, Autofac, Ninject, and StructureMap.
The Unity Container (Unity) is a lightweight, extensible dependency injection container. There are Nugets both for Unity library and for Web API integration.
Now to the point of this post.
Unity defines a hierarchy of injection scopes. In Web API they are usually mapped to application and request scopes. This way a developer can inject application singletons, create request level, or transient objects.
Everything looks reasonable. The only problem we have found is that there is no way you to inject Web API objects like HttpConfiguration , HttpControllerContext or request's CancellationToken , as they are never registered for injection.
To workaround this we have created a small class called UnityControllerActivator that perfroms required registration:
using System;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using System.Web.Http.Controllers;
using System.Web.Http.Dispatcher;
using Microsoft.Practices.Unity;
/// <summary>
/// Unity controller activator.
/// </summary>
public class UnityControllerActivator: IHttpControllerActivator
{
/// <summary>
/// Creates an UnityControllerActivator instance.
/// </summary>
/// <param name="activator">Base activator.</param>
public UnityControllerActivator(IHttpControllerActivator activator)
{
if (activator == null)
{
throw new ArgumentException("activator");
}
this.activator = activator;
}
/// <summary>
/// Creates a controller wrapper.
/// </summary>
/// <param name="request">A http request.</param>
/// <param name="controllerDescriptor">Controller descriptor.</param>
/// <param name="controllerType">Controller type.</param>
/// <returns>A controller wrapper.</returns>
public IHttpController Create(
HttpRequestMessage request,
HttpControllerDescriptor controllerDescriptor,
Type controllerType)
{
return new Controller
{
activator = activator,
controllerType = controllerType
};
}
/// <summary>
/// Base controller activator.
/// </summary>
private readonly IHttpControllerActivator activator;
/// <summary>
/// A controller wrapper.
/// </summary>
private class Controller: IHttpController, IDisposable
{
/// <summary>
/// Base controller activator.
/// </summary>
public IHttpControllerActivator activator;
/// <summary>
/// Controller type.
/// </summary>
public Type controllerType;
/// <summary>
/// A controller instance.
/// </summary>
public IHttpController controller;
/// <summary>
/// Disposes controller.
/// </summary>
public void Dispose()
{
var disposable = controller as IDisposable;
if (disposable != null)
{
disposable.Dispose();
}
}
/// <summary>
/// Executes an action.
/// </summary>
/// <param name="controllerContext">Controller context.</param>
/// <param name="cancellationToken">Cancellation token.</param>
/// <returns>Response message.</returns>
public Task<HttpResponseMessage> ExecuteAsync(
HttpControllerContext controllerContext,
CancellationToken cancellationToken)
{
if (controller == null)
{
var request = controllerContext.Request;
var container = request.GetDependencyScope().
GetService(typeof(IUnityContainer)) as IUnityContainer;
if (container != null)
{
container.RegisterInstance<HttpControllerContext>(controllerContext);
container.RegisterInstance<HttpRequestMessage>(request);
container.RegisterInstance<CancellationToken>(cancellationToken);
}
controller = activator.Create(
request,
controllerContext.ControllerDescriptor,
controllerType);
}
controllerContext.Controller = controller;
return controller.ExecuteAsync(controllerContext, cancellationToken);
}
}
}
Note on how it works.
IHttpControllerActivator is a controller factory, which Web API uses to create new controller instances using IHttpControllerActivator.Create() . Later controller's IHttpController.ExecuteAsync() is called to run the logic.
-
UnityControllerActivator replaces original controller activator with a wrapper that delays creation (injection) of real controller untill request objects are registered in the scope
To register this class one need to update code in the UnityWebApiActivator.cs (file added with nuget Unity.AspNet.WebApi )
public static class UnityWebApiActivator
{
/// <summary>Integrates Unity when the application starts.<summary>
public static void Start()
{
var config = GlobalConfiguration.Configuration;
var container = UnityConfig.GetConfiguredContainer();
container.RegisterInstance<HttpConfiguration>(config);
container.RegisterInstance<IHttpControllerActivator>(
new UnityControllerActivator(config.Services.GetHttpControllerActivator()));
config.DependencyResolver = UnityHierarchicalDependencyResolver(container);
}
...
}
With this addition we have simplified the boring problem with passing of CancellationToken all around the code, as controller (and other classes) just declared a property to inject:
public class MyController: ApiController
{
[Dependency]
public CancellationToken CancellationToken { get; set; }
[Dependency]
public IModelContext Model { get; set; }
public async Task<IEnumerable<Products>> GetProducts(...)
{
...
}
public async Task<IEnumerable<Customer>> GetCustomer(...)
{
...
}
...
}
...
public class ModelContext: IModelContext
{
[Dependency]
public CancellationToken CancellationToken { get; set; }
...
}
And finally to perform unit tests for controllers with Depenency Injection you can use a code like this:
using System.Threading;
using System.Threading.Tasks;
using System.Web.Http;
using System.Web.Http.Controllers;
using System.Web.Http.Dependencies;
using System.Net.Http;
using Microsoft.Practices.Unity;
using Microsoft.Practices.Unity.WebApi;
using Microsoft.VisualStudio.TestTools.UnitTesting;
[TestClass]
public class MyControllerTest
{
[ClassInitialize]
public static void Initialize(TestContext context)
{
config = new HttpConfiguration();
Register(config);
}
[ClassCleanup]
public static void Cleanup()
{
config.Dispose();
}
[TestMethod]
public async Task GetProducts()
{
var controller = CreateController<MyController>();
//...
}
public static T CreateController<T>(HttpRequestMessage request = null)
where T: ApiController
{
if (request == null)
{
request = new HttpRequestMessage();
}
request.SetConfiguration(config);
var controllerContext = new HttpControllerContext()
{
Configuration = config,
Request = request
};
var scope = request.GetDependencyScope();
var container = scope.GetService(typeof(IUnityContainer))
as IUnityContainer;
if (container != null)
{
container.RegisterInstance<HttpControllerContext>(controllerContext);
container.RegisterInstance<HttpRequestMessage>(request);
container.RegisterInstance<CancellationToken>(CancellationToken.None);
}
T controller = scope.GetService(typeof(T)) as T;
controller.Configuration = config;
controller.Request = request;
controller.ControllerContext = controllerContext;
return controller;
}
public static void Register(HttpConfiguration config)
{
config.DependencyResolver = CreateDependencyResolver(config);
}
public static IDependencyResolver CreateDependencyResolver(HttpConfiguration config)
{
var container = new UnityContainer();
container.RegisterInstance<HttpConfiguration>(config);
// TODO: configure Unity contaiener.
return new UnityHierarchicalDependencyResolver(container);
}
public static HttpConfiguration config;
}
P.S. To those who think Dependency Injection is an universal tool, please read the article: Dependency Injection is Evil.
Farewell Entity Framework
and hello Dapper!
For many years we were using Entity Framework. It's still very popular and Microsoft's primary Object-Relational Mapper library.
Clearly, the decision is subjective but here are our arguments.
We know and love SQL, and think that in its domain it occupies strong positions. What SQL leaves out of scope is a bridge between itself and other languages. That's where ORM should help.
We strongly beleive that no ORM library should try to hide SQL behind the Object's language itself. We beleive in a separation of roles in development. Database design and Data Access Layer should be separated from client's logic. Thus, we strive, if possible, to encapulate data access through SQL functions and stored procedures.
Entity Framework, in contrast, tries to factor out SQL, giving a perspective of object graph to a client. Initially, it looks promising but at the end a developer should remember that any object query is mapped back to SQL. Without keeping this in mind either query won't compile, or performance will be poor.
E.g. This query will probably fail to build SQL, as no Regex can be mapped to SQL:
var result = context.Content.
Where(data => Regex.IsMatch(data.Content, pattern)).
ToArray();
This query might be slow, if no suitble SQL index is defined:
var result = context.Content.
Where(data => data.Field == value).
ToArray();
Thus no EF's goal is achieved completely, SQL power is limitted, and Data Access Layer is often fused into other client's logic.
We think that Entity Framework is over-engineered library, which tries to be more than ORM. Its generality often bumps into limits of SQL support in EF: SQL dialects, types, operators, functions, and so on. One can observe that people for years appeal to introduce support of xml, hierarchyid, geometry/geography types, full text search, and so on. This state cannot be different, as EF will never be able and does not aim to support all SQL features.
EF has both design-time and runtime. Each database vendor should implement their EF adapter for EF to play well with that database. This cooperation is not always smooth. E.g see Database first create entity framework 6.1.1 model using system.data.sqlite 1.0.93.
At some point the cost of dealing with EF has became too high for us, so we started to look into an alternatives: from plain ADO.NET to lighter ORM library.
To our delight we have immediately found: Dapper - a simple object mapper for .NET. It provides a simple extensions to IDBConnection interface to deal with mapping of query parameters to object properties, and of query results to plain types. Here are some examples:
// Get Customer
var customer = connection.
Query<Customer>("select * from Customers where CustomerId = @id", new { id = customerID }).
ToSingle();
// Insert a value
connection.Execute("insert into MyTable(A, B) values(@a, @b)", new { a = 2, b = 3 });
So, Dapper leaves you with plain SQL, which we consider as advantage.
Except beeing minimalistic compared to EF, Dapper claims performance close to pure hand written ADO.NET. Indeed, they build dynamic methods to populate parameters and to create rows instances, so reflection is used during warm up period only.
Looking at Guava Cache we think its API is more convenient than .NET's Cache API.
Just consider:
-
.NET has getters, and setters of object s by string keys.
You should provide caching policy with each setter.
-
Guava cache operates with typed storage of Key to Value.
Provides a value factory and a caching policy in advance at cache construction.
Guava's advantange is based on an idea that homogenous storage assumes a uniform way of creation of values, and uniform caching policy. Thus a great part of logic is factored out into a cache initialization.
We have decided to create a simple adapter of the MemoryCache to achieve the same goal. Here is a result of such an experiment:
public class Cache<K, V>
where V: class
{
/// <summary>
/// A cache builder.
/// </summary>
public struct Builder
{
/// <summary>
/// A memory cache. If not specified then MemoryCache.Default is used.
/// </summary>
public MemoryCache MemoryCache;
/// <summary>
/// An expiration value.
/// Alternatively CachePolicyFunc can be used.
/// </summary>
public TimeSpan Expiration;
/// <summary>
/// Indicates whether to use sliding (true), or absolute (false)
/// expiration.
/// Alternatively CachePolicyFunc can be used.
/// </summary>
public bool Sliding;
/// <summary>
/// Optional function to get caching policy.
/// Alternatively Expiration and Sliding property can be used.
/// </summary>
public Func<V, CacheItemPolicy> CachePolicyFunc;
/// <summary>
/// Optional value validator.
/// </summary>
public Func<V, bool> Validator;
/// <summary>
/// A value factory.
/// Alternatively FactoryAsync can be used.
/// </summary>
public Func<K, V> Factory;
/// <summary>
/// Async value factory.
/// Alternatively Factory can be used.
/// </summary>
public Func<K, Task<V>> FactoryAsync;
/// <summary>
/// A key to string converter.
/// </summary>
public Func<K, string> KeyFunc;
/// <summary>
/// Converts builder to a Cache<K, V> instance.
/// </summary>
/// <param name="builder">A builder to convert.</param>
/// <returns>A Cache<K, V> instance.</returns>
public static implicit operator Cache<K, V>(Builder builder)
{
return new Cache<K, V>(builder);
}
}
/// <summary>
/// Creates a cache from a cache builder.
/// </summary>
/// <param name="builder">A cache builder instance.</param>
public Cache(Builder builder)
{
if ((builder.Factory == null) && (builder.FactoryAsync == null))
{
throw new ArgumentException("builder.Factory");
}
if (builder.MemoryCache == null)
{
builder.MemoryCache = MemoryCache.Default;
}
this.builder = builder;
}
/// <summary>
/// Cached value by key.
/// </summary>
/// <param name="key">A key.</param>
/// <returns>A cached value.</returns>
public V this[K key]
{
get { return Get(key); }
set { Set(key, value); }
}
/// <summary>
/// Sets a value for a key.
/// </summary>
/// <param name="key">A key to set.</param>
/// <param name="value">A value to set.</param>
public void Set(K key, V value)
{
SetImpl(GetKey(key), IsValid(value) ? value : null);
}
/// <summary>
/// Gets a value for a key.
/// </summary>
/// <param name="key">A key to get value for.</param>
/// <returns>A value instance.</returns>
public V Get(K key)
{
var keyValue = GetKey(key);
var value = builder.MemoryCache.Get(keyValue) as V;
if (!IsValid(value))
{
value = CreateValue(key);
SetImpl(keyValue, value);
}
return value;
}
/// <summary>
/// Gets a task to return an async value.
/// </summary>
/// <param name="key">A key.</param>
/// <returns>A cached value.</returns>
public async Task<V> GetAsync(K key)
{
var keyValue = GetKey(key);
var value = builder.MemoryCache.Get(keyValue) as V;
if (!IsValid(value))
{
value = await CreateValueAsync(key);
SetImpl(keyValue, value);
}
return value;
}
/// <summary>
/// Gets string key value for a key.
/// </summary>
/// <param name="key">A key.</param>
/// <returns>A string key value.</returns>
protected string GetKey(K key)
{
return builder.KeyFunc != null ? builder.KeyFunc(key) :
key == null ? null : key.ToString();
}
/// <summary>
/// Creates a value for a key.
/// </summary>
/// <param name="key">A key to create value for.</param>
/// <returns>A value instance.</returns>
protected V CreateValue(K key)
{
return builder.Factory != null ? builder.Factory(key) :
builder.FactoryAsync(key).Result;
}
/// <summary>
/// Creates a task for value for a key.
/// </summary>
/// <param name="key">A key to create value for.</param>
/// <returns>A task for a value instance.</returns>
protected Task<V> CreateValueAsync(K key)
{
return builder.FactoryAsync != null ? builder.FactoryAsync(key) :
Task.FromResult(builder.Factory(key));
}
/// <summary>
/// Validates the value.
/// </summary>
/// <param name="value">A value to validate.</param>
/// <returns>
/// true if value is valid for a cache, and false otherise.
/// </returns>
protected bool IsValid(V value)
{
return (value != null) &&
((builder.Validator == null) || builder.Validator(value));
}
/// <summary>
/// Set implementation.
/// </summary>
/// <param name="key">A key to set value for.</param>
/// <param name="value">A value to set.</param>
/// <returns>A set value.</returns>
private V SetImpl(string key, V value)
{
if (value == null)
{
builder.MemoryCache.Remove(key);
}
else
{
builder.MemoryCache.Set(
key,
value,
builder.CachePolicyFunc != null ? builder.CachePolicyFunc(value) :
builder.Sliding ?
new CacheItemPolicy { SlidingExpiration = builder.Expiration } :
new CacheItemPolicy
{
AbsoluteExpiration = DateTime.Now + builder.Expiration
});
}
return value;
}
/// <summary>
/// Cache builder.
/// </summary>
private Builder builder;
}
The use consists of initialization:
Cache<MyKey, MyValue> MyValues =
new Cache<MyKey, MyValue>.Builder
{
KeyFunc = key => ...key to string value...,
Factory = key => ...create a value for a key...,
Expiration = new TimeSpan(0, 3, 0),
Sliding = true
};
and a trivial cache access:
var value = MyValues[key];
This contrasts with MemoryCache coding pattern:
MemoryCache cache = MemoryCache.Default;
...
var keyAsString = ...key to string value...
var value = cache.Get(keyAsString) as MyValue;
if (value == null)
{
value = ...create a value for a key...
cache.Set(keyAsString, value, ...caching policy...);
}
Enumerable class contains many overloads with IEqualityComparable<T> argument. Most notable methods are:
- Contains;
- Distinct;
- Except;
- GroupBy;
- Intersect;
- Join;
- ToDictionary;
- ToLookup;
- Union.
Recently we dealt with simple case:
source.
Select(
item =>
new Word
{
Text = ...,
LangID = ...,
Properties = ...
...
}).
Distinct(equality comparer by Text and LangID);
In other words how do you produce a enumeration of distinct words from a enumeration of words, where two words are qualified equal if their Text and LangID are equal?
It turns out it's cumbersome to implement IEqualityComparer<T> interface (and any other interface in C#), at least it's nothing close to a conciseness of lambda functions.
Here we've decided to step in into framework space and to introduce an API to define simple equality comparers for a class.
We start from the use case:
var wordComparer =
KeyEqualityComparer.Null<Word>().
ThenBy(item => item.Text).
ThenBy(item => item.LangID);
...
source.Select(...).Distinct(wordComparer);
And then proceed to the API:
namespace NesterovskyBros.Linq
{
using System;
using System.Collections;
using System.Collections.Generic;
/// <summary>
/// A equality comparer extensions.
/// </summary>
public static class KeyEqualityComparer
{
/// <summary>
/// Gets null as equality comparer for a type.
/// </summary>
/// <typeparam name="T">A type.</typeparam>
/// <returns>
/// null as equality comparer for a type.
/// </returns>
public static IEqualityComparer<T> Null<T>()
{
return null;
}
/// <summary>
/// Creates an equality comparer for a enumeration item.
/// </summary>
/// <typeparam name="T">A type.</typeparam>
/// <param name="source">A source items.</param>
/// <param name="keyFunc">A key function.</param>
/// <returns>
/// null as equality comparer for a type.
/// </returns>
public static IEqualityComparer<T> EqualityComparerBy<T, K>(
this IEnumerable<T> source,
Func<T, K> keyFunc)
{
return new KeyEqualityComparer<T, K>(keyFunc);
}
/// <summary>
/// Creates an equality comparer that uses this comparer as a base.
/// </summary>
/// <typeparam name="T">A type.</typeparam>
/// <typeparam name="K">A key type.</typeparam>
/// <param name="equalityComparer">A base equality comparer.</param>
/// <param name="keyFunc">A key function.</param>
/// <returns>
/// An equality comparer that uses this comparer as a base.
/// </returns>
public static KeyEqualityComparer<T, K> ThenBy<T, K>(
this IEqualityComparer<T> equalityComparer,
Func<T, K> keyFunc)
{
return new KeyEqualityComparer<T, K>(keyFunc, equalityComparer);
}
}
/// <summary>
/// Equality comparer that uses a function to extract a comparision key.
/// </summary>
/// <typeparam name="T">A type.</typeparam>
/// <typeparam name="K">A key type.</typeparam>
public struct KeyEqualityComparer<T, K>: IEqualityComparer<T>
{
/// <summary>
/// Creates an equality comparer.
/// </summary>
/// <param name="keyFunc">A key function.</param>
/// <param name="equalityComparer">A base equality comparer.</param>
public KeyEqualityComparer(
Func<T, K> keyFunc,
IEqualityComparer<T> equalityComparer = null)
{
KeyFunc = keyFunc;
EqualityComparer = equalityComparer;
}
/// </summary>
/// <param name="x">The first object of type T to compare.</param>
/// <param name="y">The second object of type T to compare.</param>
/// <returns>
/// true if the specified objects are equal; otherwise, false.
/// </returns>
public bool Equals(T x, T y)
{
return ((EqualityComparer == null) || EqualityComparer.Equals(x, y)) &&
EqualityComparer<K>.Default.Equals(KeyFunc(x), KeyFunc(y));
}
/// <summary>
/// Returns a hash code for the specified object.
/// </summary>
/// <param name="obj">
/// The value for which a hash code is to be returned.
/// </param>
/// <returns>A hash code for the specified object.</returns>
public int GetHashCode(T obj)
{
var hash = EqualityComparer<K>.Default.GetHashCode(KeyFunc(obj));
if (EqualityComparer != null)
{
var hash2 = EqualityComparer.GetHashCode(obj);
hash ^= (hash2 << 5) + hash2;
}
return hash;
}
/// <summary>
/// A key function.
/// </summary>
public readonly Func<T, K> KeyFunc;
/// <summary>
/// Optional base equality comparer.
/// </summary>
public readonly IEqualityComparer<T> EqualityComparer;
}
}
So, now you can easily build simple equality comparers to cache them or instantiate on the fly. This comparers are usually related to property values or their function of source values.
See also LINQ extensions
This is a small post about refactoring lock statements in async methods.
Before refactoring we had a code like this:
lock(sync)
{
result = methodToRefactorIntoAsync();
}
...
private object sync = new object();
Lock is bound to a thread, thus no way you to use it in async code. As an alternative you may use SemaphoreSlim class:
await sync.WaitAsync(cancellationToken);
try
{
result = await methodAsync(cancellationToken);
}
finally
{
sync.Release();
}
...
private SemaphoreSlim sync = new SemaphoreSlim(1, 1);
What will you do if you have async Web API method that runs on server for a some time but your client is dropped?
There are two solutions:
- Run method to the end and allow to a framework to deal with disconnect;
- Try to be notified about client's drop and to break early.
The first approach is simplest but might result in some overconsumption of server resources.
The other method requires you to check client status from time to time.
Fortunatelly, ASP.NET provides a HttpResponse.ClientDisconnectedToken property, which is limited to IIS 7.5+ in integrated mode, but still fits our needs.
So, you should request ClientDisconnectedToken , if any, and implement your async code using that token.
The following extension function gets that token:
using System.Linq;
using System.Net.Http;
using System.Threading.Tasks;
using System.Threading;
using System.Web;
public static class HttpApiExtensions
{
public static CancellationToken GetCancellationToken(
this HttpRequestMessage request)
{
CancellationToken cancellationToken;
object value;
var key = typeof(HttpApiExtensions).Namespace + ":CancellationToken";
if (request.Properties.TryGetValue(key, out value))
{
return (CancellationToken)value;
}
var httpContext = HttpContext.Current;
if (httpContext != null)
{
var httpResponse = httpContext.Response;
if (httpResponse != null)
{
try
{
cancellationToken = httpResponse.ClientDisconnectedToken;
}
catch
{
// Do not support cancellation.
}
}
}
request.Properties[key] = cancellationToken;
return cancellationToken;
}
}
And here is a Web API WordCount service described in the previous post:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
public class ValuesController: ApiController
{
public async Task<int> GetWordCount([FromUri(Name = "url")] string[] urls)
{
var cancellationToken = Request.GetCancellationToken();
using(var client = new HttpClient())
{
return (await Task.WhenAll(
urls.Select(url => WordCountAsync(client, url, cancellationToken)))).Sum();
}
}
public static async Task<int> WordCountAsync(
HttpClient client,
string url,
CancellationToken cancellationToken)
{
string content = await (await client.GetAsync(url, cancellationToken)).
Content.ReadAsStringAsync();
return WordCount(content);
}
private static int WordCount(string text)
{
var count = 0;
var space = true;
for (var i = 0; i < text.Length; ++i)
{
if (space != char.IsWhiteSpace(text[i]))
{
space = !space;
if (!space)
{
++count;
}
}
}
return count;
}
}
Though is simple there is a nuisance. You should pass cancellation token here and there, which adds to a pollution from async .
Though parallel and async algorithms solve different tasks, they converge in some cases. And it's not always immediately clear what's the best.
Consider the following task: get a total word count contained in a given a set of urls.
At first we've solved it as a parallel task: indeed this fits to MapReduce pattern when you get urls' contents to count the number of words in parallel (Map), and then sum word counts per each url to get final result (Reduce).
But then we decided that the very same MapReduce algorithm can be implemented with async .
This is a parallel word count:
public static int ParallelWordCount(IEnumerable<string> urls)
{
var result = 0;
Parallel.ForEach(
urls,
url =>
{
string content;
using(var client = new WebClient())
{
content = client.DownloadString(url);
}
var count = WordCount(content);
Interlocked.Add(ref result, count);
});
return result;
}
Here is async word count:
public static async Task<int> WordCountAsync(IEnumerable<string> urls)
{
return (await Task.WhenAll(urls.Select(url => WordCountAsync(url)))).Sum();
}
public static async Task<int> WordCountAsync(string url)
{
string content;
using(var client = new WebClient())
{
content = await client.DownloadStringTaskAsync(url);
}
return WordCount(content);
}
And this is an implementation of word count for a text (it's less important for this discussion):
public static int WordCount(string text)
{
var count = 0;
var space = true;
for(var i = 0; i < text.Length; ++i)
{
if (space != char.IsWhiteSpace(text[i]))
{
space = !space;
if (!space)
{
++count;
}
}
}
return count;
}
Our impressions are:
The parallel version is contained in one method, while the async one is implemeneted with two methods.
This is due to the fact that C# compiler fails to generate async labmda function. We attribute this to Microsoft who leads and implements C# spec. Features should be composable. If one can implement a method as a lambda function, and one can implement a method as async then one should be able to implement a method as an async lambda function.
Both parallel and async versions are using thread pool to run their logic.
While both implementations follow MapReduce pattern, we can see that async version is much more scaleable. It's because of parallel threads stay blocked while waiting for an http response. On the other hand async tasks are not bound to any thread and are just not running while waiting for I/O.
This sample helped us to answer the question as to when to use parallel and when async . The simple answer goes like this:
- if your logic is only CPU bound then use parallel API;
- otherwise use
async API (this accounts I/O waits).
Not a long ago C# has introduced special language constructs to simplify asynchronous programming. It seems C++1x will follow async trend. But only recently when frameworks like ASP.NET Web API and Entity Framework started to catch up we've felt how it's to program with async and await keywords.
At first glance it seems it's a pure pleasure to write async methods:
private async Task SumPageSizesAsync()
{
// To use the HttpClient type in desktop apps, you must include a using directive and add a
// reference for the System.Net.Http namespace.
HttpClient client = new HttpClient();
// . . .
byte[] urlContents = await client.GetByteArrayAsync(url);
// . . .
}
To dereference a Task<T> into T you just write await task_t_expression , mark your method with async specifier, and adjust output type (if not void ) to Task or Task<Result> . Compiler applies its magic to convert your code into an asynchronous state machine.
We liked this feature and immediately have started to use it. But, as we said, async /await has shined in full when frameworks made it a core element, and at that point we have started to see that while async /await solve the task, they does not abstract the developer from implementation details, as a code gets considerably polluted.
Consider a method with pollution marked:
public static async Task<UserAuthorization> GetAuthorizationAsync(string accessToken)
{
var key = "oauth2:" + accessToken;
var authorization = cache.Get<UserAuthorization>(key);
if (authorization != null)
{
return authorization;
}
using(var model = new ModelContainer())
{
authorization =
(await model.UserAuthorizations.
Where(item => item.AccessToken == accessToken).
ToListAsync()).
FirstOrDefault();
}
if (authorization == null)
{
authorization = await ValidateAsync(accessToken);
}
cache.Set(key, cache.ShortDelay, authorization);
return authorization;
}
The more you use async , the more pollution you will see in your code. Ideally we would like to see the method quoted without any marked parts.
We needed to have oauth2 authorization in angularjs project.
Internet search on the subject immediately brings large amout of solutions like:
But unfortunatelly:
- provider specific libraries have too different set of APIs, which requires another umbrella library to allow for application to accept several providers;
- angular-oauth - supports Google only, and does not work in IE 11 with default security settings;
- oauth.io looks attractive but adds additional level of indirection server, and is free for a basic plan only.
However there is a problem with all those approaches.
Let's assume that you have properly implemented client side authorization, and finally have gotten an access_token .
Now, you access your server with that access_token . What is your first step on the server?
Right! Your should validate it against oauth2 provider.
So, while client side authorization, among other things, included a validation of your token, you have to perform the validation on the server once more.
At this point we felt that we need to implement our oauth2 API.
It can be found at nesterovsky-bros/oauth2.
This is the readme from that project:
Here we implement oauth2 authorization within angularjs.
Authentication is done as follows:
- Open oauth2 provider login/grant screen.
- Redirect to the oauth2 callback screen with access token.
- Verify of the access token against provider.
- Get some basic profile.
A base javascript class OAuth2 implements these steps.
There are following implementations that authorize against specific providers:
OAuth2Server - implements authorization through known providers, but calls server side to validate access token. This way, the server side can establish a user's session.
The file Config.json contains endpoints and request parameters per supported provider.
Note: You should register a client_id for each provider.
Note: user_id and access_tokens are unique only in a scope of access provider, thus a session is identified by Provider + access_token, and a user is identified by Provider + user_id.
The use case can be found in test.js E.g. authorization against OAuth2Server is done like this:
var login = new OAuth2Server(provider);
token = login.authorize();
token.$promise.then(
function()
{
// token contains populated data.
},
function(error)
{
if (error)
{
// handle an error
}
});
Authorization token contains:
- a promise to handle authorization outcome.
- cancelToken (a Deferred object) to cancel authorization in progress.
Whole sample is implemented as VS project. All scripts are build with app.tt, that combines content of Scripts/app int app.js.
Server side is implemented with ASP.NET Web API. Authorization controllers are:
In the article "Error handling in WCF based web applications"
we've shown a custom error handler for RESTful service
based on WCF. This time we shall do the same for Web API 2.1 service.
Web API 2.1 provides an elegant way to implementat custom error handlers/loggers, see
the following article. Web API permits many error loggers followed by a
single error handler for all uncaught exceptions. A default error handler knows to output an error both in XML and JSON formats depending on requested
MIME type.
In our projects we use unique error reference IDs. This feature allows to an end-user to refer to any error that has happened during the application life time and pass such error ID to the technical support for further investigations. Thus, error details passed to the client-side contain an ErrorID field. An error logger generates ErrorID and passes it over to an error handler for serialization.
Let's look at our error handling implementation for a Web API application.
The first part is an implementation of IExceptionLogger
interface. It assigns ErrorID and logs all errors:
/// Defines a global logger for unhandled exceptions.
public class GlobalExceptionLogger : ExceptionLogger
{
/// Writes log record to the database synchronously.
public override void Log(ExceptionLoggerContext context)
{
try
{
var request = context.Request;
var exception = context.Exception;
var id = LogError(
request.RequestUri.ToString(),
context.RequestContext == null ?
null : context.RequestContext.Principal.Identity.Name,
request.ToString(),
exception.Message,
exception.StackTrace);
// associates retrieved error ID with the current exception
exception.Data["NesterovskyBros:id"] = id;
}
catch
{
// logger shouldn't throw an exception!!!
}
}
// in the real life this method may store all relevant info into a database.
private long LogError(
string address,
string userid,
string request,
string message,
string stackTrace)
{
...
}
}
The second part is the implementation of IExceptionHandler :
/// Defines a global handler for unhandled exceptions.
public class GlobalExceptionHandler : ExceptionHandler
{
/// This core method should implement custom error handling, if any.
/// It determines how an exception will be serialized for client-side processing.
public override void Handle(ExceptionHandlerContext context)
{
var requestContext = context.RequestContext;
var config = requestContext.Configuration;
context.Result = new ErrorResult(
context.Exception,
requestContext == null ? false : requestContext.IncludeErrorDetail,
config.Services.GetContentNegotiator(),
context.Request,
config.Formatters);
}
/// An implementation of IHttpActionResult interface.
private class ErrorResult : ExceptionResult
{
public ErrorResult(
Exception exception,
bool includeErrorDetail,
IContentNegotiator negotiator,
HttpRequestMessage request,
IEnumerable<MediaTypeFormatter> formatters) :
base(exception, includeErrorDetail, negotiator, request, formatters)
{
}
/// Creates an HttpResponseMessage instance asynchronously.
/// This method determines how a HttpResponseMessage content will look like.
public override Task<HttpResponseMessage> ExecuteAsync(CancellationToken cancellationToken)
{
var content = new HttpError(Exception, IncludeErrorDetail);
// define an additional content field with name "ErrorID"
content.Add("ErrorID", Exception.Data["NesterovskyBros:id"] as long?);
var result =
ContentNegotiator.Negotiate(typeof(HttpError), Request, Formatters);
var message = new HttpResponseMessage
{
RequestMessage = Request,
StatusCode = result == null ?
HttpStatusCode.NotAcceptable : HttpStatusCode.InternalServerError
};
if (result != null)
{
try
{
// serializes the HttpError instance either to JSON or to XML
// depend on requested by the client MIME type.
message.Content = new ObjectContent<HttpError>(
content,
result.Formatter,
result.MediaType);
}
catch
{
message.Dispose();
throw;
}
}
return Task.FromResult(message);
}
}
}
Last, but not least part of this solution is registration and configuration of the error logger/handler:
/// WebApi congiguation.
public static class WebApiConfig
{
public static void Register(HttpConfiguration config)
{
...
// register the exception logger and handler
config.Services.Add(typeof(IExceptionLogger), new GlobalExceptionLogger());
config.Services.Replace(typeof(IExceptionHandler), new GlobalExceptionHandler());
// set error detail policy according with value from Web.config
var customErrors =
(CustomErrorsSection)ConfigurationManager.GetSection("system.web/customErrors");
if (customErrors != null)
{
switch (customErrors.Mode)
{
case CustomErrorsMode.RemoteOnly:
{
config.IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.LocalOnly;
break;
}
case CustomErrorsMode.On:
{
config.IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.Never;
break;
}
case CustomErrorsMode.Off:
{
config.IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.Always;
break;
}
default:
{
config.IncludeErrorDetailPolicy = IncludeErrorDetailPolicy.Default;
break;
}
}
}
}
}
The client-side error handler remain almost untouched. The implementation details you may find in
/Scripts/api/api.js and Scripts/controls/error.js files.
You may download the demo project here.
Feel free to use this solution in your .NET projects.
From time to time we run into tasks that we would like to solve in LINQ style but unfortunately it either cannot be done or a solution is not efficient.
Note that by LINQ style we do not mean C# query expressions (we have a strong distaste for that syntax) but extension methods defined in System.Linq.Enumerable and other classes.
Here we quote several extension methods that are good for a general use:
1. Select with predicate. This is shorthand of items.Where(...).Select(...) :
/// <summary>
/// Projects each element of a sequence into a new form.
/// </summary>
/// <typeparam name="T">A type of elements of source sequence.</typeparam>
/// <typeparam name="R">A type of elements of target sequence.</typeparam>
/// <param name="source">A source sequence.</param>
/// <param name="where">A predicate to filter elements.</param>
/// <param name="selector">A result element selector.</param>
/// <returns>A target sequence.</returns>
public static IEnumerable<R> Select<T, R>(
this IEnumerable<T> source,
Func<T, bool> where,
Func<T, R> selector)
{
return source.Where(where).Select(selector);
}
2. Select with predicate with source element index passed both into the predicate and into the selector. This one you cannot trivially implement in LINQ:
/// <summary>
/// Projects each element of a sequence into a new form.
/// </summary>
/// <typeparam name="T">A type of elements of source sequence.</typeparam>
/// <typeparam name="R">A type of elements of target sequence.</typeparam>
/// <param name="source">A source sequence.</param>
/// <param name="where">A predicate to filter elements.</param>
/// <param name="selector">A result element selector.</param>
/// <returns>A target sequence.</returns>
public static IEnumerable<R> Select<T, R>(
this IEnumerable<T> source,
Func<T, int, bool> where,
Func<T, int, R> selector)
{
var index = 0;
foreach(var value in source)
{
if (where(value, index))
{
yield return selector(value, index);
}
++index;
}
}
3. A function with output element as projection of a window of input elements. Such function can be used to get finite difference (operation opposite to a cumulative sum).
/// <summary>
/// Projects a window of source elements in a source sequence into target sequence.
/// Thus
/// target[i] =
/// selector(source[i], source[i - 1], ... source[i - window + 1])
/// </summary>
/// <typeparam name="T">A type of elements of source sequence.</typeparam>
/// <typeparam name="R">A type of elements of target sequence.</typeparam>
/// <param name="source">A source sequence.</param>
/// <param name="window">A size of window.</param>
/// <param name="lookbehind">
/// Indicate whether to produce target if the number of source elements
/// preceeding the current is less than the window size.
/// </param>
/// <param name="lookahead">
/// Indicate whether to produce target if the number of source elements
/// following current is less than the window size.
/// </param>
/// <param name="selector">
/// A selector that derives target element.
/// On input it receives:
/// an array of source elements stored in round-robing fashon;
/// an index of the first element;
/// a number of elements in the array to count.
/// </param>
/// <returns>Returns a sequence of target elements.</returns>
public static IEnumerable<R> Window<T, R>(
this IEnumerable<T> source,
int window,
bool lookbehind,
bool lookahead,
Func<T[], int, int, R> selector)
{
var buffer = new T[window];
var index = 0;
var count = 0;
foreach(var value in source)
{
if (count < window)
{
buffer[count++] = value;
if (lookbehind || (count == window))
{
yield return selector(buffer, 0, count);
}
}
else
{
buffer[index] = value;
index = index + 1 == window ? 0 : index + 1;
yield return selector(buffer, index, count);
}
}
if (lookahead)
{
while(--count > 0)
{
index = index + 1 == window ? 0 : index + 1;
yield return selector(buffer, index, count);
}
}
}
This way a finite difference looks like this:
var diff = input.Window(
2,
false,
false,
(buffer, index, count) => buffer[index ^ 1] - buffer[index]);
4. A specialization of Window method that returns a enumeration of windows:
/// <summary>
/// Projects a window of source elements in a source sequence into a
/// sequence of window arrays.
/// </summary>
/// <typeparam name="T">A type of elements of source sequence.</typeparam>
/// <typeparam name="R">A type of elements of target sequence.</typeparam>
/// <param name="source">A source sequence.</param>
/// <param name="window">A size of window.</param>
/// <param name="lookbehind">
/// Indicate whether to produce target if the number of source elements
/// preceeding the current is less than the window size.
/// </param>
/// <param name="lookahead">
/// Indicate whether to produce target if the number of source elements
/// following current is less than the window size.
/// </param>
/// <returns>Returns a sequence of windows.</returns>
public static IEnumerable<T[]> Window<T>(
this IEnumerable<T> source,
int window,
bool lookbehind,
bool lookahead)
{
return source.Window(
window,
lookbehind,
lookahead,
(buffer, index, count) =>
{
var result = new T[count];
for(var i = 0; i < count; ++i)
{
result[i] = buffer[index];
index = index + 1 == buffer.Length ? 0 : index + 1;
}
return result;
});
}
Consider how would you implement Style object in the HTML DOM?
These are some characteristics of that object:
- It has a long list of properties, e.g. in IE 11 there are more than 300 properties over a style object.
- Any specific instance usually have only several properties assigned.
- Reads of properties are much more frequent than writes. In fact style often stays unchanged after initialization.
- DOM contains many style instances (often thousands).
- The number of distinct instances in terms of values of properties is moderate (usually dozens).
Here is how would we approached to such an object.
1.
Styles are sparse objects, thus there is no point to implement plain class with all those properties, as it's wasteful.
We would rather use two techniques to keep style's state:
- A dictionary of properties with their values;
- An aggregation of objects, where all properies are grouped into families, each group is defined by a separate type, and a style's state is an aggregation of that groups.
A current style of an element is an aggregation of styles of ancestor element. It can either by dynamic or be fused into a single style instance.
2. Make style's state immutable, and share all these states among all style instances.
In this implementation property write turns into a state transition operation: state = set(state, property, value) . Thus no state is modified but replaced with other state that corresponds to a required change.
If state is seen as a dictionary then API may look like this :
public class State<K, V>
{
// Gets shared dictionary for an input dictionary.
public IDictionary<K, V> Get(IDictionary<K, V> dictionary);
// Gets a shared dictionary for an input dictionary with key set to a value.
public IDictionary<K, V> Set(IDictionary<K, V> dictionary, K key, V value);
// Gets a shared dictionary for an input dictionary.
public IDictionary<K, V> Remove(IDictionary<K, V> dictionary, K key);
// Gets typed value.
public T Get<T>(IDictionary<K, V> dictionary, K key)
where T: V
{
V value;
if ((dictionary == null) || !dictionary.TryGetValue(key, out value))
{
return default(T);
}
return (T)value;
}
// Sets or removes a typed value.
// dictionary can be null.
// null returned if output dictionary would be empty.
public IDictionary<K, V> Set<T>(IDictionary<K, V> dictionary,
K key,
T value)
where T : V
{
return value == null ? Remove(dictionary, key) :
Set(dictionary, key, (V)value);
}
}
States can be cached. Provided the cache keeps states in a weak way, no unsued state will be stored for a long time.
We may use weak table of dictionary to dictionary WeakTable<Dictionary<K, V>, Dictionary<K, V>> as a storage for such a cache. All required API is described in the WeakTable and Hash Code of Dictionary posts.
3. Style can be implemented as a structure with shared state as a storage. Here is a scetch:
[Serializable]
public struct Style
{
// All properties.
public enum Property
{
Background,
BorderColor,
BorderStyle,
Color,
FontFamily,
FontSize,
// ...
}
public int? Background
{
get { return states.Get<int?>(state, Property.Background); }
set { state = states.Set(state, Property.Background, value); }
}
public int? BorderColor
{
get { return states.Get<int?>(state, Property.BorderColor); }
set { state = states.Set(state, Property.BorderColor, value); }
}
public string BorderStyle
{
get { return states.Get<string>(state, Property.BorderStyle); }
set { state = states.Set(state, Property.BorderStyle, value); }
}
public int? Color
{
get { return states.Get<int?>(state, Property.Color); }
set { state = states.Set(state, Property.Color, value); }
}
public string FontFamily
{
get { return states.Get<string>(state, Property.FontFamily); }
set { state = states.Set(state, Property.FontFamily, value); }
}
public double? FontSize
{
get { return states.Get<double?>(state, Property.FontSize); }
set { state = states.Set(state, Property.FontSize, value); }
}
// ...
[OnDeserialized]
private void OnDeserialized(StreamingContext context)
{
state = states.Get(state);
}
// A state.
private IDictionary<Property, object> state;
// A states cache.
private static readonly State<Property, object> states =
new State<Property, object>();
}
Note that:
- default state is a
null dictionary;
- states are application wide shared.
The following link is our implementation of State<K, V> class: State.cs.
Here we have outlined the idea of shared state object, and how it can be applied to sparse mostly immutable objects. We used HTML style as an example of such an object. Shared state object may work in many other areas, but for it to shine its use case should fit to the task.
Dealing recently with some task (the same that inspired us to implement WeakTable), we were in a position to use a dictionary as a key in another dictionary.
What are the rules for the class to be used as key:
- key should be immutable;
- key should implement a
GetHashCode() method;
- key should implement a
Equals() method.
The first requirement is usually implemented as a documentation contract like this:
As long as an object is used as a key in the Dictionary<TKey, TValue> , it must not change in any way that affects its hash value.
The third requirement about equals is trivially implemented as a method:
public bool Equals(IDictionary<K, V> x, IDictionary<K, V> y)
{
if (x == y)
{
return true;
}
if ((x == null) || (y == null) || (x.Count != y.Count))
{
return false;
}
foreach(var entry in x)
{
V value;
if (!y.TryGetValue(entry.Key, out value) ||
!valueComparer.Equals(entry.Value, value))
{
return false;
}
}
return true;
}
But how would you implement hash code?
We argued like this.
1. Let's consider the dictionary as a sparse array of values with only populated items that correspond to key hash codes.
2. Hash code is constructed using some fair algorithm. E.g like that used in java to calculate string's hash code:
n-1
h(s) = SUM (s[i]*p^(n-1-i)) mod m, where m = 2^31
i=0
In our case:
n can be arbitrary large int value, so in fact it's 2^32 ;
- items are enumerated in unknown order;
- there is only limited set of items, so most
s[i] are zeros.
As result we cannot use recurrent function to calculate a power p^k mod m. Fortunately one can build fast exponentiation arguing like this:
32/s - 1
p^k = p^ SUM 2^((s*i)*k[i]) mod m, where s some int: 1, 2, 4, 8, 16, or 32.
i=0
Thus
32/s - 1
p^k = PRODUCT (p^(2^(s*i)))^k[i] mod m
i=0
If s = 1 then k[i] is either 1 or 0 (a bit), and there is 32 different p^(2^i) mod m values, which can be precalculated.
On the other hand, if we select s = 8 we can write the formula as:
p^k = p^k[0] * (p^(2^8))^k[1] * (p^(2^16))^k[2] * (p^(2^24))^k[3] mod m
where k[i] is a 8 bit value (byte).
Precalculating all values
p^n, (p^(2^8))^n , (p^(2^16))^n , (p^(2^24))^n for n in 0 to 255 we reach the formula with 4 multiplications and with 1024 precalculated values.
Here is the whole utility to calculate hash factors:
/// <summary>
/// Hash utilities.
/// </summary>
public class Hash
{
/// <summary>
/// Returns a P^value mod 2^31, where P is hash base.
/// </summary>
/// <param name="value">A value to get hash factor for.</param>
/// <returns>A hash factor value.</returns>
public static int GetHashFactor(int value)
{
return factors[(uint)value & 0xff] *
factors[(((uint)value >> 8) & 0xff) | 0x100] *
factors[(((uint)value >> 16) & 0xff) | 0x200] *
factors[(((uint)value >> 24) & 0xff) | 0x300];
}
/// <summary>
/// Initializes hash factors.
/// </summary>
static Hash()
{
var values = new int[4 * 256];
var value = P;
var current = 1;
var i = 0;
do
{
values[i++] = current;
current *= value;
}
while(i < 256);
value = current;
current = 1;
do
{
values[i++] = current;
current *= value;
}
while(i < 512);
value = current;
current = 1;
do
{
values[i++] = current;
current *= value;
}
while(i < 768);
value = current;
current = 1;
do
{
values[i++] = current;
current *= value;
}
while(i < 1024);
factors = values;
}
/// <summary>
/// A base to calculate hash factors.
/// </summary>
public const int P = 1103515245;
/// <summary>
/// Hash factors.
/// </summary>
private static readonly int[] factors;
}
With this API hash code for a dictionary is a trivial operation:
public int GetHashCode(IDictionary<K, V> dictionary)
{
if (dictionary == null)
{
return 0;
}
var result = 0;
foreach(var entry in dictionary)
{
if ((entry.Key == null) || (entry.Value == null))
{
continue;
}
result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) *
valueComparer.GetHashCode(entry.Value);
}
return result;
}
And finally, here is a reference to a class DictionaryEqualityComparer<K, V>: IEqualityComparer<IDictionary<K, V>> that allows a dictionary to be a key in another dictionary.
Update
We have commited some tests, and have found that with suffiently "good" implementation of GetHashCode() of key or value we achieve results almost of the same quality, as the results of the algorithm we have outlined above with much simpler and straightforward algorithm like this:
public int GetHashCode(IDictionary<K, V> dictionary)
{
if (dictionary == null)
{
return 0;
}
var result = 0;
foreach(var entry in dictionary)
{
if ((entry.Key == null) || (entry.Value == null))
{
continue;
}
var k = entry.Key.GetHashCode();
var v = entry.Value.GetHashCode();
k = (k << 5) + k;
v = (v << (k >> 3)) + v;
result += k ^ v;
//result += Hash.GetHashFactor(keyComparer.GetHashCode(entry.Key)) *
// valueComparer.GetHashCode(entry.Value);
}
return result;
}
It was worth to blog about this just to find out that we have outwitted ourselves, and finally to reach to a trivial hash code implementation for the dictionary.
Dealing recently with some task, we were in a position to use a weak dictionary in the .NET. Instinctively we assumed that it should exist somewhere in the standard library. We definitely knew that there is a WeakReference class to for a single instance. We also knew that there is WeakHashMap in java, and that it's based on java's WeakReference .
So, we were surprised to find that there is no such thing out of the box in .NET.
We have found that java's and .NET's weak references are different. In java weak references whose targets are GCed can be automatically put into a queue, which can be used to build clean up logic to remove dead keys from weak hash map. There is nothing similar in .NET, where weak reference just silently loses it's value.
Internet is full with custom implementations of weak dictionaries in .NET.
.NET 4.5 finally defines a class ConditionalWeakTable<TKey, TValue> , which solves the problem in case when you need to match keys by instance identity.
Unfortunately in our case we needed to match keys using key's GetHashCode() and Equals() . So, ConditionalWeakTable<TKey, TValue> did not directly work, but then we found a way to make it work for us.
Here is a quote from the definition:
A ConditionalWeakTable<TKey, TValue> object is a dictionary that binds a managed object, which is represented by a key, to its attached property, which is represented by a value. The object's keys are the individual instances of the TKey class to which the property is attached, and its values are the property values that are assigned to the corresponding objects.
...in the ConditionalWeakTable<TKey, TValue> class, adding a key/value pair to the table does not ensure that the key will persist, even if it can be reached directly from a value stored in the table... Instead, ConditionalWeakTable<TKey, TValue> automatically removes the key/value entry as soon as no other references to a key exist outside the table.
This property of ConditionalWeakTable<TKey, TValue> has helped us to build a way to get a notification when the key is being finalized, which is the missed ingredient in .NET's weak references.
Assume you have an instance key of type Key . To get a notification you should define a class Finalizer that will call some handler when it's finalized, and you should bind key and a finalizer instance using weak table.
The code looks like this:
public class Finalizer<K>
where K: class
{
public static void Bind(K key, Action<K> handler)
{
var finalizer = table.GetValue(key, k => new Finalizer<K> { key = k });
finalizer.Handler += handler;
}
public static void Unbind(K key, Action<K> handler)
{
Finalizer finalizer;
if (table.TryGetValue(key, out finalizer))
{
finalizer.Handler -= handler;
}
}
~Finalizer()
{
var handler = Handler;
if (handler != null)
{
handler(key);
}
}
private event Action<K> Handler;
private K key;
private static readonly
ConditionalWeakTable<K, Finalizer> table =
new
ConditionalWeakTable<K, Finalizer>();
}
Key key = ...
Finalizer.Bind(key, k => { /* clean up. */ });
Using this approach we have created a class WeakTable<K, V> modeled after ConditionalWeakTable<TKey, TValue>.
So, this is our take in the problem: WeakTable.cs.
Oftentimes we deal with Hebrew in .NET.
The task we face again and again is attempt to convert a Hebrew text from visual to logical representation.
The latest demand of such task was when we processed content extracted from PDF. It's turned out that PDF stores content as graphic primitives, and as result text is stored visually (often each letter is kept separately).
We solved the task more than a decade ago, by calling Uniscribe API.
The function by itself is a small wrapper around that API, so in .NET 1.0 we were using managed C++, several years later we have switched to C++/CLI.
But now after many .NET releases, and with 32/64 versions we can see that C++ is only a guest in .NET world.
To run C++ in .NET you have to install VC runtime libraries adjusted to a specific .NET version. This turns C++ support in .NET into not a trivial task.
So, we have finally decided to define C# interop for the Uniscribe API, and recreate that function in pure C#:
namespace NesterovskyBros.Bidi
{
/// <summary>
/// An utility to convert visual string to logical.
/// <summary>
public static class BidiConverter
{
/// <summary>
/// Converts visual string to logical.
/// </summary>
/// <param name="value">A value to convert.</param>
/// <param name="rtl">A base direction.</param>
/// <param name="direction">
/// true for visual to logical, and false for logical to visual.
/// </param>
/// <returns>Converted string.</returns>
public static string Convert(string value, bool rtl, bool direction);
You can download this project from BidiVisualConverter.zip.
Although WCF REST service + JSON is outdated comparing to Web API, there are yet a lot of such solutions (and probably will appear new ones) that use such "old" technology.
One of the crucial points of any web application is an error handler that allows gracefully resolve server-side exceptions and routes them as JSON objects to the client for further processing. There are dozen approachesin Internet that solve this issue (e.g. http://blog.manglar.com/how-to-provide-custom-json-exceptions-from-as-wcf-service/), but there is no one that demonstrates error handling ot the client-side. We realize that it's impossible to write something general that suits for every web application, but we'd like to show a client-side error handler that utilizes JSON and KendoUI.
On our opinion, the successfull error handler must display an understandable error message on one hand, and on the other hand it has to provide technical info for developers in order to investigate the exception reason (and to fix it, if need):
You may download demo project here. It contains three crucial parts:
- A server-side error handler that catches all exceptions and serializes them as JSON objects (see /Code/JsonErrorHandler.cs and /Code/JsonWebHttpBehaviour.cs).
- An error dialog that's based on user-control defined in previous articles (see /scripts/controls/error.js, /scripts/controls/error.resources.js and /scripts/templates/error.tmpl.html).
- A client-side error handler that displays errors in user-friendly's manner (see /scripts/api/api.js, method defaultErrorHandler()).
Of course this is only a draft solution, but it defines a direction for further customizations in your web applications.
Kendo UI Docs contains an article "How To:
Load Templates from External Files", where authors review two way of dealing
with Kendo UI templates.
While using Kendo UI we have found our own answer to: where will the Kendo
UI templates be defined and maintained?
In our .NET project we have decided to keep templates separately, and to store
them under the "templates" folder. Those templates are in fact include html,
head, and stylesheet links. This is to help us to present those tempates in the
design view.
In our scripts folder, we have defined a small text transformation template:
"templates.tt", which produces "templates.js" file. This template takes body
contents of each "*.tmpl.html" file from "templates" folder and builds string of
the form:
document.write('<script id="footer-template" type="text/x-kendo-template">...</script><script id="row-template" type="text/x-kendo-template">...</script>');
In our page that uses templates, we include "templates.js":
<!DOCTYPE html>
<html>
<head>
<script
src="scripts/templates.js"></script>
...
Thus, we have:
- clean separation of templates and page content;
- automatically generated templates include file.
WebTemplates.zip contains a web project demonstrating our technique. "templates.tt" is
text template transformation used in the project.
See also: Compile KendoUI templates.
Our goal is to generate reports in streaming mode.
At some point we need to deal with data streams (e.g. xml streams for xslt
transformations). Often a nature of report demands several passes through the data.
To increase performance we have defined a class named StreamResource .
This class encapsulates input data, reads it once and caches it into a temp
file; thus data can be traversed many times. StreamResource can
read data lazily or in a eager way thus releasing resources early.
This class can be used as a variation of PipeStream , which never blocks, as if
a size of a buffer is not limited, and which can be read many times.
The API
looks like this:
public class StreamResource: IDisposable
{
/// <summary>
/// Creates a StreamSource instance.
/// </summary>
/// <param name="source">
/// A function that returns source as an input stream.
/// </param>
/// <param name="settings">Optional settings.</param>
public StreamResource(Func<Stream> source, Settings settings = null);
/// <summary>
/// Creates a StreamSource instance.
/// </summary>
/// <param name="source">
/// A function that writes source data into an output stream.
/// </param>
/// <param name="settings">Optional settings.</param>
public StreamResource(Action<Stream> source, Settings settings = null);
/// <summary>
/// Gets an input stream.
/// </summary>
/// <param name="shared">
/// Indicates that this StreamResouce should be disposed when returned
/// stream is closed and there are no more currently opened cache streams.
/// </param>
/// <returns>A input stream.</returns>
public Stream GetStream(bool shared = false);
}
The use pattern is following:
// Acquire resource.
using(var resource = new StreamResource(() =>
CallService(params...)))
{
// Read stream.
using(var stream = resource.GetStream())
{
...
}
...
// Read stream again.
using(var stream = resource.GetStream())
{
...
}
}
StreamResource is efficient even if you need to process content only once, as
it monitors timings of reading of source data and compares it with timings of
data consumption. If the difference exceeds some threshold then StreamResource
caches source greedily, otherwise source is pooled lazily. Thus, input resources
can be released promptly. This is important, for example, when the source
depends on a database connection.
The use pattern is following:
// Acquire resource and get shared stream.
using(var stream = new StreamResource(() =>
CallService(params...)).GetStream(true))
{
...
}
Finally, StreamResource allows to process
data in a pipe stream mode. This is when you have a generator function
Action<Stream> that can write to a stream, and you want to read that data.
The advantage of StreamResource over real pipe stream is that it
can work without blocking of generator, thus releasing resources early.
The use pattern is similar to the previous one:
using(var stream = new StreamResource(output =>
Generate(output, params...)).GetStream(true))
{
...
}
The source of the class can be found at
Streaming.zip.
Two monthes ago we have started
a process of changing column type from smallint to int in a big database.
This was splitted in two phases:
- Change tables and internal stored procedures and functions.
- Change interface API and update all clients.
The first part took almost two monthes to complete. Please read earlier post about
the technique we have selected for the implementation. In total we have transferred
about 15 billion rows. During this time database was online.
The second part was short but the problem was that we did not control all clients,
so could not arbitrary change types of parameters and of result columns.
All our clients use Entity Framework 4 to access the database. All access is done
though stored procedures. So suppose there was a procedure:
create procedure Data.GetReports(@type smallint) as
begin
select Type, ... from Data.Report where Type = @type;
end;
where column "Type" was of type smallint . Now
we were going to change it to:
create procedure Data.GetReports(@type int) as
begin
select Type, ... from Data.Report where Type = @type;
end;
where "Type" column became of type int .
Our tests have shown that EF bears with change of types of input parameters, but throws
exceptions when column type has been changed, even when a value fits the
range. The reason is that EF uses method SqlDataReader.GetInt16
to access the column value. This method has a remark: "No
conversions are performed; therefore, the data retrieved must already be a 16-bit
signed integer."
Fortunately, we have found that EF allows additional columns in the result set. This helped us to formulate the solution.
We have updated the procedure definition like this:
create procedure Data.GetReports(@type int) as
begin
select
cast(Type as smallint) Type, -- deprecated
Type TypeEx, ...
from
Data.Report
where
Type = @type;
end;
This way:
- result column
"Type" is declared as deprecated;
- old clients still work;
- all clients should be updated to use
"TypeEx" column;
- after all clients will be updated we shall remove
"Type" column from the result
set.
So there is a clear migration process.
P.S. we don't understand why SqlDataReader doesn't support value
conversion.
If you deal with
web applications you probably have already dealt with export data to Excel.
There are several options to prepare data for Excel:
- generate CSV;
- generate HTML that excel understands;
- generate XML in Spreadsheet 2003 format;
- generate data using Open XML SDK or some other 3rd party libraries;
- generate data in XLSX format, according to Open XML specification.
You may find a good article with pros and cons of each solution
here. We, in our turn, would like to share our experience in this field. Let's start from requirements:
- Often we have to export huge data-sets.
- We should be able to format, parametrize and to apply different styles to the exported data.
- There are cases when exported data may contain more than one table per sheet or
even more than one sheet.
- Some exported data have to be illustrated with charts.
All these requirements led us to a solution based on XSLT processing of streamed data.
The advantage of this solution is that the result is immediately forwarded to a client as fast as
XSLT starts to generate output. Such approach is much productive than generating of XLSX using of Open XML SDK or any other third party library, since it avoids keeping
a huge data-sets in memory on the server side.
Another advantage - is simple maintenance, as we achieve
clear separation of data and presentation layers. On each request to change formatting or
apply another style to a cell you just have to modify xslt file(s) that generate
variable parts of XLSX.
As result, our clients get XLSX files according with Open XML specifications.
The details of implementations of our solution see in our next posts.
Earlier we have shown
how to build streaming xml reader from business data and have reminded about
ForwardXPathNavigator which helps to create
a streaming xslt transformation. Now we want to show how to stream content
produced with xslt out of WCF service.
To achieve streaming in WCF one needs:
1. To configure service to use streaming. Description on how to do this can be
found in the internet. See web.config of the sample
Streaming.zip for the details.
2. Create a service with a method returning Stream :
[ServiceContract(Namespace = "http://www.nesterovsky-bros.com")]
[AspNetCompatibilityRequirements(RequirementsMode = AspNetCompatibilityRequirementsMode.Allowed)]
public class Service
{
[OperationContract]
[WebGet(RequestFormat = WebMessageFormat.Json)]
public Stream GetPeopleHtml(int count,
int seed)
{
...
}
}
2. Return a Stream from xsl transformation.
Unfortunately (we mentioned it already), XslCompiledTransform generates its
output into XmlWriter (or into output Stream ) rather than exposes result as
XmlReader , while WCF gets input stream and passes it to a client.
We could generate xslt output into a file or a memory Stream and then return
that content as input Stream , but this will defeat a goal of streaming, as
client would have started to get data no earlier that the xslt completed its
work. What we need instead is a pipe that form xslt output Stream to an input
Stream returned from WCF.
.NET implements pipe streams, so our task is trivial.
We have defined a utility method that creates an input Stream from a generator
populating an output Stream :
public static Stream GetPipedStream(Action<Stream> generator)
{
var output = new AnonymousPipeServerStream();
var input = new AnonymousPipeClientStream(
output.GetClientHandleAsString());
Task.Factory.StartNew(
() =>
{
using(output)
{
generator(output);
output.WaitForPipeDrain();
}
},
TaskCreationOptions.LongRunning);
return input;
}
We wrapped xsl transformation as such a generator:
[OperationContract]
[WebGet(RequestFormat = WebMessageFormat.Json)]
public Stream GetPeopleHtml(int count, int seed)
{
var context = WebOperationContext.Current;
context.OutgoingResponse.ContentType = "text/html";
context.OutgoingResponse.Headers["Content-Disposition"] =
"attachment;filename=reports.html";
var cache = HttpRuntime.Cache;
var path = HttpContext.Current.Server.MapPath("~/People.xslt");
var transform = cache[path] as XslCompiledTransform;
if (transform == null)
{
transform = new XslCompiledTransform();
transform.Load(path);
cache.Insert(path, transform, new CacheDependency(path));
}
return Extensions.GetPipedStream(
output =>
{
// We have a streamed business data.
var people = Data.CreateRandomData(count, seed, 0, count);
// We want to see it as streamed xml data.
using(var stream =
people.ToXmlStream("people", "http://www.nesterovsky-bros.com"))
using(var reader = XmlReader.Create(stream))
{
// XPath forward navigator is used as an input source.
transform.Transform(
new ForwardXPathNavigator(reader),
new XsltArgumentList(),
output);
}
});
}
This way we have build a code that streams data directly from business data to a
client in a form of report. A set of utility functions and classes helped us to
overcome .NET's limitations and to build simple code that one can easily
support.
The sources can be found at
Streaming.zip.
In the previous
post about streaming we have dropped at the point where we have XmlReader
in hands, which continously gets data from IEnumerable<Person>
source.
Now we shall remind about ForwardXPathNavigator - a class we have built
back in 2002, which adds streaming transformations to .NET's xslt processor.
While XslCompiledTransform is desperately obsolete, and no upgrade
will possibly follow; still it's among the fastest xslt 1.0 processors. With
ForwardXPathNavigator we add ability to transform input data of arbitrary size to this processor.
We find it interesting that
xslt 3.0 Working Draft defines streaming processing in a way that closely
matches rules for ForwardXPathNavigator :
Streaming achieves two important objectives: it allows large documents to be transformed
without requiring correspondingly large amounts of memory; and it allows the processor
to start producing output before it has finished receiving its input, thus reducing
latency.
The rules for streamability, which are defined in detail in 19.3 Streamability
Analysis, impose two main constraints:
-
The only nodes reachable from the node that is currently being processed are its
attributes and namespaces, its ancestors and their attributes and namespaces, and
its descendants and their attributes and namespaces. The siblings of the node, and
the siblings of its ancestors, are not reachable in the tree, and any attempt to
use their values is a static error. However, constructs (for example, simple forms
of xsl:number , and simple positional patterns) that require knowledge
of the number of preceding elements by name are permitted.
-
When processing a given node in the tree, each descendant node can only be visited
once. Essentially this allows two styles of processing: either visit each of the
children once, and then process that child with the same restrictions applied; or
process all the descendants in a single pass, in which case it is not possible while
processing a descendant to make any further downward selection.
The only significant difference between ForwardXPathNavigator and
xlst 3.0 streaming is in that we reported violations of rules for streamability
at runtime, while xslt 3.0 attempts to perform this analysis at compile time.
Here the C# code for the xslt streamed transformation:
var transform = new XslCompiledTransform();
transform.Load("People.xslt");
// We have a streamed business data.
var people = Data.CreateRandomData(10000, 0, 0, 10000);
// We want to see it as streamed xml data.
using(var stream =
people.ToXmlStream("people", "http://www.nesterovsky-bros.com"))
using(var reader = XmlReader.Create(stream))
using(var output = File.Create("people.html"))
{
// XPath forward navigator is used as an input source.
transform.Transform(
new ForwardXPathNavigator(reader),
new XsltArgumentList(),
output);
}
Notice how XmlReader is wrapped into ForwardXPathNavigator .
To complete the picture we need xslt that follows the streaming rules:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:d="http://www.nesterovsky-bros.com"
exclude-result-prefixes="msxsl d">
<xsl:output method="html" indent="yes"/>
<!-- Root template processed in the streaming mode. -->
<xsl:template match="/d:people">
<html>
<head>
<title>List of persons</title>
<style type="text/css">
.even
{
}
.odd
{
background: #d0d0d0;
}
</style>
</head>
<body>
<table border="1">
<tr>
<th>ID</th>
<th>First name</th>
<th>Last name</th>
<th>City</th>
<th>Title</th>
<th>Age</th>
</tr>
<xsl:for-each select="d:person">
<!--
Get element snapshot.
A
snapshot allows arbitrary access to the element's content.
-->
<xsl:variable name="person">
<xsl:copy-of select="."/>
</xsl:variable>
<xsl:variable name="position" select="position()"/>
<xsl:apply-templates mode="snapshot" select="msxsl:node-set($person)/d:person">
<xsl:with-param name="position" select="$position"/>
</xsl:apply-templates>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
<xsl:template mode="snapshot" match="d:person">
<xsl:param name="position"/>
<tr>
<xsl:attribute name="class">
<xsl:choose>
<xsl:when test="$position mod 2 = 1">
<xsl:text>odd</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:text>even</xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:attribute>
<td>
<xsl:value-of select="d:Id"/>
</td>
<td>
<xsl:value-of select="d:FirstName"/>
</td>
<td>
<xsl:value-of select="d:LastName"/>
</td>
<td>
<xsl:value-of select="d:City"/>
</td>
<td>
<xsl:value-of select="d:Title"/>
</td>
<td>
<xsl:value-of select="d:Age"/>
</td>
</tr>
</xsl:template>
</xsl:stylesheet>
So, we have started with a streamed entity data, proceeded to the streamed
XmlReader and reached to the streamed xslt transformation.
But at the final post about streaming we shall remind a simple way of building
WCF service returning html stream from our xslt transformation.
The sources can be found at
Streaming.zip.
If you're using .NET's IDictionary<K, V> you have probably found
its access API too boring. Indeed at each access point you have to write a code
like this:
MyValueType value;
var hasValue = dictionary.TryGetValue(key, out value);
...
In many, if not in most, cases the value is of a reference type, and you do not
usually store null values, so it would be fine if dictionary
returned null when value does not exist for the key.
To deal with this small nuisance we have declared a couple of accessor
extension methods:
public static class Extensions
{
public static V Get<K, V>(this IDictionary<K, V> dictionary, K key)
where V: class
{
V value;
if (key == null)
{
value = null;
}
else
{
dictionary.TryGetValue(key, out value);
}
return value;
}
public static V Get<K, V>(this IDictionary<K, V> dictionary, K? key)
where V: class
where K: struct
{
V value;
if (key == null)
{
value = null;
}
else
{
dictionary.TryGetValue(key.GetValueOrDefault(), out value);
}
return value;
}
}
These methods simplify dictionary access to:
var value = dictionary.Get(key);
...
For some reason neither .NET's XmlSerializer nor DataContractSerializer allow
reading data through an XmlReader . These APIs work other way round writing data
into an XmlWriter . To get data through XmlReader one needs to write it to some
destination like a file or memory stream, and then to read it using XmlReader .
This complicates streaming design considerably.
In fact the very same happens with other .NET APIs.
We think the reason of why .NET designers preferred XmlWriter to XmlReader in
those APIs is that XmlReader 's implementation is a state machine like, while
XmlWriter 's implementation looks like a regular procedure. It's much harder to
manually write and to support a correct state machine logic
than a procedure.
If history would have gone slightly
different way, and if yield return, lambda, and Enumerator API appeared before
XmlReader , and XmlWriter then, we think, both these classes looked differently.
Xml source would have been described with a IEnumerable<XmlEvent> instead of
XmlReader , and XmlWriter must be looked like a function receiving
IEnumerable<XmlEvent> . Implementing XmlReader would have meant a creating a
enumerator. Yield return and Enumerable API would have helped to implement it in
a procedural way.
But in our present we have to deal with the fact that DataContractSerializer
should write the data into XmlWriter , so let's assume we have a project that
uses Entity Framework to access the database, and that you have a data class
Person , and data access method GetPeople() :
[DataContract(Name = "person", Namespace = "http://www.nesterovsky-bros.com")]
public class Person
{
[DataMember] public int Id { get; set; }
[DataMember] public string FirstName { get; set; }
[DataMember] public string LastName { get; set; }
[DataMember] public string City { get; set; }
[DataMember] public string Title { get; set; }
[DataMember] public DateTime BirthDate { get; set; }
[DataMember] public int Age { get; set; }
}
public static IEnumerable<Person> GetPeople() { ... }
And your goal is to expose result of GetPeople() as XmlReader .
We achieve result with three simple steps:
- Define
JoinedStream - an input Stream implementation that
reads data from a enumeration of streams (IEnumerable<Stream> ).
- Build xml parts in the form of
IEnumerable<Stream> .
- Combine parts into final xml stream.
The code is rather simple, so here we qoute its essential part:
public static class Extensions
{
public static Stream JoinStreams(this IEnumerable<Stream> streams, bool closeStreams = true)
{
return new JoinedStream(streams, closeStreams);
}
public static Stream ToXmlStream<T>(
this IEnumerable<T> items,
string rootName = null,
string rootNamespace = null)
{
return items.ToXmlStreamParts<T>(rootName, rootNamespace).
JoinStreams(false);
}
private static IEnumerable<Stream> ToXmlStreamParts<T>(
this IEnumerable<T> items,
string rootName = null,
string rootNamespace = null)
{
if (rootName == null)
{
rootName = "ArrayOfItems";
}
if (rootNamespace == null)
{
rootNamespace = "";
}
var serializer = new DataContractSerializer(typeof(T));
var stream = new MemoryStream();
var writer = XmlDictionaryWriter.CreateTextWriter(stream);
writer.WriteStartDocument();
writer.WriteStartElement(rootName, rootNamespace);
writer.WriteXmlnsAttribute("s", XmlSchema.Namespace);
writer.WriteXmlnsAttribute("i", XmlSchema.InstanceNamespace);
foreach(var item in items)
{
serializer.WriteObject(writer, item);
writer.WriteString(" ");
writer.Flush();
stream.Position = 0;
yield return stream;
stream.Position = 0;
stream.SetLength(0);
}
writer.WriteEndElement();
writer.WriteEndDocument();
writer.Flush();
stream.Position = 0;
yield return stream;
}
private class JoinedStream: Stream
{
public JoinedStream(IEnumerable<Stream> streams, bool closeStreams = true)
...
}
}
The use is even more simple:
// We have a streamed business data.
var people = GetPeople();
// We want to see it as streamed xml data.
using(var stream = people.ToXmlStream("persons", "http://www.nesterovsky-bros.com"))
using(var reader = XmlReader.Create(stream))
{
...
}
We have packed the sample into the project
Streaming.zip.
In the next post we're going to remind about streaming processing in xslt.
Several days ago we've arrived to the blog "Recursive
lambda expressions". There, author asks how to write a lambda expression
that calculates a factorial (only expression statements are allowed).
The problem by itself is rather artificial, but at times you feel an intellectual
pleasure solving such tasks by yourself. So, putting original blog post aside we
devised our answers. The shortest one goes like this:
- As C# lambda expression cannot refer to itself, so it have to receive itself as
a parameter, so:
factorial(factorial, n) = n <= 1 ? 1 : n * factorial(factorial, n - 1);
- To define such lambda expression we have to declare a delegate type that receives
a delegate of the same type:
delegate int Impl(Impl impl, int n);
Fortunately, C# allows this, but a workaround could be used even if it were not
possible.
- To simplify the reasoning we've defined a two-expression version:
Impl impl = (f, n) => n <= 1 ? 1 : n * f(f, n - 1);
Func<int, int> factorial = i => impl(impl, i);
- Finally, we've written out a one-expression version:
Func<int, int> factorial = i => ((Func<Impl,
int>)(f => f(f, i)))((f, n) => n <= 1 ? 1 : n * f(f, n - 1));
- The use is:
var f = factorial(10);
After that excercise we've returned back to original blog and compared
solutions.
We can see that author appeals to a set theory but for some reason his answer is
more complex than nesessary, but comments contain variants that analogous to our
answer.
This time we
update csharpxom to adjust it to C# 4.5.
Additions are async modifier and
await operator.
They are used to simplify asynchronous programming.
The following example from
the msdn:
private async Task<byte[]> GetURLContentsAsync(string url)
{
var content = new MemoryStream();
var request = (HttpWebRequest)WebRequest.Create(url);
using(var response = await request.GetResponseAsync())
using(var responseStream = response.GetResponseStream())
{
await responseStream.CopyToAsync(content);
}
return content.ToArray();
}
looks like this in csharpxom:
<method name="GetURLContentsAsync" access="private" async="true">
<returns>
<type name="Task" namespace="System.Threading.Tasks">
<type-arguments>
<type name="byte" rank="1"/>
</type-arguments>
</type>
</returns>
<parameters>
<parameter name="url">
<type name="string"/>
</parameter>
</parameters>
<block>
<var name="content">
<initialize>
<new-object>
<type name="MemoryStream" namespace="System.IO"/>
</new-object>
</initialize>
</var>
<var name="request">
<initialize>
<cast>
<invoke>
<static-method-ref name="Create">
<type name="WebRequest" namespace="System.Net"/>
</static-method-ref>
<arguments>
<var-ref name="url"/>
</arguments>
</invoke>
<type name="HttpWebRequest" namespace="System.Net"/>
</cast>
</initialize>
</var>
<using>
<resource>
<var name="response">
<initialize>
<await>
<invoke>
<method-ref name="GetResponseAsync">
<var-ref name="request"/>
</method-ref>
</invoke>
</await>
</initialize>
</var>
</resource>
<using>
<resource>
<var name="responseStream">
<initialize>
<invoke>
<method-ref name="GetResponseStream">
<var-ref name="response"/>
</method-ref>
</invoke>
</initialize>
</var>
</resource>
<expression>
<await>
<invoke>
<method-ref name="CopyToAsync">
<var-ref name="responseStream"/>
</method-ref>
<arguments>
<var-ref name="content"/>
</arguments>
</invoke>
</await>
</expression>
</using>
</using>
<return>
<invoke>
<method-ref name="ToArray">
<var-ref name="content"/>
</method-ref>
</invoke>
</return>
</block>
</method>
For a long time we were developing web applications with ASP.NET and JSF. At
present we prefer rich clients and a server with page templates and RESTful web
services.
This transition brings technical questions. Consider this one.
Browsers allow to store session state entirely on the client, so should we
maintain a session on the server?
Since the server is just a set of web services, so we may supply all required
arguments on each call.
At first glance we can assume that no session is required on the server.
However, looking further we see that we should deal with data validation
(security) on the server.
Think about a classic ASP.NET application, where a user can select a value from
a dropdown. Either ASP.NET itself or your program (against a list from a
session) verifies that the value received is valid for the user. That list of
values and might be other parameters constitute a user profile, which we stored
in session. The user profile played important role (often indirectly) in the
validation of input data.
When the server is just a set of web services then we have to validate all
parameters manually. There are two sources that we can rely to: (a)
a session, (b)
a user principal.
The case (a) is very similar to classic ASP.NET application except that with
EnableEventValidation="true" runtime did it for us most of the time.
The case (b) requires reconstruction of the user profile for a user principal
and then we proceed with validation of parameters.
We may cache user profile in session, in which case we reduce (b) to (a); on the
other hand we may cache user profile in
Cache, which is also similar to (a) but which might be lighter than (at least not
heavier than) the solution with the session.
What we see is that the client session does not free us from server session (or
its alternative).
We were dealing with a datasource of (int? id, string
value) pairs in LINQ. The data has originated from a database where id is unique field.
In the program this datasource had to be seen as a dictionary, so we have
written a code like this:
var dictionary =
CreateIDValuePairs().ToDictionary(item => item.ID, item => item.Value) ;
That was too simple-minded. This code compiles but crashes at runtime when there is an id == null .
Well, help warns about this behaviour, but anyway this does not make pain easier.
In our opinion this restriction is not justified and just complicates the use
of Dictionaty .
A bit history: the first release of this solution was about 9.5 years ago...
Today we've run into a strange situation. One of our clients ask us about automatic conversion of data from mainframe (that were defined as COBOL copybooks) into XML or Java/.NET objects. On our suggestion to use eXperanto, which is well known to him, he stated that he wouldn't like to use a tool of a company that is no more exists...
The situation, in our opinion, become more strange when you consider the following:
- eXperanto (the design-time tool and run-time libraries for Java and .NET) were developed, well tested, and delivered by us to production already several years ago.
- the client bought this set (the tool and libraries).
- the set is in production yet already in another big company, and is used time to time by our company in different migration projects.
- the client talks with developers of this tool and run-time libraries, and he knows about this fact.
- the client uses widely open source solutions even without dedicated vendors or support warranties.
We're not big fans of
Entity Framework, as we don't directly expose the database structure to
the client program but rather through stored procedures and functions. So, EF for
us is a tool to expose those stored procedures as .NET wrappers. This limited use
of EF still greatly automates the data access code.
But what we have lately found is that the EF has a problem with char parameters. Namely,
if you import a procedure say MyProc that accepts char(1) ,
and then will call it through the generated wrapper, the you will see in sql profiler
that char(1) parameter is passed with many trailing spaces as if it
were char(8000) . There isn't necessity to prove that this is highly
ineffective.
We can see that the problem happens in VS 2010 designer rather than in the EF runtime,
as SP's parameters are not attributed with length, see model xml (*.edmx):
<Function Name="MyProc" Schema="Data">
...
<Parameter Name="recipientType" Type="char" Mode="In"
/>
...
</Function>
while if we set:
<Parameter Name="recipientType" Type="char" MaxLength="1"
Mode="In" />
the runtime starts working as expected. So the workaround is to fix model file manually.
See also:
Stored Proc and Char parm
|