Though parallel and async algorithms solve different tasks, they converge in some cases. And it's not always immediately clear what's the best.
Consider the following task: get a total word count contained in a given a set of urls.
At first we've solved it as a parallel task: indeed this fits to MapReduce pattern when you get urls' contents to count the number of words in parallel (Map), and then sum word counts per each url to get final result (Reduce). But then we decided that the very same MapReduce algorithm can be implemented with async.
async
This is a parallel word count:
public static int ParallelWordCount(IEnumerable<string> urls) { var result = 0; Parallel.ForEach( urls, url => { string content; using(var client = new WebClient()) { content = client.DownloadString(url); } var count = WordCount(content); Interlocked.Add(ref result, count); }); return result; }
Here is async word count:
public static async Task<int> WordCountAsync(IEnumerable<string> urls) { return (await Task.WhenAll(urls.Select(url => WordCountAsync(url)))).Sum(); } public static async Task<int> WordCountAsync(string url) { string content; using(var client = new WebClient()) { content = await client.DownloadStringTaskAsync(url); } return WordCount(content); }
And this is an implementation of word count for a text (it's less important for this discussion):
public static int WordCount(string text) { var count = 0; var space = true; for(var i = 0; i < text.Length; ++i) { if (space != char.IsWhiteSpace(text[i])) { space = !space; if (!space) { ++count; } } } return count; }
Our impressions are:
The parallel version is contained in one method, while the async one is implemeneted with two methods.
This is due to the fact that C# compiler fails to generate async labmda function. We attribute this to Microsoft who leads and implements C# spec. Features should be composable. If one can implement a method as a lambda function, and one can implement a method as async then one should be able to implement a method as an async lambda function.
Both parallel and async versions are using thread pool to run their logic.
While both implementations follow MapReduce pattern, we can see that async version is much more scaleable. It's because of parallel threads stay blocked while waiting for an http response. On the other hand async tasks are not bound to any thread and are just not running while waiting for I/O.
This sample helped us to answer the question as to when to use parallel and when async. The simple answer goes like this: