.NET news » Other Other Rss Feed

Performance and Design Guidelines for Data Access Layers

Many problems you will face are actually the building data access layer, sometimes thinly disguised, sometimes in your face; it’s one of the broad patterns that you see in computer science – as the cliché says: it keeps rearing its ugly head.

Despite this, the same sorts of mistakes tend to be made in the design of such systems so I’d like to offer a bit of hard-won advice on how to approach a data access problem. Mostly this is going to be in the form of patterns/anti-patterns but nonetheless I hope it will be useful.

As always, in the interest of not writing a book, this advice is only approximately correct.

The main thing that you should remember is that access to the data will take two general shapes.  In database parlance you might say some of the work will be OLTP-ish (online transaction processing) and some of the work will be OLAP-ish (online analytical processing).   But simply, there’s how you update pieces of your data and how your read chunks of it.  And they have different needs.

At present it seems to me that people feel a strong temptation to put an OO interface around the data and expose that to customers.  This can be ok as part of the solution if you avoid some pitfalls, so I suggest you follow this advice:

1. Consider the unit of work carefully

There are likely to be several typical types of updates.  Make sure that you fetch enough data so that the typical cases do one batch of reads for necessary data, modify the data locally, and then write back that data in a batch.  If you read too much data you incur needless transfer costs, if you read too little data then you make too many round trips to do the whole job.

You may have noticed that I began with a model where you fetch some data, change it locally, and write it back.  This is a fairly obvious thing to do given that you are going to want to do the write-back in probably a single transaction but it’s important to do this even if you aren’t working in a transacted system.  Consider an alternative:  if you were to provide some kind of proxy to the data to each client and then RPC each property change back to the server you are in a world of hurt.  Now the number of round trips is very high and furthermore it’s impossible to write correct code because two people could be changing the very same object at the same time in partial/uncoordinated ways. 

This may seem like a silly thing to do but if the authoritative store isn’t a database it’s all too common for people to forget that the database rules exist for a reason and they probably apply to any kind of store at all.  Even if you’re using (e.g.) the registry or some other repository you still want to think about unit-of-work and make it so the each normal kind of update is a single operation.

Whatever you do don’t create an API where each field read/write is remoted to get the value.  Besides the performance disaster this creates it’s impossible to understand what will happen if you several people are doing something like Provider.SomeValue += 1;

2. Consider your locking strategy

Implicit in the discussion above is some notion of accepting or rejecting writes because the data has changed out from under you.  This is a normal situation and making it clear that it can and does happen and should be handled makes everyone’s life simpler.  This is another reason why an API like Provider.SomeValue = 1 to do the writes is a disaster.  How does it report failure?  And if it failed, how much failed?

You can choose an optimistic locking strategy or something else but you’ll need one.  A sure sign that you have it right is that the failure mode is obvious, and the recovery is equally obvious. 

I once had a conversation with Jim Gray where I told him how ironic it was to me that the only reason transactions could ever succeed at all in a hot system was that they had the option of failing.  Delicious irony that.

Remember, even data from a proxy isn’t really live.  It’s an illusion.  The moment you say “var x = provider.X;”  your ‘x’ is already potentially stale by the time it’s assigned.  Potentially stale data is the norm, it’s just a question of how stale and how do you recover. That means some kind of isolation and locking choice is mandatory.

3. Don’t forget the queries

Even if you did everything else completely correctly you’ve still only built half the system if all you can do is read and modify entities.  The OLAP part of your system will want to do bulk reads like “find me all the photos for this week”.  When doing these types of accesses it is vital to take advantage of their read aspect.  Do not create transactable objects just bring back the necessary data in a raw form.  Simple arrays of primitives are the best choice; they have the smallest overheads.  Do not require multiple round-trips to get commonly required data or the fixed cost of the round trip will end up dwarfing the actual work you're trying to do.

These queries are supposed to represent a snapshot in time according to whatever isolation model your data has (which comes back to the requirements created by your use cases and your unit of work).  If you force people to use your object interface to read raw data you will suffer horrible performance and you will likely have logically inconsistent views of your data.  Don’t do that.

One of the reasons that systems like Linq to SQL were as successful as they were (from various perspectives I think) is that they obeyed these general guidelines:

you can get a small amount of data or a large amount of data you can get objects or just data you can write back data chunks in units of your choice the failure mode for read/writes is clear, easy to deal with, and in-your-face (yes, reads can fail, too)

Other data layers, while less general no doubt, would do well to follow the same set of rules.

 

 

10 Jan 2012, 15:47:00   Source: Performance and Design Guidelines for Data Access Layers   Tags: Other

Windows with C++: Thread Pool Synchronization

Blocking operations are bad news for concurrency. You need a way for the thread pool to wait on your behalf without affecting its concurrency limits. It can then queue a callback once the resource is available or the time has elapsed. Along with work objects, the thread pool API provides a number of other callback-generating objects. Here, Kenny Kerr shows how to use wait objects.
25 Oct 2011, 19:00:00   Source: Windows with C++: Thread Pool Synchronization   Tags: Other

Universal Type Extender

Emulate extension properties by extending any reference type with any other types.
18 Oct 2011, 00:54:00   Source: Universal Type Extender   Tags: Other

Multiple face detection and recognition in real time

Face detection and recognition with support of multiples faces in the same scene and others interesting features using C# and EmguCV
14 Aug 2011, 23:30:00   Source: Multiple face detection and recognition in real time   Tags: Other

Working with Audio in Windows Phone 7

Smart phones are constantly evolving to fit your mobile lifestyle. Most modern phones function as full featured music and video players. Windows Phone 7 follows the path blazed by other smart phones, but adds its own twist. Your musical life on this device revolves around the Music + Videos hub. This article contains details on how to interact with the Music hub from your application.

8 Aug 2011, 19:00:00   Source: Working with Audio in Windows Phone 7   Tags: Other

An MFC-CListCtrl derived class that allows other ‘controls’ to be inserted into a particular cell

A class derived from CListCtrl that allows edit controls, combo boxes, check boxes, date pickers, and color pickers to be inserted into or removed from particular cells extremely easily. The inserted 'controls' are not CWnd-derived.
2 Aug 2011, 16:01:00   Source: An MFC-CListCtrl derived class that allows other...   Tags: Other

Introduction to Ruby on Rails

Ruby on Rails is an open source web development stack with a large developer base. Last year, Ruby on Rails reached a critical milestone with the release of Ruby on Rails version 3.0. For more information about Ruby on Rails I recommend checking out www.rubyonrails.org. You can find installation information, documentation and links to other resources on this site. This article will demonstrate how to get up and running with Ruby on Rails with help from the RailsInstaller.

30 Jun 2011, 19:00:00   Source: Introduction to Ruby on Rails   Tags: Other

7 Tips for Loading JavaScript Rich Web 2.0-like Sites Significantly Faster

Learn the principle behind Microsoft's new Doloto and 6 other cool techniques that I did in Pageflakes to load large amounts of JavaScript without compromising performance
11 Jun 2011, 10:54:00   Source: 7 Tips for Loading JavaScript Rich Web 2.0-like Sites...   Tags: Other

Cloud Cache: Introducing the Windows Azure AppFabric Caching Service

Windows Azure AppFabric Caching service provides an easy-to-use cache in the cloud that you can employ for application data, maintaining session state, and other tasks. We'll show you how to start using the Cache service in your apps today.
31 Mar 2011, 19:00:00   Source: Cloud Cache: Introducing the Windows Azure AppFabric...   Tags: Other

Use MvcContrib Grid to Display a Grid of Data in ASP.NET MVC

The past six articles in this series have looked at how to display a grid of data in an ASP.NET MVC application and how to implement features like sorting, paging, and filtering. In each of these past six tutorials we were responsible for generating the rendered markup for the grid. Our Views included the <table> tags, the <th> elements for the header row, and a foreach loop that emitted a series of <td> elements for each row to display in the grid. While this approach certainly works, it does lead to a bit of repetition and inflates the size of our Views.

The ASP.NET MVC framework includes an HtmlHelper class that adds support for rendering HTML elements in a View. An instance of this class is available through the Html object, and is often used in a View to create action links (Html.ActionLink), textboxes (Html.TextBoxFor), and other HTML content. Such content could certainly be created by writing the markup by hand in the View; however, the HtmlHelper makes things easier by offering methods that emit common markup patterns. You can even create your own custom HTML Helpers by adding extension methods to the HtmlHelper class.

MvcContrib is a popular, open source project that adds various functionality to the ASP.NET MVC framework. This includes a very versatile Grid HTML Helper that provides a strongly-typed way to construct a grid in your Views. Using MvcContrib's Grid HTML Helper you can ditch the <table>, <tr>, and <td> markup, and instead use syntax like Html.Grid(...). This article looks at using the MvcContrib Grid to display a grid of data in an ASP.NET MVC application. A future installment will show how to configure the MvcContrib Grid to support both sorting and paging.

15 Mar 2011, 19:00:00   Source: Use MvcContrib Grid to Display a Grid of Data in ASP.NET MVC   Tags: Other