Friday, January 16, 2009

Improving Performance of LINQ Queries

Part of the challenge facing television broadcasters is reconciling asrun logs against the playlists they are supposed to represent. Every once in a while content is changed on the fly so what is actually broadcasted is different from what the original playlist stated. Part of this reconciliation involves the generation of recon keys, which involve comparing event information of an asrun log against the playlist it was generated for. In order to compare the original playlists and the resulting asrun logs, I turned to LINQ to get the information.

One of the LINQ queries I used for generating the recon keys was:

var rOriginalEvent = from orgEvents in db.BIP_Playlist_Events
where orgEvents.eventid.Equals(searchStr)
orderby orgEvents.Last_Updated descending
select orgEvents;

if (rOriginalEvent.Count() > 0)
{
hasMatchingEvents = true;
dlstartDateTime = rOriginalEvent.First().dlstart;
orgDlhnumber = rOriginalEvent.First().dlhnumber;
orgEventid = rOriginalEvent.First().eventid;

The purpose of this query was to search archives of the original playlists from which the D-class asrun log was generated to compare the original event id, start time and dlhnumber against the asrun log’s information. These playlists were archived in a SQL Server 2005 database table, so this LINQ query was using it as the source. I timed the query and it ran a total of 0s according to my calculations. However, when I timed the following rOriginalEvent.First() calls, they took a total of .45 seconds. This does not seem to be a significant amount of time, but when repeated 1500 times for a single transaction, it meant the transaction would last almost 20 minutes.

Since LINQ automatically generates the most efficient query in the background, I looked for ways to extract the record I wanted directly from the var query. I then turned to the following LINQ query:

var rOriginalEvent = (from orgEvents in db.BIP_Playlist_Events
where orgEvents.eventid.Equals(searchStr)
orderby orgEvents.Last_Updated descending
select orgEvents).FirstOrDefault();

This would move all the processing into the LINQ SQL connector and extract either the record I wanted or a null object. The important thing is that I no longer needed the First() call when getting the information:

dlstartDateTime = rOriginalEvent.dlstart;
orgDlhnumber = rOriginalEvent.dlhnumber;
orgEventid = rOriginalEvent.eventid;

By moving the costly First() call into the LINQ query itself I reduced the time of the total operation from .45 seconds to .1 seconds, thus a performance improvement of 4.5 times faster than before.

In conclusion, I recommend that if you are after a single record for a specific set of information, use the above LINQ expression. The FirstOrDefault() method returns the first element in the collection or the default value if the collection is empty, so you don’t have to worry about dealing with a null object.

No comments: