Die Inhalte auf dieser Seite stehen unter diesem Creative Commens License Dingsbums. Also fröhlich kopieren und weiterverteilen. Bitte abweichende Copyright-Hinweise in anderen Blogs respektiveren.

Montag, 29. März 2010

2010-03-29-Vb20xx-Bottlenecks and traps using Linq with expensive select-expressions

VersionDescription
2010-03-29 T15:51:00Corrected formula for number of calls to new.

Description:

Implement Lazy-Load pattern for IEnumerator.Item[Get].

Problem:

If you are a linq-beginner there is something you should really be aware of before using it. Otherwise some day you may wonder about strange bugs or bad performance of your linq application.

Starting with linq some months ago I meanwhile like it very much and went deeply in using Linq-querys for various use cases. I used the select expression also to instantiate objects as shown in the below code snippet. Doing so i walked in a trap because i was not aware, that the select expression is evaluated each time the object set is enumerated. Also complex queries may be evaluated very often which increases runtime costs when the result tree is not copied.

Consider the following Vb.Net code:

 Dim aCount As Integer = (From aNum In New Integer() {1,2,3} Select New Object).Count

This query seems ok, but the problem is, that new objects are instantiated each time the list is iterated and even if one uses the Count-extension-method. This results in a number of calls to new of n * c + n * i where n is the number of items in the list, c is the number of requests to the count method and i is the number of iterations. The required number of calls to new is more less equal to n.

I now use following extension method to solve this problem and get the desired number of n calls to new:


 Module MyExtensions
  <System.Runtime.CompilerServices.Extension()>_
  Friend Function EvaluateItems(Of T)(ByVal aItems As IEnumerable(Of Func(Of T))) As IEnumerable(Of T)
   Dim aArray As T()
   Dim aCount As Integer = aItems.Count
   ReDim aArray(aCount - 1)
   Dim aIndex As Integer
   Dim aItem As Func(Of T)
   For Each aItem In aItems
    aArray(aIndex) = aItem()
    aIndex = aIndex + 1
    Next
   Return aArray
  End Function
 End Module

 Dim aQuery As IEnumerable(Of Func(Of Object)) = (From aNum In New Integer() {1,2,3} Select Function() New Object)  Dim aCount1 As Integer = aQuery.Count
 Dim aObjects As IEnumerable(Of Object) = aQuery.EvaluateItems
 Dim aCount2 As Integer = aObjects.Count

With the obove code only n constructors calls occur where n is the number of objects in the list.

However, i have one suggestions here.

Of course it is necessarry that the select expression is evaluated each time the item is requested. However, this does not apply if the list is only enumerated using IEnumerator.MoveNext without using IEnumerator.Item[Get]. The Interface design is suitable for implementing a lazy load here but (linq)-implementations of IEnumerable do not follow this.

I also wrote a extension method to (deep)-copy any IEnumerable tree for optimizing complex queries where updating is handled by the owner of the query result. The implementation quite tricky and not not so good so i don't post it here yet. Maybe Microsoft could provide a default extension method or a code snippet to do this.

Beside of this i'd like to suggest linq to implement a lazy load in IEnumerator.Item.Get to allow iteration without evaluating the select expression. For example this is done when counting items or when requesting an item at the end of the object set using ElementAt. For some use cases (for example implementing paging) this behaviour is required.

Solution:

1a. Implement lazy load in certain IEnumerator.Item[Get] implementations. (For example Linq)

Or

1b. provide extension method to create a ghost-list.

2. Provide a IEnumerable-Extension method which deep-copys items in an ienumerable (to an array) to avoid unneccesarry evaluations of the select expression.

TestCase:


 Class C
  Public Sub New()
   InstanceCount = InstanceCount + 1
  End Sub
  Public Shared InstanceCount As Integer
 End Class

 Module Test

  Function MyCount(ByVal a As IEnumerable) As Integer
   Dim aCount As Integer
   Dim aEnumerator As IEnumerator = a.GetEnumerator
   While aEnumerator.MoveNext
    aCount = aCount + 1
   End While
   Return aCount
  End Function

  Sub Test
   Debug.Assert(3 = MyCount((From aNum In New Integer() {1,2,3} Select New C).CreateGhostList)
   Dim aSuggestionImplemented As Boolean = True
   If aSuggestionImplemented Then
    ' This will fail with Vb.Net 2010
    Debug.Assert(0 = C.InstanceCount)
   Else
    ' In Vb.Net 2010 using IEnumerator.MoveNext evaluates the Select expression of a linq-query.
    Debug.Assert(3 = C.InstanceCount)
   End If
  End Sub

  End Module

Keine Kommentare:

Kommentar veröffentlichen