How choice of Data Structures and Algorithms affect Performance

Posted by

 

The main data structure issues that
you need to consider are as under:

Table
1

Issues

Implications

Choosing a collection without evaluating your needs
(size, adding, deleting, updating)

Reduced efficiency; overly complex code.

Using the wrong collection for a given task

Reduced efficiency; overly complex code.

Excessive type conversion

Passing value type to reference type causing boxing and
unboxing overhead, causing performance hit.

Inefficient lookups

Complete scan of all the content in the data structure,
resulting in slow performance.

Not measuring the cost of your data structures or
algorithms in your actual scenarios

Undetected bottlenecks due to inefficient code,

 

What are
Collections

There are two basic
types of collections: lists and dictionaries. Lists are index-based.
Dictionaries are key-based, which means you store a key with the value. Table 2
summarizes the various list and dictionary types provided by the .NET Framework
class libraries.

Collections are types that implement
the IEnumerable, ICollection, or IList interfaces. If the types implement
IDictionary separately or in
addition to these three interfaces, they are referred to as Dictionaries. Table 2 lists the guidelines
for each of these collection types.

Table
2

Type

Description

ArrayList

This is a dynamically sizable array. It is useful when
you do not know the required array size at design time.

Hashtable

This is a collection of key/value pairs that are
organized based on the hash code of the key. It is appropriate when you need to
search but not sort.

HybridDictionary

This uses a ListDictionary when the collection is
small, and switches to Hashtable
when the collection gets large.

ListDictionary

This is useful for storing 10 or less key/value pairs.

NameValueCollection

This is a sorted collection of associated String keys and String values that can be accessed either
with the key or with the index.

Queue

This is a first-in, first-out collection that implements
ICollection.

SortedList

This is a collection of key/value pairs that are sorted
by the keys and are accessible by key and by index.

Stack

This is a simple last-in, first-out collection of
objects.

StringCollection

This is a strongly typed array list for strings.

StringDictionary

This is a hash table with the key strongly typed to be a
string rather than an object.

 

The correct use of
data structures and algorithms plays a significant role in building
high-performance applications. Your choices in these areas can significantly
affect memory consumption and CPU loading.

The following guidelines help you to
use efficient data structures and algorithms:

●    Choose an appropriate data structure.

●    Pre-assign size for large dynamic growth data
types
.

●    Use value and reference types
appropriately
.

 

Choose an
Appropriate Data Structure

Before choosing the
collection type for your scenarios, you should spend time analyzing your
specific requirements by using the following common criteria:

●    Data storage. Consider how much data will
be stored. Will you store a few records or a few thousand records? Do you know
the amount of data to be stored ahead of time instead of at run time? How do you
need to store the data? Does it need to be stored in order or randomly?

●    Type. What type of data do you need to
store? Is it strongly typed data? Do you store variant objects or value types?

●    Growth. How will your data grow? What size
of growth? What frequency?

●    Access. Do you need indexed access? Do you
need to access data by using a key-value pair? Do you need sorting in addition
to searching?

●    Concurrency. Does access to the data need
to be synchronized? If the data is regularly updated, you need synchronized
access. You may not need synchronization if the data is read-only.

●    Marshaling. Do you need to marshal your
data structure across boundaries? For example, do you need to store your data in
a cache or a session store? If so, you need to make sure that the data structure
supports serialization in an efficient way.

 

Pre-Assign Size
for Large Dynamic Growth Data Types

If you know that
you need to add a lot of data to a dynamic data type, assign an approximate size
up front wherever you can. This helps avoid unnecessary memory re-allocations.

 

Use Value and
Reference Types Appropriately

Value types are
stack-based and are passed by value, while reference types are heap-based and
are passed by reference. Use the following guidelines when choosing between
pass-by-value and pass-by-reference semantics:

●         
Avoid passing large value types by value to
local methods
. If the target method is in the same process or
application domain, the data is copied onto the stack. You can improve
performance by passing a reference to a large structure through a method
parameter, rather than passing the structure by value.

●         
Consider passing reference types by value
across process boundaries
. If you pass an object reference across a
process boundary, a callback to the client process is required each time the
objects’ fields or methods are accessed. By passing the object by value, you
avoid this overhead. If you pass a set of objects or a set of connected objects,
make sure all of them can be passed by value.

●         
Consider passing a reference type when the
size of the object is very large or the state is relevant only within the
current process boundaries
. For example, objects that maintain
handles to local server resources, such as files.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *