The main data structure issues that
you need to consider are as under:
Table
1
Issues |
Implications |
Choosing a collection without evaluating your needs |
Reduced efficiency; overly complex code. |
Using the wrong collection for a given task |
Reduced efficiency; overly complex code. |
Excessive type conversion |
Passing value type to reference type causing boxing and |
Inefficient lookups |
Complete scan of all the content in the data structure, |
Not measuring the cost of your data structures or |
Undetected bottlenecks due to inefficient code, |
What are
Collections
There are two basic
types of collections: lists and dictionaries. Lists are index-based.
Dictionaries are key-based, which means you store a key with the value. Table 2
summarizes the various list and dictionary types provided by the .NET Framework
class libraries.
Collections are types that implement
the IEnumerable, ICollection, or IList interfaces. If the types implement
IDictionary separately or in
addition to these three interfaces, they are referred to as Dictionaries. Table 2 lists the guidelines
for each of these collection types.
Table
2
Type |
Description |
ArrayList |
This is a dynamically sizable array. It is useful when |
Hashtable |
This is a collection of key/value pairs that are |
HybridDictionary |
This uses a ListDictionary when the collection is |
ListDictionary |
This is useful for storing 10 or less key/value pairs. |
NameValueCollection |
This is a sorted collection of associated String keys and String values that can be accessed either |
Queue |
This is a first-in, first-out collection that implements |
SortedList |
This is a collection of key/value pairs that are sorted |
Stack |
This is a simple last-in, first-out collection of |
StringCollection |
This is a strongly typed array list for strings. |
StringDictionary |
This is a hash table with the key strongly typed to be a |
The correct use of
data structures and algorithms plays a significant role in building
high-performance applications. Your choices in these areas can significantly
affect memory consumption and CPU loading.
The following guidelines help you to
use efficient data structures and algorithms:
● Choose an appropriate data structure.
● Pre-assign size for large dynamic growth data
types.
● Use value and reference types
appropriately.
Choose an
Appropriate Data Structure
Before choosing the
collection type for your scenarios, you should spend time analyzing your
specific requirements by using the following common criteria:
● Data storage. Consider how much data will
be stored. Will you store a few records or a few thousand records? Do you know
the amount of data to be stored ahead of time instead of at run time? How do you
need to store the data? Does it need to be stored in order or randomly?
● Type. What type of data do you need to
store? Is it strongly typed data? Do you store variant objects or value types?
● Growth. How will your data grow? What size
of growth? What frequency?
● Access. Do you need indexed access? Do you
need to access data by using a key-value pair? Do you need sorting in addition
to searching?
● Concurrency. Does access to the data need
to be synchronized? If the data is regularly updated, you need synchronized
access. You may not need synchronization if the data is read-only.
● Marshaling. Do you need to marshal your
data structure across boundaries? For example, do you need to store your data in
a cache or a session store? If so, you need to make sure that the data structure
supports serialization in an efficient way.
Pre-Assign Size
for Large Dynamic Growth Data Types
If you know that
you need to add a lot of data to a dynamic data type, assign an approximate size
up front wherever you can. This helps avoid unnecessary memory re-allocations.
Use Value and
Reference Types Appropriately
Value types are
stack-based and are passed by value, while reference types are heap-based and
are passed by reference. Use the following guidelines when choosing between
pass-by-value and pass-by-reference semantics:
●
Avoid passing large value types by value to
local methods. If the target method is in the same process or
application domain, the data is copied onto the stack. You can improve
performance by passing a reference to a large structure through a method
parameter, rather than passing the structure by value.
●
Consider passing reference types by value
across process boundaries. If you pass an object reference across a
process boundary, a callback to the client process is required each time the
objects’ fields or methods are accessed. By passing the object by value, you
avoid this overhead. If you pass a set of objects or a set of connected objects,
make sure all of them can be passed by value.
●
Consider passing a reference type when the
size of the object is very large or the state is relevant only within the
current process boundaries. For example, objects that maintain
handles to local server resources, such as files.