affiliate_link

Sunday, July 9, 2017

Important C# Concepts Part 2


1. What are generics in C#?

Generics is a technique by which we can declare a class without specifying the data type that the class works.


Generics Problem Statement

Code block 1. An object based stack 
Shows the full implementation of the Object-based stack. Because Object is the canonical .NET base type, you can use the Object-based stack to hold any type of items, such as integers:

Stack stack = new Stack();
stack.Push(1);
stack.Push(2);
int number = (int)stack.Pop();

public class Stack
{
   readonly int m_Size; 
   int m_StackPointer = 0;
   object[] m_Items; 
   public Stack():this(100)
   {}   
   public Stack(int size)
   {
      m_Size = size;
      m_Items = new object[m_Size];
   }
   public void Push(object item)
   {
      if(m_StackPointer >= m_Size) 
         throw new StackOverflowException();       
      m_Items[m_StackPointer] = item;
      m_StackPointer++;
   }
   public object Pop()
   {
      m_StackPointer--;
      if(m_StackPointer >= 0)
      {
         return m_Items[m_StackPointer];
      }
      else
      {
         m_StackPointer = 0;
         throw new InvalidOperationException("Cannot pop an empty stack");
      }
   }
}

Two Problems with Object Based Solution

1. The first issue is performance. When using value types, you have to box them in order to 
push and store them, and unbox the value types when popping them off the stack. Boxing 
and unboxing incurs a significant performance penalty in their own right, but it also increases 
the pressure on the managed heap, resulting in more garbage collections, which is not great 
for performance either. Even when using reference types instead of value types, there is still 
a performance penalty because you have to cast from an Object to the actual type you 
interact with and incur the casting cost:

Stack stack = new Stack();
stack.Push("1");
string number = (string)stack.Pop();


2. The second (and often more severe) problem with the Object-based solution is type 
safety. Because, the compiler lets you cast anything to and from Object, you lose 
compile-time type safety. For example, the following code compiles fine, but raises an 
invalid cast exception at run time:

Stack stack = new Stack();
stack.Push(1);
//This compiles, but is not type safe, and will throw an exception: 
string number = (string)stack.Pop();

You can overcome these two problems by providing a type-specific (and hence, type-safe) performant stack. For integers you can implement and use the IntStack:
public class IntStack
{
   int[] m_Items; 
   public void Push(int item){...}
   public int Pop(){...}
} 
IntStack stack = new IntStack();
stack.Push(1);
int number = stack.Pop(); 

And so on. Unfortunately, solving the performance and type-safety problems this way 
introduces a third, and just as serious problem—productivity impact
Why Generics?
Generics allows you create type-safe classes without comprising type safety, performance 
or productivity.

Stack<int> stack = new Stack<int>();
class Stack<t>
{
    int m_StackPointer = 0;
    T[] m_Items;

    public void Push(T item)
    {
        m_Items[m_StackPointer] = item;
    }

    public T Pop()
    {
        return m_Items[m_StackPointer];
    }
}

Serialization and how it works

Serialization is the process of converting an object type to a stream of bytes in
order to store the object or transfer it to memory, database or file. Its main purpose is 
to save the state of the object in order to recreate it when needed. The reverse process
is called deserialization

How Serialization Works

This illustration shows the overall process of serialization.
Serialization Graphic
The object is serialized to a stream, which carries not just the data, but information about the object's type, such as its version, culture, and assembly name. From that stream, it can be stored in a database, a file, or memory.

Making an Object Serializable

To serialize an object, you need the object to be serialized, a stream to contain the serialized object, and a FormatterSystem.Runtime.Serialization contains the classes necessary for serializing and deserializing objects.
Apply the SerializableAttribute attribute to a type to indicate that instances of this type can be serialized. A SerializationException exception is thrown if you attempt to serialize but the type does not have the SerializableAttribute attribute.
If you do not want a field within your class to be serializable, apply the NonSerializedAttribute attribute. If a field of a serializable type contains a pointer, a handle, or some other data structure that is specific to a particular environment, and the field cannot be meaningfully reconstituted in a different environment, then you may want to make it nonserializable.
If a serialized class contains references to objects of other classes that are marked SerializableAttribute, those objects will also be serialized.

Binary and XML Serialization
Either binary or XML serialization can be used. In binary serialization, all members, even those that are read-only, are serialized, and performance is enhanced. XML serialization provides more readable code, as well as greater flexibility of object sharing and usage for interoperability purposes.

Basic and Custom Serialization

Serialization can be performed in two ways, basic and custom. Basic serialization uses the .NET Framework to automatically serialize the object.

Basic Serialization

The only requirement in basic serialization is that the object has the SerializableAttribute attribute applied. The NonSerializedAttribute can be used to keep specific fields from being serialized.
When you use basic serialization, the versioning of objects may create problems, in which case custom serialization may be preferable. Basic serialization is the easiest way to perform serialization, but it does not provide much control over the process.+

Custom Serialization

In custom serialization, you can specify exactly which objects will be serialized and how it will be done. The class must be marked SerializableAttribute and implement the ISerializable interface.
If you want your object to be deserialized in a custom manner as well, you must use a custom constructor.

What is Tuple?
This msdn article explains it very well with examples, "A tuple is a data structure that has a specific number and sequence of elements".
Tuples are commonly used in four ways:
  1. To represent a single set of data. For example, a tuple can represent a database record, and its components can represent individual fields of the record.
  2. To provide easy access to, and manipulation of, a data set.
  3. To return multiple values from a method without using out parameters (in C#) or ByRefparameters (in Visual Basic).
  4. To pass multiple values to a method through a single parameter. For example, the Thread.Start(Object) method has a single parameter that lets you supply one value to the method that the thread executes at startup time. If you supply a Tuple<T1, T2, T3> object as the method argument, you can supply the thread’s startup routine with three items of data.

// Create a 7-tuple.
var population = new Tuple<string, int, int, int, int, int, int>(
                           "New York", 7891957, 7781984, 
                           7894862, 7071639, 7322564, 8008278);
// Display the first and last elements.
Console.WriteLine("Population of {0} in 2000: {1:N0}",
                  population.Item1, population.Item7);
// The example displays the following output: 
// Population of New York in 2000: 8,008,278

Creating the same tuple object by using a helper method is more straightforward, as the following example shows.

// Create a 7-tuple.
var population = Tuple.Create("New York", 7891957, 7781984, 
7894862, 7071639, 7322564, 8008278);
// Display the first and last elements.
Console.WriteLine("Population of {0} in 2000: {1:N0}",
                  population.Item1, population.Item7);
// The example displays the following output:
//       Population of New York in 2000: 8,008,278

Thread vs TPL


Thread

Thread represents an actual OS-level thread, with its own stack and kernel resources. (technically, a CLR implementation could use fibers instead, but no existing CLR does this) Thread allows the highest degree of control; you can Abort() or Suspend() or Resume() a thread (though this is a very bad idea), you can observe its state, and you can set thread-level properties like the stack size, apartment state, or culture.
The problem with Thread is that OS threads are costly. Each thread you have consumes a non-trivial amount of memory for its stack, and adds additional CPU overhead as the processor context-switch between threads. Instead, it is better to have a small pool of threads execute your code as work becomes available.
There are times when there is no alternative Thread. If you need to specify the name (for debugging purposes) or the apartment state (to show a UI), you must create your own Thread (note that having multiple UI threads is generally a bad idea). Also, if you want to maintain an object that is owned by a single thread and can only be used by that thread, it is much easier to explicitly create a Thread instance for it so you can easily check whether code trying to use it is running on the correct thread.

ThreadPool

ThreadPool is a wrapper around a pool of threads maintained by the CLR. ThreadPool gives you no control at all; you can submit work to execute at some point, and you can control the size of the pool, but you can’t set anything else. You can’t even tell when the pool will start running the work you submit to it.
Using ThreadPool avoids the overhead of creating too many threads. However, if you submit too many long-running tasks to the threadpool, it can get full, and later work that you submit can end up waiting for the earlier long-running items to finish. In addition, the ThreadPool offers no way to find out when a work item has been completed (unlike Thread.Join()), nor a way to get the result. Therefore, ThreadPool is best used for short operations where the caller does not need the result.

Task

Finally, the Task class from the Task Parallel Library offers the best of both worlds. Like the ThreadPool, a task does not create its own OS thread. Instead, tasks are executed by a TaskScheduler; the default scheduler simply runs on the ThreadPool.
Unlike the ThreadPool, Task also allows you to find out when it finishes, and (via the generic Task<T>) to return a result. You can call ContinueWith() on an existing Task to make it run more code once the task finishes (if it’s already finished, it will run the callback immediately). If the task is generic, ContinueWith() will pass you the task’s result, allowing you to run more code that uses it.
You can also synchronously wait for a task to finish by calling Wait() (or, for a generic task, by getting the Result property). Like Thread.Join(), this will block the calling thread until the task finishes. Synchronously waiting for a task is usually bad idea; it prevents the calling thread from doing any other work, and can also lead to deadlocks if the task ends up waiting (even asynchronously) for the current thread.
Since tasks still run on the ThreadPool, they should not be used for long-running operations, since they can still fill up the thread pool and block new work. Instead, Task provides a LongRunning option, which will tell the TaskScheduler to spin up a new thread rather than running on the ThreadPool.
All newer high-level concurrency APIs, including the Parallel.For*() methods, PLINQ, C# 5 await, and modern async methods in the BCL, are all built on Task.

Conclusion

The bottom line is that Task is almost always the best option; it provides a much more powerful API and avoids wasting OS threads.
The only reasons to explicitly create your own Threads in modern code are setting per-thread options, or maintaining a persistent thread that needs to maintain its own identity.

Delegate and Multicast Delegate

No comments: