Tuesday, November 25, 2008

Dotnet Framework Fundamentals

Dotnet Framework

Features
* Interoperability
As interaction between new and older applications is commonly required, the .NET Framework provides means to access functionality that is implemented in programs that execute outside the .NET environment. Access to COM components is provided in the System.Runtime.InteropServices and System.EnterpriseServices namespaces of the framework; access to other functionality is provided using the P/Invoke feature.

* Common Runtime Engine
The Common Language Runtime (CLR) is the virtual machine component of the .NET framework. All .NET programs execute under the supervision of the CLR, guaranteeing certain properties and behaviors in the areas of memory management, security, and exception handling.
* Base Class Library
The Base Class Library (BCL), part of the Framework Class Library (FCL), is a library of functionality available to all languages using the .NET Framework. The BCL provides classes which encapsulate a number of common functions, including file reading and writing, graphic rendering, database interaction and XML document manipulation.
Simplified Deployment
Installation of computer software must be carefully managed to ensure that it does not interfere with previously installed software, and that it conforms to security requirements. The .NET framework includes design features and tools that help address these requirements.

* Security
The design is meant to address some of the vulnerabilities, such as buffer overflows, that have been exploited by malicious software. Additionally, .NET provides a common security model for all applications.

* Portability
The design of the .NET Framework allows it to theoretically be platform agnostic, and thus cross-platform compatible. That is, a program written to use the framework should run without change on any type of system for which the framework is implemented. Microsoft's commercial implementations of the framework cover Windows, Windows CE, and the Xbox 360.In addition, Microsoft submits the specifications for the Common Language Infrastructure (which includes the core class libraries, Common Type System, and the Common Intermediate Language), the C# language, and the C++/CLI language to both ECMA and the ISO, making them available as open standards. This makes it possible for third parties to create compatible implementations of the framework and its languages on other platforms.

CLR

CLR is .NET equivalent of Java Virtual Machine (JVM). It is the runtime that converts a MSIL code into the host machine language code, which is then executed appropriately.
Features

The CLR is the execution engine for .NET Framework applications. It provides a number of services, including:

• Code management (loading and execution)
• Application memory isolation
• Verification of type safety
• Conversion of IL to native code.
• Access to metadata (enhanced type information)
• Managing memory for managed objects
• Enforcement of code access security
• Exception handling, including cross-language exceptions
• Interoperation between managed code, COM objects, and pre-existing DLL's (unmanaged code and data)
• Automation of object layout
• Support for developer services (profiling, debugging, and so on).
Compilers

When a dotnet program is compiled the output of compiler is non-executable file that contains special type of code called “MSIL” (MICROSOFT INTERMEDIATE LANGUAGE) Code. This MSIl code defines set of portable instructions that are independent of CPU. The main task of JIT compiler is to convert this MSIL code into an executable one.

Working of JIT Compiler

Microsoft Intermediate Language (MSIL) code is converted into Executable code with the help of JIT compiler. When the program is executed the Common Language Runtime (CLR) activates the JIT compiler and in turn it will convert the MSIL code to Native (Executable) code. One should remember that the JIT will not convert complete available MSIL code to native code ,it does it only as it get the instructions from the CLR on demand basis as each part of program is needed.

Types of JIT Compiler
PRE JIT Compiler
Pre-JIT compiler compiles complete source (MSIL) code to Native code in a single Compilation.

ECONO JIT Compiler
This compiler compiles only MSIL code of those methods that are called at Runtime.

NORMAL JIT compiler
This compiler compiles only MSIL code of those methods that are called at Runtime and that converted (native) code is stored in Cache. When these methods called again it will retrieve code from cache itself without sending request to CLR. Thus, saves much of Execution time
Dotnet Assemblies

Assemblies are the fundamental building blocks of a .NET Framework application. They contain the types and resources that make up an application and describe those contained types to the common language runtime. Assemblies enable code reuse, version control, security, and deployment.
Put simply, an assembly is a project that compiles to an EXE or a DLL file. Although .NET EXE and DLL files resemble their predecessors externally, the internal structure of an assembly is quite different from that of an EXE or DLL created with earlier development tools.
Parts of an Assembly

The assembly manifest or metadata
This contains information about the assembly that the common language runtime uses to obtain information about the assembly.

The type metadata
This exposes information about the types contained within the assembly.

Intermediate Code
The intermediate language code for your assembly.

Resource files
These are non-executable bits of data, such as strings or images for a specific culture.

Assembly Execution
The assembly manifest contains the metadata that describes the assembly to the common language runtime. The common language runtime then uses the information in the assembly manifest to make decisions about the assembly's execution. An assembly manifest contains the following information:
Identity
It contains the name and version number of the assembly, and can contain optional information such as locale and signature information.
Types and resources
It contains a list of all the types that will be exposed to the common language runtime as well as information about how those types can be accessed.

Files
It contains a list of all files in the assembly as well as dependency information for those files.

Security permissions
The manifest describes the security permissions required by the assembly. If the permissions required conflict with the local security policy, the assembly will fail to execute. For the most part, the developer does not have to be concerned with the contents of the assembly manifest. It is compiled and presented to the common language runtime automatically.
The developer does, however, need to explicitly set the metadata that describes the identity of the assembly. The identity of the assembly is contained in the AssemblyInfo.vb or .cs file for your project. You can set identity information for your assembly by right-clicking the AssemblyInfo icon and choosing View Code from the drop-down menu. The code window will open to the AssemblyInfo code page, which contains default null values for several assembly identity attributes.
Assembly Types

Class Library Assemblies
You will frequently want to create class library assemblies. These represent sets of types that can be referenced and used in other assemblies. For example, you might have a custom control that you want to use in several applications or a component that exposes higher math functions. Such an assembly is not executable itself, but rather must be referenced by an executable application to be used. You can create class library assemblies and control library assemblies by using the templates provided by Microsoft Visual Studio .NET.
The class library template is designed to help you create an assembly of types that can be exposed to other applications, and the Microsoft Windows control library template is provided to assist you in building assemblies of custom controls.
Resource Files
The .NET Framework includes a sample application called ResEditor that can be used for creating text and image resource files. The ResEditor application is not integrated with Visual Studio .NET-it must be run separately. In fact, it is supplied as source code files and must be compiled before it can be used.

Embedding Resources
Once you have created resource files, you can embed them in your assembly. This allows you to package resources into the same assembly as the code files, thus increasing the portability of your code and reducing its dependence on additional files. To embed an externally created resource into your assembly, all you have to do is add the file to your project. When the project is built, the resource file will be compiled into the assembly.
Creating Resource Assemblies
You can create assemblies that only contain resources. You might find this useful in situations where you expect to have to update the data contained in resource files, but do not want to have to recompile your application to update it.

Satellite Assemblies
When creating international applications, you might want to provide different sets of resources for different cultures. Satellite assemblies allow different sets of resources to automatically be loaded based on the CurrentUICulture setting of the thread.
Retrieving Resources at Run Time
At run time, you can use the ResourceManager class to retrieve embedded resources. A ResourceManager, as the name implies, manages access and retrieval of resources embedded in assemblies. Each instance of a ResourceManager is associated with an assembly that contains resources.
You can create a ResourceManager by specifying two parameters: the base name of the embedded resource file and the assembly in which that file is found. The new ResourceManager will be dedicated to the embedded resource file that you specify. The base name of the file is the name of the namespace that contains the file and the file without any extensions.
The assembly parameter refers to the assembly that the resource file is located in. If the assembly that contains the resources is the same assembly that contains the object that is creating the ResourceManager, you can get a reference to the assembly from the type object of your object.

Private and Shared Assemblies
Most of the assemblies you create will be private assemblies. Private assemblies are the most trouble free for developers and are the kind of assembly created by default. A private assembly is an assembly that can be used by only one application. It is an integral part of the application, is packaged with the application, and is only available to that application. Because private assemblies are used by one application only, they do not have versioning or identity issues. Up to this point, you have only created private assemblies. When you add a reference to a private assembly to your project, Visual Studio .NET creates a copy of the DLL containing that assembly and writes it to your project folder. Thus, multiple projects can reference the same DLL and use the types it contains, but each project has its own copy of the DLL and therefore has its own private assembly.
Only one copy of shared assemblies, on the other hand, is present per machine. Multiple applications can reference and use a shared assembly. You can share an assembly by installing it to the Global Assembly Cache. There are several reasons why you might want to install your assembly to the Global Assembly Cache. For example:
Shared location. If multiple applications need to access the same copy of an assembly, it should be shared.
Why Share Assemblies

Security
The Global Assembly Cache is located in the C:\WINDOWS\assembly (Microsoft Windows XP) folder, which is given the highest level of security by default.

Side-by-side versioning
You can install multiple versions of the same assembly to the Global Assembly Cache, and applications can locate and use the appropriate version.

For the most part, however, assemblies that you create should be private. You should only share an assembly when there is a valid reason to do so. Sharing an assembly and installing it to the Global Assembly Cache requires that your assembly be signed with a strong name.
Garbage Collector

The garbage collector (GC) of .NET completely absolves the developer from tracking memory usage and knowing when to free memory.
The garbage collector performs a collection in order to free some memory. The garbage collector's optimizing engine determines the best time to perform a collection, (the exact criteria is guarded by Microsoft) based upon the allocations being made. When the garbage collector performs a collection, it checks for objects in the managed heap that are no longer being used by the application and performs the necessary operations to reclaim their memory.

Garbage Collection Algorithm
Application Roots
Every application has a set of roots. Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null.


For example:
• All the global and static object pointers in an application.
• Any local variable/parameter object pointers on a thread's stack.
• Any CPU registers containing pointers to objects in the managed heap.
• Pointers to the objects from Freachable queue
• The list of active roots is maintained by the just-in-time (JIT) compiler and common language runtime, and is made accessible to the garbage collector's algorithm.
Implementation
Garbage collection in .NET is done using tracing collection and specifically the CLR implements the Mark/Compact collector
Phase I: Mark
When the garbage collector starts running, it makes the assumption that all objects in the heap are garbage. In other words, it assumes that none of the application's roots refer to any objects in the heap. Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage.
Phase II: Compact
Move all the live objects to the bottom of the heap, leaving free space at the top. After all the garbage has been identified, all the non-garbage has been compacted, and all the non-garbage pointers have been fixed-up, a pointer is positioned just after the last non-garbage object to indicate the position where the next object can be added
Finalization

.NET Framework's garbage collection implicitly keeps track of the lifetime of the objects that an application creates, but fails when it comes to the unmanaged resources (i.e. a file, a window or a network connection) that objects encapsulate.
The unmanaged resources must be explicitly released once the application has finished using them. .NET Framework provides the Object.Finalize method: a method that the garbage collector must run on the object to clean up its unmanaged resources, prior to reclaiming the memory used up by the object. Since Finalize method does nothing, by default, this method must be overridden if explicit cleanup is required.
It would not be surprising if you will consider Finalize just another name for destructors in C++. Though, both have been assigned the responsibility of freeing the resources used by the objects, they have very different semantics. In C++, destructors are executed immediately when the object goes out of scope whereas a finalize method is called once when Garbage collection gets around to cleaning up an object.
The potential existence of finalizers complicates the job of garbage collection in .NET by adding some extra steps before freeing an object.
Whenever a new object, having a Finalize method, is allocated on the heap a pointer to the object is placed in an internal data structure called Finalization queue. When an object is not reachable, the garbage collector considers the object garbage. The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to another internal data structure called Freachable queue, making the object no longer a part of the garbage. At this point, the garbage collector has finished identifying garbage. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method.
The next time the garbage collector is invoked, it sees that the finalized objects are truly garbage and the memory for those objects is then, simply freed.
Thus when an object requires finalization, it dies, then lives (resurrects) and finally dies again. It is recommended to avoid using Finalize method, unless required. Finalize methods increase memory pressure by not letting the memory and the resources used by that object to be released, until two garbage collections. Since you do not have control on the order in which the finalize methods are executed, it may lead to unpredictable results.
Garbage Collection Performance Optimizations

Weak References
When an object has a weak reference to it, it basically means that if there is a memory requirement & the garbage collector runs, the object can be collected and when the application later attempts to access the object, the access will fail. On the other hand, to access a weakly referenced object, the application must obtain a strong reference to the object. If the application obtains this strong reference before the garbage collector collects the object, then the GC cannot collect the object because a strong reference to the object exists.
The managed heap contains two internal data structures whose sole purpose is to manage weak references: the short weak reference table and the long weak reference table.
Weak references are of two types:
• A short weak reference doesn't track resurrection.
i.e. the object which has a short weak reference to itself is collected immediately without running its finalization method.
• A long weak reference tracks resurrection.
i.e. the garbage collector collects object pointed to by the long weak reference table only after determining that the object's storage is reclaimable. If the object has a Finalize method, the Finalize method has been called and the object was not resurrected.
Generations
One feature of the garbage collector that exists purely to improve performance is called generations. A generational garbage collector takes into account two facts that have been empirically observed in most programs in a variety of languages:
• Newly created objects tend to have short lives.
• The older an object is, the longer it will survive.
Generational collectors group objects by age and collect younger objects more often than older objects. When initialized, the managed heap contains no objects. All new objects added to the heap can be said to be in generation 0, until the heap gets filled up which invokes garbage collection. As most objects are short-lived, only a small percentage of young objects are likely to survive their first collection. Once an object survives the first garbage collection, it gets promoted to generation 1.Newer objects after GC can then be said to be in generation 0.The garbage collector gets invoked next only when the sub-heap of generation 0 gets filled up. All objects in generation 1 that survive get compacted and promoted to generation 2. All survivors in generation 0 also get compacted and promoted to generation 1. Generation 0 then contains no objects, but all newer objects after GC go into generation 0.
Wrappers
COM is a binary reusable object which exposes its functionality to other components. When a client object asks for instances of server object, the server instantiates those objects and handout references to the client. So, a COM component can act as a binary contract between caller and callee. This binary contract is defined in a document known as Type library. The Type library describes to a potential client the services available from a particular server. Each COM components will expose a set of interfaces through which the communication between COM components will occurs.
The following diagram shows the communication between a client and a COM object.

Fig.1 Communication between client and a COM object
In the above figure the IUnknown and IDispatch are the interfaces and QueryInterface, AddRef, Release, etc., are the methods exposed by those interfaces.
The communication between the .NET objects occurs through Objects, there are no such interfaces for communication. So, in .NET component, there is no type libraries, instead they deal with assemblies. Assembly is a collection of types and resources that are built to work together and form a logical unit of functionality. All the information related to the assembly will be held in assembly metadata. Unlike the communication between COM components, the communication between .NET components is Object based.
RCW
Calling COM components from .NET Client
Generally COM components will expose interfaces to communicate with other objects. A .NET client cannot directly communicate with a COM component because the interfaces exposed by a COM component may not be read by the .NET application. So, to communicate with a COM component, the COM component should be wrapped in such a way that the.NET client application can understand the COM component. This wrapper is known as Runtime Callable Wrapper (RCW).
The .NET SDK provides Runtime Callable Wrapper (RCW) which wraps the COM components and exposes it into to the .NET client application.

Fig.2 calling a COM component from .NET client
To communicate with a COM component, there should be Runtime Callable Wrapper (RCW). RCW can be generated by using VS.NET or by the use of TlbImp.exe utility. Both the ways will read the type library and uses System.Runtime.InteropServices.TypeLibConverter class to generate the RCW. This class reads the type library and converts those descriptions into a wrapper (RCW). After generating the RCW, the .NET client should import its namespace. Now the client application can call the RCW object as native calls.
When a client calls a function, the call is transferred to the RCW. The RCW internally calls the native COM function coCreateInstance there by creating the COM object that it wraps. The RCW converts each call to the COM calling convention. Once the object has been created successfully, the .NET client application can access the COM objects as like native object calls.
CCW
Calling .NET components from COM Client
When a COM client requests a server, first it searches in the registry entry and then the communication starts. Calling a .NET component from a COM component is not a trivial exercise. The .NET objects communicate through Objects. But the Object based communication may not be recognized by the COM clients. So, to communicate with the .NET component from the COM component, the .NET component should be wrapped in such a way that the COM client can identify this .NET component. This wrapper is known as COM Callable Wrapper (CCW). The COM Callable Wrapper (CCW) will be used to wrap the .NET components and used to interact with the COM clients.
CCW will be created by the .NET utility RegAsm.exe. This reads metadata of the .NET component and generates the CCW. This tool will make a registry entry for the .NET components.

Fig.3 calling a .NET component from COM client
Generally COM client instantiates objects through its native method coCreateInstance. While interacting with .NET objects, the COM client creates .NET objects by coCreateInstance through CCW.
Internally, when coCreateInstance is called, the call will redirect to the registry entry and the registry will redirect the call to the registered server, mscoree.dll. This mscoree.dll will inspect the requested CLSID and reads the registry to find the .NET class and the assembly that contains the class and rolls a CCW on that .NET class.
When a client makes a call to the .NET object, first the call will go to CCW. The CCW converts all the native COM types to their .NET equivalents and also converts the results back from the .NET to COM.



References

Framework
http://en.wikipedia.org/wiki/.NET_Framework
CLR
http://www.geekinterview.com/question_details/1047
http://www.codeproject.com/KB/dotnet/An_Overview_of_P_Invoke.aspx
Compiler
http://www.dotnet-guide.com/jit.html
http://www.dotnetspider.com/resources/1888-Just-In-Time-compiler-JIT--working-its-typ.aspx
Assemblies
http://www.dotnetpoint.net/2008/09/assemblies-in-dot-net_09.html
Garbage Collector
http://www.codeproject.com/KB/dotnet/garbagecollection.aspx
Wrappers
http://www.codeproject.com/KB/COM/nettocom.aspx
http://www.codeproject.com/KB/COM/COM_DOTNET_INTEROP.aspx

1 comment:

Arpi said...

nice job....as usual hardworking.