Note: This post is a work in progress... not yet completed...
Overview
When Visual Studio 2008 and the .NET Framework 3.5 shipped in late 2007, the feature that got by far the most attention was Language Integrated Query (LINQ). This article will provide a quick overview of the LINQ technologies, and will then focus in on LINQ to XML, which is a subset of the LINQ technologies.
This article is structured as follows:
- The next section will provide a short overview of LINQ in general. We take a look at what the main drivers were behind the creation of LINQ, and we will start looking at some of the specifics of LINQ to XML.
- Once we have established a basic understanding of LINQ to XML, we will take a look at its object model. LINQ to XML introduces an easy to use, flattened object model, especially when compared to the XML DOM object model.
- Next, we will take a look at what is in my opinion one of the most important features of LINQ to XML, and that is Functional Construction. Functional construction enables us to create an XML document in a very straightforward way. No longer do we need to use the elaborate XML DOM object model and write 100 lines of complex code, just to create a simple XML document that is just a few lines long.
- After we have an understanding of how to create an XML document, we take a look at how we can use the features on the new LINQ to XML object model to query, iterate, manipulate and validate an XML document.
- To illustrate how easy it is to leverage the LINQ to XML features in a real-world application, we will take a look at an example that performs two-way binding of a rather complex XML document to a WPF tree control.
- To round out our discussion, we will take a look at some practical tips and tricks, which enables us to take full advantage of LINQ to XML, and we will also mentioned some common pitfalls, and inform you how to save time not making the same mistakes as your truly ;-)!
Prerequisites
If you want to run the code samples that are associated with this article, you will need the following pre-requisites:
- Since we will be using LINQ, you obviously need to have a version of Visual Studio 2008 installed. If you install Visual Studio, you will automatically also install .NET 3.5 and the C# 3.0 compiler.
- We will be showing how you can use LINQ to SQL to create XML documents, so you will need to have SQL Server 2005 or the SQL Server 2008 beta installed. If you have the express edition installed, I recommend you also download SQL Server Management Studio express.
- We will be using the AdventureWorks sample database to illustrate how we can create an XML document from a SQL Server database, so you should have the latest version of the AdventureWorks database installed. If you currently don't have this database installed, you can navigate to: http://www.codeplex.com/MSFTDBProdSamples/Release/ProjectReleases.aspx?ReleaseId=4004, and download the appropriate version of the AdventureWorks*.msi file, and follow the instructions to install the database on your SQL server.
LINQ Overview
Background - Problem Description
Most of us write business applications to earn our living. All business applications that I ever worked with deal with some type of data, and quite a number of them involved multiple heterogeneous data sources.
In the early days (1980-1990) all of the data for an application was centralized in one file, and the data access capabilities were directly built into the language (I know that at this point, my friend Pete Miller is thinking back nostalgically about his FoxPro and DBASE days... ;-). The upside of this was that data querying and data manipulation were a core part of the programming experience, but the downside was that every platform (DBASE, FoxPro, FileMaker) would integrate these data access features in a completely different way, tailored to the capabilities of the underlying data access tool. As a result, programmers faced a steep learning curve when moving from one platform to another. Another drawback of these first-generation data access systems was scalability. All platforms were file-based, and therefore had a hard time scaling to multiple users and large data sizes.
In response to this and other issues, (1990s through early 2000s) relational database systems (RDBMS) were created. All of these databases used an emerging query and DML language called SQL to access the data, and database programs migrated from being strictly tied to a particular data access system dialect, to using industry-standard SQL. The query and data manipulation features became again external to the core programming languages, and were typically made available through a set of external libraries (db-Library anyone?). This process was accelerated by the creation of ODBC, which provides a standard means of accessing any database that provides an ODBC driver.
In recent years, we have seen the emergence of Object-Relational Mapping (ORM) tools. The main philosophy behind the ORM movement was the fact that designers and programmers are dealing with objects to represent their data, but each time they have to persist or load these objects from a data store, they have to make a paradigm shift back to SQL and the relational model, and the specific details associated with the database access libraries that are being used. An ORM tool, such as NHibernate takes over the responsibility of persisting and loading objects to the database in a transparent fashion.
While both SQL and their associated data access technologies (such as ADO.NET) and/or the newer ORM tools have simplified things significantly, we still have a number of challenges that remain:
- Most real-world applications out there deal with other data types besides relational data. Indeed, in this connected world, we access a variety of XML data sources, we connect to RSS feeds, leverage Web Services using the SOAP or REST protocols, access data in Active Directory or some other LDAP-based data store, and so on. Each of these data sources have their own data access paradigms.
- When working with "plain old objects", we have to use a very different API to sort, filter, group our otherwise manipulate our objects, as compared to relational data. Wouldn't it be nice if we could have one standard API for these common tasks?
- Often we have to perform complex data transformations and/or data shaping. The way in which these transformations are performed is often dependent on the type of the data. For XML we use XSLT, for relational data we use views or complex joins, for objects we use manual code etc. Again, it would be nice to have access to one standard approach for performing transformations.
- What if we want to access data from a dynamic language, such as Ruby, IronPython, PowerShell, F# or any of the future DLR-based languages? Will these languages be able to use the same data access tool as the statically-typed .NET languages such as C# or VB.NET?
Microsoft's Answer: LINQ
At the core, LINQ is really a set of constructs, built into the language, which allow us to work with any type of data, be it relational, XML or plain old objects. LINQ is supported in both C# 3.0 and VB 9.0, which are compilers that shipped with Visual Studio 2008. In a way, LINQ brings us "back to the future", making querying and manipulating data a core programming concept again. The main different with the "old school" languages such as FoxPro is that LINQ is fully independent of the type of data that is being accessed (object, relational data, XML, etc.), and the specific implementation of the data source (SQL Server, Oracle etc.).
The LINQ features in the C# 3.0 compiler are built on top of a number of other language enhancements such as:
- Anonymous types
- Anonymous methods
- Type inference
- Lambda expressions
- Expression trees
- Extension methods
- Instance and collection initializers
- Partial methods
A number of these features were addressed in some of my previous posts. Please refer to the blog archives for more information.
The .NET 3.5 Framework also provides LINQ support through a number of types, available in the System.*.LINQ namespaces. These types provide additional support on top of the compiler features.
A graphical overview of LINQ is shown below:

LINQ to XML
LINQ to XML is portion of LINQ that allows us to:
- Construct
- Traverse
- Manipulate
- Query
- Search
XML documents and fragments, using the standard LINQ API.
One of the main goals of LINQ to XML was to address the main shortcomings in the W3C XML DOM API, as implemented in the System.Xml.* .NET 2.0 namespaces, with a focus on the following areas:
- Simplify XML tree construction with functional construction.
- Eliminate document centricity in favor of element centricity.
- Simplify naming by eliminating prefixes from the API.
- Simplify Node value extraction.
The above topics have always been a hard area to deal with for any programmer working with XML documents. XML DOM code is unnecessary complex and bloated, and often unintentionally obfuscated. For example, it is not easy to imply the structure of the created XML document, when reading XML DOM document creation code.
Other issues that Microsoft wanted to address with LINQ to XML is to allow a developer to quickly move data extracted from a relational model to an XML representation, or from an object graph to an XML document. This current DOM API does not support constructs such as projections inside a XQuery, LINQ to XML provides an elegant solution to this problem.
In the .NET 3.5 framework, LINQ to XML is implemented in the System.Xml.Linq.dll assembly, and exposed through the System.Xml.Linq namespace.
The dependencies of the System.Xml.Linq.dll assembly is show in the figure below:

LINQ to XML Object Model
LINQ to XML was developed with Language-Integrated Query over XML in mind from the onset. It takes advantage of the standard query operators and adds query extensions specific to XML. Just as significant as the Language-Integrated Query capabilities of LINQ to XML is the fact that LINQ represents a consistent query experience across all LINQ enabled APIs and allows us to combine XML queries from other data sources. So, with one query, you can access data from:
- Local objects in memory
- An XML Data Source
- One or more SQL Server data sources
The core classes that make up the LINQ to XML object model are listed below:

As you can conclude from the object model above, the number of classes involved has been dramatically reduced, resulting in a reduced learning curve. Actually I think that the biggest challenge in working with LINQ to XML is to unlearn some of the bad practices that we had to burn into our brain to make the W3C XML DOM work for us.
Some key issues regarding this object model are listed below:
- You can now work in a "document free mode" if you would like to do so. In some scenarios, you simply want to create or load some XML, manipulate and query it, and save it back. With the W3C DOM, you would be forced to create an XML document. In LINQ to XML, this is no longer the case. To perform the task listed, you could simply:
- Create XElements directly (without having an XDocument involved at all)
- Manipulate the XElements or XAttributes directly.
- Save the resulting XML tree directly to a writer.
- XML names have been greatly simplified. LINQ to XML goes out of its way to make XML names as straightforward as possible. One can say that the complexity of XML names does not originate in namespaces, but from XML prefixes. XML prefixes can be used for reducing the keystrokes required when inputting XML or making XML easier to read, however prefixes are just shortcuts for using the full XML namespace. On input LINQ to XML resolves all prefixes to their corresponding XML Namespace and prefixes are not exposed at all in the programming API. In LINQ to XML, a XName represents a full XML name consisting of an XNamespace and the local name. Developers will usually find it more convenient to use the XNamespace rather than the namespace URI string.
- An attribute (modeled by means of the XAttribute class) is no longer a subclass of the node class). It is now simply a XName-value pair, which is what it always should have been.
Functional Construction
Querying
WPF Data Binding
Tips and Tricks
Conclusion
Notes: Make sure how to construct an XML document from a LINQ database query