Story of Modularity: Loading XML Files

Previously I told about achieving modularity in web application and I mentioned about using XML files as a configuration script for our module. This session, I’m going to explain the easy way to load external XML file to your program. This method doesn’t only work on web application, but in any .NET application.

Where to Start?

I usually start developing custom XML structure by typing directly to my XML file. It is some kind of prototyping your XML structure. In this case, I am going to develop an XML file that contains a configuration information for my modular theming system. The XML file will describe the theme description and theme structure which is a list of ASPX pages that is inside that theme file. For each pages in theme, they should contains the type of the page, title of the page, identifier, and ASPX file name of the page.

From that requirement, I could derive my new theme file is something like this:


[xml] <?xml version="1.0" encoding="utf-8" ?>
<theme>
<title>Heliosky</title>
<pages>
<page title="Home Page" type="Home" identifier="home" page="Home.aspx" />
<page title="Default Content" type="Default" identifier="default" page="Default.aspx" />
</pages>
</theme>
[/xml]

.NET Class to Represent XML

After having a defined XML structure, you need to make class in your program to represent the XML structure. You need to include namespace of System.Xml.Serialization in your using directives. The namespace contains required attribute to mark your class and properties in respect to the XML structure. The deserialization of XML file will be handled automatically by .NET Framework. Some attributes that we are going to use to create the class are:

  1. XmlRoot: Denotes the class that this class represents the XML root element.
  2. XmlElement: Denotes the property or field that this represents an XML element. XML element is the tag (<data></data>, <single />). XML Element may be a complex type which has details and child elements, or a simple type such as string or integer, depends on the markup data contained inside that element.
  3. XMLAttribute: Denotes property or field that represents the attribute of XML element. XML attribute is a property that is tied to certain XML element (<data attribute1=”value 1″></data>, <single attribute2=”value 2″ />). XML attribute is always a simple type, whether string, integer, or may be an enum.
  4. XMLArray: Denotes the property or field that must be an array type, which represents an XML array (<numbers><number>1</number><number>2</number></numbers>). XML array’s child element must be a single type of XML element.
  5. XMLArrayItem: Denotes the configuration of XML array, the name of child tag, and the .NET type that will represent the child element.

Most attribute has a constructor that accept the actual XML tag name which will be represented. You have to input the correct XML tag in the attribute constructor so the deserializer will be able to map the XML file into the correct class, field, or property. The XMLArrayItem attribute has additional constructor parameter which accept the type information of the .NET class that will represent the child element.

Translating the defined XML structure, we can identify that we have several XML elements which are:

  1. theme: The root element of the XML file
  2. title: A simple XML element containing a string which is the title of the theme
  3. pages: An XML element which is an XML array containing child element with the type of page
  4. page: XML element which is a child of pages, which has several XML attributes tied to its tag

So we need to create at least 2 classes to represent the XML file:

  1. The XML root class to represent the XML root tag theme. I named this class ThemeConfiguration
  2. A class to represent the XML element page. I named this class ThemePage

So here’s the basic design of the class:

[csharp] [XmlRoot] public class ThemeConfiguration
{
[XmlElement("title")] public string Title { get; set; }

[XmlArray("pages")] [XmlArrayItem("page", typeof(ThemePage))] public ThemePage[] Pages { get; set; }
}

public class ThemePage
{
[XmlAttribute("title")] public string Title { get; set; }

[XmlAttribute("identifier")] public string Identifier { get; set; }

[XmlAttribute("type")] public string Type { get; set; }

[XmlAttribute("page")] public string Page { get; set; }
}
[/csharp]

Load the XML

To load the XML is very straightforward. You only need to use XmlSerializer class to load the XML file from input stream. You only need to specify the class that represent the XML root element, in this case is ThemeConfiguration class. Check out the following implementation:

[csharp] public static ThemeConfiguration LoadTheme(string themeName)
{
string themeXmlFile = String.Format("{0}Themes\\{1}\\Theme.xml", HttpRuntime.AppDomainAppPath, themeName);

using (var fileStream = new FileStream(themeXmlFile, FileMode.Open, FileAccess.Read))
{
// deserialize ThemeConfiguration handler
XmlSerializer serializer = new XmlSerializer(typeof(ThemeConfiguration));
var retTheme = (ThemeConfiguration)serializer.Deserialize(fileStream);

return retTheme;
}

}
[/csharp]

Because I make this function to run in web application environment, and I want to load XML file dynamically from my website folder, you can see that this function tries to load the XML file from the Theme folder inside my web application folder. When calling Deserialize method from XmlSerializer instance, it will return an Object, which you have to downcast to the actual type, in this case is ThemeConfiguration.

Additional Knowledge

There are some other way of loading XML file to your program other than using XmlSerializer. You can use LINQ to XML and DataContractSerializer. But using XmlSerializer is more straightforward to load XML file to .NET classes. In LINQ to XML, every XML elements and attributes will be converted into standardized type in System.Xml.Linq namespaces. LINQ to XML is better for cases where XML file is a data store, so you can manipulate or perform query easily to XML data set. While DataContractSerializer perform similarly with XmlSerializer, it doesn’t support XML attribute. So using XmlSerializer typically better for our case, where the XML file can be directly loaded to specific entity class defined in our program.

Using this approach, we can extend our class by adding more methods and fields to perform specific operations. In my case where the class represent a theme configuration, I added one method to give the actual path of ASPX file that is specified by the theme. Take a look on implementation below:

[csharp] [XmlRoot("theme")] public class ThemeConfiguration
{
[XmlElement("title")] public string Title { get; set; }

[XmlArray("pages")] [XmlArrayItem("page", typeof(ThemePage))] public ThemePage[] PagesRaw { get; set; }

[XmlIgnore] public Dictionary<string, ThemePage> Pages { get; set; }

[XmlIgnore] public string ThemeFolder { get; private set; }

public string GetPageUri(string identifier)
{
// Validate ThemePage
var page = Pages[identifier];

string retUri = String.Format("/Themes/{0}/{1}", this.ThemeFolder, page.Page);
return retUri;
}

private void RebuildThemeDictionary()
{
Pages = new Dictionary<string, ThemePage>();
foreach(ThemePage page in PagesRaw)
{
Pages.Add(page.Identifier, page);
}
}

// Static Members

private static string _themeRootFolder;

static ThemeConfiguration()
{
_themeRootFolder = String.Format("{0}Themes\\", HttpRuntime.AppDomainAppPath);
}

public static ThemeConfiguration LoadTheme(string themeName)
{
string themeXmlFile = String.Format("{0}Themes\\{1}\\Theme.xml", HttpRuntime.AppDomainAppPath, themeName);

using (var fileStream = new FileStream(themeXmlFile, FileMode.Open, FileAccess.Read))
{
// deserialize ThemeConfiguration handler
XmlSerializer serializer = new XmlSerializer(typeof(Library.ThemeConfiguration));
var retTheme = (Library.ThemeConfiguration)serializer.Deserialize(fileStream);

// Set theme folder name
retTheme.ThemeFolder = themeName;
retTheme.RebuildThemeDictionary();

return retTheme;
}

}
}
[/csharp]

I added the static method to load the theme XML directly to the ThemeConfiguration class. I also added some additional property to store the theme pages in a dictionary so I can call it using its identifier. After deserializing the XML into the ThemeConfiguration class, the loader performs additional task to rebuild the dictionary of theme pages. XmlIgnore attribute is to mark that the field or property is not meant to be (de)serialized, so the serializer will omit the field or property with this attribute.

Good Programming Practice You Have To Apply: Enum

Other good practice on creating XML format is using enum type for attribute which has specific set of data as its value. In my case, I want the type attribute in page element to have only specific value. List of values I would like to have are: Unknown, Default, Home, ObjectPage, GroupPage. We use XmlEnum attribute to denote the value to its respective item in enum. Check below:

[csharp] public enum ThemePageType
{
[XmlEnum("Unknown")] Unknown = 0,

[XmlEnum("Default")] Default,

[XmlEnum("Home")] Home,

[XmlEnum("ObjectPage")] ObjectPage,

[XmlEnum("GroupPage")] GroupPage
}

public class ThemePage
{
[XmlAttribute("title")] public string Title { get; set; }

[XmlAttribute("identifier")] public string Identifier { get; set; }

[XmlAttribute("type")] public ThemePageType Type { get; set; }

[XmlAttribute("page")] public string Page { get; set; }
}
[/csharp]

Note that I modified the type of Type property in ThemePage class into ThemePageType, which previously was string. XmlSerializer will automatically identify the string value captured from XML file into the respective XmlEnum. Using enum is a very good programming approach to create variable with specific list of value rather than using plain string. Comparing enum value is more type-safe than comparing string. See below:

[csharp] // Assume we get ThemePage instance from somewhere over the rainbow
ThemePage themePage = GetThemePage();

// Using string approach
if(themePage.Type.Equals("Home"))
{
// Do Something
}

// Using Enum approach
if(themePage.Type == ThemePageType.Home)
{
// Do Something
}
[/csharp]

Using enum, the compiler can check whether you are specifying correct value or not. If using string, such error will not be detected by the compiler, and can cause logic error. See example below:

[csharp] // Assume we get ThemePage instance from somewhere over the rainbow
ThemePage themePage = GetThemePage();

// Using string approach
if(themePage.Type.Equals("Homey"))
{
// Compiler does not care about this, because this is valid
// statement, even though in the reality, ThemePage will never
// have ‘Homey’ value
}

// Using Enum approach
if(themePage.Type == ThemePageType.Homey)
{
// Compiler will raise compile error because Homey is not
// defined in Enum declaration
}
[/csharp]

Furthermore, by using enum, XmlSerializer will also check the validity of the value specified in the XML file. For example, if we have invalid attribute type specified in the XML file like below:
[xml highlight=”5″] <?xml version="1.0" encoding="utf-8" ?>
<theme>
<title>Heliosky</title>
<pages>
<page title="Home Page" type="Homey" identifier="home" page="Home.aspx" />
<page title="Default Content" type="Default" identifier="default" page="Default.aspx" />
</pages>
</theme>
[/xml]

The XmlSerializer will raise InvalidOperationException marking that there is malformed XML file, because it contains a value that is not defined in the enum specification. This will ensure that the XML file you load is also a correct XML file format you defined.

Verdict

So basically you can go creative by creating class that suits your needs. This approach is also more object-oriented which makes an application more structured and robust.

And a little nagging, I really hate programmers who use string for representing a list value. String is only used to store free-text instead of representing a list value. Using enum or an agreed specification using integer value makes the program less prone to logic error. I met some programmers who disobey this programming practice, and yes, it causes a lot of trouble in the end, such as wrong string stored in the data, invalid string comparison, case sensitive problem, etc. So please be a good programmer and do best practice.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *