FlashRSS Reader Pg.6

source: http://www.thegoldenmean.com

6 — Storing Data in Arrays

Version One: Using Recusion and Arrays

Restating The Objective:

We want to present data using a TextArea component in a Flash Movie, formatted with HTML tags and structured like this:

<headline><a href="LINK">TITLE</a></headline><p>DESCRIPTION</p>

(where the words in all caps above are the actual content from an RSS document). There might be three or thirty entries. The script shouldn’t care. We simply want to target the <item> nodes and from them extract the text data contained in their <title>, <link> and <description> child nodes.

The First Pass

Long ago I read some code posted by Peter Hall for dealing with XML. It was well beyond me at the time but I tucked it away for the day when it would make sense. This project finally forced me to sit down and study the code until it did make sense. It is a great example of using a recursive function to walk the XML node heirarchy. Mr. Hall’s original example was an ActionScript 1.0 prototype; I modified it to conform better to an ActionScript 2.0 method. As the headline suggests, we make one pass through the XML document to harvest the <item> nodes, and push them into an array. The method is as follows:

   private function getNodes(node:XMLNode, name:String):Array {
      var nodes:Array = new Array();
      var c:XMLNode = node.firstChild;
      while (c) {
         if (c.nodeType != 3) {
            if (c.nodeName   == name) {
         nodes = nodes.concat(getNodes(c, name));
      c = c.nextSibling;
   return nodes;

There is a lot of power in that brief block of code! We will examine it line by line in a moment, but the overall picture of what it does is this: it travels the length of the document’s first node branch to its very end, collecting and storing any nodes that match “name” as it goes. When it reaches the end of one branch, it starts again with the next and so on until there are no more branches to explore. Fortunately most RSS documents have a relatively simple structure, but this recursive approach is capable of scouring even the most convoluted structure.

Notice that the method expects to have two arguments passed to it: an XML or an XMLNode Object to search in, and a string to search for. Let’s say we store our XML Object in a variable we call “_xml”, and we want to find all of the “item” nodes. We would invoke this method by writing:
getNodes(_xml, "item");
That says: look in "_xml" for matches to a node with the name “item”.

Let’s examine what follows line by line:

I just think that is pretty amazing. After one small block of code we end up with an array consisting of all the nodes named “item”. We are far from done at this point however. Now we have to make a second pass to extract the content from the title, link and description nodes. The method for this task is very similar to the one we just examined.

The Second Pass

The first function has found all the <item> nodes and inserted them into an array, but each of these array elements is itself quite a complex XML construct containing the three child nodes we are actually interested in plus quite a bit more. We still need to present the information and to do that we need to pull it out of the “nodes” and extract the text data. It is for this purpose that I wrote another method I called “extractContent()”. This is a modified version of getNodes(). Like getNodes(), it searches the source node recursively for a match to "name", but what it returns when it does find a match is the text content of that node. I don’t feel the need to go line by line since it is so similar to the previous method. Here it is:

  private function extractContent (source:XMLNode, name:String):String {
    var nodeTxt:String = "";
    var c:XMLNode = source.firstChild;
    while (c) {
      if (c.nodeType != 3) {
        if (c.nodeName == name) {
          nodeTxt = c.firstChild.nodeValue;    
        nodeTxt += extractContent(c, name);
      c = c.nextSibling;
    return nodeTxt;

To use this, we would define three new arrays to hold specific content. This extractContent() method would be invoked three times to each <item> element: once for everything we want to get. (Since we want to get title, link and description, we call it three times.) Note this in this method we get the actual text rather than a node (using c.firstChild.nodeValue).


Assuming your head is swimming at this point, here is a summary of how this approach attacks the problem:

There are a total of four arrays: one to hold the <item> nodes, and one each to hold the text contents of the <title>, <link> and <description> child nodes. In the first pass, getNodes() travels the entire document searching for <item> nodes and adding them to an array. In the second pass extractContent() is called on each of those <item> Objects as many times as there are things we are interested in, adding that extracted text content to the other arrays.

Once that second pass has completed it is easy to build the output string which is passed on to the TextArea component for your site’s visitor to read.

Of course there is more to making a Flash movie than this parsing engine. The project files download has a fully functioning RSS reader based on this code which should make perfect sense to you now.

This approach works, and works quite well. It certainly works far better than my first inflexible, rules-based approach. We could stop here and be pleased with ourselves, but it bothered me that the data had to be evaluated multiple times in order to get at what we want, and it bothered me that there was the residue of four arrays which stored redundant information. Was there a way to get the stuff we want in one pass without storing in in arrays along the way? The next page shows a way to do just that.

go to page: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13
divider ornament