Google's treasure hunt - task 2

I don't normally bother (or have time for) the various programming competitions on the net. This one caught my eye however, as I find the Tango implementation fairly short and elegant (the one that prompted my attention was in F# and wasn't particularly readable IMO, but then I'm not really into functional programming).

The task (see here to get your instance) is to process a directory tree (provided from Google in a zip file), and sum up values on certain lines of certain files.

My instance told me to sum values on line 5 in files with extension ".js" and with "BCD" somewhere in the path and multiply that with the sum of the values on line 1 in files with extension ".txt" and with "zzz" somewhere in the path. Empty lines should not be counted (I probably misunderstood something as an empty line would only have yielded a zero value for the sum in any case, but I got the correct result according to Google).


module googlehunt;
import tango.io.vfs.ZipFolder;
import tango.text.Util;
import tango.io.Stdout;
import tango.text.stream.LineIterator;
import tango.text.convert.Integer;
void main(char[][] args)
{
    if (args.length < 2)
        return;
    auto archive = new ZipFolder(args[1]);
    auto info = archive.tree;
    uint countLines(char[] ext, char[] pattern, int ln) {
        bool googleFilter(VfsInfo info) {
            if (info.path.containsPattern(pattern) ||
                info.name.containsPattern(pattern))
                if (info.name[$-ext.length..$] == ext)
                    return true;
            return false;
        }
        uint sum;
        foreach (file; info.catalog(&googleFilter)) {
            foreach (idx, line; new LineIterator!(char) (file.input))
                if (idx == ln)
                    if (line.length > 0)
                        sum += toInt(line);
        }
        return sum;
    }
    uint sum1, sum2;
    sum1 = countLines(".js", "BCD", 4);
    sum2 = countLines(".txt", "zzz", 0);
    Stdout ("Sum 1 = ")(sum1)(", sum 2 = ")(sum2).newline;
    Stdout ("Result is: ")(sum1 * sum2).newline;
}
 

Ok, analysis. I couldn't bother unzipping the zip, so I just mounted it in the Tango VFS - that is the ZipFolder part, and then extracting information for the complete subtree.

Inside the countLines function, I foreach over the entries in the tree, filtered by the nested googleFilter function (note that I'm using the generic VFS interfaces here, even if this particular folder is a zip file). Then I take the input stream from those files matching the filter, and put it through the line iterator, summing the contents of the relevant lines.

Ok, thanks for your patience - with some luck the VFS will be extended with FTP in the not so distant future.

For reference, here is the blog entry that prompted my attempt

Reply

Please solve the math problem above and type in the result. e.g. for 1+1, type 2
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <center><img><a> <em> <strong> <cite> <code> <blockcode> <ul> <ol> <li> <dl> <dt> <dd><br><script> <pre>
  • Lines and paragraphs break automatically.
  • You may post block code using <blockcode [type="language"]>...</blockcode> tags.
More information about formatting options