Google's treasure hunt - task 2

I don't normally bother (or have time for) the various programming competitions on the net. This one caught my eye however, as I find the Tango implementation fairly short and elegant (the one that prompted my attention was in F# and wasn't particularly readable IMO, but then I'm not really into functional programming).

The task (see here to get your instance) is to process a directory tree (provided from Google in a zip file), and sum up values on certain lines of certain files.

My instance told me to sum values on line 5 in files with extension ".js" and with "BCD" somewhere in the path and multiply that with the sum of the values on line 1 in files with extension ".txt" and with "zzz" somewhere in the path. Empty lines should not be counted (I probably misunderstood something as an empty line would only have yielded a zero value for the sum in any case, but I got the correct result according to Google).


module googlehunt;

import tango.io.vfs.ZipFolder;
import tango.text.Util;
import tango.io.Stdout;
import tango.text.stream.LineIterator;
import tango.text.convert.Integer;

void main(char[][] args)
{
if (args.length < 2)
return;

auto archive = new ZipFolder(args[1]);
auto info = archive.tree;

uint countLines(char[] ext, char[] pattern, int ln) {

bool googleFilter(VfsInfo info) {
if (info.path.containsPattern(pattern) ||
info.name.containsPattern(pattern))
if (info.name[$-ext.length..$] == ext)
return true;

return false;
}

uint sum;
foreach (file; info.catalog(&googleFilter)) {
foreach (idx, line; new LineIterator!(char) (file.input))
if (idx == ln)
if (line.length > 0)
sum += toInt(line);
}

return sum;
}

uint sum1, sum2;
sum1 = countLines(".js", "BCD", 4);
sum2 = countLines(".txt", "zzz", 0);

Stdout ("Sum 1 = ")(sum1)(", sum 2 = ")(sum2).newline;
Stdout ("Result is: ")(sum1 * sum2).newline;
}

Ok, analysis. I couldn't bother unzipping the zip, so I just mounted it in the Tango VFS - that is the ZipFolder part, and then extracting information for the complete subtree.

Inside the countLines function, I foreach over the entries in the tree, filtered by the nested googleFilter function (note that I'm using the generic VFS interfaces here, even if this particular folder is a zip file). Then I take the input stream from those files matching the filter, and put it through the line iterator, summing the contents of the relevant lines.

Ok, thanks for your patience - with some luck the VFS will be extended with FTP in the not so distant future.

For reference, here is the blog entry that prompted my attempt

path... ookey

Ookey, I was too busy trying to solve the puzzle, so I only searched after files containing the string in the file name, not in the path...
Thanks for the note, now I could solve it easily... ;)
Cheers,

What do I wrong?

Hey,
Here is how i understand the question. http://img233.imageshack.us/img233/6962/googletreasurehuntziphowy2.png
It is obviously false, but where? I can't figure out the answer... :(
Cheers,

If that is a complete

If that is a complete "example", then you probably miss files that have the containing text earlier in the path (note path or name).