2011-02-15

Fast Unique Files filter for EnCase

Related to my post Fast Hash Matching post from November and Lance's original post, here's the code to an EnCase filter for the Entries view that will show you only the first occurrence of each file by hash value:

include "GSI_Basic"

class MainClass {
  NameListClass  HashList;
  BinaryTreeClass Tree;

  MainClass() :
    HashList(),
    Tree(HashList, new NodeStringCompareClass())
  {
    if (SystemClass::CANCEL == SystemClass::Message(
     SystemClass::ICONINFORMATION |
     SystemClass::MBOKCANCEL, "Unique Files By Hash",
     "Note:\nFiles must be hashed prior to running this filter."))
    {
      SystemClass::Exit();
    }
  }

  bool Main(EntryClass entry) {
    HashClass hash = entry.HashValue();
    if (Tree.FastFind(hash)) {
      return false;
    }
    else {
      Tree.FastInsert(new NameListClass(null, hash), hash);
      return true;
    }
  }
}
The code in the comments on Lance's blog is close, but not quite correct, maybe due to the comment form or some such. You need to hash all files before you run this. As discussed in my earlier post, this is by no means the fastest possible way to do this, but I recently had someone ping me about needing exactly this, and it makes sense to put it up for everyone.

This filter is utterly dependent on BinaryTreeClass in GSI_Basic.EnScript, a support file that comes with EnCase and can be found at Program Files\EnCase 6\EnScript\Include\GSI_Basic.EnScript.

I also have put an ini file of the filter that you can import directly into EnCase (right-click in the filter tree, choose Import...), available here on Google Docs.

1 comment:

  1. Hi there, is it possible to have this work for the bookmark view? I have a total of 7000 pictures, I've run the EXIF data finder and it has found 1500 pictures with exif data in them. Now I have a total of 8500 pictures. I want to be able to remove the duplicate pictures that didn't have the exif data in them so that my total images returns to 7000, where 5500 of them do not have exif data.

    ReplyDelete