Feature Article


The Five Golden W's of Investigation


         By Johannes C. Scholtes, Chairman and CSO, ZyLAB

In every investigation, whether it is a criminal or legal investigation, there are five golden Ws that the investigator must answer in order to be successful. These are:

  • 1.       WHO is it about?
  • 2.       WHAT happened?
  • 3.       WHEN did it take place?
  • 4.       WHERE did it take place?
  • 5.       WHY did it happen?

Some might even consider two other pertinent questions:

  • 6.       HOW did it happen?
  • 7.       HOW MUCH or HOW OFTEN?

How do you cull through the vast volume of data to find these answers? Manually analyzing the data to find the answers is very time consuming and requires lots of resources. The reason is that you do not know exactly what it is you are looking. What words should you be searching in order to find the “smoking gun?” How do you find patterns among the words? Do you highlight the words as you manually review each of the documents, then go back and see how they are connected? It’s not that simple. Criminals use aliases; transfers may be done by unknown off-shore companies or via unknown bank accounts, etc. All of this complicates and slows down the investigation.


In addition, the size of electronic data that needs to be investigated continues to grow with increasing complexity, exacerbating the problem. Of course, there are technologies that can help expedite the investigation. Computer technology can help analyze large data sets at tremendous speed for specific patterns. In combination with other technological advances including text mining, computational linguistics, statistics, machine learning and even artificial intelligence, it is much easier to analyze the data specifically focused on finding the five Golden W's.


Modern text mining and content analytics can search on a higher level than just key words.  For example, with text mining linguistic patterns like ‘someone pays someone else’ or ‘someone meets someone else at a certain location and at a certain time’ can be identified without using the exact names or amounts. By extracting such patterns combined with simple statistics, one can easily identify unknown persons, companies, bank account numbers, and also spot code names and aliases.

Criminals will try to cover up illegal activities by hiding information in non-searchable file formats or by embedding different types of electronic objects within complex compound files where the most relevant information is often hidden in the deepest layers. Your solution needs to identify information even when it is hidden in the deepest layers and be able to search those seemingly unsearchable formats such bitmaps, images, non-searchable PDFs, audio files or even a video. By combining text mining with advanced analytics, relevant information can be quickly identified at speeds many times faster and more efficient than what humans could ever do. The investigators can easily validate the relevant information to prevent so-called tunnel vision and identify invalid evidence or investigation directions.

Over the years, I have seen many real-life cases where this hybrid man-machine approach has identified twice the amount of relevant information with half the resources in half the time! This is a great example where Big Data analytics can lead to Big Savings!



Johannes C. Scholtes, Ph.D. is chairman and chief strategy officer of ZyLAB. Scholtes, who was the company’s president and CEO from 1989 to 2009, shaped ZyLAB as an information management powerhouse in countries across the globe. His leadership and vision led to ZyLAB being selected for historic and high-profile engagements including the United Nations War Crime Tribunals, FBI-Enron investigations, and the United States White House Executive Office of the President.

Before joining ZyLAB, Scholtes was a lieutenant in the intelligence department of the Royal Dutch Navy. Scholtes holds a Master of Science degree in Computer Science from Delft University of Technology and a Ph.D. in Computational Linguistics from the University of Amsterdam. As of 2008, he holds the Extraordinary Chair in Text Mining from the Department of Knowledge Engineering at the University of Maastricht. In January 2010, Scholtes joined the board of directors of the Association of Information and Image Management (AIIM), the worldwide leading authority for standards and education in Enterprise Information Management.

Download the eBook

ROI in eDiscovery
Saving in eDiscovery and Risk Management

Powered by Wild Apricot Membership Software