Blog

Data Collection 101

Data Collection 101

Data collection and correlation can be a fraught undertaking. What metadata should you extract? What tools and technologies do you have in place to collect and report on your data? How do you organize and categorize the data? Where do you start grouping data into collections? Here’s the rundown.

Sensitive unstructured data doesn’t care where they lives. And neither do the hackers or insider threats that are trying to find and use them. When it comes to securing data, organizations tend to focus on the big network-attached devices, file servers or document management systems such as SharePoint. Yet they tend to ignore desktops, laptops, print servers, or even critical Windows application servers. Your most sensitive, critical data can live in all those devices. You can be assured that the folks attempting to steal PII and other sensitive data are not ignoring these devices and neither should you.

Scan every hard drive. Regardless of location, device type or size. Every hard drive should be scanned, analyzed and if required, remediated. You simply don’t have the luxury of assuming a specific device will not have data that could pose a threat if it were stolen or exposed. Case in point: the fact that application servers may have little storage is not security. A single Excel file or PDF with thousands of your customer’s social security numbers can sit in a file that’s only a few kilobytes. The same goes for the C-drive. If you take the total disk space of all your desktops combined, it could account for 10 times the amount of total storage you have in your data centers. You must collect data from all your unstructured data platforms.

Logically organize your data. Once you know the data you have, you need to find a way to organize them. This is where data tools take the spotlight. You’ll need to figure out what you have at your disposal and what you need to build or outsource. Data governance tools will help you collect and analyze relevant data from a variety of sources that provide context into the data. Your initial set of reports should provide a high-level overview of what exists within the in-scope data set. With reporting in-hand, you’ll be able to organize your data into meaningful collections, assign ownership to these collections and slice-and-dice the data into categories mimicking how your organization manages the data i.e. region or department.

Integrate to correlate. Another key facet of data collection is ingesting and normalizing data from disparate systems. That means having tools in place that can integrate into various third party connectors — think of correlating information from your CMDB’s and HR systems and pulling in contextual data from your DLP and SIEM platforms. You want to add value to the software solutions you have already invested in. Integration capabilities provide increased visibility into the information needed to decrease risk and manage resources.

If you’re not finding all the sensitive data in your environment and protecting it, you can be pretty sure the bad guys will find it and exploit it.

Cleaning Up Open Access