Punit Bhatia

Create Data Maps That Cover Structured and Unstructured Data

Drag to resize

Perspective


Structured data and Unstructured data are at the heart of any business online. Internet giants are paying big money to get all the important data analyzed so that they can use the power of data to leverage their marketing and branding to get more customers. From a user’s point of view, this poses a threat to privacy. To ensure data is used ethically and lawfully, it’s important to have a data governance program. And a data governance program can’t exist without data maps.

So, what’s the data map?


Data, in technology terms, is a collection of facts and statistics gathered for reference or analysis. When you analyze data, you can gain useful insights. Almost every company is using data-driven insights to expand its operations. However, in order for any Data Analysis to produce appropriate results, it is necessary to verify that the data is correctly migrated and mapped. This is where the concept of data mapping comes into play.


Data mapping is the process of connecting a data field from one source to a data file in another source. Different data sets with different means of describing identical points can be linked in such a way that the final product is accurate and usable. This reduces the potential for errors, helps standardize your data, and makes it easier to understand.


What is The Importance of Data Mapping?


Data mapping enables you to create a single source of truth for business-critical personal and sensitive client data. It can also show you how you know those things, such as what actual data records you have about your customers, which systems hold them, and how those records are related and connected. You can gain a deeper understanding of individual preferences and behaviors by delving into this data at a detailed level.


What’s a Data Governance Program?


The data governance program is the entire foundation for a corporate program's implementation. It establishes communication, implementation, and monitoring processes, as well as a structure for ensuring that policies and best practices are followed. In other words, oversight and control ensure that the program's aims and objectives are consistent with the organization's overall goals and objectives.


Dan Clarke says “Users are protected by privacy laws because under a privacy law I would absolutely have a right to just to request my deletion of my account under GDPR under CCPA or any of them and this for me highlighted why we need these laws in place in the first place” Dan has a joint venture between his company IntraEdge and Intel. That is an automation tool and automation platform for privacy compliance.


When asked about GDPR, Dan explained “transparency would be the number one word because if you think about how many teeth GDPR has in enforcement. Yes, we've seen some enforcement actions yes we've seen maybe a touch of private action. The fundamental to enabling transparency is having a good governance program. One of the fundamental elements in a privacy program is to establish a data inventory or a record of processing”.


Why data maps should be a core component of any privacy governance program?


Dan explained that data governance and privacy policy is required to ensure transparency. He further explains “basically data governance or privacy governance and maybe both hand in hand. And then one of the fundamental elements in a privacy program is to establish a data inventory or a record of the processing activity and with that goes what we call data maps that is what data is being collected where is it flowing and how what's the life cycle of it.”


He added “The data map is the core and is the most important element of any governance program and certainly a privacy program. We quickly realized that without a good data map you just can't have an effective governance program.”


The data map is the most important piece of a privacy governance program because you have to know where was this data collected where did it go, who did we send it to, what vendors, what third parties, and what systems are leveraging that data, and how sensitive is it? And that you can only find out from your data map


Through the data mapping exercise, you can have an independent data map. The perspective they can even link it to the marketing or analytics or digital teams which are looking to say we want to leverage data we want to be a data-oriented company but how do you become a data-powered company you got to know your data.


Understanding Structured And Unstructured Data


What is Structured Data?


Structured data is information that has been predefined and prepared to a specific structure before being stored, a process known as schema-on-write. The relational database is the best example of structured data: the data has been formatted into precisely specified fields, such as credit card numbers or addresses so that SQL can simply query it.


Examples of structured data include any data that is stored in a database. For instance, data stored in customer relationship management (CRM), data is hotel and ticket reservation data (e.g., dates, prices, destinations, etc.), accounting data etc.


As structured data is comprised of clearly defined data types with patterns that make them easily searchable.


What is Unstructured Data?


Unstructured data is information that does not follow a data model and has no immediately identifiable structure, making it difficult to use by a computer program. Because unstructured data is not organized in a predetermined fashion or lacks a predetermined data model, it is not a good fit for a traditional relational database. Examples of unstructured data include: data is emails, SharePoint, files stored on network etc.


Sources of unstructured data include web pages, audio and videos files, user comments on blogs and social media sites, written memos and reports, documents, transcripts of customer service calls, logs etc.


Problems faced in storing unstructured data: It requires a lot of storage space to store unstructured data. It is difficult to store videos, images, audios, etc. Due to unclear structure, operations like update, delete and search is very difficult. Storage cost is high as compared to structured data Indexing the unstructured data is difficult.


Automation around that data inventory and data mapping


Besides the obvious difference between storing in a relational database and storing outside of one, the biggest difference between structured and unstructured data is the ease of analysis. This creates a huge challenge in privacy situations such as request for right to delete data.


To solve this, it is important to analyze data and create data maps. To this effect, data mapping automation enables enterprises to effortlessly transition from manual to automated data mapping processes. It automates data collection, discovers new data and updates records on the fly, and allows an AI-powered Privacy Ops solution.


A privacy policy gets implemented and then getting automation in place. Some privacy approach and a policy and then the tool need to kick in or jump in so that it assists you to manage that policy, manage that structure. Because you got to know how your companies are the regions are structured are they in a federated structure or a global structure. Based on that you will choose your privacy approach is it going to be federated? Is it going to be centralized or is it going to be hybrid and that determines?


You got to have your strategy and policy combined with the tools so it's not either or it's both.


I think more than automation that needs some changes in the law or the way cookies are managed because at the moment it's a nuance a nuisance rather everywhere you go there's a pop-up and the pop-ups are not implemented correctly there's too much variety some of them are too trust transparent, meaning-making it is difficult for customers some of them are hardly transparent.


Data Map or a Governance Program, Which One Would Choose First?


Before making a governance program to protect users’ data on the internet, it’s important to start with data mapping.


Dan explains “I would choose the data map. Because I feel like governance needs to know where the data starts and you could argue that you start with the strategy. But I feel like you have to start with that data map. It's practical and it typically reveals a lot of things the organization didn't understand and might actually modify the strategy so I always recommend starting with the data map.”


He adds “data map is mapping the data but before, that there's the data discovery. You’re getting to know your data that is having a data inventory what data do I have? Where do I collect it? What do I do? Then mapping it.”


In response to a question about choosing between making a data inventory and a data map, Dan points out” You have to do the inventory first because I don't think you can come up with for most people I don't think you know how to map your data without first doing an inventory.”


Conclusion

Data mapping is a critical for any organization looking to take advantage of big data. Data maps help you better understand and navigate through your existing structured and unstructured data, enabling you to create predictive models that can drive future business decisions. Creating effective data maps requires analyzing both your structured and unstructured data. So, you must make sure that data mapping is a key step, infact one the first steps, in your privacy and data governance program.


About the host


Punit Bhatia works with business and privacy leaders to create an organizational culture with high privacy awareness and compliance as a business priority. Punit is the author of various privacy books including “Be Ready for GDPR” and "AI & Privacy". Punit has been a speaker at over 50 global events. Punit is the creator and host of the FIT4PRIVACY Podcast.


Who is Dan Clarke?


Dan Clarke has 30 years of experience combining technology with media, retail, and business leadership, has held executive leadership roles at Intel, is an experienced data privacy advisor, and is a 9-time CEO. Dan has deep expertise in the privacy landscape and speaks frequently at public venues on the topic. He is also actively involved in Arizona, Texas, and federal privacy legislation.

Watch

Listen

Listen to the top EU GDPR based privacy podcast...

Stay connected with the views of leading data privacy professionals and business leaders in today's world on broad range of topics like setting global privacy programs for private sector companies, role of Data Protection Officer (DPO), EU Representative role, Data Protection Impact Assessments (DPIA), Records of Processing Activity (ROPA), security of personal information, data security, personal security, privacy and security overlaps, prevention of personal data breaches, reporting a data breach, securing data transfers, privacy shield invalidation, new Standard Contractual Clauses (SCCs), guidelines from European Commission and other bodies like European Data Protection Board (EDPB), implementing regulations and laws (like CCPA, EU GDPR, Chinese Data Protection Bill, India's Personal Data Protection Bill), different types of solutions, even new laws and legal framework(s) to comply with a privacy law and much more.
Created with