By Dr. Fabian Ibel
In the context of increasing digitization and growing data volumes, companies are well advised to engage in inventorying, organizing and categorizing their data collections. There are two main reasons for this. Firstly, implementing such measures is driven by regulatory and contractual requirements, including the GDPR, the Trade Secrets Act (Geschäftsgeheimnisschutzgesetz), and Non-Disclosure Agreements (NDAs). Secondly, beyond legal compliance, data classification serves a corporation’s best interest – particularly if it aims to expand its use of Artificial Intelligence (AI). Drawing exclusively on German and European law, this article highlights the growing importance of data categorization.
For the purpose of this article, categorization refers to, among other things:
- Maintaining clean and logical folder and data structures
- Ensuring consistent and meaningful data naming conventions
- Gaining reliable knowledge where data is stored, for example
- Basic folder structures managed by Active Directory
- Relational databases (e.g. SAP)
- Cloud storage, servers, and similar plattforms
- Assigning data to relevant projects and authorized personnel
- Classifying data according to appropriate categories, such as form, priority, sensitivity, and more
Data Categorizing as a Legal Obligation
(1) “Classified Material” under German Security Legislation
One of the most well-known methods of classifying data is the categorization of classified material (Verschlusssachen – VS) into levels such as “For Official Use Only,” “Secret,” and “Top Secret.” This system originates from Germany’s Security Clearance Act (Sicherheitsüberprüfungsgesetz – SÜG), which governs security screenings for individuals handling sensitive information.
However, it’s not only government agencies that must follow these rules. Private companies working with government contracts, intelligence services, or the military may also be required to comply with strict security protocols when dealing with classified data.
In the United States, where the Defense Industrial Base is even larger, these distinctions go even further. There, classified information, controlled unclassified information (CUI), and federal contract information are critical categories in security procurement, ensuring that sensitive data stays in the right hands. As the author outlines, such categorization has a significant impact on the IT security architecture of contracting parties, with standards and requirements such as Cybersecurity Maturity Model Certification based on NIST SP 800-171 playing a key role.
(2) Obligations under Non-Disclosure Agreements
In addition, categorizing data based on its sensitivity becomes especially important when research – and technology – driven companies and institutions seek to collaborate or engage in joint research projects that involve the exchange of confidential information. These collaborations are typically governed by Non-Disclosure Agreements (NDAs).
Most NDAs include clauses requiring the parties to implement safeguards against unauthorized access, clearly separate confidential data from other information, label data as confidential, and return it upon completion of the cooperation. A robust data management system is essential for meeting these obligations under an NDA.
(3) Complementary Safeguards under the German Trade Secrets Act
Such a system is equally vital in the context of the Trade Secrets Act (Geschäftsgeheimnisschutzgesetz), which not only complements but can also go beyond the confidentiality safeguards provided by an NDA. The Trade Secrets Act imposes penalties and fines for violations, making it a key legal instrument for protecting sensitive business information. However, claims under this act can only be enforced if the affected party has already implemented appropriate measures to prevent unauthorized access to or misappropriation of their trade secrets.
(4) Legal Requirements under the GDPR
Certainly, a strong impetus for entities to engage in data categorization comes from data protection legislation. Data controllers must identify special categories of personal data as defined in Article 9 of the GDPR – such as data revealing racial or ethnic origin, political opinions, or religious or philosophical beliefs – including, for example, HR-related data. These categories require a higher degree of protection and awareness.
The processing of not only special categories of personal data but all personal data must adhere to the fundamental principles outlined in Article 5 of the GDPR.
Of particular relevance are the principles of data minimization (Article 5(1)(c) GDPR) and storage limitation (Article 5(1)(e) GDPR), both of which are also reflected in the right to erasure (Article 17 GDPR), often referred to as the „right to be forgotten.“
Under the GDPR, data controllers are generally required to delete personal data unless there is a legal justification for retaining it (Article 17 GDPR). Common exceptions include legal retention obligations – such as the requirement under German law to retain business correspondence (e.g., commercial letters) for typically six (6) years – or the need to retain data in anticipation of potential legal disputes, in line with applicable statutes of limitations.
In recent years, data protection authorities have increasingly imposed fines for non-compliance with data deletion obligations. Violations have included the absence of sufficient deletion policies, lack of a legal basis for continued storage, excessive retention periods, and breaches of Article 5(1)(e) GDPR. Notable cases include a €900,000 fine imposed on Credit Collection Service in 2024 for deletion violations, and a €250,000 fine issued by the French data protection authority against a provider of virtual communication services.
Companies are strongly advised to implement a comprehensive deletion policy – and, crucially, to put it into practice.
A fundamental first step is to conduct a comprehensive data inventory, addressing the essential question: What data is retained, and where is it stored? This process enables effective data classification and the definition and assignment of appropriate retention periods.
Regarding deletion practices, a recommended best practice is to designate at least one specific date per year for reviewing and executing data deletion in accordance with the established policy and a structured deletion protocol.
Many company managers will likely realize that establishing a culture of data deletion can be a significant challenge and requires continuous effort. To succeed, they should work closely with internal or external data protection officers and ensure strong support from top management for this important initiative.
When it comes to deletion obligations, one worst-case scenario must be avoided: a company suffers a cyberattack that compromises and extracts sensitive data, only to then report the incident to authorities under Article 33 GDPR and to affected individuals under Article 34 – while acknowledging that the compromised data should have already been deleted and should no longer have been stored on the server.
(5) Data Organization under the NIS 2 Directive
Finally, data organization plays a crucial role in the NIS 2 Directive (EU 2022/2555), which has not yet been transposed into German national law. This is particularly relevant in areas such as risk management, access control, and incident response. To effectively implement Role-Based Access Control (RBAC) and deploy backup strategies and disaster recovery plans that ensure data integrity and availability, organizations must classify data according to its sensitivity and criticality.
Categorizing Data as a Prerequisite to Using AI
Beyond the legal obligations outlined above, there is another strong incentive to implement a professional data strategy: the adoption of advanced AI technologies. Without a structured approach to data categorization, organizations will struggle to leverage AI effectively at a high level.
At a basic level, employees should be aware of key do’s and don’ts when using browser-based large language models (LLMs), especially free or publicly available ones. Recognizing these risks, a growing number of companies have introduced AI policies that establish essential guidelines for chatbot usage. These typically include prohibitions against using private accounts, entering personal data or business secrets, and, whenever possible, disabling chat history to enhance data security.
Taking AI usage to the next level involves leveraging integrated tools like Microsoft Copilot 365, which – depending on its configuration – can access data across the entire Microsoft environment. With Copilot, Microsoft can access all data stored within a tenant and use it for analysis. A tenant refers to a dedicated, isolated instance within Microsoft cloud services such as Azure, Office 365, or Microsoft 365, assigned to a specific organization or customer.
When a company implements Copilot across its entire workforce without full transparency, it must ensure that sensitive data – such as HR records, trade secrets, management board resolutions, and shareholder decisions – is adequately protected. This requires a robust authorization framework that ensures AI systems only process information accessible to authorized users.
Outlook
Germany and Europe are often criticized for focusing more on data protection than on sophisticated data utilization. Recently, Friedrich Merz referenced a conversation with Microsoft CEO Satya Nadella at the World Economic Forum in Davos, where Nadella highlighted that German SMEs hold an immense, untapped wealth of data within their organizations…
When raising this treasure management, compliance officers, and data protection officers are called to establish effective data management while critically assessing technical solutions. Their challenge lies in navigating the fine line between innovation, future viability, and sustainability on one side and the protection of personal data and business secrets on the other. Inventorying, organizing and categorizing data is the back bone of this big project.
The Author:
Dr. Fabian Ibel is the Chief Compliance Officer at Harmonic Drive SE and the Co-Founder of Truveo Compliance. He advises his corporate and private clients on all compliance related matters. Fabian is the author of several academic articles in the corporate and compliance law fields.
Editor:
Isabel Cagala , TLB Co-Editor-in-Chief