Data Security Considerations for AI
There is no denying the power and versatility that Artificial Intelligence software and technology has achieved over just the last few years.
In such a short time, the advances that we have made with machine learning algorithms have evolved from what were once clunky chat bots into conversation generative AIs so convincing that one of Google’s own engineers was sure that it was sentient. Less than 2 years ago, open sourced AI image generators like the first generation DALL-E were released, but only seemed capable of producing blurry and dream-like pictures. Compare that to the subsequent generations and algorithms that have emerged since, which can create photorealistic images in seconds, and it's clear to see that the speed at which we are making progress in the field of AI is accelerating rapidly. As the technology around artificial intelligence has quickly become more advanced, it is easy to understand why every industry imaginable has been scrambling to figure out how to put AI to use to stay ahead of the rapidly changing curve.
Many philosophical concerns have already been raised regarding the possible dangers of a rapidly evolving artificial intelligence; like the ethical questions on the fine line between “borrowing” and stealing the intellectual property necessary to train these algorithms. Artists and writers have been at the forefront of ringing these particular alarms, and those grievances were at the heart of the writers' strike that played out this year. Even more dangerous possibilities have also been flagged as law enforcement agencies have begun using AI to look for and track suspects. All of those very serious concerns can seem almost trivial when compared to the apocalyptic scenarios outlined in science fiction over the years that have begun to look frighteningly more and more possible. Recently, the Air Force was forced to clarify statements from officials regarding a dangerous, rogue AI from a hypothetical training exercise.
While all of these concerns should be taken seriously, it seems clear that the momentum is behind Artificial Intelligence, and we will continue to experiment with using it in every sector moving forward. You may not share many of these apprehensions, or they possibly aren’t relevant to the work for which you plan to utilize AI. However, if you do decide to test artificial intelligence in your industry, you should be aware of an underlying concern that will be relevant no matter what you are doing. We are currently utilizing a technology that nobody really fully understands, including the scientists and engineers who built it. When it comes to sensitive data and information, that is a problem.
If you’ve ever worked with healthcare or education data, personally identifiable information (PII) like social security numbers or financial data, or any other type of sensitive information, then you are aware of all of the rigid regulations and industry standards that have been implemented to make sure that data is protected and does not fall into the wrong hands. You would not trust this information to anyone who is not properly authorized to view it, so can you really trust an algorithm with that same information when you don’t fully understand how it works? These programs work by ingesting large amounts of data, analyzing trends, and then using those trends to make predictions. Many of these programs are constantly taking in new data, and using it to finetune their predictions. They are literally learning as they are exposed to new information.
If you plan on utilizing machine learning algorithms or artificial intelligence to clean, process, or analyze any amount of sensitive data, you should first make sure that you understand how that particular program works. Make sure you understand if the algorithm is storing the information you provide it. If it is storing your data, where does it store it and is it stored securely? Will the algorithm learn from the information that you are providing and will it utilize your data for future processings? Is it possible that what the algorithm learns from your data could show up externally? Is it possible the sensitive data itself could be regurgitated by the algorithm somewhere else? If the answer to any of these questions is “we aren’t sure” then you need to exercise caution with how you will use the tool. How would you react to an employee or a contractor if they told you they couldn't remember where they left that sensitive document you gave them for analysis, or if they didn’t know how the information on it ended up being used or possibly disseminated? You probably have standards and precautions in place to regulate and track which human eyes, brains, and hands have access to your sensitive data. Make sure you are taking the time to understand and translate those standards and precautions for regulating and tracking how machine brains will use the same sensitive information.
We'll be sharing more posts each month to support you with your Data Governance, Data Management, and Reporting questions! Sign-up for our newsletter to receive the next post in your inbox!