As of January 1st, the California Consumer Privacy Act is now in effect. The CCPA lets anyone in California request all the information a company has on them as a consumer, including what data has been sold to /accessed by other companies. And when it comes to penalties, if a company is notified of being out of compliance (i.e., unable to provide all the data of their consumers), they have 30 days to comply or they will get fined per record. And that “per record” component is important because it highlights how quickly a fine could balloon into billions of dollars in fines. The interesting component of this is that if a company doesn’t comply, it opens companies to face class action lawsuits from consumers.
Based on the data the CCPA covers, it’s mostly data that is collected before applying any type of AI – key identifiers like name and DOB, personal records like property or services obtained, geolocation, etc. Keyword: mostly. This is where it gets interesting. The CCPA also covers inferences based on that data. In other words, when a company creates a data profile for a consumer based on connecting a group of data points, that will also need to be shared. These would be considered derived data points--areas like user behavior, perceived intelligence levels, preferences, psychological trends, etc. For major businesses with robust and mature data-centric strategies, derived data points make up most of a consumer’s data profile. For example, I recently spoke to a big bank and they said they had millions of data points on every single customer, most of which are derived data points.
Now what does it mean for AI? The good news is that for training and inference, it’s mostly about what goes into a model (consumer identifier data points) and what comes out (derived data points). The gray area will be the more complex neural networks that rely on many derived data points (often hidden from humans) to make an overarching derived data point. It remains to be seen if companies will be on the hook for those hidden insights too. This presents somewhat of a conundrum for companies leveraging more advanced AI because a surprising number of businesses don’t have deep enough knowledge into how an insight was derived from a complex model or deep neural network. I think it will force those leveraging AI to prioritize explainability as a feature of their chosen AI platform, where insights derived from AI must be explained to a point where they can be understood by a human.
The CCPA is the most thorough consumer privacy regulation in the US to date and I believe this is just the start of a country-wide movement. I envision a future with individual state and federal mandates on digital footprint monitoring and transparency for all consumers, but before we get there, everyone will be paying close attention to the ramifications of the CCPA--in particular, the ease (or not) with which companies satisfy compliance and if the penalties for being out of compliance are too severe or not severe enough.