As they assess the fallout of Facebook’s Cambridge Analytica difficulties and prepare for the European Union’s General Data Protection Regulation (GDPR), data analytics practitioners have shifted from looking for new ways to collect and analyze data to thinking about how long data should be retained.
Some providers of data analytics systems are getting involved by offering data retention control tools. One vendor that has taken this step is Google, which announced data retention controls for its venerable Google Analytics system.
Google’s new data retention features will go live May 25, the day the General Data Protection Regulation (GDPR) goes into effect. The pending EU regulation is forcing all major players in the internet services market — from developers of social media platforms to vendors of analytics tools — to evaluate how their processes allow customers to have data removed.
Google Analytics Expands Options for Data Retention
Google explains that its data retention controls give users “the ability to set the amount of time before user-level and event-level data stored by Google Analytics is automatically deleted from Analytics’ servers.”
With the new data retention capabilities, users will be able to choose to have Google Analytics retain data for 14 months, 26 months, 38 months or 50 months before automatically deleting it. They will also be able to opt for no automatic expiration.
Aggregated data is not affected when the user data is deleted.
The choices impact user-level and event-level data associated with the identification elements used in analytics. This means cookies (the text files associated with browser activity), user identifiers (such as user IDs) and advertising identifiers (e.g., DoubleClick cookies, the Android advertising ID and Apple’s Identifier for Advertisers).
Access to the data retention controls is under the Google Analytics admin panel. Users navigate to the property they are interested in adjusting, then to the Tracking Info column, where they will see the Data Retention option.
Related Article: Understanding Google Analytics Audiences Report
No More Data Hoarding
Google’s introduction of the data retention controls is a response to the overzealous data collection that can happen in analytics efforts today. Analytics tools have become much more sophisticated since the days when Google first introduced Google Analytics. At that time, data was generated from webpages loaded into browsers.
Today data in an analytics solution can be extracted into a business intelligence dashboard (like Tableau or Google Data Studio), come from an open source program (like R programming or Python), or sourced from a data import — a feature Google Analytics offers. Decreasing storage costs have accelerated the trend to hosting data longer and thus having more data to parse.
The end result is that brands are likely accumulating more data than what they actually need to use for reporting and modeling accuracy. Keeping more data than necessary is risky and increases a brand’s liability in the event of a data breach. It is better to decrease risk by not collecting data in the first place, or by periodically deleting it.
Related Article: 10 Unexpected Places You May Find GDPR-Related Data