Last week, a security researcher discovered that the workplace communications platform Slack trains its AI models using customer data. Worse still, all Slack users are opted-in by default.
Naturally, the revelation led to uproar amongst the Slack community, with many people concerned about improper usage of their personal data.
However, all is not what it seems with this incident and, in reality, your data is far safer than media headlines may lead you to believe.
Here’s everything you need to know.
In the spotlight: Slack’s AI training policy
The latest Slack controversy stems from a poorly worded sentence in the company’s privacy principles, which have since been updated.
Initially, the privacy principles included the following sentence:
“To develop AI/ML models, our systems analyze Customer Data (e.g. messages, content, and files) submitted to Slack as well as other Information (including usage information) as defined in our privacy policy and in your customer agreement. To opt out, please have your org, workspace owners or primary owner contact our Customer Experience team at feedback@slack.com.”
This rather ambiguous wording was discovered by a Slack user over the weekend, who then spread the word across Slack community forums and the media, expressing frustration that their data was being used without explicit consent, and that opting-out wasn’t straightforward.
Slack is quick to respond
Following the media storm, Slack posted a blog to clarify what specific data is used for training models and how customer data is handled. Slack stated:
“We do not build or train these models in such a way that they could learn, memorize, or be able to reproduce any customer data of any kind. While customers can opt out, these models enhance the product experience without the risk of their data being shared. Slack’s traditional ML models use de-identified, aggregate data and do not access message content in DMs, private channels, or public channels.”
In further comments, Slack explained that its training models analyze metadata, such as user behavior data related to messages, content, and files, but do not access the message content itself.
Additionally, Slack emphasized that customer data is not used to develop large language models (LLMs) or other generative models. Its add-on generative AI product, Slack AI, utilizes third-party LLMs.
In essence, Slack has attempted to clarify that it harnesses user data to enhance automated channel and emoji recommendations, which is a much more benign use of AI than its initial privacy policy suggests.
In fact, since the incident, Slack has updated the wording of its privacy policy to state the following:
“To develop non-generative AI/ML models for features such as emoji and channel recommendations, our systems analyze Customer Data.”
Lessons learned
The incident surrounding Slack’s AI training practices brings to light several important lessons for both users and organizations. First and foremost is the critical importance of precise and unambiguous wording in privacy policies. Ambiguity can lead to confusion and mistrust among users, as seen in this case.
Another key takeaway is the necessity for organizations to prioritize user privacy from the outset. Privacy should not be an afterthought but a ‘by-design’ aspect of any service that handles user data.
Slack’s response to the controversy was a step in the right direction. The company clarified that its AI models use de-identified, aggregate data and do not access message content, aiming to enhance user experience without compromising privacy.
Still, though, some critics argue that Slack still places too much responsibility on users to opt out, rather than making this the default setting or requiring a clear opt-in declaration.
This debate highlights a broader issue in the tech industry: the balance between innovation and user privacy. As AI and machine learning technologies advance, companies are under increasing pressure to develop features that enhance the user experience while being cognizant of user privacy.
There’s no doubt that controversies like this one will continue to appear in the months to come, especially as more SaaS companies rollout generative AI and machine learning features.
For now, the best thing SaaS customers can do is review privacy policies carefully to ensure customer and employee data is being used ethically, as well as deploy SaaS data loss prevention (DLP) to safeguard sensitive information from data leakage–either by humans or AI training models.