Highlights

  • As before, Google’s library, which gets along with TensorFlow Privacy, can be used with data processing engines such as Spark and Beam frameworks, yielding potentially more flexibility in deployment.
  • Google is one among those tech giants that have released differential privacy tools for AI in recent years.

On the occasion of Data Privacy Day, Google announced the expansion of its existing differential privacy library to the Python programming language in collaboration with OpenMined, an open source community focused on privacy protection technologies. The company also disclosed a new differential privacy tool that, according to it, will allow practitioners to visualize and better adjust the parameters used to generate differentially private information and paper sharing techniques to scale differential privacy to large datasets. This step will make differential privacy tools easily accessible to more people.

Expanding differential privacy

Google’s announcement marks both, the first year of collaboration with OpenMined and Data Privacy Day, which celebrates the signing of Convention 108 in January 1981, the first legally binding international agreement on data protection. Google open-sourced the differential privacy library – that it claims is used in its core products such as Google Maps – in September 2019, before the launch of Google’s experimental module that tests the privacy of AI models.

Miguel Guevara, Google differential privacy product lead, commented in a blog post, “In 2019, we launched our open-sourced version of our foundational differential privacy library in C++, Java, and Go. Our goal was to be transparent and allow researchers to inspect our code. We received a tremendous amount of interest from developers who wanted to use the library in their own applications, including startups like Arkhn, which enabled different hospitals to learn from medical data in a privacy-preserving way, and developers in Australia that have accelerated scientific discovery through provably private data. Since then, we have been working on various projects and new ways to make differential privacy more accessible and usable.”

The tech giant also said that the newfound support for Python in its respectful privacy library has enabled organizations to start experimenting with novel use cases such as aggregating and anonymizing the most visited web pages per country.

As before, Google’s library, which gets along with TensorFlow Privacy, can be used with data processing engines such as Spark and Beam frameworks, yielding potentially more flexibility in deployment.

Growing support

Google is one among those tech giants that have released differential privacy tools for AI in recent years. In collaboration with researchers at Harvard University, Microsoft had released SmartNoise in May 2020. The other tech giant, Meta (formerly Facebook), also recently reported open-sourced a PyTorch library for differential privacy dubbed Opacus.

Studies have revealed the urgent need for techniques to conceal private data in the datasets used to train AI systems. Researchers have also emphasized that even anonymized X-ray datasets can identify patient identities, for example. Large language models like OpenAI’s GPT-3 are known to leak names, phone numbers, addresses, etc., from training datasets when prompted.