Privacy in the context of AI predominantly focuses around our personal information, who has access to it, how it's used, and to borrow a quote from American lawyer Louis Brandeis, our 'right to be let alone'. For a longer read, take a look at Wired's "The Next Big Privacy Hurdle? Teaching AI to Forget".

In terms of how major tech and media platforms handle privacy consent and data handling, The New York Times has an excellent (and worrying) interactive article below:

We Read 150 Privacy Policies. They Were an Incomprehensible Disaster.
In the background here are several privacy policies from major tech and media platforms. Like most privacy policies, they’re verbose and full of legal jargon — and opaquely establish companies’ justifications for collecting and selling your data. The data market has become the engine of the internet, and these privacy policies we agree to but don't fully understand help fuel it.

Privacy-Related Concepts

Personally Identifiable Information or PII ranges from the obvious (e.g. name, contact details, address, phone number) to information we may not typically think as being able to identify us (e.g. browsing history, device location, purchase history). How data collected from an individual may be used depends on the jurisdiction in which you (and/or the company collecting the data) reside. ZDNet has a short read on PII with a US lens below:

Personally identifiable information (PII): What it is, how it’s used, and how to protect it | ZDNet
Keep your data and personal information safe from hackers and prevent your devices from being used to spy on you with these security and privacy gadgets.

Given the sensitive nature of PII and regulations under which it must be protected, people working with data may need to ensure information about an individual is obfuscated in their datasets. A really simple definition of Pseudonymization and Anonymization (and how these data obfuscation methods relate to GDPR) is provided by enterprise data security experts Protegrity:

"Pseudonymization is a method to substitute identifiable data with a reversible, consistent value. Anonymization is the destruction of the identifiable data."
Pseudonymization vs. Anonymization and How They Help With GDPR
Pseudonymization and Anonymization are distinct yet often confused terms in data security. With GDPR, it is important to understand the difference.

Differential privacy provides a way for aggregate data to be shared without compromising the privacy of the individuals on which that data is based. Further to that, differential privacy is a "strong, mathematical definition of privacy in the context of statistical and machine learning analysis", originally invented by Cynthia Dwork. For further understanding it may help to watch the following video by the USA National Institute of Standards and Technology (NIST):

Protected attributes or classes are typically characteristics of a person that are protected from discrimination under various acts and legislation in each country or jurisdiction. Examples are race, religion, sex, sexual orientation, age, and disability. The IBM Trusted AI research group has a series of tools for identifying then reducing bias and discrimination in machine learning models. You can try out a demo of their AI Fairness 360 toolkit via the link below:

AI Fairness 360 - Demo

Passive listening, typically enabled in smart assistants, is an area that while enabling innovative services, provides ample opportunity for misuse and unethical practices. Smart assistants like Amazon's Alexa, Google Home, and similar devices are all collecting swathes of data as they wait for you to utter the right voice command. Global law firm Dentons shares an overview of the Italian data protection authority's recommendations pertaining to privacy and the use of smart assistants:

Smart assistants: a privacy-proof use
Smart assistants are programs that use artificial intelligence (AI) algorithms to understand and interpret ordinary language, and to perform certain actions such as conversing with humans.

Opt in/Opt out refers to the method by which an individual agrees to data being collected about them (including their actions and behaviours particularly in an online setting). It depends on the jurisdiction of the user and/or the service they're accessing whether opt in (ask beforehand) or opt out (don't ask, start collecting) is legally required. GDPR for instance is very much in the opt in camp. Brian Barrett has this to say in an article on Wired:

"It’s a simple problem to explain. An “opt out” paradigm means that data collection happens automatically, and you have to actively seek out ways to stop it. Under “opt in,” you must affirmatively grant a company the right to access that data before it can do so. You’re in control from the start."
‘Opt Out’ Is Useless. Let People Opt In
It’s not so crazy to want Big Tech to ask for your data—and conversations with AI assistants—before they take it.

Informed consent involves ensuring the user is aware their data is being collected, who is collecting it and how it will be processed, what it's being used for, and that they are able to withdraw that consent at any time. GDPR provides some of the strictest requirements on obtaining informed consent which you can read about here. Kalev Leetaru writes for Forbes below about the challenges individuals face when 'consent' implies we've read and understood the implications of a service's privacy policy and T&Cs:

What Does It Mean To “Consent” To The Use Of Our Data In The Facebook Era?
Facebook’s response to nearly every one of its privacy stories over the past year has been to argue that its users legally consented to its practices so they have nothing to complain about. All of this raises the question of just what it means to “consent” in the Facebook Era?

Facial recognition and its applications in police and state surveillance has become a hot topic in recent times. John Oliver's segment below provides a good introduction and some of the dangers inherent in its use:

De-identification is a term that can be used synonymously with anonymization. The goal is to remove any attributes that could identify an individual. Johns Hopkins provides a set of steps to de-identify data here, and the Finnish Social Science Data Archive provides a further definition in their Data Management Guidelines below:

Data Management Guidelines - Anonymisation and Personal Data | Data Archive
The Data Archive provides research data to researchers, teachers and students. All services are free of charge.

Privacy by Design is a set of design principles introduced by Canadian Dr. Ann Cavoukian the former Information and Privacy Commissioner of Ontario. FutureLearn covers the 7 design principles of Privacy by Design in a module of their Understanding the GDPR course below:

Privacy by Design - Understanding the GDPR
An article about the Privacy-by-Design notion and its essence.

Finally, the politically independent ThinkDoTank defines data ethics as the following:

"Data ethics is about responsible and sustainable use of data. It is about doing the right thing for people and society. Data processes should be designed as sustainable solutions benefitting first and foremost humans."

The Deloitte Insights team has a quick read on data ethics 'in the age of big data' and 4 of the biggest issues driving discussion in this space:

The rise of data and AI ethics
As technology tracks huge amounts of personal data, data ethics can be tricky, with very little covered by existing law. Governments are at the center of the data ethics debate in two important ways.​