You're the Teacher Now: How Companies Are Using Your Data to Build AI That Replaces You | Privacy Please Ep. 269

Transcript

From Product To Teacher

SPEAKER_00 0:00

I think we've all heard the phrase. You're the product. What a service is free. Your data is what's being sold. That was the deal. A little uncomfortable, but at least we understood it. Here's what nobody's told you about the new deal. You're not just the product anymore. You're the teacher. Your emails, your chats, your support tickets, your work decisions, your daily patterns. Companies are feeding into all of it. They're feeding all of it into AI systems right now. Systems that, in a lot of cases, are being built specifically to do what you do. GitHub is just the latest company to make it official. They're using your code, your prompts, your inputs and outputs to train their AI models. Opted in by default. Four menus deep and settings if you want out. That's just the one that got caught. We need to talk about the rest of it. And today we're talking about something that affects every single person listening right now, probably, most likely, 99% of you. Um, most of you, all of you, I'm pretty sure of it. Uh, and if you can't tell, I'm dealing with some kind of sickness or I'm losing my voice. I'm not really sure, but apologies if my voice cracks, but uh the train keeps moving. So AI systems need data the way that factories need raw materials, right? So the more the better. And unlike oil, data is everywhere. It's being generated constantly by hundreds of millions of people just going about their day. For years, the model was simple: scrape the internet, use public data, but regulators and courts have started pushing back on all of that. Copyright laws, GDPR enforcement, platform access restrictions. So companies turn to the next best thing: the data they already have, data their customers gave them, data their employees generate every single day. There's a legal term for what happens when data collected for one purpose gets quietly repurposed for something else. Function creep. You signed up for a productivity app, your documents are now training a language model, you used a customer service chatbot, those transcripts are now teaching the next version of the chatbot how to sound more human. You uploaded photos to the cloud backup, those images are now part of an image recognition data set. It just keeps going. The original consent covered the original purpose. It didn't cover this. And in most cases, it nobody asked for it. The most recent example is GitHub. Starting April 24th of 2026, Microsoft's code platform is going to use customer interaction data, specifically your inputs, outputs, code snippets, and context from Copilot to train its AI models. The policy applies to Copilot free pro and pro plus users. Enterprise and business customers are exempt because their contract's specifically prohibited, which tells you something important. The protection exists, they just didn't extend it to everyone. You can opt out if GitHub says, but to do it, you have to navigate to settings, find the copilot section, find the privacy heading, disable a toggle that says allow GitHub to use my data for AI model training. It's off by default in some regions because European law requires opt-in. In the US, it's on by default because the law doesn't require otherwise. How that's even a thing in the US boggles my mind to this day. But here we are, we're still trucking along, right? That isn't on accident. It's the architecture of consent in 2026. GitHub isn't alone. This is the playbook. A quiet policy update, a pre-selected toggle, an opt-out buried four menus deep. By the time most users notice their data has been feeding the model for months. The FTC has been clear on paper. Using customer data for secret purposes, including training AI models, can be an unfair or deceptive practice. There's a lawsuit right now, Con vs. Figma, that's testing exactly this theory. Figma allegedly used years of customers' design files to train generative AI tools for assuring users their content wouldn't be repurposed. The suit doesn't hinge on copyright, it hinges on consent. And that's what makes it dangerous for every company running the same playbook. The customer data story is uncomfortable. The employee data story is something else entirely. Across industries right now, a pattern is emerging. Companies are asking employees to document their workflows, label data sets, record their decision-making processes, and workers are increasingly realizing that what they're building is the training set for the AI that replaces them. Customer service reps, recording calls, financial analysts documenting how they evaluate risk, legal teams tagging contracts, all of it feeding into systems specifically designed to automate those functions. At Amazon, driver data was used to train autonomous routing algorithms. At Google, employees spent months training an internal AI called Gemini for sales on client relationship workflows. Then the company cut hundreds of ad sale positions. Former employees described it to Business Insider as building your own coffin. JP Morgan Chase trained a contract analysis tool partly on the work of the legal and analysts whose roles it was designed to streamline. Here's the legal reality here. Most workers have almost no protection here. If your employer asks you to participate in AI training programs, you're likely required to comply. Employment lawyers say that documenting everything is the best defense. If your employer tells you the AI training won't affect your position and then lays you off six months later, that paper trail matters. The power imbalance in employment relationships is also why European regulators are skeptical of employee consent for AI training purposes. Consent isn't freely given when you can't say no without risking your job. And there's a compounding problem nobody talks about here. It's the bias baked in at the source. If the training data reflects existing workplace patterns, who got promoted, whose decisions got flagged, whose work got praised, the AI inherits all of it. Not as a bug, as a feature. Because that's what it learned from. Here's the part that should really unsettle you the most. Privacy law gives you the right to have your data deleted. Under GDPR, CCPA, and a growing list of state laws, you can ask a company to remove your information, in theory. In practice, AI training breaks all of that. When a company deletes your record from their database, the raw data is gone. But the model that learned from it, that's still running. Your patterns, your behavior, your decisions, they've been absorbed into the weights of a neural network. You can't delete yourself from a trained model without retaining the model from scratch. Most companies aren't doing that. Legal authorities are starting to say the model itself constitutes derived personal data when it was trained on individual records. That's an enormous problem for every AI product built on customer data without rigorous consent processes. And it means deletion rights. One of the most important protections consumers have may be functionally unenforceable in the AI context right now. So what do you do about this? What do you do with this? Audit your tools. Any app or platform you use regularly, especially ones using AI features, has a privacy settings page. Look for toggles related to improving our services, personalization, or anything that mentions AI model training. Assume they're on by default. Turn them off. For GitHub specifically, settings, co-pilot, privacy, disable. Allow GitHub to use my data for AI model training. Do it before April 24th. If your employer asks you to participate in AI training, document what you've been told about how the data will be used and whether it affects your role. Don't rely on verbal reassurances. Know your rights. If you're in California, Colorado, Virginia, or any of the 17 other states with comprehensive privacy laws, you have data access and deletion rights. Use them. If deletion from a train model is complicated, the paper trail matters. I challenge you to ask the question companies don't want you to ask. When you use any AI-powered tool, ask yourself, is my input being used to train this? If the answer is unclear, assume it's a yes. The phrase function creep is almost too gentle for what's happening. Your data is being repurposed for a use you never agreed to, your work is being used to build systems designed to replace you, and the opt-out, when it exists at all, is buried deep and enough that most people will never find it. This isn't a future problem. GitHub's policy change is coming up on April 24th. That's a few weeks away. The question we're sitting with isn't whether companies should be allowed to do this. It's whether you ever actually agreed to it. And if the answer is no, what happens next? I'm Cameron Ivey. This was Privacy Please, part of the Problem Lounge Network. If this hit close to home for you, and it should, share it with someone, please. If you share it, it uh helps us spread the show out to more people, which, you know, I think could help a lot of people right now. You can find all of our shows, episodes, and more at theproblemlounge.com. Subscribe, follow, leave us a review if if you're feeling generous, it would really help us reach more people, and that's that's the main goal here, just to to try to spread good information that is helpful in this space. So thank you so much for tuning in if it's your first time and if you've been with us for a long time. Thank you so much for the support. And uh, I just know that I love doing these. So sorry for the voice this week, and uh we'll see you guys on the next one. Cameron Ivy, over and out.