
If there's one bad thing about the AI revolution, it's the privacy concerns that arise with training AI models. What data is being used to train these models and where is it from? Well, it seems some companies have been sneaky with sourcing their training data.
According to Wired (via 9to5Mac), some tech companies like Apple and Nvidia have been training AI models using YouTube videos without consent. This was done by scrapping the subtitle files of 173536 videos from over 48000 YouTube channels like MKBHD (Marques Brownlee), MrBeast, PewDiePie, Stephen Colbert, and more.
Apple has sourced data for their AI from several companies
— Marques Brownlee (@MKBHD) July 16, 2024
One of them scraped tons of data/transcripts from YouTube videos, including mine
Apple technically avoids "fault" here because they're not the ones scraping
But this is going to be an evolving problem for a long time https://t.co/U93riaeSlY
For your info, YouTube's terms and services forbid the use of YouTube data to train AI models. In his response to the news on X, Brownlee commented that Apple technically "avoided fault" because it's not the company doing the data scrapping. Regardless, many believe the use of data without consent will continue to be a problem in the AI era.
What do you think? Should tech companies be allowed to use such data without asking for permission? Please share your thoughts in the comments and stay tuned to TechNave for more news like this.







COMMENTS