Nvidia scraped videos from YouTube, Netflix, and other sources to train its AI products. This is according to leaked internal Slack chats, emails, and documents.

The employees involved in scraping videos often questioned the ethics and legality of it but were silenced by managers. These managers also said they had clearance to use that content from the highest levels of the company.

The videos have been mainly scraped from YouTube, but content from sources likeNetflixand GitHub has also been used.

In a Slack message, a Nvidia employee also suggested scraping movies. The reasoning is “Movies are actually a good source of data to get gaming-like 3D consistency and fictional content but much higher quality.”

To this, Ming-Yu Liu, Vice President of Research at Nvidia, replied, “We need a volunteer to download all the movies.”

“We are finalizing the v1 data pipeline and securing the necessary computing resources to build a video data factory that can yield a human lifetime visual experience worth of training data per day,” Liu said in an email in May.

In Slack channels, employees also discussed which YouTube channel’s videos should be scraped for AI training. A research scientist posted several links to YouTube channels in a Slack channel and said: “If you are still open to suggestions about YouTube channels that we could download, here are a couple of channels that might be interesting to consider.”

Related:

Related:

The links were from YouTube channels of brands like Expedia and Architectural Digest’s official channel, as well as individual content creators likeMarques Brownlee(MKBHD). The scientist added a note saying: “Tech product reviews – super high quality,” next to MKBHD’s YouTube video link.

In July, Nvidia was alsoaccusedof using data from a third-party company to train its AI models. The third-party company in question had obtained that data by scraping YouTube videos from creators without permission.

Anurag Singh was a Tech Writer on Dexerto’s UK team, expertly covering laptops, smartphones, and wearables. He covers the biggest tech news from major brands such as Apple, Samsung, and Microsoft. He also has bylines at Android Police, Neowin, MakeTechEasier, Gizmochina.