In an effort to protect users' privacy, Meta Platforms utilized public Facebook and Instagram postings to train some features of its new Meta AI virtual assistant while excluding private posts shared only with family and friends, the company's top policy officer told Reuters in an interview.
In addition, Meta took steps to remove private information from publicly available datasets used for training, according to Meta President of Global Affairs Nick Clegg, who was speaking this week on the sidelines of the company's annual Connect conference. Meta also avoided using private chats on its messaging services as training data for the model.
Clegg stated that "we've tried to exclude datasets that have a heavy preponderance of personal information," adding that the "vast majority" of the data utilized by Meta for training was publicly available.
He used LinkedIn as an illustration of a website whose material Meta chose to purposefully avoid using due to privacy concerns.
Clegg's remarks come as tech firms like Meta, OpenAI, and Alphabet's Google have come under fire for training their AI models on data that was illegally scraped from the internet. These models consume enormous volumes of data in order to summarize information and produce visuals.
While contending with litigation from authors who claim they have violated copyrights, the corporations are considering how to manage the private or copyrighted materials swept up in that process that their AI systems may reproduce.
The business's strong Llama 2 large language model, along with a brand-new model called Emu that creates visuals in response to text inputs, were used to create the personal assistant by Meta, according to the company.
The product will have the ability to produce text, audio, and visual content and will have real-time information access thanks to a collaboration with Microsoft's Bing search engine.
No comments:
Post a Comment