
Wikimedia Enterprise is now offering parts of Wikipedia's datasets to companies that want to use them to train artificial intelligence (AI) models. They collaborated with Keggle – a subsidiary of Google – to offer selected datasets in English and French. So, what should you know about it?
For your information, the data has been optimised for training the model as it contains no links and code for formatting text as offered in Wikipedia. The move to offer this data set was made because the site's traffic was hit hard by bots looking to steal articles to train models without permission. Last month, Wikipedia said the amount of traffic accessing multimedia content increased by 50% last year due to bot activity.
Keggle will pay Wikipedia Enterprise to use this data. At the same time, all data used will be re-attributed under the Creative Commons Attribution-Share-Alike 4.0 and GNU Free Documentation License (GFDL) licenses.
What are your thoughts on this news? Comment below, and stay tuned for more news like this at TechNave.





COMMENTS