Companies are using books to train AI softwares.

Just like the real world, there is a shadow library on the Internet too. You might think that this is not real but it is very much true. And the fact that we are going to disclose today will amaze you.

So there is an online platform where the books are kept without permissions. These libraries are always on the Internet. It is like a secret/an open secret. Most of the libraries have books, other people’s work, academic papers and pirated books which are used without permission.

Apparently, these high-quality books are used to train the biggest language models of today’s history. Now because these books are the work of some great names that’s why they have very diverse and long information regarding any aspect of life.

The tech companies are taking the emotional and well diverse qualities of humans and indulging them into the training of artificial intelligence which will help them understand how the human brain is working.

It trains the artificial intelligence robots to think and express themselves just like humans do. We all know how expensive the license and rights of these books can get. That’s why the joint companies are not even bothering getting into this mess.

In March 2025, the Atlantic started releasing a tool that will let you search for the books in the library Genesis. It is the biggest shadow library on the Internet. This is where the authorities find out that there was quite an exploitation happening. which was then Disclosed to the public.

In the legal documents you can easily find out that the company which has Facebook and Instagram under its wing was using libGen to steal books and train its language models on the emotions of humans. Not every book from the library Genesis was used, but it was enough to know that they used priority books without permission.

This is disturbing because even if the giant companies are making the creativity of other creators without even giving the credit. Then what can you expect from the general public? Once one sheep goes, others follow too.

Meta is facing legal actions about this controversy. Meta is defending itself by saying that the company is not taking the original books itself but giving the language models a transformative piece of information which means that AI is not going to produce original documents or work, but it will take inspiration and patterns from them and generate its own new content.

Meta and AI advocates of different other companies are making it seem like this is done for some greater good. Because these systems are bringing a large number of benefits to the general public, that means that this mistake should be justified.

Leave a Reply Cancel reply

Related Stories

You can get weekly and lifetime subscription of ChatGPT

AI is giving digital life to the dead. This is problematic.

Amazon is launching its own AI assistant The Alexa +

You may have missed

You can get weekly and lifetime subscription of ChatGPT

AI is giving digital life to the dead. This is problematic.

Amazon is launching its own AI assistant The Alexa +

Companies are using books to train AI softwares.