Big Data Sucks

Michael de Groot - Blog
6 min readApr 8, 2022


Image by Toorged

In case you didn’t already know, I’ve been on this 12+ month journey to reduce my spam/junk email and believe it or not some days I have ZERO spam/junk emails. Yeah I know, unbelievable right? It’s been hard work and I’ve learnt a massive amount. When you focus on something with great attention more and more information gathers from the ether and finds its way to you.

Incredibly this week I received a ‘notice of data processing’ email from a company I had never heard of, but it has given me an insight into the rabbit hole of big data and the companies that are gathering huge amount of company information with the singular goal of selling it on to interested spammers. They all claim that they gather this information on the basis of ‘legitimate interest’, the biggest loophole that has been invented by countries’ information commissioners. There’s never any ‘legitimate interest’ in my view to send unsolicited emails these days.

I predict one day this loophole will be closed off, but it will need a lot of lobbying by citizens towards government institutions to do so.

Let’s examine my rabbit hole. I have highlighted the company names that I have investigated and communicated with and what I discovered in the process.


The data processing email I received was from Cognism, I’ve never heard of them and what an incredible company name, it already sounds scam-like to me.

This is what their email said:

I used their opt-out link but also emailed them to ask specifically how they managed to get hold of my data and received the following email response:

They claim to have sourced my data from 2 companies, Coresignal and PIPL.


Coresignal was the first company to explore and this is their ‘about’ statement on their website:

“Coresignal was founded with the goal of making large amounts of up-to-date alternative data accessible to any company worldwide. Focusing on firmographics and publicly available professional profiles, we believe that data is key to creating value and opening up never before explored opportunities.”

Firstly a new term for my brain ‘Firmographics’, that’s definitely a new one to me. [].

Anyway Cognism were kind enough to provide me with the opt-out link:


Upon further research, I also discovered they use Mergent, so I emailed them:

“Hello, I have become aware that a data company Cognism states on their website that Mergent is one of their data sources. I am contacting you to request if you hold data on me or my company Staying Alive UK Ltd. If you do hold this data, I would like it to be removed as soon as possible, please advise how I may be able to request that to be actioned?”

And they did respond to my surprise and delight!


Cognism also confirmed that they collected my phone number information via Pipl, another catchy name, my dog is called Pip!

This one is a huge data collector, this is what they claim:

“Pipl includes more than 3 billion online identities cross-referenced from more than 25 billion individual records. Our search platform gathers data from sources such as public records, business listings, marketing lists, phone directories, and crowd-sourced information. Pipl also extracts facts and relevant information from web documents, personal profiles, blogs, news articles, and publications (see PiplBot). Data gathered from all sources is cleansed, merged, and clustered to create more than three billion detailed online identities.”

Thankfully Cognism also provided me with their removal request page:

I received the following email from Pipl and they kindly attached 15 pages of data and/or web locations where they have collected my data. Makes fascinating reading!

Pipl also show on their website which companies they specifically collect data from, so I downloaded that list of no less than 300 sites that are but a sample of the sites they crawl to collect data. I’m definitely on some of those websites, so I will be working my way down the list slowly to make sure there’s no personal data available to these awful crawlers. If you wish you can download the list HERE.


There’s one listed company in particular I was interested in ensuring they haven’t got my details and that’s ZoomInfo. Thankfully they also have an option to remove yourself:

The next company wasn’t listed on Pipl, but whilst I was searching for my own email address using Google last year (2021), they came up with claiming to have my company information and indeed they still do!


They are Rocketreach and I believed I had dealt with them back in November 2021, when I claimed my profile and also again in March 2022, but they still have my data and my suggested email based on their algorithm.

This their claim:
“Our service used by professionals to find other professionals. It is designed to open opportunities for you by connecting you with your next job, career opportunity or customer. To this end, we’ve created a professionally focused search index which you are a part of. This index is generated from publicly sourced data in a similar fashion to search engines like Google and Bing.” to claim and then remove your profile, but I recommend you email them direct as well, as that has now worked as the claiming your profile route did not.

I sent this direct email to their customer support:

“Hello, I am trying to get all my data removed from Rocketreach’s database.……
I tried to complete the form, but it won’t accept the above link.
Please can you urgently remove all my details, including from Google Search.
I look forward to your urgent attention and action. Thank you.”

To my total amazement I heard back from them quite quickly and they have indeed removed my profile from their database! Email below:

So there you have it. 4 companies, Coresignal, Pipl, ZoomInfo, RocketReach and likely many more of them, just collecting, collecting, and selling our data to willing spammers. I am fascinated how when I contacted them all, they were quite responsive about removing my details. Of course there’s nothing illegal with them collecting everyone’s data is there?


Hopefully it will start to improve things even further in my spam / junk email department.

If you are also interested in stopping unwanted email, I highly recommend removing your company email and details from those sites for starters, it may make a difference to your junk mailbox.


PS. After publishing this I received a further email from Mergent by FTSE Russell, advising me that I needed to jump through a number of extra hoops in order to have my data removed properly. Just shows that they may be experts at collecting and selling my data, but have no clue how to have it removed. Their initial email and their confirmation after I jumped through the hoops successfully.

Originally published at on April 8, 2022.