Once again, researchers have uncovered yet another leaked database containing a vast set of personal data. This latest discovery was unearthed by Bob Diachenko, a researcher at Security Discovery.
Through his blog post, Diachenko claims that the database is an unsecured MongoDB database belonging to email validation service provider, Verifications.io.
During cross checking the database, Diachenko teamed up with Troy Hunt of Have I Been Pwned (HIBP) to collaborate on determining if the leaked data was an entirely new unique set.
He concluded that this leak wasn't a collection of previous breaches, as has been the case with recent leaks reported by Hunt.
The database, 150 GB in size, is said to be a collection of emails, 808,539,939 in total, that is combined with additional information.
Most of the information attached to the emails includes phone numbers, dates of birth, physical addresses, gender, IP addresses and employer details.
Diachenko has stated in his blog post that not all the emails contain detailed personally identifiable information, but unfortunately, a majority do.
More Data May Have Been Leaked
However, according a report from cybersecurity firm DynaRisk, there are four leaked databases as opposed to one as previously reported.
DynaRisk CEO Andrew Martin stated that the initial discovery was made from a database called mainEmailDatabase, which contained three folders, namely Emailrecords, emailWithPhone and businessLeads. All had 798,171,891 records, 4,150,600 records and 6,217,358 records respectively.
Martin further stated that their discovery increased the number of emails exposed to 2 billion while the size of the databases went up to 196 GB.
Their analysis of the four databases showed that they were all hosted on the same server somewhere in Miami.
However, after further investigation and examination, DynaRisk updated its report to state that the combined number of emails leaked is 982,864,972 to be exact, and not 2 billion as previously reported.
They reported that after cleaning all of the combined data from all four databases, they found an additional 191 million records.
Regrettably, this meant that more additional information got exposed which included Facebook, LinkedIn and Instagram account details.
Additionally, mortgage data which included interest rates and credit scores were also leaked.
Fortunately, the data doesn't contain any passwords, credit card numbers or social security numbers; however, there were Verifications.io's passwords used in its infrastructure.
Findings from Researchers
Before this event, many people weren't aware of Verifications.io and the crucial role they play in the marketing industry.
Diachenko teamed up with the owner of NightLion Security, Vinny Troia, to take on further investigations to understand how Verifications.io performed its validations.
In his blog, Diachenko states that the company verifies email addresses by dispatching messages (effectively spamming).
The emails are valid if they receive a confirmation that the message was delivered, and vice versa if they bounce back.
Most companies outsource this process to avoid their infrastructure from being blacklisted by spam filters.
The leaked database also exposes Verifications.io's internal tools such as SMTP servers, email, spam traps, keywords to avoid as well as IP addresses to blacklist.
Diachenko further stated that he contacted the company about the leak. They got back to him alleging the data exposed was public data.
Since then, the company has removed the data and taken its website offline (though an archived version has been made here).
This has led to speculations that the data exposed may have not been public. Moreover, a lot continues to be unknown about the company and the leaked databases.
Researchers have been unable to contact the company after they pulled down their website.
Furthermore, tracking the company is tricky with its whereabouts unclear. Some evidence appears to suggest that the company is based in Boca Raton, Florida, while is assets are registered in California and Delaware.
Commenting on the leak, experts have stated the databases represent a treasure trove for spammers and phishers-especially those operating in dark web hacking communities. Moreover, there is no evidence suggesting whether cybercriminals were able to get a hold of the data yet.
This data is now about 92% loaded into @haveibeenpwned and will go live in about 5 hours or so. If you're subscribed to notifications, chances are you're going to be getting an email, from me https://t.co/OfwPk6L9x7 https://t.co/b5qdZ7OjYz
- Troy Hunt (@troyhunt) March 8, 2019
This leads to the ultimate question in people affected by the breach, how to mitigate the risks.
Troy Hunt says the data has now been loaded onto Have I Been Pwned so users can check if their information was affected in the breach.
Staying alert to phishing and spamming will limit your exposure to the risks.
Alternatively, you can change your passwords and deploy an extra layer of protection such as two-factor authentication.
The Overall Data Industry
All in all, heightened vigilance is warranted since throughout 2018 the number of data breaches increased by 424 percent compared to the previous year.
The figures show that there were 12,449 verified incidents in 2018.
At the same instant, the number of data leaks circulating in the dark web increased by 71 percent.
Around 3.6 billion out of 14.9 billion identity records were authentic and did not contain any redundancy.
All this attributed to the big data industry and how personal information is bought and sold between marketers and tech giants like Facebook.
Meanwhile, GDPR regulations haven't been able to prevent this even though companies require users' permission to exchange information with third parties.
However, Verification.io is not expected to adhere to these regulations since they offer data cleaning services.
In the end, the consumers are afforded little power on who utilizes their data or where it ends up.