Development Platform Builder.ai Exposed Over 1.2 TB of Data Containing More Than 3 Million Records

Last updated: December 19, 2024

Cybersecurity researcher, Jeremiah Fowler, discovered and reported to Website Planet about a non-password-protected database that contained more than 3 million records belonging to Builder.ai — a London-based company offering AI software and app development solutions without any technical knowledge or coding skills.

The publicly exposed database was not password-protected or encrypted. It contained 3,077,542 records totaling 1.29 TB. In a limited sampling of the exposed documents, I saw customer cost proposals, NDA agreements, invoices, tax documents, email correspondence screenshots, internal image files, and much more. Among the most concerning files were two documents that indicated access and configuration details of two separate cloud storage databases that also included secret access keys. As an ethical security researcher, I never bypass authentication credentials, but it is hypothetically possible that those access keys could have revealed additional potentially sensitive data if they were to fall into the wrong hands.
The database contained 337,434 invoices (totaling 18 GB) and 32,810 files ( 4 GB) labeled Master service agreements. The latter included NDA agreements that contained the names, emails, IP addresses, project cost summaries, and other details of the projects.

The name of the database as well as the documents inside it indicated that the records belonged to Engineer.ai and Builder.ai. Further research indicates both names correspond to the same company, which went through a rebranding process in 2019. Builder.ai is a London-based technology company that provides human-assisted AI for building applications. It has multiple offices in the US, Asia, Europe, and the Middle East.

I immediately sent a responsible disclosure notice; however, the database remained publicly accessible for almost a month after my disclosure (from 10/28 until 11/27). In a follow-up email informing Builder that the files were still exposed, I was told “unfortunately it’s taking longer than we’d like due to some complexities with dependent systems”. It is not known if the database was owned and managed by Builder directly or via a third-party contractor. It is also not known how long the database was exposed before I discovered it or if anyone else gained access to it. Only an internal forensic audit could identify additional access or potentially suspicious activity

According to a 2023 press release, Builder announced an investment of over $250 million in Series D funding led by Qatar Investment Authority (QIA). The release also stated that the total amount raised by the company was over $450 million. That year, Builder.ai ranked #3 on Fast Company’s annual list of the World’s Most Innovative Companies in the category of “Artificial Intelligence”.

builder.ai data breach1 — This screenshot shows an internal customer profile document that indicates payment information and other transaction details about the customer.

builder.ai data breach2 — This screenshot shows how change requests, status updates, and customer communications were stored in the database. These included email addresses, company names, and details of the projects.

builder.ai data breach3 — This screenshot shows a tax invoice that included reference number, invoice number, and Builder’s banking information.

builder.ai data breach4 — This screenshot shows how customer data was collected and stored inside the database. It is similar to the billing profile, but this also included financial contact details and more.

builder.ai data breach5 — This screenshot shows a portion of the document containing the access key ID and the secret access key in plain text.

❮❯

Builder.ai is promoted as a human-assisted AI platform that allows businesses, schools, and a wide range of organizations to create custom software applications with no coding or technical expertise. Builder uses pre-built templates to simplify the app-creation process for web and mobile applications, which are adapted to the specific needs of each client.

There are numerous potential risks of exposing internal project documents and invoices. These can range from simple invoice fraud to targeted phishing attempts on customers using insider information. The most serious concern would be the database itself being targeted by a malicious competitor or ransomware.

Invoice fraud using internal documents poses serious potential risks because customers already have a relationship of trust built with a service provider. Criminals can use insider knowledge and expose original documents to create convincing fraudulent invoices that funnel funds to the criminal’s account. When criminals have access to sensitive details (like customer contracts, project scopes, or payment terms), they can use this information to impersonate vendors, customers, or employees to send invoices that appear legitimate. With high-value services and large amounts of money regularly paid for app development services, it only takes one successful attempt to result in serious financial losses. Internal tax documents or billing records could hypothetically provide enough information to target high-value transactions or recurring payments while filtering out smaller or less valuable accounts.

According to a recent survey, 43% of finance executives admitted they had fallen victim to a deep-fake scam; the average company losses were over $1M per year. My advice to organizations is to implement thorough verification processes for invoices and to manually confirm payment requests. This includes using only official communication channels and monitoring any changes in billing details. Companies who provide services should also educate and inform customers of acceptable billing practices and fraud-prevention strategies. I am not saying that Builder.ai, their customers, or partners are at risk of this or any type of fraud. I am only providing a real-world risk scenario to raise awareness and strengthen cyber defense for any organization facing a data exposure of internal documents.

Another concern that organizations should consider is the reliance on complex system dependencies and how they can seriously delay the incident response and remediation process, potentially exacerbating the risk of unauthorized access. When an organization has a data breach or is the victim of a cyber attack, every second counts. Any delay in mitigation could increase the potential risk. Configuring a system with file dependencies or restricting database access could be a potential recipe for disaster, as they can cause service outages, trigger cascading failures, or make resources unavailable. The benefits of these dependencies are that they allow for automation, real-time data access, and interconnected workflows, which are critical for operations. However, they also complicate security management, as restrictions or changes in one component (like securing a database) can disrupt other dependent systems or services. Organizations should review the level of dependencies they have and put a backup plan in place to ensure that any security incident can be fixed without a catastrophic service disruption.

Storing documents and access keys (e.g., Key ID and Secret Access Key) in plain text within the same database could potentially create a critical security vulnerability. In the event of an accidental exposure or unauthorized access to the database, malicious actors could use the keys to access linked systems, cloud storage, or other sensitive resources without additional authentication. Administrative credentials and access keys should never be stored inside any database where they can be identified and exploited. I would recommend encrypting access keys, storing them in a dedicated and secure management system, and segregating them from other sensitive internal or customer data to minimize risk.

My general advice to any organization is to prepare for when there is a data incident and not if there will be a data incident. The 4 steps below should be the core building block of their data protection strategy. It is easy to expand and add additional measures as needed, but implementing and investing in the basics is a good start. The goal is to ensure internal and customer data is secure and that there is a plan in place in the event of a security breach.

Access Controls: Authentication protocols are the first line of defense. There are numerous options for multi-factor authentication to prevent unauthorized access. It may seem inconvenient for you to take additional steps to log in to your own systems, but it is a necessity — the more roadblocks you can put between your data and unauthorized access, the better. Even if the login and password are known, it is far more difficult for criminals to bypass the additional steps.
Encrypt Data: I am still surprised when I discover a data breach and see sensitive information in plain text or unencrypted documents. Encrypting data (both at rest and in transit) can ensure that if that data is exposed, the chances of it being exploited are significantly lower. Companies should take every step available to protect and secure the data they collect and store.
Incident Response Planning: Having an incident response plan in place is critical to resolve security vulnerabilities. Knowing what to do, who will do what, and when they should do it can prevent reactionary chaos or mistakes during the mitigation or recovery process.
Regular Security Audits: Vulnerability scans, penetration tests, and general audits of both internal systems and third-party vendors is an important step to identify issues before they become critical issues.

I imply no wrongdoing by Builder.ai or Engineer.ai, and I do not claim that internal data or customer data was ever at imminent risk. The hypothetical data-risk scenarios I have presented in this report are exclusively for educational purposes and do not reflect any actual compromise of data integrity. It should not be construed as a reflection or assessment of any organization’s specific practices, systems, or security measures. As an ethical security researcher, I do not download the data I discover. I only take a limited number of screenshots solely for verification purposes. I do not conduct any activities beyond identifying the security vulnerability and notifying the relevant parties. I disclaim any and all responsibility for direct and indirect actions that may be taken as a result of this disclosure. I publish my findings to raise awareness on issues of data security and privacy. My aim is to encourage organizations to proactively safeguard sensitive information against potential unauthorized access.