Show, don't tell.
What we've built.

No vague promises, just results.1 Below is a selection of the systems, data pipelines and models we've designed and built.

1 And some retro ASCII art, because we couldn't help ourselves.

01

High-Dimensional Market Clustering

The challenge

A client needed to find companies that operate identically to their most successful assets, rather than just relying on generic, often inaccurate industry codes.

The build

We built a similarity engine that analyzes actual operational data to map true functional overlap between businesses, ignoring static sector labels.

The outcome

A living, high-precision clustering engine powering sharper business insight and targeted commercial strategy.

Doesn't generalize to

Similarity degrades when companies have sparse observable data in the registry. Works best in well-populated national datasets with rich operational signals.

02

Automated Façade Intelligence

The challenge

Evaluating thousands of potential retail sites required an expensive, error-prone army of field scouts.

The build

We designed a computer vision pipeline that automatically extracts specific building features and storefront data directly from high-resolution street imagery.

The outcome

Field operations scaled with 80% less manual labour and significantly higher data consistency across all commercial properties.

Doesn't generalize to

Confidence drops in areas with limited street-level imagery coverage or outdated Google Street View data. Manual verification recommended for sites with thin image history.

03

Precision Retail Risk Modeling

The challenge

Retailers were relying on gut feeling and fragmented local data to decide where to open their next stores.

The build

We engineered a quantitative scoring engine that synthesizes hundreds of disparate data streams to generate a single, standardized risk profile for every commercial property in the Netherlands.

The outcome

Standardised risk assessment across the Dutch market, enabling data-driven capital allocation and precise site selection.

Doesn't generalize to

Calibrated for the Dutch market. Applying profiles to other countries requires retraining on local property and footfall datasets.

04

Predictive Asset Deployment

The challenge

Security infrastructure was being deployed reactively, often arriving too late to high-value construction sites.

The build

We built a predictive classification model that scans behavioral and environmental data to flag high-risk sites, automating the allocation of security assets before incidents occur.

The outcome

A reactive scouting process transformed into an automated, intelligence-led allocation model with significantly higher conversion rates.

Doesn't generalize to

Accuracy drops when there are insufficient historical deployment outcomes for calibration. Major urban development can disrupt the behavioral patterns the model learned on.

05

Multi-Modal Traffic Fusion

The challenge

Media buyers were pricing out-of-home advertising based on static, outdated, and periodic traffic estimates.

The build

We built a live data pipeline that merges sparse smartphone pings with high-precision infrared camera imagery to report actual, real-time footfall.

The outcome

Live, hyper-local footfall monitoring powering real-time media value optimisation.

Doesn't generalize to

Smartphone pings systematically underrepresent certain demographic groups. Signal is thinner in car-dependent areas or where smartphone penetration is lower.

06

IRIS: Strategic Location Intelligence

The challenge

Building a unified, durable platform for location intelligence and revenue forecasting.

The build

Our core B2B infrastructure, designed for high-throughput data ingestion and rigorous financial forecasting at global scale.

The outcome

An enterprise-grade platform that translates massive location datasets into actionable revenue outcomes across 25+ markets.

07

Blink: Custom Spatial Embeddings

The challenge

Standard geographic models struggle to quantify the invisible spatial relationships, like transit flow and neighborhood dynamics, that actually drive commercial success.

The build

Blink is a proprietary deep-learning model that translates complex European urban environments into clean, usable data for site portfolio optimization and revenue forecasting.

The outcome

A high-precision modelling engine for applications including QSR revenue forecasting, parcel locker placement, and site portfolio rationalisation.

Doesn't generalize to

This model was trained on high-density European urban transit patterns. Performance drops in highly rural or car-dependent retail zones.

08

Text Anonymisation

The challenge

Municipal governments were sitting on millions of sensitive administrative records that they couldn't legally use for analysis due to GDPR restrictions.

The build

We engineered an automated, high-throughput pipeline that identifies and strips personally identifiable information (PII) from completely unstructured text, safely unlocking the data for research.

The outcome

Data-driven insights unlocked for multiple large municipalities by transforming restricted raw text into secure, privacy-compliant datasets ready for analysis.

Doesn't generalize to

Optimised for Dutch administrative text. Multilingual or highly unstructured documents require additional domain-specific training.

Something like this on your list?
Let's talk.

The Big Data Company B.V.
Princetonlaan 6 · 3584 CB Utrecht, Netherlands
+31 30 899 9477