Location Intelligence for Private Equity Due Diligences

Overview

Location intelligence (LI) is frequently used in private equity due diligences to assess market dynamics, competitive landscape, and expansion opportunities using geospatial data. Given the recurring nature of these analyses, a Streamlit-based interface was developed to accelerate delivery by standardizing common workflows. This includes functionality for geocoding, shapefile fetching, spatial enrichment with POIs and demographics, catchment generation, and attractiveness scoring - reducing what used to take days of manual coding down to a few hours per client.

In this write-up, we highlight one example project: a whitespace and attractiveness analysis for a pharmacy chain in Italy. The analysis evaluated postcode-level attractiveness based on demographics, competitor density, and proximity to points of interest (POIs) such as grocery stores and commercial centers.

Problem Statement

The client was looking to identify high-potential areas for future expansion. The challenge was to create a robust attractiveness scoring framework that could distinguish between postcodes based on demand potential, competition, and ecosystem quality. This required integrating various data sources, aligning spatial definitions, and producing interpretable outputs for decision-makers.

Methodology

1. Data Preparation

The analysis began with collecting and cleaning multiple datasets, including postcode boundaries, pharmacy locations (both target and competitor), and various points of interest (POIs) such as commercial centers and grocery stores. All address-level datasets were geocoded to generate latitude and longitude coordinates, enabling spatial analysis across consistent coordinate systems.

2. Defining Urbanicity of Postcodes

Classifying Municipalities by Population:

Each municipality in Italy was classified as Urban, Suburban, or Rural based on total population thresholds. This classification used:

Urban: > 50,000
Suburban: 5,000 - 50,000
Rural: < 5,000

Overlaying Postcodes and Municipalities:

Postcode geometries were overlaid with municipality boundaries to determine how urbanicity classifications map to postcode areas. To ensure accurate area calculations, both datasets were re-projected into an appropriate planar CRS.

Assigning Urbanicity to Postcodes:

To avoid noise from marginal overlaps, any municipality intersecting less than 3% of a postcode’s area was excluded. For each postcode, the mode urbanicity classification was used. In case of a tie, the more urban classification was selected by default (e.g., Urban over Suburban).

3. Target Store Catchment Analysis

Assigning Urbanicity to Stores:

Each target store was tagged with the urbanicity of its corresponding postcode. In cases where a store’s postcode was not included in the shapefile fetched from ESRI, the urbanicity was assigned manually based on known location details.

Determining Catchment Size:

Catchment sizes were dynamically assigned based on urbanicity:

Urban: 300 meters
Suburban: 500 meters
Rural: 1,000 meters

Enriching Catchments with POIs:

Using spatial joins, each catchment was enriched with counts of nearby POIs - categorized by type of grocery store, shopping center, and competitor pharmacy presence - to quantify the local ecosystem around each store.

4. Postcode Catchment Analysis

Generating Catchments for All Postcodes:

Postcode centroids were extracted and used to create new catchments based on urbanicity-specific logic (radius, walking time, or driving time). This enabled consistent enrichment across all areas, not just existing store locations.

Feature Enrichment:

Each catchment area was enriched with both spatial and demographic data sourced from ESRI, including:

Total population and population density
Unemployment counts and percentages
Education attainment levels
Marital status distributions
Household spending metrics (e.g., maintenance, personal care, purchasing power)
POI metrics such as counts of retail parks, grocery stores, and competitors
Aggregated metrics like total shops, average shops, and competitive intensity

Correlation Review:

A correlation matrix was generated to identify multicollinearity between variables and reduce feature redundancy before scoring.

5. Attractiveness Scoring

Quantile binning was applied to standardize feature scales. In cases of skewed distributions, thresholds were set manually to ensure meaningful separation across score buckets. Features were then weighted based on stakeholder input and business logic, and aggregated into a composite attractiveness score per postcode.

6. Output Preparation

The final dataset, containing postcode geometries, enriched attributes, and attractiveness scores, was exported in geospatial format and visualized using Tableau for client-facing delivery.

Key Learnings

Technical Skills:

Developed a deep understanding of Coordinate Reference Systems (CRS), including when and how to convert between projections to ensure spatial accuracy.
Learned how to geocode raw address data using ESRI or google map APIs and integrate the output into geospatial dataframes.
Gained fluency in using GeoPandas for spatial joins, overlays, filtering, and merging - mirroring SQL-style logic for geographic data.
Adapted to various visualization tools based on client needs, including Folium for maps, Matplotlib and Seaborn for plots, and Tableau for final interactive dashboards.
Became proficient in using external data providers like ESRI, Google Maps, and OpenStreetMap to fetch shapes, enrich data, and build catchments - learning the tradeoffs between them in terms of cost, API behaviour, and data availability.
Created catchment areas using both static radius buffers and dynamic driving/walking times depending on urbanicity, leveraging APIs for accurate estimations.

Soft Skills & Reflections:

Always validate critical steps with examples. For any transformation involving spatial joins or overlays, it’s essential to manually test with a few known examples to ensure accuracy - this was particularly evident in the geocoding stage, where small mismatches can affect downstream results.
How to collaboratively design an attractiveness scoring system, including how to shortlist variables, define thresholds, assign weights, and ensure interpretability.
Be proactive in defining and aligning key concepts. Terms like “urbanicity” or “catchment area” can be interpreted differently by different stakeholders. Establishing these definitions early prevents misalignment and costly rework.
Design with ambiguity in mind. Clients don’t always know exactly what they want at the start. Framing the problem space, suggesting methodological approaches, and clarifying assumptions are key to guiding the project in the right direction.
Balance technical work with communication. Clear documentation, structured deliverables, and consistent updates helped bridge the gap between data work and strategic decision-making - and protected the team from unnecessary stress or scope creep.

Keywords

Tools & Technologies: Python, GeoPandas, NumPy, Streamlit, Folium, Matplotlib, Seaborn, Tableau, ESRI, Google Maps API, OpenStreetMap (OSM), Docker, Git, GitHub, VS Code

Tags: Geospatial Analysis, Coordinate Reference Systems (CRS), Spatial Joins, Geocoding, Drive-Time Buffers, Demographic Enrichment, Quantile Binning, Feature Engineering, Correlation Analysis, Whitespace & Attractiveness Analysis, Retail Expansion Strategy, Private Equity Due Diligence, Location Intelligence

Back to Projects