Purpose of the Selection Bias Module
This module explains selection bias as it applies to residential listings in Johannesburg. Its purpose is to clarify how participation choices determine which residential properties become visible in listing-based datasets and why this visibility should not be interpreted as representative of the broader residential landscape.
Selection as a Participation Filter
Selection bias arises because residential listings enter the dataset only when property owners, managers, or intermediaries choose to publish through formal platforms. This choice acts as a participation filter that determines inclusion or exclusion from observation. Properties that do not pass through this filter remain invisible regardless of their prevalence or occupancy.
Drivers of Participation Decisions
Participation in listing platforms is influenced by factors such as ownership structure, management capacity, tenure arrangement, and alignment with formal brokerage practices. These drivers vary across districts and housing forms, producing systematic differences in which properties are selected into the observable dataset.
Interaction With Housing Form and Management
Multi-unit developments and centrally managed properties are more likely to be selected into listing systems due to standardized marketing and turnover processes. Individually managed, owner-occupied, or informally arranged residences are less consistently selected. This interaction skews visibility toward certain housing formats.
Compounding Effects With Other Biases
Selection bias compounds with rotation bias, spatial overrepresentation, and aggregation effects. Properties that are selected into listings and frequently relisted can dominate visibility, while large portions of the residential landscape remain unobserved. These compounding effects amplify distortion at both district and city scales.
Interpretation Boundaries Created by Selection Bias
Selection bias establishes a boundary against treating listing data as representative or comprehensive. Observed residential patterns should be read as outcomes of participation behavior rather than as reflections of residential structure, distribution, or scale. This module reinforces the need to interpret visibility within the limits imposed by selection.
