TheWebConf 2022

Linking Streets in OpenStreetMap to Persons in Wikidata

Paper as PDF | Code & data on GitHub
Daria Gurtovoy
Universität Bonn
Germany
 
Simon Gottschalk
L3S Research Center, Leibniz Universität Hannover
Germany
gottschalk@L3S.de | Website

Motivation

Streets are often named after famous or distinguished individuals who may have a direct connection to the specific location. While Wikidata has information about persons, OpenStreetMap has information about streets.

➔ Can we connect OpenStreetMap and Wikidata by linking streets in OpenStreetMap to persons in Wikidata?

Example

The Wilhemstraße in Berlin could potentially be named after several different Wilhelms. Due to his popularity and as he was born in Berlin, we assume that the Wilhelmstraße was named after Friedrich Wilhelm I.

We can confirm our decision by checking Wilhelmstraße on Wikidata and its "named after" property or by checking Wilhelmstraße on OpenStreetMap and its "name:etymology:wikidata" key. However, these properties are rarely used and Wikidata does not cover all streets in OpenStreetMap.

Problem Statement

Given a street s and a set of persons P, create a street-to-person mapping function f(s) ↦ P that identifies the person p ∈ P after whom the street is named.

Approach

Our method StreetToPerson is based on the following pipeline:

Overview of StreetToPerson. Yellow boxes show example values.

Knowledge Graph Preprocessing

We use four datasets created from Wikidata:

Friedrich Wilhelm I. in the person occupation index and the person location index.

Street Name Truncation

We remove affixes (i.e., prefixes and suffixes) from the street name which are not related to the names of person (e.g., "street", "road", and "avenue"). Our list of suffixes is available on GitHub..

Candidate Retrieval

We query the person index with the truncated street name to retrieve a set of candidates.

Feature Extraction

We extract 30 features for a street-to-person pair (s,p):

Example of the spatial dependencies between the street "Wilhelmstraße" and the city Berlin. Arrows denote "located in" relations (e.g., Berlin is located in Germany).

Street-to-person Classifier

Using the 30 features, we train a random forest model that classifies a street-to-person pair as positive or negative.

Evaluation

Using Wikidata's "named after" property, we extract 4,799 pairs of German streets and persons they were named after. We use these pairs together with negative examples as training and test datasets.

Baselines

Evaluation of the Classification on Wikidata

StreetToPerson clearly outperforms the baselines and achieves a precision and recall of more than 0.9.

Precision Recall F1 Score
TagMe 0.49 0.45 0.47
PopRank 0.69 0.66 0.67
RelRank (all entities) 0.08 0.08 0.08
RelRank (person entities) 0.35 0.11 0.17
StreetToPerson 0.95 0.91 0.93
Evaluation of the classification for StreetToPerson and the selected baselines using 10-fold cross validation.

Application of StreetToPerson on OpenStreetMap

We apply StreetToPerson on German streets in OpenStreetMap. For 669,304 streets, we find at least one candidate person. For 183,022 of those streets, one person is classified positively.

Number of
Bremen The German state Bremen
NRW The German state Nordrhein-Westfalen
Germany
Streets The amount of streets in Wikidata
6,733 219,768 1,321,464
   
with candidate persons The amount of streets in Wikidata for which we find at least one candidate person in the person index
2,504 110,968 669,304
Candidate persons The total number candidate persons found for the streets
47,659 2,675,761 16,165,454
Street-to-person pairs The total number of street-to-person pairs returned by the classifier
896 28,857 183,022
Number of street-to-person relations identified for OSM streets in Germany and two of its states.

Some streets in OpenStreetMap denote the person they were named after using the "name:etymology:wikidata" key. We use these streets for estimating precision and recall of StreetToPerson on OpenStreetMap.

StreetToPerson achieves a precision of more than 0.9 and a recall of more than 0.6 on OpenStreetMap.

Bremen NRW
Precision 0.94 0.90
Recall 0.64 0.61
F1 Score 0.76 0.73
Evaluation of StreetToPerson on the OSM ground truth in two German states.

Citation

@inproceedings{gottschalk2018eventkgtl,
   title={{Linking Streets in OpenStreetMap to Persons in Wikidata}},
   author={Gurtovoy, Daria and Gottschalk, Simon},
   year={2022},
   booktitle={Proceedings of the The Web Conference}
}

References

Notes

Acknowledgements

This work was partially funded by the Federal Ministry of Education and Research (BMBF), Germany under "Simple-ML" (01IS18054) and the DFG, German Research Foundation, under "WorldKG" (424985896).