Timnit Gebru from Stanford University and her colleagues from various American academic institutions have developed self-learning algorithms that allow you to determine with unexpected accuracy the various sociological characteristics of the area, for example, the level of the average family income, and the proportion of people with different levels of education . They detailed their calculations in an article published in PNAS .
Collection of such data by traditional methods can take years – if not for Google Maps. In the course of their research, scientists have 50 million photos of streets in 200 American cities. Then they used a couple of machine learning algorithms to determine the manufacturer, model and year of production of 22 million cars in these images. (The algorithm classified the make and model with an accuracy of 52%.)
Then, based on this data, there are already other algorithms, getting information about which types of vehicles were more common in the neighborhoods, which, according to the census and election data, can be considered more affluent or, say, more conservative. These algorithms turned out to be surprisingly accurate in determining the average income per family in the area; shares of white, black and Asian; the proportion of people with different levels of education; and the results of the vote for Obama or John McCain in 2008.
Comparison of vehicle data with actual demographic data also showed some interesting patterns. For example, 88% of polling stations, where the owners of sedans predominated, in contrast to pick-ups, voted for Obama, while 82% of those with a majority of pickup owners voted for McCain.
Researchers noted that in the future cameras on self-propelled cars can increase the ease and frequency of data collection, helping politicians to obtain an almost real-time demographic picture, which should better understand the supply of labor and housing, allocate resources for the construction and maintenance of roads and schools
