Data collection is the process of gathering and measuring variables of interest, in an established systematic fashion that enables you to answer stated research questions, test hypotheses, and evaluate outcomes. The goal for all data collection is to capture quality evidence that then translates to rich data analysis and allows the building of a convincing and credible answers to questions that have been posed.
However, many research and policy are based on already available information that do not necessarily fit to the issue or question. For instance, the average income of residents in a district is estimated at the average market value of homes. Suppose there are some big villas in a district. However, the houses are rented to migrant workers who sleep with twelve persons in a room. Actually, the district has poor inhabitants and impoverishment is lurking. A better measure of prosperity in a district could be the amount of call shops, Travelex foreign exchange bureaus or percentage windows with bed sheets as curtains or without curtains at all.
Perhaps we could use data of the local supermarkets to assess the risk of impoverish of a district, not only by sales volume, but also the kind of products consumers buy. Income groups differ in the food they buy. Supermarkets may have more or less variety in products they offer, regarding the population.
This could lead to a Reverse Open Data movement, where businesses make their data available for town halls to design social and economic policies. Open Data is a movement that open data should enable third parties to leverage the potential of government data through the development of applications and services that address public and private demands. This information exchange could be made two-sided.
Perhaps we could measure the level of civilization of a nation by registering the percentage of cars that don’t stop for a pedestrian at a pedestrian crossing (a zebra crossing).
Another example is the Big Mac Index, a creative alternative to determine exchange rates, the rate at which one currency will be exchanged for another. It is based on the concept that a Big Mac is highly standardized all over the world. The difference in selling price in country A compared with the selling price of a Big Mac in country B gives a better idea of the value of the two currencies relative to each other.
Maybe we can say something about the tendency to conformity by counting people who wear clothes of a specific brand. Or determine the percentage women that is involved in street sweeping.
It’s hard to get solid data on drug usage, because it’s traditionally gathered via questionnaires. Respondents can fudge the answers or forget details. Drug users also sometimes don’t know what they are really taking or whether other drugs are mixed in. However, the laboratory analysis of waste water has the potential to get more accurate results more quickly, as a recent study showed that cities’ sewer water exposes use of cocaine, cannabis, meth and ecstasy.
It is interesting to roam in a town and create some mini theories that explain what is observed, compared with another town. For instance:
- There are substantially more gulls and open torn garbage bags ;
- Advertising is everywhere and very blatant;
- There is a lot of green space and many squares;
- All doors and window frames are painted in the same color;
- There are many pizzerias.
What does it mean? Or what does it explain? Can you quantify the phenomena and turn them into meaningful data?