As you all know, research methods and methodology are important for all disciplines in social sciences. There is far less research on methodology, however, than that on its applications in most countries. The Center for Survey Research (CSR), Research Center for Humanities and Social Sciences, Academia Sinica, with which I am affiliated, is one of the few research institutes dedicated to research on survey methodology, aside from its tasks of survey implementation and data archive management. In early October 2018, CSR hosted a national conference on survey research methodology. In the following I briefly describe the conference and some thoughts.
This conference covered topics of survey methodology in general, including sampling and weighting, questionnaire design, web surveys, data quality, survey errors, analysis of social networks, and the use of administrative data. Two keynote speeches were delivered. One, on AI and its application in Taiwan, was by Dr. Sheng-Wei Chen from the Institute of Information Science, Academia Sinica. Dr. Chen addressed the rapid change of information technology, AI, and the emergence of big data. How big data can be used and applied is surely an important issue all around the world, and it is not surprisingly so in Taiwan as well, as shown by the packed audience in the conference room. There are two major challenges of AI. First of all, AI is a technique to solve specific problems, and which require customized deep-learning models to meet diverse needs. Secondly, big training data are required for machine learning. AI will be of no use if such training data are not available. Dr. Chen calls for open sources and open data for future opportunities to use AI. And it is not realistic to expect that AI or big data will replace survey data, because both have their pros and cons.
The other keynote speech, on National Health Insurance data and sampling surveys, was given by Dr. Ching-Syang Yue from the Department of Statistics, National Chengchi University in Taiwan. Due to low response rates and concerns over the data quality of the national census, rolling sampling has been considered as an alternative. The 2010 census in Taiwan used a 16% sample survey to replace an all-population census, with linkage to administrative data to ensure the accuracy of estimates. The data quality, however, has been criticized for endangering the inference of a de jure population. On the other hand, the coverage of National Health Insurance (NHI) in Taiwan is more than 99% of the population, including non-citizens. If an all-population census will not be available in the future and access to administrative records remains difficult, NHI may be a reliable source for the estimation of the de jure population. Since NHI contains working instead of residential addresses, researchers need to be cautious about the limitations and required conditions for such estimation. Despite the long period required to conduct it, an all-population census every 10 years is preferred among academics.
Survey researchers and practitioners have been concerned about the impact of big data on sample surveys. Now we all realize that the use of big data and sample surveys can be complementary, and a win-win situation. There is no either/or proposition for big data and survey data, although we don’t want to miss out on either one, especially when both are available. We have recently seen the “collaboration” of big data and survey data in social sciences research, and expect to see more in the third wave of the AI revolution.
Administrative data, as a type of relatively big data, are another source for collaboration among survey data users. In countries where administrative data are well utilized for research, laws that allow their use for research purposes may play an important role. In others where the governments are conservative about data openness, either administrative data or government surveys, more efforts are needed. In Taiwan we are lucky enough to have access to most government surveys for research purposes. However, linking administrative data to survey data or those across government agencies remains rare. This has something to do with the Personal Information Protection Act (PIPA) in Taiwan, although its impact may not be comparable to that of the General Data Protection Regulation (GDPR). One possibility to justify the use of various types of administrative data for research purposes is to revise related laws and acts. We cannot envision when this will happen, but remain hopeful.
*Photo taken at the 2017 Christmasland in New Taipei City.