IASC - Data Visualization and Exploratory Data Analysis with Matrix Visualization
Date | 25 Oct 2024 |
Time | 14:00 GMT+02:00 - 15:30 GMT+02:00 |
Level of instruction | Intermediate |
Instructor |
Chun-houh Chen
|
Registration fee | |
Statistics is the science of collecting, organizing, analyzing, modeling, inferring, interpreting, and presenting data. And data visualization (statistical graphics) plays a pivotal role at different stages in the entire statistical analysis process. For a given dataset, the analyzer first needs to understand the whole data structure to select appropriate follow-up analysis, modeling, and inference methods. The most efficient tool for understanding data is to “see” the data through various visualization/graphics environments. In the middle stage, data visualization is not only employed passively for assisting different modeling and inference tools to find the best data analysis process but also to explore unknown information in the data actively. Finally, data visualization is used to present the analysis results and convey the information obtained in the analysis process to the data users.
There are thousands of tools for data visualization. Choosing the appropriate and correct way to explore data is an essential field in statistical science - Exploratory Data Analysis (EDA, Tukey 1977). The role of EDA is to obtain the information conveyed by the data from "seeing" the data, focusing on simple arithmetic and easy-to-construct diagrams and tables. Through EDA, the users can have a preliminary understanding and description of the patterns revealed in the graph/chart, then use the human mind (with modeling and inference) to make a comprehensive analysis and judgment on the received information to explore the hidden information in the data. The emphasis is on exploratory analysis rather than rigorous mathematical model confirmation. Modern data science and statistical analysis face the difficult challenge of high-dimensional and high-complexity BIG data. The statisticians' responsibility is to develop compelling graphics and visualization environments to assist data analysts in processing the data generated by advanced technology and complex experiments.
Instructors
About the instructor
Dr. Chun-houh Chen received his bachelor's degree from the Department of Statistics at National Chung Hsing University, Taiwan, in 1984. He later earned his master's degree (1990) and Ph.D. (1992) in Mathematics from the University of California, Los Angeles (UCLA). His expertise spans a wide range of areas, including multivariate statistical methods, exploratory data analysis (EDA), data/information/matrix visualization, dimension reduction, machine/deep learning, biobanking, bioinformatics, and precision/intelligent health.
Dr. Chen is an internationally recognized statistician. He has held various academic positions, including Assistant Professor in the Department of Statistics/Computer and Information Systems at George Washington University (1992-1993), Assistant Research Fellow (1993-2002), Associate Research Fellow (2002-2011), and Research Fellow (2011-present) at the Institute of Statistical Science, Academia Sinica, Taiwan. He served as Chairperson of the Asian Regional Section (ARS) of the International Association for Statistical Computation (IASC) from 2013 to 2015, was a member of the International Statistical Institute (ISI) Council (2015-2019), and is currently the president of the IASC (2023-2025). One of his most significant contributions to research is the development of the Generalized Association Plots (GAP) series, a set of matrix visualization tools crucial for EDA, capable of visualizing a variety of data types, including continuous, binary, ordinal, categorical, symbolic, and cartographic data, even in high-dimensional/large sample structures. Recently, Dr. Chen has shifted his focus to precision and intelligent health, leading research initiatives and practical applications within Academia Sinica and medical community in Taiwan.