5 Conclusion and future plans

The three pieces of work assembled in this thesis share a common theme of advancing regression diagnostics, with a focus on improving the assessment of residual plots, challenging the limitations of conventional methods, and developing innovative solutions to automate diagnostic processes.

5.1 Contributions

The primary contributions of this research are threefold. Firstly, we provide empirical evidence for the effectiveness of the lineup protocol in diagnosing model fit issues through residual plots (Li et al. 2024). Secondly, we develop computer vision model for automating the assessment of residual plots, which addresses the scalability limitations of lineup protocol. Lastly, we share a user-focused R package and Shiny app, making the automated diagnostic tools accessible to a broad range of analysts and practitioners.

The aforementioned R package (and its dependency) is available on CRAN with the latest development versions in the links below:

autovi (https://github.com/TengMCing/autovi), and
bandicoot (https://github.com/TengMCing/bandicoot).

The Shiny app for autovi is available at one of the mirror sites listed at https://autoviweb.netlify.app/ with the source code available at https://github.com/TengMCing/autovi_web.

Principles of transparency and reproducible research have guided the work with all materials related to the thesis at https://github.com/TengMCing/PhD/Thesis. The thesis is written using Quarto (Allaire et al. 2024) and is available online at https://patrick-li-thesis.netlify.app. The R packages used throughout the thesis include tidyverse (Wickham et al. 2019), lmtest (Zeileis and Hothorn 2002), mpoly (Kahle 2013), ggmosaic (Jeppson et al. 2021), kableExtra (Zhu 2021), patchwork (Pedersen 2022), rcartocolor (Nowosad 2018), glue (Hester and Bryan 2022), ggpcp (Hofmann et al. 2022), here (Müller 2020), magick (Ooms 2023), yardstick (Kuhn et al. 2024), reticulate (Ushey et al. 2024) and knitr (Xie 2014).

5.2 Future work

There are several directions that this work can be developed. These include improving the accuracy and effectiveness of residual plot assessments, exploring the use of alternative computer vision models, extending the automated visual diagnostics to different plot types and statistical models, and improving the front-end display and back-end computation for the web app.

The model in Chapter 4, trained on standard residual plots from linear regression, has certain limitations. It was very difficult to arrive at a final model to share, with the main concern is that the current version may still be too sensitive, leading to a decision that the model is misspecified even when problems are minor. While the current implementation relies on the basic VGG16 model (Simonyan and Zisserman 2014), developed a decade ago, performance could be enhanced by exploring more advanced versions like ResNet50 (He et al. 2016) and DenseNet201 (Huang et al. 2017), as well as ensemble techniques. There is room to improve the accuracy and effectiveness of residual plot assessments.

Residual plots from more complex models, such as hierarchical, temporal, or spatial regression, often exhibit distinct visual patterns that the current approach might not fully capture. To better address this, using scaled residual plots, with randomized quantile residuals (Dunn and Smyth 1996), offers a more flexible approach for defining residuals across different regression models, though it may alter the original visual pattern. Building a computer vision model on this foundation can provide a stronger solution for assessing residual plots across a broader spectrum of regression models.

Visual diagnostics are foundational in Bayesian modelling to assess model fit, convergence, and posterior distributions (see Gelman et al. (2013)). Some common visual diagnostics include trace plots to assess convergence of Markov Chain Monte Carlo (MCMC) chains, density plots to visualize posterior distributions, posterior predictive checks to assess model fit, and autocorrelation plots to assess dependence between samples in MCMC chains. These visual diagnostics help Bayesian modelers to evaluate the quality of their models and identify potential issues or areas for improvement. Automating the reading of these plots can help improve MCMC convergence diagnostics, facilitate model comparison and selection, and enhance uncertainty quantification.

The development of a more comprehensive suite of automated visual diagnostics for statistical models can help to improve the quality of statistical analyses. Also important is the development of user-friendly interfaces for these diagnostics, such as web applications, to make them accessible to a wider audience of researchers and practitioners. The web app developed in this thesis is a step in this direction, but further work is needed to improve the user experience, add more features, and make the app more robust and scalable. Future work could also explore the use of interactive visualizations and dashboards to help users explore and interpret the results of automated visual diagnostics more effectively.

Allaire, J. J., Teague, C., Scheidegger, C., Xie, Y., and Dervieux, C. (2024), “Quarto.” https://doi.org/10.5281/zenodo.5960048.

Dunn, P. K., and Smyth, G. K. (1996), “Randomized quantile residuals,” Journal of Computational and graphical statistics, Taylor & Francis, 5, 236–244.

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013), Bayesian data analysis (3rd ed.), Chapman and Hall/CRC.

He, K., Zhang, X., Ren, S., and Sun, J. (2016), “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.

Hester, J., and Bryan, J. (2022), Glue: Interpreted string literals.

Hofmann, H., VanderPlas, S., and Ge, Y. (2022), Ggpcp: Parallel coordinate plots in the ’ggplot2’ framework.

Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. (2017), “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708.

Jeppson, H., Hofmann, H., and Cook, D. (2021), Ggmosaic: Mosaic plots in the ’ggplot2’ framework.

Kahle, D. (2013), “Mpoly: Multivariate polynomials in R,” The R Journal, 5, 162–170.

Kuhn, M., Vaughan, D., and Hvitfeldt, E. (2024), Yardstick: Tidy characterizations of model performance.

Li, W., Cook, D., Tanaka, E., and VanderPlas, S. (2024), “A plot is worth a thousand tests: Assessing residual diagnostics with the lineup protocol,” Journal of Computational and Graphical Statistics, Taylor & Francis, 1–19.

Müller, K. (2020), Here: A simpler way to find your files.

Nowosad, J. (2018), ’CARTOColors’ palettes.

Ooms, J. (2023), Magick: Advanced graphics and image-processing in r.

Pedersen, T. L. (2022), Patchwork: The composer of plots.

Simonyan, K., and Zisserman, A. (2014), “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556.

Ushey, K., Allaire, J., and Tang, Y. (2024), Reticulate: Interface to ’python’.

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., Takahashi, K., Vaughan, D., Wilke, C., Woo, K., and Yutani, H. (2019), “Welcome to the tidyverse,” Journal of Open Source Software, 4, 1686. https://doi.org/10.21105/joss.01686.

Xie, Y. (2014), “Knitr: A comprehensive tool for reproducible research in R,” in Implementing reproducible computational research, eds. V. Stodden, F. Leisch, and R. D. Peng, Chapman; Hall/CRC.

Zeileis, A., and Hothorn, T. (2002), “Diagnostic checking in regression relationships,” R News, 2, 7–10.

Zhu, H. (2021), kableExtra: Construct complex table with kable and pipe syntax.