Analyzing the Impact of Various Criteria on Blood Pressure Through General Linear Model by Box-Cox Transformations

Section: Research Paper
Published
May 31, 2026
Pages
88-103

Abstract

Blood pressure is a key indicator of cardiovascular health, reflecting the force exerted by blood against arterial walls. Data transformation tools are essential in statistical analysis for improving assumptions necessary for linear models, such as normality, linearity, and homoscedasticity, especially when these assumptions are violated. These techniques are especially helpful for correcting distortions in data structure and enhancing the validity of General Linear Models (GLMs).    In this study, we applied the Box-Cox transformation, a method from the family of power transformations, to improve a linear regression model. Our objective was to identify the most appropriate power transformation to enhance model performance and interpretability, using statistical criteria on blood test datasets collected from Azadi Hospital in Duhok. These evaluation criteria are essential for the accurate interpretation of data, as they assess the modeBlood pressure is a key indicator of cardiovascular health, reflecting the force exerted by blood against arterial walls. Data transformation tools are essential in statistical analysis for improving assumptions necessary for linear models, such as normality, linearity, and homoscedasticity, especially when these assumptions are violated. These techniques are especially helpful for correcting distortions in data structure and enhancing the validity of General Linear Models (GLMs).    In this study, we applied the Box-Cox transformation, a method from the family of power transformations, to improve a linear regression model. Our objective was to identify the most appropriate power transformation to enhance model performance and interpretability, using statistical criteria on blood test datasets collected from Azadi Hospital in Duhok. These evaluation criteria are essential for the accurate interpretation of data, as they assess the model’s quality and reliability. The criteria used included: adjusted R-squared, Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), F-statistic, Maximum Likelihood Estimation (MLE), Root Mean Square Error (RMSE), and the Shapiro–Wilk test. These measures also guided the selection of the most appropriate GLM. We can conclude that different  values optimize different model performance aspects:  balances good model fit (  F-statistic).  gives the most accurate predictions (lowest error).  improves likelihood and residual normality. A computational algorithm was proposed to estimate the optimal power parameter, and the results of the criteria were discussed and compared. Based on the adjusted R-squared criterion, an optimal λ value was identified, indicating a strong model fit.l’s quality and reliability. The criteria used included: adjusted R-squared, Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), F-statistic, Maximum Likelihood Estimation (MLE), Root Mean Square Error (RMSE), and the Shapiro–Wilk test. These measures also guided the selection of the most appropriate GLM. We can conclude that different  values optimize different model performance aspects:  balances good model fit (  F-statistic).  gives the most accurate predictions (lowest error).  improves likelihood and residual normality. A computational algorithm was proposed to estimate the optimal power parameter, and the results of the criteria were discussed and compared. Based on the adjusted R-squared criterion, an optimal λ value was identified, indicating a strong model fit.

References

  1. References
  2. Nelder, J. A., & Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3), 370–384. https://doi.org/10.2307/2344614
  3. Dobson, A. J. (2002). An introduction to generalized linear models (2nd ed.). Chapman & Hall/CRC. https://doi.org/10.1201/9781315182780
  4. Ruppert, D. (2001). Statistical Analysis, Special Problems of: Transformations of Data. International Encyclopedia of the Social & Behavioral Sciences, 15007- 15014. https://doi.org/10.1016/B0-08-043076-7/00513-1
  5. Box, G. E. P., & Cox, D. R. (1964). An Analysis of Transformations. J R Stat Soc B Methodol, 26(2), 211–252. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
  6. Yeo, I., & Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87(4), 954–959. https://doi.org/10.1093/biomet/87.4.954
  7. Yang, Z. (2006). A modified family of power transformations. Economics Letters, 92(1), 14–19. https://doi.org/10.1016/j.econlet.2006.01.011
  8. Hou, Q., Mahnken, J. D., Gajewski, B. J., & Dunton, N. (2011). The Box-Cox power transformation on nursing sensitive indicators: Does it matter if structural effects are omitted during the estimation of the transformation parameter? BMC Medical Research Methodology, 11(1), 118. https://doi.org/10.1186/1471-2288-11-118
  9. Raymaekers, J., & Rousseeuw, P. J. (2020). Transforming variables to central normality. Machine Learning, 113(8), 4953–4975. https://doi.org/10.1007/s10994-021-05960-5
  10. Hawkins, D. M. (2024). Testing normality of data transformed by maximum likelihood Box-Cox. School of Statistics, University of Minnesota. https://doi.org/10.48550/arXiv.2407.19329
  11. Wichitaksorn, N., Choy, S. T. B., & Gerlach, R. (2014). A generalized class of skew distributions and associated robust quantile regression models. Canadian Journal of Statistics, 42(4), 579–596. https://doi.org/10.1002/cjs.11228
  12. Pek, J., Wong, O., & Wong, C. M. (2017). Data Transformations for Inference with Linear Regression: Clarifications and Recommendations. Practical Assessment, Research and Evaluation, 22(9), 1–11. https://doi.org/10.7275/2w3n-0f07
  13. Marill, K. A. (2004). Advanced statistics: Linear regression, part II: Multiple linear regression. Academic Emergency Medicine, 11(1), 94–102. https://doi.org/10.1197/j.aem.2003.09.006
  14. Lin, L., Jiang, W., Chen, B., Yu, J., & Zheng, C. (2024). Construction and Application of Cost Prediction Model Based on Multiple Linear Regression Analysis. In Procedia Computer Science (Vol. 247, Issue C, pp. 617–623). https://doi.org/10.1016/j.procs.2024.10.074
  15. Ramachandran, K. M., & Tsokos, C. P. (2009). Mathematical statistics with applications. Burlington. Elsevier Academic Press.
  16. Mohammed, A. H., & Mahdi, M. J. (2025). Comparison of estimation methods for the parameters of the Fréchet distribution using simulation. Iraqi Journal of Statistical Sciences, 22(1), 101–113. https://doi.org/10.33899/iqjoss.2025.187758
  17. Hogg, R. V., McKean, J. W., & Craig, A. T. (2019). Introduction to mathematical statistics (8th ed.). Boston. Pearson. ISBN 978-0-13-468699-8.
  18. Atkinson, A. C., Riani, M., & Corbellini, A. (2021). The Box–Cox Transformation: Review and Extensions. Statistical Science, 36(2), 239–255. DOI: 10.1214/20-STS778
  19. Wohlwend, B. (2023). Regression model evaluation metrics: R-squared, adjusted R-squared, MSE, RMSE, and MAE. Medium.
  20. Pirenne, S., & Claeskens, G. (2024). Exact post-selection inference for adjusted R squared selection. Statistics and Probability Letters, 211, 11013. https://doi.org/10.1016/j.spl.2024.110133
  21. Miles, J. (2005). R Squared, Adjusted R Squared. Wiley StatsRef: Statistics Reference Online, 4, 1655–1657. https://doi.org/10.1002/0470013192.bsa526
  22. Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)? -Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7(3), 1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
  23. de Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean Absolute Percentage Error for regression models. Neurocomputing, 192, 38–48. https://doi.org/10.1016/j.neucom.2015.12.114
  24. Gupta, S. C., & Kapoor, V. K. (2000). Fundamentals of mathematical statistics: A modern approach (10th rev. ed.). Sultan Chand & Sons.
  25. Seber, G. A. F., & Lee, A. J. (2003). Linear regression analysis (2nd ed.). Hoboken. John Wiley & Sons. https://doi.org/10.1002/9780471722199.ch1
  26. El-horbaty, Y. S. (2024). A Note on Effective Transformation-based Exact F-test for Sub- Clustering Effect in Two-Fold Nested Error ANOVA Model. Journal of Statistics and Computer Science, 3(1), 79-89. https://doi.org/10.47509/JSCS.2024.v03i01.05
  27. Khatun, N. (2021). Applications of Normality Test in Statistical Analysis. Open Journal of Statistics, 11(01), 113–122. DOI:10.4236/ojs.2021.111006
Download this PDF file

Statistics

How to Cite

Mustafa , Z. M. ., & Shareef , A. A. . (2026). Analyzing the Impact of Various Criteria on Blood Pressure Through General Linear Model by Box-Cox Transformations. IRAQI JOURNAL OF STATISTICAL SCIENCES, 23(1), 88–103. https://doi.org/10.33899/iqjoss.v23i1.62125