An Opinion Regarding Equivalence Testing for Evaluating Measurement Agreement

Main Article Content

Manolis Adamakis


The novel statistical approach ‘equivalence testing’ has been proposed in order to statistically examine agreement between different physical activity measures. By using this method, researchers argued that it is possible to determine whether a method is significantly equivalent to another method. Recently, equivalence testing was supported with the use of 90% confidence interval, obtained from a mixed ANOVA, which I believe is a more robust approach. This paper further discusses the use of this method in comparison to a more well-established statistical analysis (i.e. mixed design ANOVA), as well as various limitations and arbitrary assumptions in order to perform this analysis. The paper concludes with some remarks and considerations for future use in similar approaches.

Mixed design ANOVA, p-value, confidence interval; methods’ comparison.

Article Details

How to Cite
Adamakis, M. (2019). An Opinion Regarding Equivalence Testing for Evaluating Measurement Agreement. Journal of Scientific Research and Reports, 24(5), 1-4.
Opinion Article


Lee JM, Kim Y, Welk GJ. Validity of consumer-based physical activity monitors. Med Sci Sports Exerc. 2014;46(9):1840-48.

Florez-Pregonero A, Meckes N, Buman M, Ainsworth. Wearable monitors criterion validity for energy expenditure in sedentary and light activities. J Sport Health Sci. 2017;6(1):103-10.

Kim Y, Welk GJ. Criterion validity of competing accelerometry-based activity monitoring devices. Med Sci Sports Exerc. 2015;47(11):2456-63.

Morris CE, Wessel PA, Tinius RA, Schafer MA, Maples JM. Validity of activity trackers in estimating energy expenditure during high-intensity functional training. Res Q Exercise Sport; 2019. [Epub ahead of print].

Baker M. Statisticians issue warning over misuse of P values. Nature. 2016;531 (7593):151. DOI:10.1038/nature.2016.19503

Wasserstein RL, Lazar NA. The ASA's statement on p-values: Context, process, and purpose. Am Stat. 2016;70(2):129-33.

Yaddanapudi LN. The American Statistical Association statement on p-values explained. J Anaesthesiol Clin Pharmacol. 2016;32(4):421-423.

Startz R. Not p-Values, Said a little bit differently. Econometrics. 2019;7(1):11.

Dixon PM, Saint-Maurice PF, Kim Y, Hibbing P, Bai Y, Welk GJ. A primer on the use of equivalence testing for evaluating measurement agreement. Med Sci Sports Exerc. 2018;50(4):837-45.

Tabachnick BG, Fidell LS. Using multivariate statistics. 5th ed. Boston, MA: Allyn & Bacon; 2007.

Rencher AC. Methods of multivariate analysis. 2nd ed. New York, ny: John Wiley & Sons, Inc; 2002.

Park E, Cho M, K, CS. Correct use of repeated measures analysis of variance. Korean J Lab Med. 2009;29:1-9.

Welk GJ, McClain J, Ainsworth BE. Protocols for evaluating equivalency of accelerometry-based activity monitors. Med Sci Sports Exerc. 2012;44(Suppl1): S39-49. DOI:10.1249/MSS.0b013e3182399d8f

Mascha EJ, Sessler DI. Equivalence and noninferiority testing in regression models and repeated-measures designs. Anesth Analg. 2011;112(3):678-687.

Lakens D, Scheel AM, Isager PM. Equivalence testing for psychological research: A tutorial. AMPPS. 2018;1(2): 259-269.

Lakens D. Equivalence tests: A practical primer for t-tests, correlations and meta-analyses. Soc Psychol Pers Sci. 2017; 8(4):355-62.