I found this article very interesting and I´m writing this email hoping you could shed some light on an analysis I´m performing regarding GLM.

I am trying to analyze some data about animal behaviour and would need some help or advice regarding which non-parametric test should I use.

The variables I have are:

-Response variable: a continuous one (with both positive and negative values)

-Explicatory variable: a factor with 6 levels

-Random effect variable: as the same animal performing some behavioural task was measured more than once.

As I have a random effect variable, I chose a GLM model. Then, when checking the normality and homoscedasticity assumptions, Shapiro-Wilks test showed there was no normality and QQplots revealed there weren´t patterns nor outliers in my data. So the question would be: which non-parametric test would be optimal in this case, knowing that I would like to perform certain a posteriori comparisons (and not all-against-all comparisons)?

My database has lots of zeros responses in some conditions, I´ve read that for t-students tests lacking of normality due to lots of zeros it´s OK to turn a blind eye on lack of normality (Srivastava, 1958; Sullivan & D’agostino, 1992) … is there something similar with GLM?

Thank you so much in advance for any advice you could provide.

Kind regards,

Yair Barnatan

Ph.D. Student – Physiology and Molecular Biology Department

Faculty of Science

University of Buenos Aires

@MikeMac: Method comparison is a complicated issue and you shouldn’t take decisions on a single T test. You need information from various tests, including means difference (T or Wilcoxon tests) and correlation / concordance /agreement. I may suggest to perform Bland-Altman analysis which is easy enough. If your data is not normal, you may try some transformations but inherently nonparametric methods are available in some software.

]]>@Bravo: Yes, you can use both tests and most likely you should. Just be aware that their interpretation is different. Make sure you understand each test’s preconditions and how do they apply to your data. On the other hand, if departure from normally is not severe and you sample is reasonably large (100+) you can safely use T tests with negligible side effects.

]]>A lot of common tests like T-Test and ANOVA work on the SAMPLING DISTRIBUTION, not the distribution of your sample. No data in real life is normal, a lot of them highly skewed, but that does NOT mean its sampling distribution is not normal. But many people do not know the difference so here’s a link to an animated explanation:

http://onlinestatbook.com/stat_sim/sampling_dist/

Thanks to central limit theorem, the sampling distribution for ANY population distribution, given a “big enough sample size”, i.e. N = 10-15 seems to already approach normality thanks to Central Limit Theorem per the animated explanation above, will approach normality. It is this normal sampling distribution that T-Test and ANOVA works off on, it does NOT work on the distribution of your sample.

I have to explain this a lot to people, sometimes very seasoned six sigma black belts. I hope this will raise the statistical knowledge bar for everyone else.

]]>when I removed the outlier, the data change from non-normal into normal distribution. is this ok to remove those outlier?

]]>For what kind of height is this for??? i.e.: height of a bunch of people or height of a industrial product manufactured (pharmaceutical tablet)???

regards

]]>The information provided is apt. ]]>