I found this article very interesting and I´m writing this email hoping you could shed some light on an analysis I´m performing regarding GLM.

I am trying to analyze some data about animal behaviour and would need some help or advice regarding which non-parametric test should I use.

The variables I have are:

-Response variable: a continuous one (with both positive and negative values)

-Explicatory variable: a factor with 6 levels

-Random effect variable: as the same animal performing some behavioural task was measured more than once.

As I have a random effect variable, I chose a GLM model. Then, when checking the normality and homoscedasticity assumptions, Shapiro-Wilks test showed there was no normality and QQplots revealed there weren´t patterns nor outliers in my data. So the question would be: which non-parametric test would be optimal in this case, knowing that I would like to perform certain a posteriori comparisons (and not all-against-all comparisons)?

My database has lots of zeros responses in some conditions, I´ve read that for t-students tests lacking of normality due to lots of zeros it´s OK to turn a blind eye on lack of normality (Srivastava, 1958; Sullivan & D’agostino, 1992) … is there something similar with GLM?

Thank you so much in advance for any advice you could provide.

Kind regards,

Yair Barnatan

Ph.D. Student – Physiology and Molecular Biology Department

Faculty of Science

University of Buenos Aires

A lot of common tests like T-Test and ANOVA work on the SAMPLING DISTRIBUTION, not the distribution of your sample. No data in real life is normal, a lot of them highly skewed, but that does NOT mean its sampling distribution is not normal. But many people do not know the difference so here’s a link to an animated explanation:

http://onlinestatbook.com/stat_sim/sampling_dist/

Thanks to central limit theorem, the sampling distribution for ANY population distribution, given a “big enough sample size”, i.e. N = 10-15 seems to already approach normality thanks to Central Limit Theorem per the animated explanation above, will approach normality. It is this normal sampling distribution that T-Test and ANOVA works off on, it does NOT work on the distribution of your sample.

I have to explain this a lot to people, sometimes very seasoned six sigma black belts. I hope this will raise the statistical knowledge bar for everyone else.

]]>when I removed the outlier, the data change from non-normal into normal distribution. is this ok to remove those outlier?

]]>For what kind of height is this for??? i.e.: height of a bunch of people or height of a industrial product manufactured (pharmaceutical tablet)???

regards

]]>The information provided is apt. ]]>