Monday 16 November 2015

Categorizing Numeric Variables -- a Cautionary Tale via @infomgmt

Three takeaways from Frank's Harrell's lectures I've always kept close are to be attentive to both non-linearity and interaction effects among independent variables, and to be wary of categorizing continuous, numeric attributes.

Good article from Information Management.  Please read carefully as there are some links embedded in the article text that take you to some further information.

Turning a numeric variable into a categorical one is a risky thing to do and can change the direction of an analysis. During the phase you are cleaning the data before doing the analysis I would be tempted to change them all to a character value just to make sure there are no errors or misuse later on.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.