what is Label Encoder and One Hot Encoder and which one to use?


Label Encoder

 Label Encoder makes it easy to convert a categorical value into a numerical value. If you use it to convert ['red', 'green', 'yellow'] into numbers, you will get [2, 1, 3] .


 It is easy to use and doesn't make new columns so you don't need to worry about being runtime slow. But there is also a disadvantage. Label Encoder assigns numbers in alphabetical order. For example, If you are trying to convert [ 'Great', 'Good', 'Bad', 'Worst'] into numerical values using Label Encoder, then you will get [ 3, 2, 1, 4]. Intuitively, It is something wrong and it will make a prediction less reliable. Then why or when do we use Label Encoder?

why and when?

 If you search for Label Encoder on Google or Stackexchange, you will find a lot of posts comparing Label Encoder and One-Hot Encoder. The difference between the two encoders would be a tip for explaining when to use Label Encoder.

One Hot Encoder



 One-Hot Encoder is more intuitive than Label Encoder I think. As you can see, OHE makes new columns that signify each color there is by binary numbers. i,e. you can see at a glance with color_1 if that row's color is green. It is more clearer than Lebel Encoder because it doesn't cause any represent an ordinal relationship between each categorical data. In simple, If it is green, color_1 is 1 but 0. But, Let's consider if we have a hundred
of colors in the dataset and try to use One-Hot Encoder. We will get 100 columns and 100,000 entries more. It will make trouble in terms of the execution time of your model.
I would like to say these encoders are complementary.

So which encoder to use?

 frankly, I don't understand for sure. There are trade-offs so nothing is perfect. If you have a lot of columns about categorical data and that data doesn't have any ordinal correlation with target, You'd want to use Label Encoder. The other way, Your categorical data has ordinal information, for example[ 'excellent', 'good', 'bad'], One-Hot Encoder would be your boy.

reference

Comments

Popular Posts