Category Encoders v1.2.4 Release

This post was originally published here

I've just cut a fresh release of the scikit-learn-contrib library, category_encoders.  This one included a lot of great contributions from the broader community, which has been really great. A few selected features now available:

  • Leave-one-out encoding: a new encoder, based on a popular Kaggle post by Owen Zhang, detailed here and here. (proposal)
  • Maintenance fixes in upstream libraries (should get fewer pandas warnings, issue)
  • Bugfix for calling fit on the same thing many times (issue)
  • Consistent category ordering (proposal)
  • Consistent output shape for datasets with inconsistent category appearances (issue)
  • Missing value and unknown category handling made consistent across all encoders.

Install or upgrade using the command:

pip install -U category_encoders

All in all a fairly large release by our standards, and there are still some issues open to be worked on. So upgrade, try it out, let me know what you think, and if you'd like to get involved, find us on github here.

Related Posts

Coding in Interactive Mode vs Script Mode When programming in Python, you have two basic options for running code: interactive mode and script mode. Distinguishing between these modes can be s...
How to Consolidate Multiple Django Projects Dos and Don’ts For Success If you’ve been developing web applications for your company or a client for a few years, it’s possib...
Text Analytics and Visualization For this post, I want to describe a text analytics and visualization technique using a basic keyword extraction mechanism using nothing but a word cou...
SQL Fundamentals The pandas workflow is a common favorite among data analysts and data scientists. The workflow looks something like this: The pandas workflow works we...