Category Encoders v1.2.4 Release

This post was originally published here

I've just cut a fresh release of the scikit-learn-contrib library, category_encoders.  This one included a lot of great contributions from the broader community, which has been really great. A few selected features now available:

  • Leave-one-out encoding: a new encoder, based on a popular Kaggle post by Owen Zhang, detailed here and here. (proposal)
  • Maintenance fixes in upstream libraries (should get fewer pandas warnings, issue)
  • Bugfix for calling fit on the same thing many times (issue)
  • Consistent category ordering (proposal)
  • Consistent output shape for datasets with inconsistent category appearances (issue)
  • Missing value and unknown category handling made consistent across all encoders.

Install or upgrade using the command:

pip install -U category_encoders

All in all a fairly large release by our standards, and there are still some issues open to be worked on. So upgrade, try it out, let me know what you think, and if you'd like to get involved, find us on github here.

Related Posts

Pandas Concatenation Tutorial You'd be hard pressed to find a data science project which doesn't require multiple data sources to be combined together. Often times, data analysis ...
Building a Simple Web App with Bottle, SQLAlchemy, and the Twitter API This is a guest blog post by Bob Belderbos. Bob is a driven Pythonista working as a software developer at Oracle. He is also co-founder of PyBit...
On taking things to seriously: holiday edition For some reason Atlanta got a pretty significant amount of snow yesterday, and because of that I've been mostly stuck at home. When faced with that ki...
Using Excel with pandas Excel is one of the most popular and widely-used data tools; it's hard to find an organization that doesn't work with it in some way. From analysts, t...