Data Science Things Roundup #3

Time again for the 3rd edition of the data science things roundup, where I share a few data science things I’ve come across recently.  Check out previous editions here and here.

Self Organizing Maps with TensorFlow

Google’s open sourcing of TensorFlow late last year caused a pretty big splash in the machine learning and data science communities, and since then a ton of tutorials, examples and projects have popped up around it.  One such example from soon after it’s release was Sachin Joglekar’s tutorial of creating a self organizing map (SOM).  SOMs are interesting as one of the relatively few unsupervised neural network applications, and are a refreshing respite from image classification. Check it out here.

Density Based Clustering Toolbox

DeBaCl is a python library for doing density based clustering using level-set trees. This is particularly useful for datasets with differing clustering behavior at different scales.  Included in the repo is an in-progress set of example jupyter notebooks. Check it out here.

Protecting a Python Codebase

Python is taking over the data science and machine learning worlds, but once a project is done and it comes time to package/commercialize/distribute python code, things can get messy.  When SaaS doesn’t work out and you have to physically ship code to a customer, it can be hard to protect IP.  In this blog post, Mattias Aguirre walks through some of the options available to python programmers and the pros/cons of each. Check it out here.

The post Data Science Things Roundup #3 appeared first on Will’s Noise.