Data Science Things Roundup #10

This post was originally published here

Hey all, I haven't done one of these in quite a while, but thought I'd share a few more articles I've found interesting recently.

An analysis of twitter influencers in the field of data science & big data

This is a pretty in depth medium article that goes through some of the concepts in network analysis, through the lens of twitter data. It's not an area I know a ton about, but I found it approachable and really interesting. Check it out here.

StashPy

I am a pretty heavy user of the Elasticsearch ecosystem, and have found it to be a really powerful tool.  I also, as you probably know if you read this blog, work a lot in python.  StashPy is a python3 project that does more or less the same thing as a minimal logstash.  So it takes a config, runs listening on a TCP port, and pipes log data though a processing pipeline before indexing into Elasticsearch. Super cool. Check it out here.

Bayesian Survival Analysis with python and pymc3

Survival analysis is a really powerful branch of statistics concerned with predicting the time until some event happens.  It comes up a lot in the medical field in particular (predicting time to death for different cases, as an example).  I've used it lightly in a past post to try to predict time until a programmers code would be replaced or deleted, you can check that out here.  In this article, Austin walks through the math backing some of the more common algorithms, and then how to translate that into python. Check it out here.

Related Posts

Python – TechEuler Python – TechEulerUnOrdered Linked list – Prepend, Append, Insert At, Reverse, Remove, SearchUse of __slots__ in Python ClassUsage of Unde...
Introduction to Python Ensembles Stacking models in Python efficiently Ensembles have rapidly become one of the hottest and most popular methods in applied machine learning. Virtually...
Postgres Internals: Building a Description Tool In previous blog posts, we have described the Postgres database and ways to interact with it using Python. Those posts provided the basics, but if you...
Learning Curves for Machine Learning Diagnose Bias and Variance to Reduce Error When building machine learning models, we want to keep error as low as possible. Two major sources of error...