by Newswire | November 21, 2016 Yhat’s ScienceOps Solves Language Incompatibilities between Artificial Intelligence Algorithms and Digital Applications Yhat, a software company working to bridge the technological divide between data scientists and engineers, announced today that Lumiata, the AI-powered predictive analytics company, has implemented Yhat’s machine learning deployment platform, ScienceOps. Yhat’s ScienceOps allows data scientists […]
Category: Uncategorized
Category Encoders accepted into scikit-learn-contrib
In the past I’ve posted a few times about a library I’m working on called category encoders. The idea of it is to provide a complete toolbox of scikit-learn compatible transformers for the encoding of categorical variables in different ways. If that sounds interesting, you can check out much more in-depth posts here and here. […]
Internet of Things: Trends and Challenges – A Data Science Perspective
Filed under: Internet
Installing python on Windows
If you’ve done any work with python on Windows, you may be cringing right now at the thought of trying to do any type of python development work on the platform. Have no fear though…there is hope for python developers on Windows, especially if you are only going to be using python for data analysis, […]
The Hack of the Century
Fake news propogated by social media just before the US General Election Hacking of Democratic Party by hackers of East European Russian origin /affiliation coordinating publication of leaked emails Leaked emails by hackers causing FBI chief to make fateful statement days before election Recanting of statement by FBI chief further undermining Clintonian credibility I really […]
GaianDB ET innovation is core component in upcoming IBM Beta Product
Gaian Database – also known as GaianDB – originated in 2006 from a patented idea in IBM Hursley Emerging Technology Services (ETS) that defines a connection strategy for building a scale-free network. This became a reality in 2007 when GaianDB design and development started, funded jointly by MoD/DoD under the International Technology Alliance (ITA) fundamental […]
The case for machine learning in network security analysis
Security tends to scale badly with complexity. As information, applications and systems become more sophisticated so to do the challenges faced in assuring Confidentiality, Integrity and Availability. The role of machine assistance is emerging as one of the most important areas in data science, much of which is underpinned by techniques from machine learning and […]
Visualization libraries optimization developing rich dashboards
As you are going through the last analytic mile in your project to visualize and expose powerful insights to Line of Business users, you will undoubtedly need to leverage visualization libraries to display consumable and rich metrics, allowing your users to make the right decision, at the right time, within context. As it is often […]
Debugging the Conversational UI
Conversational UI has always been a reach goal for technologists. Its sheer presence in science-fiction movies alone is an indicator of how much we as a society value this mode of interaction. There are many reasons for this. From an early age, we’re taught how to interact with each other via conversation – wouldn’t it […]
SFN 2016 Presentation
I recently presented at the annual meeting of the society for neuroscience, so I wanted to do a quick post describing my findings. The reinforcement learning literature postulates that we go in and out of exploratory states in order to learn about our environments and maximize the reward we gain in these environments. For example, […]
Data Analytics & Python
So you want (or need) to analyze some data. You’ve got some data in an excel spreadsheet or database somewhere and you’ve been asked to take that data and do something useful with it. Maybe its time for data analytics & Python? Maybe you’ve been asked to build some models for predictive analytics. Maybe you’ve […]
Native Python access to IPFS in Jupyter Notebooks
In a previous post, we discussed using the new InterPlanetary File System (IPFS) protocol as a way to load data into Jupyter Notebooks. After experimenting with our previous example code for using IPFS, we decided that using IPFS would be more organic in a Notebook if you could use it from Python as a […]
Random Forests in Python
Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. It can be used to model the impact of marketing on customer acquisition, retention, and churn or to predict disease risk and susceptibility in patients. Random forest is capable of regression and classification. It can handle […]
psutil 5.0.0 is around twice as fast
OK, this is a big one. Starting from psutil 5.0.0 you can query multiple Process information around twice as fast than with previous versions (see original ticket and updated doc). It took me 7 months, 108 commits and a massive refactoring of psutil internals (here is the big commit), and I can safely say this is one […]
PCA Tutorial
Principal Component Analysis (PCA) is an important method for dimensionality reduction and data cleaning. I have used PCA in the past on this blog for estimating the latent variables that underlie player statistics. For example, I might have two features: average number of offensive rebounds and average number of defensive rebounds. The two features are […]