Stock market forecasting with prophet

In a previous post, I used stock market data to show how prophet detects changepoints in a signal (http://pythondata.com/forecasting-time-series-data-prophet-trend-changepoints/). After publishing that article, I’ve received a few questions asking how well (or poorly) prophet can forecast the stock market so I wanted to provide a quick write-up to look at stock market forecasting with prophet. This […]

Read More

Importing data from csv file using PySpark

There are two ways to import the csv file, one as a RDD and the other as Spark Dataframe(preferred) !pip install pyspark from pyspark import SparkContext, SparkConf sc =SparkContext() A SparkContext represents the connection to a Spark cluster, and can be used to create RDD and broadcast variables on that cluster.  https://spark.apache.org/docs/latest/rdd-programming-guide.html#overview To create a […]

Read More

Using pandas with large data

Tips for reducing memory usage by up to 90% When working using pandas with small data (under 100 megabytes), performance is rarely a problem. When we move to larger data (100 megabytes to multiple gigabytes), performance issues can make run times much longer, and cause code to fail entirely due to insufficient memory. While tools […]

Read More