In the previous post we discussed on how to connect jupyter notebook to pyspark. Further going forward, in this post I will discuss on how you can run python scripts, and analyze and build Machine Learning models on top of data stored in
Wrote by Supun Setunga
Prerequisites Install jupyter Download and uncompress spark 1.6.2 binary. Dowload pyrolite-4.13.jar Set Environment Variables open ~/.bashrc and add the following entries:  export PYSPARK_DRIVER_PYTHON=ipython export PYSPARK_DRIVER_PYTHON_OPTS='notebook' pyspark export PYSPARK_PYTHON=/home/supun/Supun/Softwares/anaconda3/bin/python export SPARK_HOME="/home/supun/Supun/Softwares/spark-1.6.2-bin-hadoop2.6" export PATH="/home/supun/Supun/Softwares/spark-1.6.2-bin-hadoop2.6/bin:$PATH" export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH export PYTHONPATH=$SPARK_HOME/python:$PYTHONPATH export PYTHONPATH=$SPARK_HOME/python/lib:$PYTHONPATH export SPARK_CLASSPATH=/home/supun/Downloads/pyrolite-4.13.jar If you are
Wrote by Supun Setunga
Prerequisites: Install python Install ipython notebook Create a directory as a workspace for the notebook, and navigate to it. Start python jupyter by running: jupyter notebook Create a new python notebook. To use Pandas Dataframe this notebook scipt, we first need to import
Wrote by Supun Setunga
This post will discuss on how to setup a fully distributed hbase cluster. Here we will not run zookeeper as a separate server, but will be using the zookeeper which is embedded in hbase itself. And our setup will consist of 1 master
Wrote by Supun Setunga
Here i will discuss on how to setup a fully distributed hadoop cluster with 1-master and 2 salves. Here the three nodes are setup in three different machines. Updating Hostnames To start off the things, lets first give hostnames to the three nodes.
Wrote by Supun Setunga
HeapDump: jmap -dump:live,format=b,file=<filename>.hprof <PID> Thread Dump: jstack <PID> > <filename>
Wrote by Supun Setunga
Login to mysql with your usernamse and password. eg: mysql u root -proot Then execute the following command: SELECT table_schema "DB Name", ROUND(SUM(data_length + index_length)/1024/1024, 2) "Size in MBs" FROM information_schema.tables GROUP BY table_schema; Here SUM(data_length + index_length) is in bytes. Hence we have
Wrote by Supun Setunga
table.data-table th, table.data-table td { border: 1px solid black; padding: 10px; text-align:center; width:700px; } What is stacking? Stacking is one of the three widely used ensemble methods in Machine Learning and its applications. The overall idea of stacking is to train several models,
Wrote by Supun Setunga
In Spark a transformer is used to convert a Dataframe in to another. But due to the immutability of Dataframes  (i.e: existing values of a Dataframe cannot be changed), if we need to transform values in a column, we have to create a new
Wrote by Supun Setunga
Java Profiling can help you to identify asses the performance of your program, improve your code and identify any defects such as memory leaks, high CPU usages, etc. Here I will discuss on how to profile your code using the java inbuilt utility
Wrote by Supun Setunga
ESB Analytics Server is the analytics distribution for the WSO2 ESB, which is built on top of WSO2 Data Analytics Server (DAS).  Analytics for ESB consists of an inbuilt dashboard for Statistics and Tracing visualization for Proxy Services, APIs, Endpoints, Sequence and Mediators.
Wrote by Supun Setunga
By default, access to mysql databases is bounded to the server which is running mysql itself. Hence, if we need to log-in to the mysql console or  need to use a database from a remote server, we need to enable those configs. Open
Wrote by Supun Setunga
WSO2 Data Analytics Server (DAS) can be used to do various kinds of batch data analytics and create dashboards out of those data. In this blog, I will be discussing on how can you create a simple dashboard using the data read from
Wrote by Supun Setunga
Sample SOAP Message with WSSE Header: <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:echo="http://echo.services.core.carbon.wso2.org">    <soapenv:Header>       <wsse:Security soapenv:mustUnderstand="1" xmlns:wsse="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd">          <wsu:Timestamp wsu:Id="Timestamp-13" xmlns:wsu="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd">             <wsu:Created>2015-05-21T04:17:56.541Z</wsu:Created>             <wsu:Expires>2015-09-21T04:22:56.541Z</wsu:Expires>          </wsu:Timestamp>
Wrote by Supun Setunga
Seasonal Time Series data can be easily modeled with methods such as Seasonal-ARIMA, GARCH and HoltWinters. These are readily available in Statistical packages like R, STATA and etc. But If you wanted to model a Seasonal Time-Series using Java, there' are only very
Wrote by Supun Setunga
Page 1 of 6123456Next »Last