Skip to content

vzlatkin/Stocks2HBaseAndSolr

Repository files navigation

Visualize near-real-time stock price changes using Solr and Banana UI

The goal of this tutorial is to create a moving chart that shows the changes in price of a few stock symbols, similar to Google Finance or Yahoo Finance.

Summary of steps

  1. Download and install the HDP Sandbox
  2. Download and install the latest NiFi release
  3. Create a Solr dashboard to visualize the results
  4. Create a new NiFi flow to pull from Google Finance API, transform, and store in HBase and Solr

Step-by-step

1. Download and install the HDP Sandbox

Download the latest (2.3 as of this writing) HDP Sandbox here. Import it into VMware or VirtualBox, start the instance, and update the DNS entry on your host machine to point to the new instance’s IP.

On Mac, edit /etc/hosts, on Windows, edit %systemroot%\system32\drivers\etc\ as administrator and add a line similar to the below:

192.168.56.102  sandbox sandbox.hortonworks.com

2. Download and install the latest NiFi release

Follow the directions here. These were the steps that I executed for 0.4.1

cd /tmp
wget http://apache.cs.utah.edu/nifi/0.4.1/nifi-0.4.1-bin.zip
cd /opt/
unzip  /tmp/nifi-0.4.1-bin.zip
useradd nifi
chown -R nifi:nifi /opt/nifi-0.4.1/
perl -pe 's/run.as=.*/run.as=nifi/' -i /opt/nifi-0.4.1/conf/bootstrap.conf
perl -pe 's/nifi.web.http.port=8080/nifi.web.http.port=9090/' -i /opt/nifi-0.4.1/conf/nifi.properties
/opt/nifi-0.4.1/bin/nifi.sh start

3. Create a Solr dashboard to visualize the results

Download a new Solr dashboard, start the service, and create a new collection to store stock price changes:

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64
wget https://raw.githubusercontent.com/vzlatkin/Stocks2HBaseAndSolr/master/Solr%20Dashboard.json -O /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/banana/app/dashboards/default.json
/opt/lucidworks-hdpsearch/solr/bin/solr start -c -z localhost:2181 
/opt/lucidworks-hdpsearch/solr/bin/solr create -c stocks -d data_driven_schema_configs -s 1 -rf 1

4. Create a new NiFi flow to pull from Google Finance API, transform, and store in HBase and Solr

Solr is used for indexing the data, Banana UI is used for visualization, and HBase is used for future-proofing. HBase can be used to further analyze the data from Storm/Spark or to create a custom UI. The get the data into these tools, follow the steps below:

  • Start HBase via Ambari
  • Create a new table:
    hbase shell
    hbase(main):001:0> create 'stocks', 'cf'
    	
  • Then download this NiFi template to your host machine.
  • To import the template, open the NiFi UI
  • Open Templates manager:

  • Find the template on your local machine and import it:

  • Drag and drop to instantiate a new template:

  • Double click the new process group:

  • You'll need to enable the HBase shared controller. To do so, click the right mouse button over the "Send to HBase" process, then click "Configure", then "Properties" and the "Go to" arrow to access the controller. Finally, click the "Enable" button.

  • Now start all of the processes. Hold down the Shift-key, and select all of the processes on the screen. Then click the start button:

You should see a flow that looks like the below screenshot

The reason for so many processes is that the response from Google Finance API needs to be transformed. First, we remove the comment characters '//' from the response. Second, we split the array into individual JSON objects. Third, we extract the relevant attributes. Fourth, the timestamp has the format of UTC, but it is actually in EST timezone, therefore, we fix that. Finally, we send the information to HBase, Solr, and the NiFi bulletin board for logging.

Conclusion

Now open the Banana UI. If you are doing this when the US stock markets are open (9:30am to 4pm Eastern Time), then you should see a dashboard similar to the below.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published