Showing posts with label Analytics. Show all posts
Showing posts with label Analytics. Show all posts
Friday, May 8, 2015
Data Science with Python
At the last Tech Talk Tuesday we took an overview of Python's Data Science related packages.
The key packages for numerical computing are Numpy, Scipy and Scikit-learn. The documentation for python is great, and makes presentations like this easy. These packages are loaded with code samples, even for complex concepts like Grid search and cross validation. The machine learning package, scikit-learn also has exercises below the code samples. Doing the exercises enforces the concepts, and is great preparation for solving problems like the ones in Kaggle competitions.
We also demoed iPython Notebooks, a fantastic way to create live data analysis documents.
Labels:
Analytics,
data driven documents,
Data Science,
ipython notebook,
NumPy,
scikit-learn,
scipy
Thursday, January 2, 2014
Running OpenTSDB on Amazon EC2
Although there are cheaper alternatives for production systems, It's easy enough to get The Open Time Series Database OpenTSDB running on an EC2 instance of Amazon Web Services.
- First you'll need to run HBase on EC2
- Make a data directory mkdir hbase_data
- vi hbase-0.94.13/conf/ hbase-site.xml
- Using vi update the hbase.rootdir property value to: file:///home/ec2-user/hbase-0.94.13/hbase-\${user.name}/hbase
- sudo yum install git
- git clone git://github.com/OpenTSDB/opentsdb.git
- sudo yum install automake
- yum install gnuplot
- cd opentsdb
- ./build.sh
- env COMPRESSION=NONE HBASE_HOME=path/to/hbase-0.94.X ./src/create_table.sh
- tsdtmp=${TMPDIR-'/tmp'}/tsd
- mkdir -p "$tsdtmp"
- ./build/tsdb tsd --port=4242 --staticroot=build/staticroot --cachedir="$tsdtmp"
- In AWS, click on your EC2 instance, then click "Security Groups" at the bottom left. Click on the default group, then click the "inbound" tab. You can now open the ec2 port 4242.
Your ip address on port 4242 will display the web UI for your instance of OpenTSDB:
Tuesday, December 24, 2013
The Journal of Trading: Smart Technology for Big Data
Smart Technology for Big Data was published in the Winter edition of Journal of Trading. You need to register to read them. Here's the Abstract:
This article provides an underlying structure for managing the big data phenomenon. Innovations and tools fundamental to handling big data are highlighted, and we look at how these technologies are being implemented in the financial industry.
See more at: http://www.iijournals.com/doi/abs/10.3905/jot.2013.9.1.057
Wednesday, December 18, 2013
Institutional Investor Journals: Big Data Article
UPDATE: Smart Technology for Big Data was published in the Winter edition of Journal of Trading, so the links below no longer work. You can access the article here: Smart Technology for Big Data (You'll still need to register if you haven't)
My article Smart Technology for Big Data is published under advanced content at the Institutional Investor site. You'll need to complete the free registration to read it. Enjoy!
Friday, April 26, 2013
Predicting the Stock Market using Big Data
In the paper Quantifying Trading Behavior in Financial Markets Using Google Trends researchers Tobias Preis, Helen Susannah Moat and H. Eugene Stanley have shown that an increase in activity of certain search terms from Google Trends correlates with a decline in stock prices in the Dow Jones Industrial Average (DJIA). The authors then compared investment strategies to show that the search activity isn't just a correlation, but can be used as a valid predictor of market activity.
This graph shows the DJIA on the left, and color codes 3 week periods in the graph according to search frequencies of the word debt. Note that red weeks, like late October of 2008 correspond with declines in the DJIA. So when lot's of people were searching for the work debt, the stock market went down.
The word Debt was the best performing term in the study. Notably , it performed better than terms like nyse ,nasdq, and dow jones.
The researchers then compared a Google Trends investment strategy to a basic Buy and Hold strategy and a random Dow Jones strategy.
The results were remarkable. The Google Trends strategy far exceeded the other strategies.
Wednesday, April 3, 2013
Wednesday, March 27, 2013
"Big Data" is so 1998
This 1998 SiliconGraphics ad from Black Enterprise magazine offers solutions for a "Big Data" world. 256GB of system memory on a server and 400 Terabytes of storage. Not bad for the 20th century. Or for this century.

The "Big Data" buzzword almost caught on in 1998, but it's sister buzzword, Data Mining won out. In the first chapter of "Predictive Data Mining: A Practical Guide is titled "Big Data" (also from 1998) the author Sholom M. Weiss asks "Is data mining a revolutionary new concept? or can we benefit from the may years of research on data analysis?”

Weiss goes on to say "While big data have the potential for better results, there is no guarantee that they are more predictive than small data" With all the hype around Big Data, it helps to get back to the origins of the term and realize that it's one of may interesting problems that experts in a variety of disciplines have been wrestling with for a long time.
Labels:
1998,
Analytics,
Big Data,
buzzwords,
Data Mining,
predictive analytics
Subscribe to:
Posts (Atom)
Popular Articles
-
Today's Tech Talk Tuesday is virtual, we'll do a live one next week. Learn how to code with R like DataFrames in Julia. And see...
-
At April's Tech Talk Tuesday , we previewed six Modern Customer Relationship Management (CRM) Systems . There are hundreds of CRM ...
-
Top 10 Cloud Apps for your Business introduced apps often used from the cloud. There are several vendors to chose from when implementin...
-
Two great things about Ruby are it's brevity and it's dynamic nature. A great way to introduce Ruby to a project written in another...