September 27, 2009

Presenting at Inaugural CoPUG

Filed under: hadoop,python — jonEbird @ 8:34 pm

Tomorrow I will be presenting an Introduction to Hadoop: Driven by Python for the inaugural Central Ohio Python Users Group or just CoPUG for short.

I have high hopes for CoPUG. The organizer, Eric Floehr, appears to be well organized, competent individual although I have only exchanged emails and have yet to meet in person. While in Atlanta, last year for PyWorks, I learned of the very strong PyAtl group lead by none other than the current editor of the Python Magazine, Brandon Rhodes. Although I am not sure, I wonder if their Python group has something to do with PyCon coming to Atlanta in 2010. Can I dream of PyCon someday coming to Columbus?

My Introduction to Hadoop: Driven by Python slides provided under the Creative Commons Attribution 3.0 United States License.

August 10, 2009

Hadoop Elephant Makes a Big Splash

Filed under: blogging,hadoop,python — jonEbird @ 5:27 pm

Big news in the world of Hadoop today. My Running Large Python Tasks With Hadoop is published in the July Edition of Python Magazine. This marks my second article with the magazine and I had a lot of fun doing it. My interest in the anti-rdbms will continue as I continue to find interesting ways to organize data in the enterprise.

While providing a gentle introduction to Hadoop, my article also introduces readers to my HadoopCalculator which you can install a couple of different ways. First way is done via git where you can pull my HadoopUtils repo from github via:

git clone git://

That will bring a few more scripts than just my HadoopCalculator. The second way to install is to use the Python setuptools utility easy_install or pull down the source package from the Cheese Shop.

Thank you for reading this far. I lied. The big news today in the Hadoop world is Doug Cutting joining Cloudera. Had you going, didn’t I? Recently, while Doug was still with Yahoo!, the Microsoft and Yahoo Partnership had people wondering what impact that would have on the Hadoop ecosystem. Today, Yahoo! is the largest Hadoop user and for obvious reasons contributed a lot to the community. Cloudera was already a well known player in the Hadoop community but their stock has risen immensely with the addition of Doug Cutting. If they were selling stock, I’d buy.