Python 2.x MySQLdb is a piece of s***

Sometimes its just retarded… that python community in general goes by “Battries included” motto. Yet, once again failure to support MySQLdb in higher version is nothing more than classic FAIL. I have been trying to search for solution to getting MySQLdb working with python 2.6, let alone having hopes for python 3.0.

Maybe its sideeffects of late night hacking, but I am eager to find solution.

posted 5 months ago

$10 Well Spent

If you are a technologist, program/product manager or even an executive take a look at Scrumban. Its a no BS book on software development principle. It covers everything from lean,agile and kanban principles/methods. Working at a startup for a year, it has made me realize how even small decisions related to product can have huge impact. In this context this book lays down general guideline on how software development should be approaced.

Traditioanl approaches have been watefall/prototype based in nature. Most of the mordern approaches like Agile have improved, but people still misinterpret it. And to some extent abuse it by following it letter to letter. The writing style of this book is non-structural which is good because it still has consistent flow to it.

I will write more about this book, as and when I read it.

posted 5 months ago

Toying around with PHP

Since I already know python a little bit, I think it makes sense to learn php. Couple of thoughts till now: PHP is relatively less strick in terms of syntax than Python is. At the same time my rant about developing web based application in Python is, there are too many web frameworks out there and sadly none of them are standard.

In python land, there are frameworks like Web.py, Pylons, Django and recently Google App Engine. Upon a little more research issues with Google App Engine is, its still going to take a while to become main stream with current Python releases. As of know Google’s take on it is simple: we will only support Google App Engine for Python 2.5.x. Web.py is simplistic framework, I like it for REST API implementation. Pylons I guess has been in production at Reddit.com. Django I guess people still think is the best we have that stands in terms with RoR?

Coming back to PHP, its good! I like the standardization and its tracks as beeing one of the most scalable languages. People still think PHP is Vertically scalable i.e. add more boxes and thing will take care of it self. I expect to write more about my PHP vs. Python experience here.

posted 10 months ago

Reality Check -- My thoughts

I finished reading “Reality Check” by Guy Kawasaki. Couple of comments: he uses some material from his earlier book “The Art of Start”(which BTW is a good read too!). I am happy to see him expand on some of topics covered in this book. He brings in great entrepreneurs from industry, who share their experiences. Some of my personal favourite chapters from the book are 1,17,21,29. The book is gives a great insight into investors, engineers, lawyers and alike. The biggest take aways that I have from this book is,
- Just Do it!
- Have some great people around you.
- Right idea and execution are two sides of same coin

And perhaps the biggest lesson I have learned by reading a lot of similar books are
- No amount of reading and training will prepare you for real life.
- Take such advice with pinch of salt and not as a rule of thumb. Things are different and so are assumptions and people.

posted 10 months ago

Google App engine first impressions

I was one of the fortunate developers to get Google App Engine account. And I am trying to make use of it. I am developing a simple video sharing application. KaveMunn its pretty basic allows you to quickly share a link.

The feature I like the most is its SDK. You develop offline and once you are ready to go, just do an update and your app gets deployed. Thus no hassle of setting up box in datacenter. The whole app is to entierly written in Python. Its pretty neat!

Best part about Google App Engine includes:

  • Support to serve static files.
  • Uses Google’s Authentication API.
  • Has its own backend storage which uses BigTable.
  • Framework is MVCish.

Now users can focus on developing features as opposed to worring about scalbility.

posted 1 year ago

Shared Nothing Programming

So recently there is a lot of buzz about shared nothing architectures and databases. And this by is partly because people are realizing thread programming is *hard*. With the race amongst chip manufacturers for fitting more cores on a chip, programmers are realizing that they need to keep up with this change.

Functional programming like Haskell, Scheme and Erlang allow single assignments to its variable, any subsequent mutations need different assignments. This is good in a way because it makes programmer think the flow of program with “shared nothing” philosophy in mind. Google realized the power of using abstract semantics of map and reduce functions available to functional programming long before MapReduce became famous. I think over the period of years MapReduce has become de facto design pattern for people wanting to exploit “shared nothing” paradigm.

I think a bigger part of programming community will still take time to realize true power of this approach but before that happens there needs to be a better understanding. However there is a huge percentage of enterprise applications written/designed using legacy systems will challenge the new. So I guess we need better migrating tools!

posted 1 year ago

Hadoop + MySQL == Killer Combo?

[Firstly I would like to declare my ignorance of Databases, Data warehousing and Data processing. Hopefully I will be better with time.]

So last couple of days I have been working a lot with Hadoop and MySQL. And this is what I have been doing

  1. Take the raw data push it off to HDFS
  2. Clean data using simple streaming jobs in Python
  3. Get the data to local drive, ingest into MySQL
  4. Run queries and get data visualized

And I must tell you, I have been seeing a lot of patterns with the jobs that I have been running. This typically includes:

  1. Filtering: Taking some dataset and filtering out data based on some predicate(either or key, value of key-value pairs).
  2. Joins: Taking dataset A and dataset B and merging them so that it has some predicate in common.
  3. Aggregating: Given key,value pairs emitting key,function(value1,value2,….valueN)

I know HBase and Pig might have better implementation of these function, but it would be nice to have some library for hadoop jobs which developers can reuse too.

But the key question is should I use Hadoop just for data cleaning and do the rest in MySQL? If so what kind of loads can MySQL handle on same amount of boxes? If not then how quick can Hadoop generate results of SQL like queries? Or even better do Hadoop and MySQL complement each other?

Honestly, the answer to these questions are subjected to following constraints:

  • Sizes of datasets
  • Frequency and complexity of executing SQL queries vs. programming and running hadoop jobs
  • Engineering time at hand

I will write more about my experiences soon. So stay tuned!

posted 1 year ago

Time seriers analysis

Temporal aspect of internet is really interesting. Specially when we now have “streaming” data easily accessible via REST apis. Following are some of interesting sources of data which could be exploited for extracting all sorts of knowledge

  • Search Query Logs(if you are luck enough to get access)
  • Feeds (RSS,FriendFeed like API)
  • News 2.0 like site (Digg,Reddit etc)

There could be more, but I am can’t think of any at this point in time. But more interesting question is How does information evolve/spread over time and can there be some early indicators of events happening in real life?

Thats a real tough question to answer for me, but there is tons of academic work done in this fields. Some of which it can be found Rosie Jones publication page.

Some of it also appears as Topic Tracking and News Analysis on Steve Skiena’s site.

posted 1 year ago

Hello World!

I have tried blogging a lot, in past I have written longers post. But lets see if I can keep up with writing small short posts.

BTW I just graduated out of USC, and I am excited to join small start-up Playlist. It sure is going to be fun and exciting specially with diversity of people they have. So stay tuned!

posted 1 year ago

posted 1 year ago