Google's Outsourcing to You

The release of Google’s App Engine has caught the attention of many prospective startup founders [1]. While the feature list and comparisons to similar services are being discussed, Google’s intent is being overlooked.

Soon, any developer with a Google account will be able to develop and deploy an application that has access to Google’s resources. That is, a developer can build a Google application without needing to be employed by Google. You build it, grow it, then Google buys it.
___
[1] See Hacker News.

Stop Drinking the Cheap Stuff

A recent post on news.YC on the decline of content quality got me thinking about the cause of what happened to Reddit and what’s happening to Hacker News. I saw unique, quirky content at Reddit become replaced by clichés and pictures of cats. And the transformation of news.YC from startup news to hacker news to prog.reddit++.

Overpopulation is a common diagnosis of the change; usually with an argument that all crowds are stupid and so, it follows, is the content they produce. That may be true, but it alone is not the cause of quality degradation. The news.YC site began life as Startup News and at first the front page changed slowly. That is, stories remained on the front page longer, probably due to fewer total number of submissions. But then, as more users joined, the volatility increased as the quality of the front page decreased. The front page was still full of startup-related posts, but much of content was repetitious or weak.

Startup News was becoming lame [1], so PG rebirthed the site as Hacker News. Startup-related content was still welcome, but the focus was now on “news interesting to hackers generally” [2]. The stated reason for the change was “we ourselves were getting a bit bored reading stories about nothing but startups” [3]. I propose that it wasn’t that users were getting bored about startups, but that we were running out of content to share and things to say.

The pool of hacker-related material on the net is larger than that of startup-related material, but the rate of quality content creation is still a limit. Users struggle to find new content to share and our quality measurements become relative.

There are fixes, but the side effects may be undesirable. Here’s one, off the top of my head: let’s slow down. Slow acceleration and low gravity of post rank would make for a less dynamic front page but allow more time for quality content to be generated, found, and submitted. This may be enough to transform Hacker News from an increasingly vapid whirlpool to a concentrated and directed information flow, a mutating hacker journal.

___
[1] This was originally “Startup News was dying”, which didn’t say much and sounded like FUD. I’m not happy with “lame”, but it conveys the subjectivity of my statement. The point is that the front page was full of smarmy business development articles, reports that people didn’t die after quitting their jobs, and recounts of random startups throwing parties or being purchased by Google.
[2] http://ycombinator.com/hackernews.html
[3] see 2

Databases, Dimensions, and Change

While working on a pet project, I was forced to make a decision about data storage. Being somewhat conservative and lazy I decided to (mis)use a row-oriented RDBMS for tagged data objects. That is, there is a table in the database where each row represents an object and I use relationships defined in other tables to annotate each object with any number of tags.

Yuck.

Google’s press release and paper on BigTable was the first time I had noticed column-oriented tables. Ever since I’ve kept tabs on alternative database systems, matching each with hypothetical problems.

What I want is a system for storing structured data, tagging the data, and performing queries based on combinations of tags. Tags don’t have to be fluffy things like “kittens” — tags can be used to signal a particular trait, such as “all posts in the current month”.

At this point, there appears to be only one open source, column-oriented database that is still being actively developed and could be used behind a web application [1]. It’s unclear whether it supports bitmap indices [2] or is capable of handling heavy web traffic, but fight off NIH syndrome long enough to determine that.

___
[1] MonetDB, http://www.monetdb.nl/
[2] bitmap indices, http://en.wikipedia.org/wiki/Bitmap_index

Arc Burnout

Arc was recently released as an abstraction layer on MzScheme. Much has been said about the design philosophy and the lack of innovation.

The most significant blunder in the release of Arc is the absence of a packaging system. People are excited to be a part of the new community and share code that could wind up being in a standard library. There’s a potential chain reaction that will now decay without easy code sharing.

Package management is an ugly problem, but there are many preexisting systems from which to learn. If the masses had to wait several years for the language axioms to be defined, what’s another 6 months for nifty package management?

Of course, the rebuttal goes something like, “PG didn’t want to, write it yourself.” Understood and partially agree, but one of the Arc founders should have implemented package management before the public release. Not because it’s wrong to punt, but because all the excitement from the initial release will have worn off by the time that a proper mechanism for sharing code is available.