Author Archives: Josh

Wow, how did I miss the Mechanical Turk?

Amazon Mechanical Turk is an astonishing idea – an Artificial AI marketplace. Basically, there’s an API you can call to get humans to do tasks (oddly enough, they want to be paid). Currently, a big favourite for the tasks is transcribing podcasts. I can see that it would be a cheap way to truth a set of training data for AI systems, like number plate detection / recognition.

An artist has used the Mechanical Turk to acquire 10,000 hand drawn left-facing sheep and put them on a site for your viewing pleasure – plus, there was an exhibition of the collectable stamp sheets etc (you can buy the as stamp-sheets for only $20 a sheet). Given the images cost less than a cent each to acquire, he may be a bullshit artist.

The Turk is an example of what Wired calls Rise of Crowdsourcing – Remember outsourcing? Sending jobs to India and China is so 2003. The new pool of cheap labor: everyday people using their spare cycles to create content, solve problems, even do corporate R & D. It’s about the markets, people. These are markets for micro-transactions – micro in their repeatability, or micro in their value.

Sucky factorial calculators

Look for “factorial calculator” on Google and you’ll take a long time to find a factorial calculator that thinks that 100! doesn’t have an ‘e’ in it. If you’re going to write a dinky little app like that, be aware that there are limitations to it and tell people. I’m not going to link to any of them, they’re all naughty applications that shouldn’t be allowed out in the real world. But Dima Stopel’s large number factorial calculator isn’t afraid to give you all the digits.

Database war stories

Databases have long been part and parcel of web development, but it seems that some of the big 2.0 sites have a few things to say on databases. Some love them, others hate them, and all are dealing with really big databases.

Second Life (database has grown and grown and split), Bloglines and memeorandum.com (lovers of flatfiles), Flickr (almost a Tb of data – and we’re ignoring images here), NASA, Craigslist (dealing with masses of data), O’Reilly (doing interesting data mining / transformation), Google (not much gets said here), Findory and Amazon (Findory try to keep it all in RAM), finally MySQL repsonds saying “Flat files suck”

The Gwigle Game

Test your googling skills by playing The Gwigle Game. You need to know a fair bit about pop culture ‘tho. Actually, it’s a rather good training aide. Maybe Google will give the guy a big pile of cash.

I got so far as the paintings before I got bored. What comes after that?