Archive for the ‘Programming’ Category

Screenshots to URLs

Sunday, June 28th, 2009

Whenever I take a screenshot, the next step is usually to paste it into an email or IM conversation. Here’s a modified version of a launchd script for OS X, originally from Avi, that automates that: whenever you take a screenshot, it uploads it to a server of your choice, places the URL on the clipboard, and beeps. Download and execute the script to install the service. It assumes you use key-based SSH authentication.

Where’d that code go?

Sunday, June 28th, 2009
$ < ~/Binaries/git-find-removal
#!/bin/zsh
if [ $# != 2 ] ; then
  echo "$0 <commit> <string>" >/dev/stderr
  exit 1
fi
git bisect start
git bisect good $1
git bisect bad
git bisect run grep -qRE $2 .
$ git-find-removal HEAD~20 'def my_important_function'

App School

Friday, June 12th, 2009

App School just went live. It’s an iPhone development training course taught by my friend Daniel, and run by SQT, my Mum’s training company, and Mulley Communications. The courses will be based in Dublin—it should be a pretty cool chance for anyone Irish to quickly get up to speed with everything related to iPhone app development (including some tips on navigating the notorious app submission process…).

I’ve known Daniel since first year in secondary school—he convinced me to learn PHP, my first real (”real”?) programming language. He’s one of the best programmers I’ve ever worked with, and just won an IBM Open Source Award for his work on Yakumo [PDF], an open-source library for using the iPhone a game input device. Despite only just graduating now, a couple of companies have already spotted his win-ness—he interned with the hardcore physics guys at Havok in San Francisco last summer.

App School is lucky to have him, and I think the courses are going to be killer.

See also: Damien, Web 2 Ireland, Irish Times.

Hacking for fun and profit with Mathematica and the Google Analytics API

Monday, April 27th, 2009

I’ve always been surprised by how much web analytics software sucks. Web sites produce reams of interesting statistical data, many sites have a strong interest in growing their traffic, and yet even the leaders in the field—Google Analytics and co.—do nothing to help: they make no attempt to analyze the data; they just display it in fairly pedestrian ways.

Every day, you log into your analytics dashboard, and you see basically the same picture that you saw the day before. It’s left up to you to find out what’s changed. Did your search referrals for some mid-ranking keyword rise hugely? Are you getting referrals from some site that never linked to you before?

Additionally, most packages don’t provide any mechanism for you to integrate your business-specific data with your web site data. Sure, you can sometimes hack this with things like tracking visits to special “checkout” pages, but fundamentally there’s no direct way of integrating off-site data (sales of an iPhone app, say).

(It looks like dshbrd may fix many of these complaints—see the end of this post for details.)

I wrote some Emacs-based software to do all of this last year (including charts in the REPL), and have been using it fairly happily. Recently, I’ve started using Mathematica more and more to process log data, and have been very happy with the results.

A few days ago, Google released an API for Google Analytics (largely invisible but enormously successful). I spent a while over the weekend writing a bridge to Mathematica, and playing with the data, particularly that related to my iPhone app (Encyclopedia, which stores a copy of Wikipedia on your phone for offline browsing).

We can start off by doing fairly standard things, like number of visits from each search term by date:

And we can, say, chart the result. (First, we remove the “(not set)” results that the API returns.)

What fraction of this is from search traffic related to the iPhone app? We can select the promising search terms, and compare them with overall search traffic (as far as I know, no boolean logic is possible in the Google Analytics web interface):

Looks like most of the Google traffic is from the app-related keywords. We can also pretty trivially look at the geographical distribution of these visitors. First, let’s load the data:

And group the result by country:

We can now define a simple WorldPlot function, and use it with our data:

(Brighter colours indicate more visits.) Okay, that’s somewhat interesting. But how many sales does the app get per visit? I track my sales in Dabble DB, so I’ll load the data from there. They provide a CSV file of country code to unit sales mappings:

We can pretty easily import this into Mathematica:

Using Mathematica’s CountryData[] function, which knows about country codes (as well as their shapes), we can easily integrate the keyword referrals and the sales data to generate a heat map of purchases per search referral:

The interesting outlier here is Mexico. Though they don’t visit the site much, they do buy the app a lot. (Hundreds of times to date.) I’m still not sure why. The two maps also show that, although Ireland sends quite a bit of traffic to the site, it converts quite poorly—perhaps because my .ie domain causes my search rank there to be artificially inflated.

Okay, ignoring the app for a while, we might also be interested in questions like “what are my most important search terms?”. One way of measuring this is to look at the total time on your site due by each search term. (Google Analytics tells you the average time on site for each term, but that’s not much help in telling you where you should direct your work.)

Let’s load the data:

And define a simple function that computes the total number of seconds on the site driven by each keyword:

We can compute the set of search terms that were used to get to the site:

And sort them by time on site:

Ignoring the first two (corresponding to “(not set)”, which we could of course filter out), it seems that “wiki”, “wikipedia” and “iphone” are the important terms.

We might also be interested to see how keyword usage changes over time. Let’s load two months of keyword data:

And define a simple function to count the fraction of referrals that are from a particular keyword in some given dataset:

We can test it with something like:

Okay, 4% of the site’s visitors came by searching for something containing the word “iPhone”.

Now let’s compare two months of data:

These are the search keywords that showed the biggest increase between October and January. And they correspond quite closely to what you’d expect—default.png files on the iPhone and Back To My Mac are both topics I wrote about in the intervening time period.

Keeping with the spirit of analysing changes, we can look at non-search referrals:

We can drop the direct referrals, group them by source, and select the first referral from each site:

And then we can do something like plot the first referral from each site against the amount of traffic it sent—a chart of buzz, basically:

Next, we can look at how traffic flows around the site. First, we load (landing page, exit page) tuples:

And then hack together a function to generate a directed graph:

Yielding a pretty interesting result:

As a final example, we can, for no good reason, take advantage of the latitude and longitude metrics that the API provides and quickly create a video of visits by hour, showing traffic move across the globe. First, we load the data, and define two helper functions:

We can test it out on one of the hours:

Looks good. Let’s export an image for each hour, which we can then join with QuickTime:

The result is here. (You probably want to have it loop when you play it.)

Mathematica doesn’t get much attention from the programming community (largely because of Wolfram’s pricing, as far as I can tell). But its power is undeniable—I spent about 5 hours writing the Google Analytics interface and generating the above data. I wrote this blog post over lunch. If anyone else is interested in using the Mathematica/Google Analytics interface, let me know in the comments, and I’ll package it up and release it somewhere (it requires modified versions of a few libraries).

Lastly, over the past year, I spent a lot of time talking to my friend Avi about the state of web analytics. He and the guys at Dabble DB decided to do something about it, and it looks like dshbrd will launch soon. From what I’ve seen so far, it looks like it’ll be win.

tsocks: a nifty utility now working on OS X

Saturday, April 25th, 2009

tsocks is a cool Linux utility. Using LD_PRELOAD, it intercepts calls to the OS’s socket-related functions (connect() and co.), and transparently tunnels them through a SOCKS proxy. Example usage:

$ curl http://www.whatismyip.com/automation/n09230945.asp
89.141.232.202
$ tsocks curl http://www.whatismyip.com/automation/n09230945.asp
159.29.64.14

As it happens, curl supports SOCKS proxies, but tsocks allows you to add support to programs that know nothing about them (like, say, wget).

Sadly, it’s no longer maintained.

Marc Abramowitz got it working on OS X (patch) back in 2006 by switching to DYLD_INSERT_LIBRARIES, among other things, but even this port has succumbed to bit-rot.

So I fixed it up, and the code now lives at github.

Using Mathematica to generate Web 2.0 company names

Friday, April 10th, 2009

Feel like calling your company something like Cashcoup, Feebany, Bunkapps, Morpone, Realance or Afative? Combining CrunchBase, Mathematica and stochastic matrices yields the Web 2.0 Company Name Generator:

In[105]:=

mathematica-names_1.gif

In[90]:=

mathematica-names_2.gif

In[94]:=

mathematica-names_3.gif

In[95]:=

mathematica-names_4.gif

In[117]:=

mathematica-names_5.gif

In[98]:=

mathematica-names_6.gif

In[106]:=

mathematica-names_7.gif

Out[106]=

mathematica-names_8.gif

In[121]:=

mathematica-names_9.gif

Out[121]=

mathematica-names_10.gif

My first attempt at automated name generation used a few Gutenberg books, which yielded appropriately Victorian-sounding names. CrunchBase seems to work better. If you want to experiment with the code, download the notebook.

Update: you can now use the name generator interactively.

Dynamic Default.png files on the iPhone

Saturday, November 8th, 2008

John Gruber writes:

I’ve seen third-party iPhone developers complaining that this trick is only available to Apple; they want to use it too. The technical reason why they can’t is that because application bundles are cryptographically signed, you can’t modify the contents of the application bundle (by, in this case, changing the default.png resource file) without breaking the digital signature. Apple could enable this feature for signed applications by providing for a way to specify a dynamic default.png that exists outside the application bundle, somewhere in the application’s private Library folder.

With a bit of hackery, it turns out that you can actually create dynamic Default.png files that don’t cause problems. Here’s a demo of it in action:





This is possible because OS X’s codesign binary (I’ve had far too many run-ins with it while writing the offline Wikipedia browser), used to sign and verify bundles, doesn’t traverse symlinks:

$ codesign -vv Rememberer.app
Rememberer.app: valid on disk
$ touch Rememberer.app/test
$ codesign -vv Rememberer.app
Rememberer.app: a sealed resource is missing or invalid
/Users/patrick/Projects/Rememberer/build/Debug-iphoneos/Rememberer.app/test: resource added
$ rm Rememberer.app/test
$ codesign -vv Rememberer.app
Rememberer.app: valid on disk
$ ls -l Rememberer.app/randomfile
lrwxr-xr-x 1 patrick staff 24 8 Nov 17:21 Rememberer.app/randomfile -> ../Documents/randomfile
$ dd if=/dev/random of=Documents/randomfile count=1
1+0 records in
1+0 records out
512 bytes transferred in 0.000095 secs (5382165 bytes/sec)
$ codesign -vv Rememberer.app
Rememberer.app: valid on disk

This is somewhat understandable; the symlink itself doesn’t change. But if “randomfile” is instead something like “Default.png”, the OS will happily load it from the default path in the application bundle—and follow the symlink—even though the file is actually stored in an area (Documents) that’s dynamically modifiable.

I’m guessing that Apple will consider this a bug, and fix it in some future version of the OS. If that happens, though, the downside will probably be nothing worse than losing your dynamic Default.png.

To get it to work in Xcode, you can just add a Run Script phase to the Target:

ln -sf ../Documents/Default.png $TARGET_BUILD_DIR/$CONTENTS_FOLDER_PATH

Here’s the Xcode project for the above demo. (Code is public domain.)

Update (Nov 19): TechCrunch pointed out some wider implications of this vulnerability. Although the article was met with some skepticism, they’re basically right. There’s a good summary of the situation on the McAfee Avert Labs Blog.

iPhone hackery: API Explorer

Wednesday, October 29th, 2008

I wrote the offline Wikipedia browser back before there was any official iPhone SDK documentation (or SDK, for that matter), and figuring out the APIs was a bit of a challenge. So in trying to get a handle on things, I wrote an API explorer for showing a rough outline of the system’s classes. It started out as a bare-bones script, and since then I’ve gradually bolted various bits on to it.

Unlike many compiled languages, Objective-C supports pretty powerful runtime introspection. The explorer uses this to present the implemented protocols, methods and instance variables of every loaded class. In addition, if the class responds to initWithFrame: (these are usually subclasses of UIView), you can draw and resize an instance, to get a basic feel for what it does.

It’s all more easily explained with a short screencast:





If you want to play around with it (it works in both the simulator and on the devices themselves), you can download the code.

Wikipedia iPhone redux

Sunday, October 19th, 2008

Back at the start of the year, I blogged about an app I wrote that allows you to store a complete copy of Wikipedia on an iPhone/iPod Touch.

The app got more attention than I expected, with tens of thousands of downloads in the first month, which I think made it one of the more popular apps for the jailbroken iPhone. (Not anticipating any of this, the non-existent documentation and installer ensured many were confused, and so someone made a YouTube installation tutorial that has over 57,000 views at time of writing. I’m not sure if that’s good or bad.)

I also released the app’s source code, and it’s been pretty fun to work with a lot of talented people in improving it. The OLPC crew took an interest in it, and thanks to some cool work from Chris Ball and Wade Brainerd, the iPhone application was ported to the XO laptop. Chris announced in June that:

We’re going to be shipping the result to Peru on tens of thousands of laptops in the near future, and it should go up to hundreds of thousands if the other South American countries with OLPC deployments decide to include it in their builds too.

When the iPhone 3G was announced, I didn’t originally intend to port the application to the new version of the OS. The original app was a short Christmas project, and now that I’m working at Live Current, I don’t have much spare time to hack. But after a few hundred emails enquiring about a new version, I eventually felt too guilty not to. So I spent a weekend porting it to iPhone OS 2.0, added a handful of new features, and I’m happy to say that the end result is now available in the App Store.

Common Lisp heresy: syntactic lambdas

Saturday, May 3rd, 2008
(defun bracket-reader (stream char)
  (declare (ignore char))
  (let* ((lst (read-delimited-list #] stream t))
         (pos (position '|| lst)))
    (if pos
        `#'(lambda ,(mapcar #'intern (subseq lst 0 pos))
             ,@(nthcdr (1+ pos) lst))
        `#'(lambda (_) ,lst))))

(set-macro-character #[ #'bracket-reader)
(set-macro-character #] (get-macro-character #)))

Closures are beautiful, but the heaviness of CL’s lambda syntax kept jumping out at me as fairly ugly after a few months of writing Smalltalk. The above snippet allows you to write things like:

(mapcar [+ _ 1] '(1 2 3))

and:

(maphash [k v || (print v)] tbl)

Which, to my eyes at least, is nicer than:

(mapcar #'(lambda (x) (+ x 1)) '(1 2 3))

and:

(maphash #'(lambda (k v) (print v)) tbl)

Of course, to add syntax to Lisp is to wade into failure-littered territory. But although no-one agrees how it should be done, I really don’t think it’s a bad idea.