Alkahest my heroes have always died at the end

June 16, 2010

The Sounds of Science

Filed under: Personal,Technical — cec @ 10:07 pm

Last Friday, hsarik pointed out an interesting web site: Echo Nest.  They provide a web service that allows you to analyze and remix music.  The API also can provide information (meta-data) about music, artists, songs etc.  and has Python bindings.  If you’ve seen the “More Cowbell” website where you can upload an mp3 and have more cowbell (and more Christopher Walken) added to it, well that site uses Echo Nest and if you download the python bindings for their API, you can see the script that adds the sounds.  Personally, I’m fond of “Ob-la-di, Ob-la-da” with 80% cowbell and 20% Christopher Walken.

I started playing with the API and as a first cut thought it would be neat to use the “get_similar” function.  So for each artist, you can get the top N similar artists.  Now where can I get a list of artists I like?  Well, I could type ’em in, but that sucks.  So I wrote a small program which:

  1. Opens the database on my iPod (or a directory of mp3 files)
  2. Finds each artist by either reading the iPod db or looking at the id3 tags in all of the files
  3. For each artist, add a node to a graph where the area of the node is proportional to the number of songs that artist has on the iPod (or in the music folder)
  4. For each artist, finds the top 50 similar artists
  5. For all of the similar artists that are in my collection of artists, add a graph edge between the two nodes
  6. Plot the graph

What can I say, I’ve been working on a fair amount of graph-theory at work recently.  So after processing my iPod, I came up with the following graph of my current music (click to embiggen):

Okay, that’s pretty cool.  Almost completely illegible, but cool.  FWIW, the graph has 15 connected components, unfortunately, 13 of them are “singles” (not connected to anything), with one pair (Louis Armstrong paired with Louis Armstrong and Duke Ellington).  Fortunately, the graphing tool I use (igraph), has built in tools for doing community analysis (using the leading eigenvector method), i.e., we can automatically find tightly coupled subgraphs.  A few examples from the 25 or so communities:

which arguably correspond to “Indie,”  “Classic Rock,”  “Jam Bands,”  “Guitar Gods,” and “Alternative.”  If I processed my complete music database, I suspect we would wind up with several other communities, e.g., Blues.  But since Robert Johnson is the only blues I’ve got on there right now… he’s in a class by himself.

I suppose it goes w/o saying, that my musical tastes aren’t everyone’s and that if you don’t like my musical tastes, you can keep it to yourself or go DIAF 🙂

So, what’s next?  I was talking with M from my office and we’ve come up with another interesting project for the Echo Nest API.  This one a) uses the audio analysis functions, and b) if we do it right might cause someone to send us a cease and desist.  So, win all the way around.

June 3, 2010

Photography workflow

Filed under: Photography — cec @ 4:47 pm

Four years ago, I made the switch to digital SLR photography.  The primary reason was the workflow.  When I shot slide film, I would have to get the film developed, look at each image, scan the ones I liked, correct the color balance and then manually remove the dust spots from the scanned images.

When I first got the digital camera, the workflow became: auto-correct the color balance using the Nikon’s color profile, then select the images I liked.  Great!

Unfortunately, over the years, my SLR has gotten dust on the sensor, because I was doing what Nikon said and not mucking with the sensor to try to clean it.  So, first thing is that I should ignore Nikon and actually clean the sensor.  But the second thing is that this has really screwed with my workflow.  Last year, after identifying the “good” images, I had to manually go through them and use the Heal tool in the GIMP in order to get rid of a few dust spots.  Well, dust is cumulative and this year it was worse than ever.  In particular, the dust was more noticeable because I was shooting a lot of waterfalls… long exposures with a small aperture – dust city.  Take a look at the following:

To some extent or another, that’s on every single image I took while K and I were on vacation.

I could repeat my old workflow, but that would take days.  New idea:  there is a tool in the GIMP called the Smart Remove Selection.  It takes a selected bit of the image and replaces it with textures from the surrounding area.  It’s comparable to Photoshop’s content-aware fill.  So, if I can select all of the visible dust, I can clean it at one time.  But that’s still slow.

Instead, I selected all of the dust from the image above.  Grew the selection by 10 pixels, converted it to a path and then saved the path as an SVG file.  Since the dust is at the same location in each image, a single dust file is relevant to all of my images.

Now all I have to do is to open an image, import the path, convert the path to a selection and apply the smart remove.  That’s a little better, but still means that I have to touch each file manually.

Enter GIMP scripting.  Last night, I wrote a script that takes a file glob, converts it to a list of files, and for each file automatically removes the dust and color corrects the image.  It still takes about a minute per file, but it’s completely automated.  Unfortunately, the first version of the script only handled horizontal images.  But since I always turn the camera clockwise when I shoot vertically, I was able to modify it to rotate the image appropriately, apply the dust removal and then rotate the image back i the height of the image is greater than the width.

The results are pretty great:

Powered by WordPress