Shallow Thoughts : tags : python

Akkana's Musings on Open Source Computing, Science, and Nature.

Sun, 08 Jan 2012

Parsing HTML in Python

I've been having (mis)adventures learning about Python's various options for parsing HTML.

Up until now, I've avoided doing any HTMl parsing in my RSS reader FeedMe. I use regular expressions to find the places where content starts and ends, and to screen out content like advertising, and to rewrite links. Using regexps on HTML is generally considered to be a no-no, but it didn't seem worth parsing the whole document just for those modest goals.

But I've long wanted to add support for downloading images, so you could view the downloaded pages with their embedded images if you so chose. That means not only identifying img tags and extracting their src attributes, but also rewriting the img tag afterward to point to the locally stored image. It was time to learn how to parse HTML.

Since I'm forever seeing people flamed on the #python IRC channel for using regexps on HTML, I figured real HTML parsing must be straightforward. A quick web search led me to Python's built-in HTMLParser class. It comes with a nice example for how to use it: define a class that inherits from HTMLParser, then define some functions it can call for things like handle_starttag and handle_endtag; then call self.feed(). Something like this:

from HTMLParser import HTMLParser

class MyFancyHTMLParser(HTMLParser):
  def fetch_url(self, url) :
    request = urllib2.Request(url)
    response = urllib2.urlopen(request)
    link = response.geturl()
    html = response.read()
    response.close()
    self.feed(html)   # feed() starts the HTMLParser parsing

  def handle_starttag(self, tag, attrs):
    if tag == 'img' :
      # attrs is a list of tuples, (attribute, value)
      srcindex = self.has_attr('src', attrs)
      if srcindex < 0 :
        return   # img with no src tag? skip it
      src = attrs[srcindex][1]
      # Make relative URLs absolute
      src = self.make_absolute(src)
      attrs[srcindex] = (attrs[srcindex][0], src)

    print '<' + tag
    for attr in attrs :
      print ' ' + attr[0]
      if len(attr) > 1 and type(attr[1]) == 'str' :
        # make sure attr[1] doesn't have any embedded double-quotes
        val = attr[1].replace('"', '\"')
        print '="' + val + '"')
    print '>'

  def handle_endtag(self, tag):
    self.outfile.write('</' + tag.encode(self.encoding) + '>\n')

Easy, right? Of course there are a lot more details, but the basics are simple.

I coded it up and it didn't take long to get it downloading images and changing img tags to point to them. Woohoo! Whee!

The bad news about HTMLParser

Except ... after using it a few days, I was hitting some weird errors. In particular, this one:
HTMLParser.HTMLParseError: bad end tag: ''

It comes from sites that have illegal content. For instance, stories on Slate.com include Javascript lines like this one inside <script></script> tags:
document.write("<script type='text/javascript' src='whatever'></scr" + "ipt>");

This is technically illegal html -- but lots of sites do it, so protesting that it's technically illegal doesn't help if you're trying to read a real-world site.

Some discussions said setting self.CDATA_CONTENT_ELEMENTS = () would help, but it didn't.

HTMLParser's code is in Python, not C. So I took a look at where the errors are generated, thinking maybe I could override them. It was easy enough to redefine parse_endtag() to make it not throw an error (I had to duplicate some internal strings too). But then I hit another error, so I redefined unknown_decl() and _scan_name(). And then I hit another error. I'm sure you see where this was going. Pretty soon I had over 100 lines of duplicated code, and I was still getting errors and needed to redefine even more functions. This clearly wasn't the way to go.

Using lxml.html

I'd been trying to avoid adding dependencies to additional Python packages, but if you want to parse real-world HTML, you have to. There are two main options: Beautiful Soup and lxml.html. Beautiful Soup is popular for large projects, but the consensus seems to be that lxml.html is more error-tolerant and lighter weight.

Indeed, lxml.html is much more forgiving. You can't handle start and end tags as they pass through, like you can with HTMLParser. Instead you parse the HTML into an in-memory tree, like this:

  tree = lxml.html.fromstring(html)

How do you iterate over the tree? lxml.html is a good parser, but it has rather poor documentation, so it took some struggling to figure out what was inside the tree and how to iterate over it.

You can visit every element in the tree with

for e in tree.iter() :
  print e.tag

But that's not terribly useful if you need to know which tags are inside which other tags. Instead, define a function that iterates over the top level elements and calls itself recursively on each child.

The top of the tree itself is an element -- typically the <html></html> -- and each element has .tag and .attrib. If it contains text inside it (like a <p> tag), it also has .text. So to make something that works similarly to HTMLParser:

def crawl_tree(tree) :
  handle_starttag(tree.tag, tree.attrib)
  if tree.text :
    handle_data(tree.text)
  for node in tree :
    crawl_tree(node)
  handle_endtag(tree.tag)

But wait -- we're not quite all there. You need to handle two undocumented cases.

First, comment tags are special: their tag attribute, instead of being a string, is <built-in function Comment> so you have to handle that specially and not assume that tag is text that you can print or test against.

Second, what about cases like <p>Here is some <i>italicised</i> text.</p> ? in this case, you have the p tag, and its text is "Here is some ". Then the p has a child, the i tag, with text of "italicised". But what about the rest of the string, " text."?

That's called a tail -- and it's the tail of the adjacent i tag it follows, not the parent p tag that contains it. Confusing!

So our function becomes:

def crawl_tree(tree) :
  if type(tree.tag) is str :
    handle_starttag(tree.tag, tree.attrib)
    if tree.text :
      handle_data(tree.text)
    for node in tree :
      crawl_tree(node)
    handle_endtag(tree.tag)
  if tree.tail :
    handle_data(tree.tail)

See how it works? If it's a comment (tree.tag isn't a string), we'll skip everything -- except the tail. Even a comment might have a tail:
<p>Here is some <!-- this is a comment --> text we want to show.</p>
so even if we're skipping comment we need its tail.

I'm sure I'll find other gotchas I've missed, so I'm not releasing this version of feedme until it's had a lot more testing. But it looks like lxml.html is a reliable way to parse real-world pages. It even has a lot of convenience functions like link rewriting that you can use without iterating the tree at all. Definitely worth a look!

Tags: , ,
[ 14:04 Jan 08, 2012    More programming | permalink to this entry | comments ]

Thu, 29 Dec 2011

Plotting the Analemma

My SJAA planet-observing column for January is about the Analemma and the Equation of Time.

The analemma is that funny figure-eight you see on world globes in the middle of the Pacific Ocean. Its shape is the shape traced out by the sun in the sky, if you mark its position at precisely the same time of day over the course of an entire year.

The analemma has two components: the vertical component represents the sun's declination, how far north or south it is in our sky. The horizontal component represents the equation of time.

The equation of time describes how the sun moves relatively faster or slower at different times of year. It, too, has two components: it's the sum of two sine waves, one representing how the earth speeds up and slows down as it moves in its elliptical orbit, the other a function the tilt (or "obliquity") of the earth's axis compared to its orbital plane, the ecliptic.

[components of the Equation of time] The Wikipedia page for Equation of time includes a link to a lovely piece of R code by Thomas Steiner showing how the two components relate. It's labeled in German, but since the source is included, I was able to add English labels and use it for my article.

But if you look at photos of real analemmas in the sky, they're always tilted. Shouldn't they be vertical? Why are they tilted, and how does the tilt vary with location? To find out, I wanted a program to calculate the analemma.

Calculating analemmas in PyEphem

The very useful astronomy Python package PyEphem makes it easy to calculate the position of any astronomical object for a specific location. Install it with: easy_install pyephem for Python 2, or easy_install ephem for Python 3.

import ephem
observer = ephem.city('San Francisco')
sun = ephem.Sun()
sun.compute(observer)
print sun.alt, sun.az

The alt and az are the altitude and azimuth of the sun right now. They're printed as strings: 25:23:16.6 203:49:35.6 but they're actually type 'ephem.Angle', so float(sun.alt) will give you a number in radians that you can use for calculations.

Of course, you can specify any location, not just major cities. PyEphem doesn't know San Jose, so here's the approximate location of Houge Park where the San Jose Astronomical Association meets:

observer = ephem.Observer()
observer.name = "San Jose"
observer.lon = '-121:56.8'
observer.lat = '37:15.55'

You can also specify elevation, barometric pressure and other parameters.

So here's a simple analemma, calculating the sun's position at noon on the 15th of each month of 2011:

    for m in range(1, 13) :
        observer.date('2011/%d/15 12:00' % (m))
        sun.compute(observer)

I used a simple PyGTK window to plot sun.az and sun.alt, so once it was initialized, I drew the points like this:

    # Y scale is 45 degrees (PI/2), horizon to halfway to zenith:
    y = int(self.height - float(self.sun.alt) * self.height / math.pi)
    # So make X scale 45 degrees too, centered around due south.
    # Want az = PI to come out at x = width/2.
    x = int(float(self.sun.az) * self.width / math.pi / 2)
    # print self.sun.az, float(self.sun.az), float(self.sun.alt), x, y
    self.drawing_area.window.draw_arc(self.xgc, True, x, y, 4, 4, 0, 23040)

So now you just need to calculate the sun's position at the same time of day but different dates spread throughout the year.

[analemma in San Jose at noon clock time] And my 12-noon analemma came out almost vertical! Maybe the tilt I saw in analemma photos was just a function of taking the photo early in the morning or late in the afternoon? To find out, I calculated the analemma for 7:30am and 4:30pm, and sure enough, those were tilted.

But wait -- notice my noon analemma was almost vertical -- but it wasn't exactly vertical. Why was it skewed at all?

Time is always a problem

As always with astronomy programs, time zones turned out to be the hardest part of the project. I tried to add other locations to my program and immediately ran into a problem.

The ephem.Date class always uses UTC, and has no concept of converting to the observer's timezone. You can convert to the timezone of the person running the program with localtime, but that's not useful when you're trying to plot an analemma at local noon.

At first, I was only calculating analemmas for my own location. So I set time to '20:00', that being the UTC for my local noon. And I got the image at right. It's an analemma, all right, and it's almost vertical. Almost ... but not quite. What was up?

Well, I was calculating for 12 noon clock time -- but clock time isn't the same as mean solar time unless you're right in the middle of your time zone.

You can calculate what your real localtime is (regardless of what politicians say your time zone should be) by using your longitude rather than your official time zone:

    date = '2011/%d/12 12:00' % (m)
    adjtime = ephem.date(ephem.date(date) \
                    - float(self.observer.lon) * 12 / math.pi * ephem.hour)
    observer.date = adjtime

Maybe that needs a little explaining. I take the initial time string, like '2011/12/15 12:00', and convert it to an ephem.date. The number of hours I want to adjust is my longitude (in radians) times 12 divided by pi -- that's because if you go pi (180) degrees to the other side of the earth, you'll be 12 hours off. Finally, I have to multiply that by ephem.hour because ... um, because that's the way to add hours in PyEphem and they don't really document the internals of ephem.Date.

[analemma in San Jose at noon clock time] Set the observer date to this adjusted time before calculating your analemma, and you get the much more vertical figure you see here. This also explains why the morning and evening analemmas weren't symmetrical in the previous run.

This code is location independent, so now I can run my analemma program on a city name, or specify longitude and latitude.

PyEphem turned out to be a great tool for exploring analemmas. But to really understand analemma shapes, I had more exploring to do. I'll write about that, and post my complete analemma program, in the next article.

Tags: , , , , ,
[ 19:54 Dec 29, 2011    More science/astro | permalink to this entry | comments ]

Thu, 22 Dec 2011

Calculating the Solstice and shortest day

Today is the winter solstice -- the official beginning of winter.

The solstice is determined by the Earth's tilt on its axis, not anything to do with the shape of its orbit: the solstice is the point when the poles come closest to pointing toward or away from the sun. To us, standing on Earth, that means the winter solstice is the day when the sun's highest point in the sky is lowest.

You can calculate the exact time of the equinox using the handy Python package PyEphem. Install it with: easy_install pyephem for Python 2, or easy_install ephem for Python 3. Then ask it for the date of the next or previous equinox. You have to give it a starting date, so I'll pick a date in late summer that's nowhere near the solstice:

>>> ephem.next_solstice('2011/8/1')
2011/12/22 05:29:52
That agrees with my RASC Observer's Handbook: Dec 22, 5:30 UTC. (Whew!)

PyEphem gives all times in UTC, so, since I'm in California, I subtract 8 hours to find out that the solstice was actually last night at 9:30. If I'm lazy, I can get PyEphem to do the subtraction for me:

ephem.date(ephem.next_solstice('2011/8/1') - 8./24)
2011/12/21 21:29:52
I used 8./24 because PyEphem's dates are in decimal days, so in order to subtract 8 hours I have to convert that into a fraction of a 24-hour day. The decimal point after the 8 is to get Python to do the division in floating point, otherwise it'll do an integer division and subtract int(8/24) = 0.

The shortest day

The winter solstice also pretty much marks the shortest day of the year. But was the shortest day yesterday, or today? To check that, set up an "observer" at a specific place on Earth, since sunrise and sunset times vary depending on where you are. PyEphem doesn't know about San Jose, so I'll use San Francisco:

>>> import ephem
>>> observer = ephem.city("San Francisco")
>>> sun = ephem.Sun()
>>> for i in range(20,25) :
...   d = '2011/12/%i 20:00' % i
...   print d, (observer.next_setting(sun, d) - observer.previous_rising(sun, d)) * 24
2011/12/20 20:00 9.56007901422
2011/12/21 20:00 9.55920379754
2011/12/22 20:00 9.55932991847
2011/12/23 20:00 9.56045709446
2011/12/24 20:00 9.56258416496
I'm multiplying by 24 to get hours rather than decimal days.

So the shortest day, at least here in the bay area, was actually yesterday, 2011/12/21. Not too surprising, since the solstice wasn't that long after sunset yesterday.

If you look at the actual sunrise and sunset times, you'll find that the latest sunrise and earliest sunset don't correspond to the solstice or the shortest day. But that's all tied up with the equation of time and the analemma ... and I'll cover that in a separate article.

Tags: , , , ,
[ 10:28 Dec 22, 2011    More science/astro | permalink to this entry | comments ]

Wed, 16 Nov 2011

New trails, and new PyTopo 1.1 release

A new trail opened up above Alum Rock park! Actually a whole new open space preserve, called Sierra Vista -- with an extensive set of trails that go all sorts of interesting places.

Dave and I visit Alum Rock frequently -- we were married there -- so having so much new trail mileage is exciting. We tried to explore it on foot, but quickly realized the mileage was more suited to mountain bikes. Even with bikes, we'll be exploring this area for a while (mostly due to not having biked in far too long, so it'll take us a while to work up to that much riding ... a combination of health problems and family issues have conspired to keep us off the bikes).

Of course, part of the fun of discovering a new trail system is poring over maps trying to figure out where the trails will take us, then taking GPS track logs to study later to see where we actually went.

And as usual when uploading GPS track logs and viewing them in pytopo, I found some things that weren't working quite the way I wanted, so the session ended up being less about studying maps and more about hacking Python.

In the end, I fixed quite a few little bugs, improved some features, and got saved sites with saved zoom levels working far better.

Now, PyTopo 1.0 happened quite a while ago -- but there were two of us hacking madly on it at the time, and pinning down the exact time when it should be called 1.0 wasn't easy. In fact, we never actually did it. I know that sounds silly -- of all releases to not get around to, finally reaching 1.0? Nevertheless, that's what happened.

I thought about cheating and calling this one 1.0, but we've had 1.0 beta RPMs floating around for so long (and for a much earlier release) that that didn't seem right.

So I've called the new release PyTopo 1.1. It seems to be working pretty solidly. It's certainly been very helpful to me in exploring the new trails. It's great for cross-checking with Google Earth: the OpenCycleMap database has much better trail data than Google does, and pytopo has easy track log loading and will work offline, while Google has the 3-D projection aerial imagery that shows where trails and roads were historically (which may or may not correspond to where they decide to put the new trails). It's great to have both.

Anyway, here's the new PyTopo.

Tags: , ,
[ 19:59 Nov 16, 2011    More mapping | permalink to this entry | comments ]

Sun, 16 Oct 2011

Monitor an Arduino's serial output from Python

Debugging Arduino sensors can sometimes be tricky. While working on my Arduino sonar project, I found myself wanting to know what values the Arduino was reading from its analog port.

It's easy enough to print from the Arduino to its USB-serial line. First add some code like this in setup():

    Serial.begin(9600);
Then in loop(), if you just read the value "val":
    Serial.println(val);

Serial output from Python

That's all straightforward -- but then you need something that reads it on the PC side.

When you're using the Arduino Java development environment, you can set it up to display serial output in a few lines at the bottom of the window. But it's not terrifically easy to read there, and I don't want to be tied to the Java IDE -- I'm much happier doing my Arduino development from the command line. But then how do you read serial output when you're debugging? In general, you can use the screen program to talk to serial ports -- it's the tool of choice to log in to plug computers. For the Arduino, you can do something like this: screen /dev/ttyUSB0 9600

But I found that a bit fiddly for various reasons. And I discovered that it's easy to write something like this in Python, using the serial module.

You can start with something as simple as this:

import serial

ser = serial.Serial("/dev/ttyUSB0", 9600)
while True:
    print ser.readline()

Serial input as well as output

That worked great for debugging purposes. But I had another project (which I will write up separately) where I needed to be able to send commands to the Arduino as well as reading output it printed. How do you do both at once?

With the select module, you can monitor several file descriptors at once. If the user has typed something, send it over the serial line to the Arduino; if the Arduino has printed something, read it and display it for the user to see.

That loop looks like this:

while True :
    # Check whether the user has typed anything (timeout of .2 sec):
    inp, outp, err = select.select([sys.stdin, self.ser], [], [], .2)

    # If the user has typed anything, send it to the Arduino:
    if sys.stdin in inp :
        line = sys.stdin.readline()
        self.ser.write(line)

    # If the Arduino has printed anything, display it:
    if self.ser in inp :
line = self.ser.readline().strip()
print "Arduino:", line

Add in a loop to find the right serial port (the Arduino doesn't always show up on /dev/ttyUSB0) and a little error and exception handling, and I had a useful script that met all my Arduino communication needs: ardmonitor.

Tags: , , ,
[ 19:27 Oct 16, 2011    More hardware | permalink to this entry | comments ]

Tue, 27 Sep 2011

Banishing errant tooltips

Every now and then I have to run a program that doesn't manage its tooltips well. I mouse over some button to find out what it does, a tooltip pops up -- but then the tooltip won't go away. Even if I change desktops, the tooltip follows me and stays up on all desktops. Worse, it's set to stay on top of all other windows, so it blocks anything underneath it.

The places where I see this happen most often are XEphem (probably as an artifact of the broken Motif libraries we're stuck with on Linux); Adobe's acroread (Acrobat Reader), though perhaps that's gotten better since I last used it; and Wine.

I don't use Wine much, but lately I've had to use it for a medical imaging program that doesn't seem to have a Linux equivalent (viewing PETscan data). Every button has a tooltip, and once a tooltip pops up, it never goes aawy. Eventually I might have five of six of these little floating windows getting in the way of whatever I'm doing on other desktops, until I quit the wine program.

So how does one get rid of errant tooltips littering your screen? Could I write an Xlib program that could nuke them?

Finding window type

First we need to know what's special about tooltip windows, so the program can identify them. First I ran my wine program and produced some sticky tooltips.

Once they were up, I ran xwininfo and clicked on a tooltip. It gave me a bunch of information about the windows size and location, color depth, etc. ... but the useful part is this:

  Override Redirect State: yes

In X, override-redirect windows are windows that are immune to being controlled by the window manager. That's why they don't go away when you change desktops, or move when you move the parent window.

So what if I just find all override-redirect windows and unmap (hide) them? Or would that kill too many innocent victims?

Python-Xlib

I thought I'd have to write my little app in C, since it's doing low-level Xlib calls. But no -- there's a nice set of Python bindings, python-xlib. The documentation isn't great, but it was still pretty easy to whip something up.

The first thing I needed was a window list: I wanted to make sure I could find all the override-redirect windows. Here's how to do that:

from Xlib import display

dpy = display.Display()
screen = dpy.screen()
root = screen.root
tree = root.query_tree()

for w in tree.children :
    print w

w is a Window (documented here). I see in the documentation that I can get_attributes(). I'd also like to know which window is which -- calling get_wm_name() seems like a reasonable way to do that. Maybe if I print them, those will tell me how to find the override-redirect windows:

for w in tree.children :
    print w.get_wm_name(), w.get_attributes()

Window type, redux

Examining the list, I could see that override_redirect was one of the attributes. But there were quite a lot of override-redirect windows. It turns out many apps, such as Firefox, use them for things like menus. Most of the time they're not visible. But you can look at w.get_attributes().map_state to see that.

So that greatly reduced the number of windows I needed to examine:

for w in tree.children :
    att = w.get_attributes()
    if att.map_state and att.override_redirect :
        print w.get_wm_name(), att

I learned that tooltips from well-behaved programs like Firefox tended to set wm_name to the contents of the tooltip. Wine doesn't -- the wine tooltips had an empty string for wm_name. If I wanted to kill just the wine tooltips, that might be useful to know.

But I also noticed something more important: the tooltip windows were also "transient for" their parent windows. Transient for means a temporary window popped up on behalf of a parent window; it's kept on top of its parent window, and goes away when the parent does.

Now I had a reasonable set of attributes for the windows I wanted to unmap. I tried it:

for w in tree.children :
    att = w.get_attributes()
    if att.map_state and att.override_redirect and w.get_wm_transient_for():
        w.unmap()

It worked! At least in my first test: I ran the wine program, made a tooltip pop up, then ran my killtips program ... and the tooltip disappeared.

Multiple tooltips: flushing the display

But then I tried it with several tooltips showing (yes, wine will pop up new tooltips without hiding the old ones first) and the result wasn't so good. My program only hid the first tooltip. If I ran it again, it would hide the second, and again for the third. How odd!

I wondered if there might be a timing problem. Adding a time.sleep(1) after each w.unmap() fixed it, but sleeping surely wasn't the right solution.

But X is asynchronous: things don't necessarily happen right away. To force (well, at least encourage) X to deal with any queued events it might have stacked up, you can call dpy.flush().

I tried adding that after each w.unmap(), and it worked. But it turned out I only need one

dpy.flush()
at the end of the program, just exiting. Apparently if I don't do that, only the first unmap ever gets executed by the X server, and the rest are discarded. Sounds like flush() is a good idea as the last line of any python-xlib program.

killtips will hide tooltips from well-behaved programs too. If you have any tooltips showing in Firefox or any GTK programs, or any menus visible, killtips will unmap them. If I wanted to make sure the program only attacked the ones generated by wine, I could add an extra test on whether w.get_wm_name() == "".

But in practice, it doesn't seem to be a problem. Well-behaved programs handle having their tooltips unmapped just fine: the next time you call up a menu or a tooltip, the program will re-map it.

Not so in wine: once you dismiss one of those wine tooltips, it's gone forever, at least until you quit and restart the program. But that doesn't bother me much: once I've seen the tooltip for a button and found out what that button does, I'm probably not going to need to see it again for a while.

So I'm happy with killtips, and I think it will solve the problem. Here's the full script: killtips.

Tags: , , , ,
[ 10:36 Sep 27, 2011    More programming | permalink to this entry | comments ]

Fri, 09 Sep 2011

Count characters or words in the X selection from Python

This post is, above all, a lesson in doing a web search first. Even when what you're looking for is so obscure you're sure no one else has wanted it. But the script I got out of it might turn out to be useful.

It started with using Bitlbee for Twitter. I love bitlbee -- it turns a Twitter stream into just another IRC channel tab in the xchat I'm normally running anyway.

The only thing I didn't love about bitlbee is that, unlike the twitter app I'd previously used, I didn't have any way to keep track of when I neared the 140-character limit. There were various ways around that, mostly involving pasting the text into other apps before submitting it. But they were all too many steps.

It occurred to me that one way around this was to select-all, then run something that would show me the number of characters in the X selection. That sounded like an easy app to write.

Getting the X selection from Python

I was somewhat surprised to find that Python has no way of querying the X selection. It can do just about everything else -- even simulate X events. But there are several command-line applications that can print the selection, so it's easy enough to run xsel or xclip from Python and read the output.

I ended up writing a little app that brings up a dialog showing the current count, then hangs around until you dismiss it, querying the selection once a second and updating the count. It's called countsel.

Of course, if you don't want to write a Python script you can use commandline tools directly. Here are a couple of examples, using xclip instead of xsel: xterm -title 'lines words chars' -geometry 25x2 -e bash -c 'xclip -o | wc; read -n 1' pops up a terminal showing the "wc" counts of the selection once, and xterm -title 'lines words chars' -geometry 25x1 -e watch -t 'xclip -o | wc' loops over those counts printing them once a second.

Binding commands to a key is different for every window manager. In Openbox, I added this to rc.xml to call up my program whenever I type W-t (short for Twitter):

    <keybind key="W-t">
      <action name="Execute">
        <execute>/home/akkana/bin/countsel</execute>
      </action>
    </keybind>

Now, any time I needed to check my character count, I could triple-click or type Shift-Home, then hit W-t to call up the dialog and get a count. Then I could leave the dialog up, and whenever I wanted a new count, just Shift-Home or triple-click again, and the dialog updates automatically. Not perfect, but not bad.

Xchat plug-in for a much more elegant solution

Only after getting countsel working did it occur to me to wonder if anyone else had the same Bitlbee+xchat+twitter problem. And a web search found exactly what I needed: xchat-inputcount.pl, a wonderful xchat script that adds a character-counter next to the input box as you're typing. It's a teensy bit buggy, but still, it's far better than my solution. I had no idea you could add user-interface elements to xchat like that!

But that's okay. Countsel didn't take long to write. And I've added word counting to countsel, so I can use it for word counts on anything I'm writing.

Tags: , ,
[ 11:32 Sep 09, 2011    More programming | permalink to this entry | comments ]

Wed, 31 Aug 2011

Read Excel XLS spreadsheets with Python

Someone mailed out information to a club I'm in as an .XLS file. Another Excel spreadsheet. Sigh.

I do know one way to read them. Fire up OpenOffice, listen to my CPU fan spin as I wait forever for the app to start up, open the xls file, then click in one cell after another as I deal with the fact that spreadsheet programs only show you a tiny part of the text in each cell. I'm not against spreadsheets per se -- they're great for calculating tables of interconnected numbers -- but they're a terrible way to read tabular data.

Over the years, lots of open-source programs like word2x and catdoc have sprung up to read the text in MS Word .doc files. Surely by now there must be something like that for XLS files?

Well, I didn't find any ready-made programs, but I found something better: Python's xlrd module, as well as a nice clear example at ScienceOSS of how to Read Excel files from Python.

Following that example, in six lines I had a simple program to print the spreadsheet's contents:

import xlrd

for filename in sys.argv[1:] :
    wb = xlrd.open_workbook(filename)
    for sheetname in wb.sheet_names() :
        sh = wb.sheet_by_name(sheetname)
        for rownum in range(sh.nrows) :
            print sh.row_values(rownum)

Of course, having gotten that far, I wanted better formatting so I could compare the values in the spreadsheet. Didn't take long to write, and the whole thing still came out under 40 lines: xlsrd. And I was able to read that XLS file that was mailed to the club, easily and without hassle.

I'm forever amazed at all the wonderful, easy-to-use modules there are for Python.

Tags: ,
[ 09:58 Aug 31, 2011    More programming | permalink to this entry | comments ]

Thu, 25 Aug 2011

Deleting email from a mail server with Python

How do you delete email from a mail server without downloading or reading it all?

Why? Maybe you got a huge load of spam and you need to delete it. Maybe you have your laptop set up to keep a copy of your mail on the server so you can get it on your desktop later ... but after a while you realize it's not worth downloading all that mail again. In my case, I use an ISP that keeps copies of all mail forwarded from one alias to another, so I periodically need to clean out the copies.

There are quite a few reasons you might want to delete mail without reading it ... so I was surprised to find that there didn't seem to be any easy way to do so.

But POP3 is a fairly simple protocol. How hard could it be to write a Python script to do what I needed?

Not hard at all, in fact. The poplib package does most of the work for you, encapsulating both the networking and the POP3 protocol. It even does SSL, so you don't have to send your password in the clear.

Once you've authenticated, you can list() messages, which gives you a status and a list of message numbers and sizes, separated by a space. Just loop through them and delete each one.

Here's a skeleton program to delete messages:

server = "mail.example.com"
port = 995
user = "myname"
passwd = "seekrit"

pop = poplib.POP3_SSL(server, port)
pop.user(user)
pop.pass_(passwd)

poplist = pop.list()
if poplist[0].startswith('+OK') :
    msglist = poplist[1]
    for msgspec in msglist :
        # msgspec is something like "3 3941", 
        # msg number and size in octets
        msgnum = int(msgspec.split(' ')[0])
        print "Deleting msg %d\r" % msgnum,
        pop.dele(msgnum)
    else :
        print "No messages for", user
else :
    print "Couldn't list messages: status", poplist[0]
pop.quit()

Of course, you might want to add more error checking, loop through a list of users, etc. Here's the full script: deletemail.

Tags: , ,
[ 16:41 Aug 25, 2011    More programming | permalink to this entry | comments ]

Fri, 19 Aug 2011

Beginning Python: Sorting lists of objects

The Beginning Python class has pretty much died down -- although there are still a couple of interested students posting really great homework solutions, I think most people have fallen behind, and it's time to wrap up the course.

So today, I didn't post a formal lesson. But I did have something to share about how I used Python's object-oriented capabilities to solve a problem I had copying new podcast files onto my MP3 player. I used Python's built-in list sort() function, along with the easy way it lets me define operators like < and > for any object I define.

You can read all about it in my post to the Courses list describing how I sorted my list of podcast objects. Or just go straight to the final program, pods.

Tags: , ,
[ 18:48 Aug 19, 2011    More education | permalink to this entry | comments ]

Fri, 12 Aug 2011

Beginning Python, Lesson 9: More extras

Lesson 9 in my online Python course is up: Lesson 9: Extras (requested topics), including string operations, web development and GUI toolkits.

The web development and GUI toolkits are topics which were requested by students, while the string ops are things that just seemed too useful not to include.

Tags: , ,
[ 16:45 Aug 12, 2011    More education | permalink to this entry | comments ]

Fri, 05 Aug 2011

Beginning Python, Lesson 8: Extras

Lesson 8 in my online Python course is up: Lesson 8: Extras, including exception handling, optional arguments, and running system commands. A motley collection of fun and useful topics that didn't quite fit anywhere in the earlier formal lessons, but you'll find a lot of use for them in writing real-world Python scripts. In the homework, I have some examples of some of my scripts using these techniques; I'm sure the students will have lots of interesting problems of their own.

Tags: , ,
[ 13:56 Aug 05, 2011    More education | permalink to this entry | comments ]

Sat, 30 Jul 2011

Beginning Python, Lesson 7: Object-oriented programming

Lesson 7 in my online Python course is up: Lesson 7: Object-oriented programming.

This is the last formal lesson in the Beginning Python class. But I will be posting a few more "tips and tricks" lessons, little things that didn't fit in other lessons plus suggestions for useful Python packages students may want to check out as they continue their Python hacking.

Tags: , ,
[ 09:28 Jul 30, 2011    More education | permalink to this entry | comments ]

Fri, 22 Jul 2011

Beginning Python, Lesson 6: Functions and Dictionaries

Lesson 6 in my online Python course is up: Lesson 6: Functions and Dictionaries.

We're getting near the end of the course -- partly because I think students may be saturated, though I may post one more lesson. I'll post on the list and see what the students think about it.

This afternoon, though, is pretty much booked up trying to get my mother's new Nook Touch e-book reader working with Linux. Would be easy ... except that she wants to be able to check out books from her local public library, which of course uses proprietary software from Adobe and other companies to do DRM. It remains to be seen if this will be possible ... of course, I'll post the results once we know.

Tags: , ,
[ 16:49 Jul 22, 2011    More education | permalink to this entry | comments ]

Fri, 15 Jul 2011

Beginning Python, Lesson 5

Lesson 5 in my online Python course is up: Infinite loops, modulo, and random numbers.

It's a motley mix of topics, mostly because I wanted to have a fun homework project that actually did something interesting. I hope everyone enjoys it!

Tags: , ,
[ 15:44 Jul 15, 2011    More education | permalink to this entry | comments ]

Fri, 08 Jul 2011

Beginning Python, Lesson 4

Lesson 4 in my online Python course is up: Modules and command-line arguments.

This lesson is a little longer than previous lessons, but that's partly because of a couple of digressions at the beginning. Hope I didn't overdo it! The homework includes an optional debugging problem for folks who want to dive a little deeper into this stuff.

Tags: , ,
[ 19:20 Jul 08, 2011    More education | permalink to this entry | comments ]

Sun, 03 Jul 2011

Beginning Python, Lesson 3: Strings and Lists

Lesson 3 in my online Python course is up: Fun with Strings and Lists.

There may be some backlog on the mailing list -- my first attempt to post the lesson didn't show up at all, but my second try made it. Mail seems to be flowing now, but if you try to post something and it doesn't show up, let me know or tell us on irc.linuxchix.org, so we know if there's a continuing problem that needs to be fixed, not just a one-time glitch.

Meanwhile, I'm having some trouble getting new blog entries posted. Due to some network glitches, I had to migrate shallowsky.com to a different ISP, and it turns out the PyBlosxom 1.4 I'd been using doesn't work with more recent versions of Python; but none of my PyBlosxom plug-ins work in 1.5. Aren't software upgrades a joy? So I'm getting lots of practice debugging other people's Python code trying to get the plug-ins updated, and there probably won't be many blog entries until I've figured that out.

Once that's all straightened out, I should have a cool new PyTopo feature to report on, as well as some Arduino hacks I've had on the back burner for a while.

Tags: , ,
[ 10:57 Jul 03, 2011    More education | permalink to this entry | comments ]

Fri, 24 Jun 2011

Beginning Python, Lesson 2 posted

I've just posted Lesson 2 in my online Python course, covering loops, if statements, and beer! You can read it in the list archives: Lesson 2: Loops, if, and beer, or, better, subscribe to the list so you can join the discussion.

I hope everybody has fun writing loops!

Tags: , ,
[ 15:10 Jun 24, 2011    More education | permalink to this entry | comments ]

Thu, 16 Jun 2011

Beginning Programming in Python course starting

I'm about to start a new LinuxChix course: Beginning Programming in Python.

It will be held on the Linuxchix Courses mailing list: to follow the course, subscribe to the list. Lessons will be posted weekly, on Fridays, with the first lesson starting tomorrow, Friday, June 17.

This is intended a short course, probably only 4-5 weeks to start with, aimed mostly at people who are new to programming. Though of course anyone is welcome, even if you've programmed before. And experienced programmers are welcome to hang out, lurk and help answer questions. I might extended the course if people are still interested and having fun.

The course is free (just subscribe to the mailing list) and open to both women and men. Standard LinuxChix rules apply: Be polite, be helpful. And do the homework. :-)

Tags: , ,
[ 08:51 Jun 16, 2011    More education | permalink to this entry | comments ]

Fri, 20 May 2011

Packaging Python for MeeGo (or other RPM-based distros)

Writing Python scripts for MeeGo is easy. But how do you package a Python script in an RPM other MeeGo users can install?

It turned out to be far easier than I expected. Python and Ubuntu had all the tools I needed.

First you'll need a .desktop file describing your app, if you don't already have one. This gives window managers the information they need to show your icon and application name so the user can run it. Here's the one I wrote for PyTopo: pytopo.desktop.

Of course, you'll also want a desktop icon. Most other applications on MeeGo seemed to use 48x48 pixel PNG images, so that's what I made, though it seems to be quite flexible -- an SVG is ideal.

With your script, desktop file and an icon, you're ready to create a package.

Create a setup.py file describing your package, as in the distutils simple example or the more detailed distutils setup script page. For a sample standalone script with a desktop file and icon, you can take a look at my PyTopo setup.py.

Starting from the Python setup script, Python's distutils can generate RPM or even Windows packages -- assuming you have the appropriate tools installed on your machine.

I'm on an Ubuntu (Debian-based) machine, and all the docs imply you have to be on an RPM-based distro to make an RPM. Happily, that's not true: Ubuntu has RPM tools you can install.

$ sudo apt-get install rpm

Then let Python do its thing:

$ python setup.py bdist_rpm

Python generates the spec file and everything else needed and builds a multiarch RPM that's ready to install on MeeGo. You can install it by copying it to the MeeGo device with scp dist/PyTopo-1.0-1.noarch.rpm meego@address.of.device:/tmp/. Then, as root on the device, install it with rpm -i /tmp/PyTopo-1.0-1.noarch.rpm. You're done!

To see a working example, you can browse my latest PyTopo source (only what's in SVN; it needs a few more tweaks before it's ready for a formal release). Or try the RPM I made for MeeGo: PyTopo-1.0-1.noarch.rpm. I'd love to hear whether this works on other RPM-based distros.

What about Debian packages?

Curiously, making a Debian package on Debian/Ubuntu is much less straightforward even if you're starting on a Debian/Ubuntu machine. Distutils can't do it on its own. There's a Debian Python package recipe, but it begins with a caution that you shouldn't use it for a package you want to submit. For that, you probably have to wade through the Complete Ubuntu Packaging Guide. Clearly, that will need a separate article.

Tags: , , , ,
[ 17:44 May 20, 2011    More programming | permalink to this entry | comments ]

Fri, 13 May 2011

Children of the Code -- Derived Python projects

I got some fun email today -- two different people letting me know about new projects derived from my Python code.

One is M-Poker, originally based on a PyQt tutorial I wrote for Linux Planet. Ville Jyrkkä has taken that sketch and turned it into a real poker program. And it uses PySide now -- the new replacement for PyQt, and one I need to start using for MeeGo development. So I'll be taking a look at M-Poker myself and maybe learning things from it. There are some screenshots on the blog A Hacker's Life in Finland.

The other project is xkemu, a Python module for faking keypresses, grown out of pykey, a Python version of my Crikey keypress generation program. xkemu-server.py looks like a neat project -- you can run it and send it commands to generate key presses, rather than just running a script each time.

(Sniff) My children are going out into the world and joining other projects. I feel so proud. :-)

Tags: ,
[ 20:04 May 13, 2011    More programming | permalink to this entry | comments ]

Mon, 18 Apr 2011

A simple Python mixer (to solve a problem with sound in Natty)

I had to buy a new hard drive recently, and figured as long as I had a new install ahead of me, why not try the latest Ubuntu 11.04 beta, "Natty Narwhal"?

One of the things I noticed right away was that sound was really LOUD! -- and my usual volume keys weren't working to change that.

I have a simple setup under openbox: Meta-F7 and Meta-F8 call a shell script called "louder" and "softer" (two links to the same script), and depending on how it's invoked, the script calls aumix -v +4 or aumix -v -4.

Great, except it turns out aumix doesn't work -- at all -- under Natty (bug 684416). Rumor has it that Natty has dropped all support for OSS sound, though I don't know if that's actually true -- the bug has been sitting for four months without anyone commenting on it. (Ubuntu never seems terribly concerned about having programs in their repositories that completely fail to do anything; sadly, programs can persist that way for years.)

The command-line replacement for aumix seems to be amixer, but its documentation is sketchy at best. After a bit of experimentation, I found if I set the Master volume to 100% using alsamixergui, I could call amixer set PCM 4- or 4-. But I couldn't use amixer set Master 4+ -- sometimes it would work but most of the time it wouldn't.

That all seemed a bit too flaky for me -- surely there must be a better way? Some magic Python library? Sure enough, there's python-alsaaudio, and learning how to use it took a lot less time than I'd already wasted trying random amixer commands to see what worked. Here's the program:

#!/usr/bin/env python
# Set the volume louder or softer, depending on program name.

import alsaaudio, sys, os

increment = 4

# First find a mixer. Use the first one.
try :
    mixer = alsaaudio.Mixer('Master', 0)
except alsaaudio.ALSAAudioError :
    sys.stderr.write("No such mixer\n")
    sys.exit(1)

cur = mixer.getvolume()[0]
if os.path.basename(sys.argv[0]).startswith("louder") :
    mixer.setvolume(cur + increment, alsaaudio.MIXER_CHANNEL_ALL)
else :
    mixer.setvolume(cur - increment, alsaaudio.MIXER_CHANNEL_ALL)
print "Volume from", cur, "to", mixer.getvolume()[0]

Tags: , , ,
[ 20:13 Apr 18, 2011    More programming | permalink to this entry | comments ]

Fri, 18 Mar 2011

Finding Twitter references to you

Twitter is a bit frustrating when you try to have conversations there. You say something, then an hour later, someone replies to you (by making a tweet that includes your Twitter @handle). If you're away from your computer, or don't happen to be watching it with an eagle eye right then -- that's it, you'll never see it again. Some Twitter programs alert you to @ references even if they're old, but many programs don't.

Wouldn't it be nice if you could be notified regularly if anyone replied to your tweets, or mentioned you?

Happily, you can. The Twitter API is fairly simple; I wrote a Python function a while back to do searches in my Twitter app "twit", based on a code snippet I originally cribbed from Gwibber. But if you take out all the user interface code from twit and use just the simple JSON code, you get a nice short app. The full script is here: twitref, but the essence of it is this:

import sys, simplejson, urllib, urllib2

def get_search_data(query):
    s = simplejson.loads(urllib2.urlopen(
            urllib2.Request("http://search.twitter.com/search.json",
                            urllib.urlencode({"q": query}))).read())
    return s

def json_search(query):
    for data in get_search_data(query)["results"]:
        yield data

if __name__ == "__main__" :
    for searchterm in sys.argv[1:] :
        print "**** Tweets containing", searchterm
        statuses = json_search(searchterm)
        for st in statuses :
            print st['created_at']
            print "<%s> %s" % (st['from_user'], st['text'])
            print ""

You can run twitref @yourname from the commandline now and then. You can even call it as a cron job and mail yourself the output, if you want to make sure you see replies. Of course, you can use it to search for other patterns too, like twitref #vss or twitref #scale9x.

You'll need the simplejson Python library, which most distros offer as a package; on Ubuntu, install python-simplejson.

It's unclear how long any of this will continue to be supported, since Twitter recently announced that they disapprove of third-party apps using their API. Oh, well ... if Twitter stops allowing outside apps, I'm not sure how interested I'll be in continuing to use it.

On the other hand, their original announcement on Google Groups seems to have been removed -- I was going to link to it here and discovered it was no longer there. So maybe Twitter is listening to the outcry and re-thinking their position.

Tags: , , ,
[ 09:53 Mar 18, 2011    More programming | permalink to this entry | comments ]

Thu, 10 Mar 2011

On Linux Planet: Plotting mail logs with CairoPlot

[Pie chart showing origins of spam]

My latest LinuxPlanet article is on plotting pretty graphs from Python with CairoPlot.

Of course, to demonstrate a graphing package I needed some data. So I decided to plot some stats parsed from my Postfix mail log file. We bounce a lot of mail (mostly spam but some false positives from mis-configured email servers) that comes in with bogus HELO addresses. So I thought I'd take a graphical look at the geographical sources of those messages.

The majority were from IPs that weren't identifiable at all -- no reverse DNS info. But after that, the vast majority turned out to be, surprisingly, from .il (Israel) and .br (Brazil).

Surprised me! What fun to get useful and interesting data when I thought I was just looking for samples for an article.

Tags: , , ,
[ 14:08 Mar 10, 2011    More programming | permalink to this entry | comments ]

Tue, 22 Feb 2011

Python for (Cartalk) Puzzlers

Last week's Car Talk had a fun puzzler called "Three Pieces of Paper":

Three different numbers are chosen at random, and one is written on each of three slips of paper. The slips are then placed face down on the table. The objective is to choose the slip upon which is written the largest number.

Here are the rules: You can turn over any slip of paper and look at the amount written on it. If for any reason you think this is the largest, you're done; you keep it. Otherwise you discard it and turn over a second slip. Again, if you think this is the one with the biggest number, you keep that one and the game is over. If you don't, you discard that one too.

What are the odds of winning? The obvious answer is one in three, but you can do better than that. After thinking about it a little I figured out the strategy pretty quickly (I won't spoil it here; follow the link above to see the answer). But the question was: how often does the correct strategy give you the answer?

It made for a good "things to think about when trying to fall asleep" insomnia game. And I mostly convinced myself that the answer was 50%. But probability problems are tricky beasts (witness the Monty Hall Problem, which even professional mathematicians got wrong) and I wasn't confident about it. Even after hearing Click and Clack describe the answer on this week's show, asserting that the answer was 50%, I still wanted to prove it to myself.

Why not write a simple program? That way I could run lots of trials and see if the strategy wins 50% of the time.

So here's my silly Python program:

#! /usr/bin/env python

# Cartalk puzzler Feb 2011

import random, time

random.seed()

tot = 0
wins = 0

while True:
    # pick 3 numbers:
    n1 = random.randint(0, 100)
    n2 = random.randint(0, 100)
    n3 = random.randint(0, 100)

    # Always look at but discard the first number.
    # If the second number is greater than the first, stick with it;
    # otherwise choose the third number.
    if n2 > n1 :
        final = n2
    else :
        final = n3

    biggest = max(n1, n2, n3)
    win = (final == biggest)
    tot += 1
    if win :
        wins += 1
    print "%4d %4d %4d %10d %10s %6d/%-6d = %10d%%" % (n1, n2, n3, final,
                                                       str(win),
                                                       wins, tot,
                                                       int(wins*100/tot))
    if tot % 1000 == 0:
        print "(%d ...)" % tot
        time.sleep(1)

It chooses numbers between 0 and 100, for no particular reason; I could randomize that, but it wouldn't matter to the result. I made it print out all the outcomes, but pause for a second after every thousand trials ... otherwise the text scrolls too fast to read.

And indeed, the answer converges very rapidly to 50%. Hurray!

After I wrote the script, I checked Car Talk's website. They have a good breakdown of all the possible outcomes and how they map to a probability. Of course, I could have checked that first, before writing the program. But I was thinking about this in the car while driving home, with no access to the web ... and besides, isn't it always more fun to prove something to yourself than to take someone else's word for it?

Tags: , , ,
[ 20:17 Feb 22, 2011    More programming | permalink to this entry | comments ]

Fri, 18 Feb 2011

New GIMP Arrow Designer

[arrow] While writing a blog post on GIMP's confusing Auto button (to be posted soon), I needed some arrows, and discovered a bug in my Arrow Designer script when making arrows that are mostly vertical.

So I fixed it. You can get the new Arrow Designer 0.5 on my GIMP Arrow Designer page.

It's purely a coincidence that I discovered this a week before SCALE, where I'll be speaking on Writing GIMP Scripts and Plug-Ins. Arrow Designer is one of my showpieces for making interactive plug-ins with GIMP-Python, so I'm glad I noticed the bug when I did.

Tags: , ,
[ 20:28 Feb 18, 2011    More gimp | permalink to this entry | comments ]

Mon, 31 Jan 2011

Feedme 0.7

[FeedMe, Seymour!] I've been enjoying my Android tablet e-reader for a couple of months now ... and it's made me realize some of the shortcomings in FeedMe. So of course I've been making changes along the way -- quite a few of them, from handling multiple output file types (html, plucker, ePub or FictionBook) to smarter handling of start, end and skip patterns to a different format of the output directory.

It's been fairly solid for a few weeks now, so it's time to release ... FeedMe 0.7.

Tags: , , ,
[ 21:32 Jan 31, 2011    More programming | permalink to this entry | comments ]

Tue, 18 Jan 2011

X Terminal Colors (and dark and light backgrounds)

[Displaying colors in an xterm] At work, I'm testing some web programming on a server where we use a shared account -- everybody logs in as the same user. That wouldn't be a problem, except nearly all Linuxes are set up to use colors in programs like ls and vim that are only readable against a dark background. I prefer a light background (not white) for my terminal windows.

How, then, can I set things up so that both dark- and light-backgrounded people can use the account? I could set up a script that would set up a different set of aliases and configuration files, like when I changed my vim colors. Better, I could fix all of them at once by changing my terminal's idea of colors -- so when the remote machine thinks it's feeding me a light color, I see a dark one.

I use xterm, which has an easy way of setting colors: it has a list of 16 colors defined in X resources. So I can change them in ~/.Xdefaults.

That's all very well. But first I needed a way of seeing the existing colors, so I knew what needed changing, and of testing my changes.

Script to show all terminal colors

I thought I remembered once seeing a program to display terminal colors, but now that I needed one, I couldn't find it. Surely it should be trivial to write. Just find the escape sequences and write a script to substitute 0 through 15, right?

Except finding the escape sequences turned out to be harder than I expected. Sure, I found them -- lots of them, pages that conflicted with each other, most giving sequences that didn't do anything visible in my xterm.

Eventually I used script to capture output from a vim session to see what it used. It used <ESC>[38;5;Nm to set color N, and <ESC>[m to reset to the default color. This more or less agreed Wikipedia's ANSI escape code page, which says <ESC>[38;5; does "Set xterm-256 text coloor" with a note "Dubious - discuss". The discussion says this isn't very standard. That page also mentions the simpler sequence <ESC>[0;Nm to set the first 8 colors.

Okay, so why not write a script that shows both? Like this:

#! /usr/bin/env python

# Display the colors available in a terminal.

print "16-color mode:"
for color in range(0, 16) :
    for i in range(0, 3) :
        print "\033[0;%sm%02s\033[m" % (str(color + 30), str(color)),
    print

# Programs like ls and vim use the first 16 colors of the 256-color palette.
print "256-color mode:"
for color in range(0, 256) :
    for i in range(0, 3) :
        print "\033[38;5;%sm%03s\033[m" % (str(color), str(color)),
    print

Voilà! That shows the 8 colors I needed to see what vim and ls were doing, plus a lovely rainbow of other possible colors in case I ever want to do any serious ASCII graphics in my terminal.

Changing the X resources

The next step was to change the X resources. I started by looking for where the current resources were set, and found them in /etc/X11/app-defaults/XTerm-color:

$ grep color /etc/X11/app-defaults/XTerm-color
irrelevant stuff snipped
*VT100*color0: black
*VT100*color1: red3
*VT100*color2: green3
*VT100*color3: yellow3
*VT100*color4: blue2
*VT100*color5: magenta3
*VT100*color6: cyan3
*VT100*color7: gray90
*VT100*color8: gray50
*VT100*color9: red
*VT100*color10: green
*VT100*color11: yellow
*VT100*color12: rgb:5c/5c/ff
*VT100*color13: magenta
*VT100*color14: cyan
*VT100*color15: white
! Disclaimer: there are no standard colors used in terminal emulation.
! The choice for color4 and color12 is a tradeoff between contrast, depending
! on whether they are used for text or backgrounds.  Note that either color4 or
! color12 would be used for text, while only color4 would be used for a
! Originally color4/color12 were set to the names blue3/blue
!*VT100*color4: blue3
!*VT100*color12: blue
!*VT100*color4: DodgerBlue1
!*VT100*color12: SteelBlue1

So all I needed to do was take the ones that don't show up well -- yellow, green and so forth -- and change them to colors that work better, choosing from the color names in /etc/X11/rgb.txt or my own RGB values. So I added lines like this to my ~/.Xdefaults:

!! color2 was green3
*VT100*color2: green4
!! color8 was gray50
*VT100*color8: gray30
!! color10 was green
*VT100*color10: rgb:00/aa/00
!! color11 was yellow
*VT100*color11: dark orange
!! color14 was cyan
*VT100*color14: dark cyan
... and so on.

Now I can share accounts, and I no longer have to curse at those default ls and vim settings!

Update: Tip from Mikachu: ctlseqs.txt is an excellent reference on terminal control sequences.


Tags: , , , , ,
[ 09:56 Jan 18, 2011    More linux | permalink to this entry | comments ]

Tue, 04 Jan 2011

Fontasia v 0.5

[Fontasia, a font viewer and categorizer] I had a nice relaxing holiday season. A little too relaxing -- I didn't get much hacking done, and spent more time fighting with things that didn't work than making progress fixing things.

But I did spend quite a bit of time with my laptop, currently running Arch Linux, trying to get the fonts to work as well as they do in Ubuntu. I don't have a definite solution yet to my Arch font issues, but all the fiddling with fonts did lead me to realize that I needed an easier way to preview specific fonts in bold.

So I added Bold and Italic buttons to fontasia, and called it Fontasia 0.5. I'm finding it quite handy for previewing all my fixed-width fonts while trying to find one emacs can display.

Tags: , ,
[ 22:00 Jan 04, 2011    More programming | permalink to this entry | comments ]

Sat, 30 Oct 2010

New versions of mapping programs: Pytopo and Ellie

[pytopo logo] On our recent Mojave trip, as usual I spent some of the evenings reviewing maps and track logs from some of the neat places we explored.

There isn't really any existing open source program for offline mapping, something that works even when you don't have a network. So long ago, I wrote Pytopo, a little program that can take map tiles from a Windows program called Topo! (or tiles you generate yourself somehow) and let you navigate around in that map.

But in the last few years, a wonderful new source of map tiles has become available: OpenStreetMap. On my last desert trip, I whipped up some code to show OSM tiles, but a lot of the code was hacky and empirical because I couldn't find any documentation for details like the tile naming scheme.

Well, that's changed. Upon returning to civilization I discovered there's now a wonderful page explaining the Slippy map tilenames very clearly, with sample code and everything. And that was the missing piece -- from there, all the things I'd been missing in pytopo came together, and now it's a useful self-contained mapping script that can download its own tiles, and cache them so that when you lose net access, your maps don't disappear along with everything else.

Pytopo can show GPS track logs and waypoints, so you can see where you went as well as where you might want to go, and whether that road off to the right actually would have connected with where you thought you were heading.

It's all updated in svn and on the Pytopo page.

Ellie

[Ellie icon]

Most of the pytopo work came after returning from the desert, when I was able to google and find that OSM tile naming page. But while still out there and with no access to the web, I wanted to review the track logs from some of our hikes and see how much climbing we'd done. I have a simple package for plotting elevation from track logs, called Ellie. But when I ran it, I discovered that I'd never gotten around to installing the pylab Python plotting package (say that three times fast!) on this laptop.

No hope of installing the package without a net ... so instead, I tweaked Ellie so that so that without pylab you can still print out statistics like total climb. While I was at it I added total distance, time spent moving and time spent stopped. Not a big deal, but it gave me the numbers I wanted. It's available as ellie 0.3.

Tags: , ,
[ 18:24 Oct 30, 2010    More mapping | permalink to this entry | comments ]

Fri, 15 Oct 2010

Snakes on a Couch! Using Python with CouchDB

Part II of my CouchDB tutorial is out at Linux Planet. In it, I use Python and CouchDB to write a simple application that keeps track of which restaurants you've been to recently, and to suggest new places to eat where you haven't been.

Snakes on a Couch, Part 2: Where do you want to eat?

Tags: , , , ,
[ 20:00 Oct 15, 2010    More writing | permalink to this entry | comments ]

Thu, 23 Sep 2010

Snakes on a Couch! Using Python with CouchDB

I've been learning CouchDB, the hot NoSQL database, as part of my new job. It's interesting -- a very different mindset compared to classic databases like MySQL.

There's a fairly good Python package for it, python-couchdb ... but the documentation is somewhat incomplete and there's very little else written about it, and virtually no sample code to steal.

That makes it a perfect topic for a Linux Planet tutorial! So here it is, Part 1:

Snakes on a Couch! Using Python with CouchDB.

I have a rather fun application for the database I introduce in the article, but you'll have to wait until Part 2, two weeks from now, to see the details.

Tags: , , , ,
[ 10:55 Sep 23, 2010    More writing | permalink to this entry | comments ]

Fri, 03 Sep 2010

Fontasia v 0.3

A couple of weeks ago I posted about fontasia, my new font-chooser app. [Fontasia: font viewer/categorizer It's gone through a couple of revisions since then, and Mikael Magnusson contributed several excellent improvements, like being able to render each font in the font list.

I'd been holding off on posting 0.3, hoping to have time to do something about the font buttons -- they really need to be smaller, so there's space for more categories. But between a new job and several other commitments, I haven't had time to implement that. And the fancy font list is so cool it really ought to be shared.

So here it is: fontasia 0.3.

Tags: , ,
[ 09:31 Sep 03, 2010    More programming | permalink to this entry | comments ]

Tue, 17 Aug 2010

Fontasia: View and categorize your fonts

[Fontasia: font viewer/categorizer We were talking about fonts again on IRC, and how there really isn't any decent font viewer on Linux that lets you group fonts into categories.

Any time you need to choose a font -- perhaps you know you need one that's fixed-width, script, cartoony, western-themed -- you have to go through your entire font list, clicking one by one on hundreds of fonts and saving the relevant ones somehow so you can compare them later. If you have a lot of fonts installed, it can take an hour or more to choose the right font for a project.

There's a program called fontypython that does some font categorization, but it's hard to use: it doesn't operate on your installed fonts, only on fonts you copy into a special directory. I never quite understood that; I want to categorize the fonts I can actually use on my system.

I've been wanting to write a font categorizer for a long time, but I always trip up on finding documentation on getting Python to render fonts. But this time, when I googled, I found jan bodnar's ZetCode Pango tutorial, which gave me all I needed and I was off and running.

Fontasia is initially a font viewer. It shows all your fonts in a list on the left, with a preview on the right. But it also lets you add categories: just type the category name in the box and click Add category and a button for that category will appear, with the current font added to it. A font can be in multiple categories.

Once you've categorized your fonts, a menu at the top of the window lets you show just the fonts in a particular category. So if you're working on a project that needs a Western-style font, show that category and you'll see only relevant fonts.

You can also show only the fonts you've categorized -- that way you can exclude fonts you never use -- I don't speak Tamil or Urdu so I don't really need to see those fonts when I'm choosing a font. Or you can show only the uncategorized fonts: this is useful when you add some new fonts to your system and need to go through them and categorize them.

I'm excited about fontasia. It's only a few days old and already used it several times for real-world font selection problems.

If you want to try it, it's here: Fontasia: View and categorize fonts.

Tags: , ,
[ 11:20 Aug 17, 2010    More programming | permalink to this entry | comments ]

Sat, 10 Jul 2010

Interactive arrow design in GIMP

How many times have you wanted an easy way of making arrows in GIMP?

I need arrows all the time, for screenshots and diagrams. And there really isn't any easy way to do that in GIMP. There's a script-fu for making arrows in the Plug-in registry, but it's fiddly and always takes quite a few iterations to get it right. More often, I use a collection of arrow brushes I downloaded from somewhere -- I can't remember exactly where I got my collection, but there are lots of options if you google gimp arrow brushes -- then use the free rotate tool to rotate the arrow in the right direction.

[GIMP Arrow Designer] The topic of arrows came up again on #gimp yesterday, and Alexia Death mentioned her script-fu in GIMP Fx Foundary that "abuses the selection" to make shapes, like stars and polygons. She suggested that it would be easy to make arrows the same way, using the current selection as a guide to where the arrow should go.

And that got me thinking about Joao Bueno's neat Python plug-in demo that watches the size of the selection and updates a dialog every time the selection changes. Why not write an interactive Python script that monitors the selection and lets you change the arrow by changing the size of the selection, while fine-tuning the shape and size of the arrowhead interactively via a dialog?

Of course I had to write it. And it works great! I wish I'd written this five years ago.

This will also make a great demo for my OSCON 2010 talk on Writing GIMP Scripts and Plug-ins, Thursday July 22. I wish I'd had it for Libre Graphics Meeting last month.

It's here: GIMP Arrow Designer.

Tags: , , ,
[ 10:25 Jul 10, 2010    More gimp | permalink to this entry | comments ]

Fri, 16 Apr 2010

Tee in Python

I needed a way to send the output of a Python program to two places simultaneously: print it on-screen, and save it to a file.

Normally I'd use the Linux command tee for that: prog | tee prog.out saves a copy of the output to the file prog.out as well as printing it. That worked fine until I added something that needed to prompt the user for an answer. That doesn't work when you're piping through tee: the output gets buffered and doesn't show up when you need it to, even if you try to flush() it explicitly.

I investigated shell-based solutions: the output I need is on sterr, while Python's raw_input() user prompt uses stdout, so if I could get the shell to send stderr through tee without stdout, that would have worked. My preferred shell, tcsh, can't do this at all, but bash supposedly can. But the best examples I could find on the web, like the arcane prog 2>&1 >&3 3>&- | tee prog.out 3>&- didn't work.

I considered using /dev/tty or opening a pty, but those calls only work on Linux and Unix and the program is otherwise cross-platform.

What I really wanted was a class that acts like a standard Python file object, but when you write to it it writes to two places: the log file and stderr.

I found an example of someone trying to write a Python tee class, but it didn't work: it worked for write() but not for print >>

I am greatly indebted to KirkMcDonald of #python for finding the problem. In the Python source implementing >>, PyFile_WriteObject (line 2447) checks the object's type, and if it's subclassed from the built-in file object, it writes directly to the object's fd instead of calling write().

The solution is to use composition rather than inheritance. Don't make your file-like class inherit from file, but instead include a file object inside it. Like this:

import sys

class tee :
    def __init__(self, _fd1, _fd2) :
        self.fd1 = _fd1
        self.fd2 = _fd2

    def __del__(self) :
        if self.fd1 != sys.stdout and self.fd1 != sys.stderr :
            self.fd1.close()
        if self.fd2 != sys.stdout and self.fd2 != sys.stderr :
            self.fd2.close()

    def write(self, text) :
        self.fd1.write(text)
        self.fd2.write(text)

    def flush(self) :
        self.fd1.flush()
        self.fd2.flush()

stderrsav = sys.stderr
outputlog = open(logfilename, "w")
sys.stderr = tee(stderrsav, outputlog)

And it works! print >>sys.stderr, "Hello, world" now goes to the file as well as stderr, and raw_input still works to prompt the user for input.

In general, I'm told, it's not safe to inherit from Python's built-in objects like file, because they tend to make assumptions instead of making virtual calls to your overloaded methods. What happened here will happen for other objects too. So use composition instead when extending Python's built-in types.

Tags: ,
[ 08:48 Apr 16, 2010    More programming | permalink to this entry | comments ]

Fri, 08 Jan 2010

Python-GTK regression: How to catch mouse button release

We just had the second earthquake in two days, and I was chatting with someone about past earthquakes and wanted to measure the distance to some local landmarks. So I fired up PyTopo as the easiest way to do that. Click on one point, click on a second point and it prints distance and bearing from the first point to the second.

Except it didn't. In fact, clicks weren't working at all. And although I have hacked a bit on parts of pytopo (the most recent project was trying to get scaling working properly in tiles imported from OpenStreetMap), the click handling isn't something I've touched in quite a while.

It turned out that there's a regression in PyGTK: mouse button release events now need you to set an event mask for button presses as well as button releases. You need both, for some reason. So you now need code that looks like this:

drawing_area.connect("button-release-event", button_event)
drawing_area.set_events(gtk.gdk.EXPOSURE_MASK |
                        # next line wasn't needed before:
                        gtk.gdk.BUTTON_PRESS_MASK |
                        gtk.gdk.BUTTON_RELEASE_MASK )

An easy fix ... once you find it.

I filed bug 606453 to see whether the regression was intentional.

I've checked in the fix to the PyTopo svn repository on Google Code. It's so nice having a public source code repository like that! I'm planning to move Pho to Google Code soon.

Tags: , , ,
[ 13:20 Jan 08, 2010    More programming | permalink to this entry | comments ]

Wed, 25 Nov 2009

Character Sets and Encodings in Linux, part 2

Continuing the discussion of those funny characters you sometimes see in email or on web pages, today's Linux Planet article discusses how to convert and handle encoding errors, using Python or the command-line tool recode:

Mastering Characters Sets in Linux (Weird Characters, part 2).

Tags: , , , , , , ,
[ 14:06 Nov 25, 2009    More writing | permalink to this entry | comments ]

Wed, 11 Nov 2009

Building a Py-Webkit-GTK presentation tool

I almost always write my presentation slides using HTML. Usually I use Firefox to present them; it's the browser I normally run, so I know it's installd and the slides all work there. But there are several disadvantages to using Firefox:

Last year, when I was researching lightweight browsers, one of the ones that impressed me most was something I didn't expect: the demo app that comes with pywebkitgtk (package python-webkit on Ubuntu). In just a few lines of Python, you can create your own browser with any UI you like, with a fully functional content area. Their current demo even has tabs.

So why not use pywebkitgtk to create a simple fullscreen webkit-based presentation tool?

It was even simpler than I expected. Here's the code:

#!/usr/bin/env python
# python-gtk-webkit presentation program.
# Copyright (C) 2009 by Akkana Peck.
# Share and enjoy under the GPL v2 or later.

import sys
import gobject
import gtk
import webkit

class WebBrowser(gtk.Window):
    def __init__(self, url):
        gtk.Window.__init__(self)
        self.fullscreen()

        self._browser= webkit.WebView()
        self.add(self._browser)
        self.connect('destroy', gtk.main_quit)

        self._browser.open(url)
        self.show_all()

if __name__ == "__main__":
    if len(sys.argv) <= 1 :
        print "Usage:", sys.argv[0], "url"
        sys.exit(0)

    gobject.threads_init()
    webbrowser = WebBrowser(sys.argv[1])
    gtk.main()

That's all! No navigation needed, since the slides include javascript navigation to skip to the next slide, previous, beginning and end. It does need some way to quit (for now I kill it with ctrl-C) but that should be easy to add.

Webkit and image buffering

It works great. The only problem is that webkit's image loading turns out to be fairly poor compared to Firefox's. In a presentation where most slides are full-page images, webkit clears the browser screen to white, then loads the image, creating a noticable flash each time. Having the images in cache, by stepping through the slide show then starting from the beginning again, doesn't help much (these are local images on disk anyway, not loaded from the net). Firefox loads the same images with no flash and no perceptible delay.

I'm not sure if there's a solution. I asked some webkit developers and the only suggestion I got was to rewrite the javascript in the slides to do image preloading. I'd rather not do that -- it would complicate the slide code quite a bit solely for a problem that exists only in one library.

There might be some clever way to hack double-buffering in the app code. Perhaps something like catching the 'load-started' signal, switching to another gtk widget that's a static copy of the current page (if there's a way to do that), then switching back on 'load-finished'.

But that will be a separate article if I figure it out. Ideas welcome!

Update, years later: I've used this for quite a few real presentations now. Of course, I keep tweaking it: see my scripts page for the latest version.

Tags: , , , ,
[ 16:12 Nov 11, 2009    More programming | permalink to this entry | comments ]

Tue, 20 Oct 2009

Gathering RSS files for a Palm PDA: FeedMe

For years I've been reading daily news feeds on a series of PalmOS PDAs, using a program called Sitescooper that finds new pages on my list of sites, downloads them, then runs Plucker to translate them into Plucker's open Palm-compatible ebook format.

Sitescooper has an elaborate series of rules for trying to get around the complicated formatting in modern HTML web pages. It has an elaborate cache system to figure out what it's seen before. When sites change their design (which most news sites seem to do roughly monthly), it means going in and figuring out the new format and writing a new Sitescooper site file. And it doesn't understand RSS, so you can't use the simplified RSS that most sites offer. Finally, it's no longer maintained; in fact, I was the last maintainer, after the original author lost interest.

Several weeks ago, bma tweeted about a Python RSS reader he'd hacked up using the feedparser package. His reader targeted email, not Palm, but finding out about feedparser was enough to get me started. So I wrote FeedMe (Carla Schroder came up with the all-important name).

I've been using it for a couple of weeks now and I'm very happy with the results. It's still quite rough, of course, but it's already producing better files than Sitescooper did, and it seems more maintainable. Time will tell.

Of course it needs to be made more flexible, adjusted so that it can produce formats besides Plucker, and so on. I'll get to it.

And the only site I miss now, because it doesn't offer an RSS feed, is Linux Planet. Maybe I'll find a solution for that eventually.

Tags: , , , ,
[ 20:08 Oct 20, 2009    More programming | permalink to this entry | comments ]

Mon, 03 Aug 2009

Twit: Now with pattern searches

During OSCON a couple of weeks ago, I kept wishing I could do Twitter searches for a pattern like #oscon in a cleaner way than keeping a tab open in Firefox where I periodically hit Refresh.

Python-twitter doesn't support searches, alas, though it is part of the Twitter API. There's an experimental branch of python-twitter with searching, but I couldn't get it to work. But it turns out Gwibber is also written in Python, and I was able to lift some JSON code from Gwibber to implement a search. (Gwibber itself, alas, doesn't work for me: it bombs out looking for the Gnome keyring. Too bad, looks like it might be a decent client.)

I hacked up a "search for OSCON" program and used it a little during the week of the conference, then got home and absorbed in catching up and preparing for next week's GetSET summer camp, where I'm running an astronomy workshop and a Javascript workshop for high school girls. That's been keeping me frazzled, but I found a little time last night to clean up the search code and release Twit 0.3 with search and a few other new command-line arguments.

No big deal, but it was nice to take a hacking break from all this workshop coordinating. I'm definitely happier program than I am organizing events, that's for sure.

Tags: , ,
[ 17:23 Aug 03, 2009    More programming | permalink to this entry | comments ]

Thu, 09 Jul 2009

Twittering -- and writing Twitter clients

I finally dragged myself into 2009 and tried Twitter.

I'd been skeptical, but it's actually fairly interesting and not that much of a time sink. While it's true that some people tweet about every detail of their lives -- "I'm waiting for a bus" / "Oh, hooray, the bus is finally here" / "I got a good seat in the second row of the bus" / "The bus just passed Second St. and two kids got on" / "Here's a blurry photo from my phone of the Broadway Av. sign as we pass it" -- it's easy enough to identify those people and un-follow them.

And there are tons of people tweeting about interesting stuff. It's like a news ticker, but customizable -- news on the latest protests in Iran, the latest progress on freeing the Mars Spirit Rover, the latest interesting publication on dinosaur fossils, and what's going on at that interesting conference halfway around the world.

The trick is to figure out how you want the information delivered. I didn't want to have to leave a tab open in Firefox all the time. There was an xchat plug-in that sounded perfect -- I have an xchat window up most of the time I'm online -- but it turned out it works by picking one of the servers you're connected to, making a private channel and posting things there. That seemed abusive to the server -- what if everyone on Freenode did that?

So I wanted a separate client. Something lightweight and simple. Unfortunately, all the Twitter clients available for Linux either require that I install a lot of infrastructure first (either Adobe Air or Mono), or they just plain didn't work (a Twitter client where you can't click on links? Come on!)

But then I tried out the Python-Twitter bindings, and they were so easy to use I decided to write them up for my next Linux Planet article, which came out today: Write Your Own Linux Twitter Client In Less Time Than It Takes To Find One!.

The article shows how to use the bindings to write a bare-bones client. But of course, I've been hacking on the client all along, so the one I'm actually using has a lot more features like *ahem* letting you click on links. And letting you block threads, though I haven't actually tested that since I haven't seen any threads I wanted to block since my first day.

You can download the current version of Twit, and anyone who's interested can follow me on Twitter. I don't promise to be interesting -- that's up to you to decide -- but I do promise not to tweet about every block of my bus ride.

Tags: , , ,
[ 15:09 Jul 09, 2009    More writing | permalink to this entry | comments ]

Sat, 20 Jun 2009

Pytopo 0.8 released

On my last Mojave trip, I spent a lot of the evenings hacking on PyTopo.

I was going to try to stick to OpenStreetMap and other existing mapping applications like TangoGPS, a neat little smartphone app for downloading OpenStreetMap tiles that also runs on the desktop -- but really, there still isn't any mapping app that works well enough for exploring maps when you have no net connection.

In particular, uploading my GPS track logs after a day of mapping, I discovered that Tango really wasn't a good way of exploring them, and I already know Merkaartor, nice as it is for entering new OSM data, isn't very good at working offline. There I was, with PyTopo and a boring hotel room; I couldn't stop myself from tweaking a bit.

Adding tracklogs was gratifyingly easy. But other aspects of the code bother me, and when I started looking at what I might need to do to display those Tango/OSM tiles ... well, I've known for a while that some day I'd need to refactor PyTopo's code, and now was the time.

Surprisingly, I completed most of the refactoring on the trip. But even after the refactoring, displaying those OSM tiles turned out to be a lot harder than I'd hoped, because I couldn't find any reliable way of mapping a tile name to the coordinates of that tile. I haven't found any documentation on that anywhere, and Tango and several other programs all do it differently and get slightly different coordinates. That one problem was to occupy my spare time for weeks after I got home, and I still don't have it solved.

But meanwhile, the rest of the refactoring was done, nice features like track logs were working, and I've had to move on to other projects. I am going to finish the OSM tile MapCollection class, but why hold up a release with a lot of useful changes just for that?

So here's PyTopo 0.8, and the couple of known problems with the new features will have to wait for 0.9.

Tags: , , , ,
[ 19:49 Jun 20, 2009    More programming | permalink to this entry | comments ]

Fri, 19 Jun 2009

Python: show all methods in a given object or module

A silly little thing, but something that Python books mostly don't mention and I can never find via Google:

How do you find all the methods in a given class, object or module?

Ideally the documentation would tell you. Wouldn't that be nice? But in the real world, you can't count on that, and examining all of an object's available methods can often give you a good guess at how to do whatever you're trying to do.

Python objects keep their symbol table in a dictionary called __dict__ (that's two underscores on either end of the word). So just look at object.__dict__. If you just want the names of the functions, use object.__dict__.keys().

Thanks to JanC for suggesting dir(object) and help(object), which can be more helpful -- not all objects have a __dict__.

Tags: , ,
[ 11:44 Jun 19, 2009    More programming | permalink to this entry | comments ]

Sun, 14 Jun 2009

Programming With PyGTK, part 3: Key events and object oriented Python

Part 3 of "Graphical Python Programming With PyGTK" uses object-oriented Python to clean up the code from Part 2, and also adds handling of key events to get rid of that silly Quit button. PythonGTK Programming part 3: Screensaver, Objects, and User Input

Tags: , ,
[ 11:18 Jun 14, 2009    More writing | permalink to this entry | comments ]

Mon, 01 Jun 2009

A GPX file manager

Someone on the OSM newbies list asked how he could strip waypoints out of a GPX track file. Seems he has track logs of an interesting and mostly-unmapped place that he wants to add to openstreetmap, but there are some waypoints that shouldn't be included, and he wanted a good way of separating them out before uploading.

Most of the replies involved "just edit the XML." Sure, GPX files are pretty simple and readable XML -- but a user shouldn't ever have to do that! Gpsman and gpsbabel were also mentioned, but they're not terribly easy to use either.

That reminded me that I had another XML-parsing task I'd been wanting to write in Python: a way to split track files from my Garmin GPS.

Sometimes, after a day of mapping, I end up with several track segments in the same track log file. Maybe I mapped several different trails; maybe I didn't get a chance to upload one day's mapping before going out the next day. Invariably some of the segments are of zero length (I don't know why the Garmin does that, but it always does). Applications like merkaartor don't like this one bit, so I usually end up editing the XML file and splitting it into segments by hand. I'm comfortable with XML -- but it's still silly.

I already have some basic XML parsing as part of PyTopo and Ellie, so I know the parsing very easy to do. So, spurred on by the posting on OSM-newbies, I wrote a little GPX parser/splitter called gpxmgr. gpxmgr -l file.gpx can show you how many track logs are in the file; gpxmgr -w file.gpx can write new files for each non-zero track log. Add -p if you want to be prompted for each filename (otherwise it'll use the name of the track log, which might be something like "ACTIVE\ LOG\ #2").

How, you may wonder, does that help the original poster's need to separate out waypoints from track files? It doesn't. See, my GPS won't save tracklogs and waypoints in the same file, even if you want them that way; you have to use two separate gpsbabel commands to upload a track file and a waypoint file. So I don't actually know what a tracklog-plus-waypoint file looks like. If anyone wants to use gpxmgr to manage waypoints as well as tracks, send me a sample GPX file that combines them both.

Tags: , ,
[ 19:43 Jun 01, 2009    More mapping | permalink to this entry | comments ]

Thu, 28 May 2009

Programming With PyGTK, part 2: pretty screensaver-type graphics

Part 2 of Graphical Python Programming With PyGTK gets into how to do some cool Qix screensaver-style graphics, in: Graphical Python Programming With PyGTK, part 2: Write Your Own Screensaver.

There's also a digg link.

Tags: , ,
[ 17:09 May 28, 2009    More writing | permalink to this entry | comments ]

Thu, 14 May 2009

Graphical Python Programming With PyGTK

This week's Linux Planet article is another one on Python and graphical toolkits, but this time it's a little more advanced: Graphical Python Programming With PyGTK.

This one started out as a fun and whizzy screensaver sort of program that draws lots of pretty colors -- but I couldn't quite fit it all into one article, so that will have to wait for the sequel two weeks from now.

Tags: , ,
[ 18:53 May 14, 2009    More writing | permalink to this entry | comments ]

Thu, 23 Apr 2009

Linux Planet: GIMP Python Plugins, part II

Latest Linux Planet article: How to write a "blobify" GIMP plug-in in Python to make text look three-dimensional.

Creating a Fancy 3D-Effect GIMP Plugin in Python.

Tags: , ,
[ 10:46 Apr 23, 2009    More writing | permalink to this entry | comments ]

Thu, 09 Apr 2009

Linux Planet: Writing Plugins for GIMP in Python

Latest Linux Planet article: Part 1 of a two-parter on Writing GIMP scripts in Python. As usual, there's a Digg link too.

Tags: , ,
[ 21:21 Apr 09, 2009    More writing | permalink to this entry | comments ]

Thu, 26 Mar 2009

GUI Programming in Python For Beginners

Latest on Linux Planet: another introductory programming article, this time on Python's tkinter library: GUI Programming in Python For Beginners. (As usual, there's a Digg link and also a Reddit one.)

Tags: , ,
[ 15:36 Mar 26, 2009    More writing | permalink to this entry | comments ]

Tue, 03 Mar 2009

Ellie: Plot GPS elevation profiles

Ever since I got the GPS I've been wanting something that plots the elevation data it stores. There are lots of apps that will show me the track I followed in latitude and longitude, but I couldn't find anything that would plot elevations.

But GPX (the XML-based format commonly used to upload track logs) is very straightforward -- you can look at the file and read the elevations right out of it. I knew it wouldn't be hard to write a script to plot them in Python; it just needed a few quiet hours. Sounded like just the ticket for a rainy day stuck at home with a sore throat.

Sure enough, it was fairly easy. I used xml.dom.minidom to parse the file (I'd already had some experience with it in gimplabels for converting gLabels templates), and pylab from matplotlib for doing the plotting. Easy and nice looking.

I even threw in the nice "conditional main" code from Matt Harrison's SCALE7x Python talk, so it should be callable from other Python code.

Here's the page and a screenshot: Ellie: plot elevation from a GPS track.

Tags: , ,
[ 16:57 Mar 03, 2009    More programming | permalink to this entry | comments ]

Sat, 28 Feb 2009

langgrep: search only in scripts of a specified language

I was making a minor tweak to my garmin script that uses gpsbabel to read in tracklogs and waypoints from my GPS unit, and I needed to look up the syntax of how to do some little thing in sh script. (One of the hazards of switching languages a lot: you forget syntax details and have to look things up a lot, or at least I do.)

I have quite a collection of scripts in various languages in my ~/bin (plus, of course, all the scripts normally installed in /usr/bin on any Linux machine) so I knew I'd have lots of examples. But there are scripts of all languages sharing space in those directories; it's hard to find just sh examples. For about the two-hundredth time, I wished, "Wouldn't it be nice to have a command that can search for patterns only in files that are really sh scripts?"

And then, the inevitable followup ... "You know, that would be really easy to write."

So I did -- a little python hack called langgrep that takes a language, grep arguments and a file list, looks for a shebang line and only greps the files that have a shebang matching the specified language.

Of course, while writing langgrep I needed langgrep, to look up details of python syntax for things like string.find (I can never remember whether it's string.find(s, pat) or s.find(pat); the python libraries are usually nicely object-oriented but strings are an exception and it's the former, string.find). I experimented with various shell options -- this is Unix, so of course there are plenty of ways of doing this in the shell, without writing a script. For instance:

grep find `egrep -l '#\\!.*python' *`
grep find `file * | grep python | sed 's/:.*//'`
i in foo; file $i|grep python && grep find $i; done    # in sh/bash
These are all pretty straightforward, but when I try to make them into tcsh aliases things get a lot trickier. tcsh lets you make aliases that take arguments, so you can use !:1 to mean the first argument, !2-$ to mean all the arguments starting with the second one. That's all very well, but when you put them into a shell alias in a file like .cshrc that has to be parsed, characters like ! and $ can mean other things as well, so you have to escape them with \. So the second of those three lines above turns into something like
alias greplang "grep \!:2-$ `file * | grep \!:1 | sed 's/:.*//'`"
except that doesn't work either, so it probably needs more escaping somewhere. Anyway, I decided after a little alias hacking that figuring out the right collection of backslash escapes would probably take just as long as writing a python script to do the job, and writing the python script sounded more fun.

So here it is: my langgrep script. (Awful name, I know; better ideas welcome!) Use it like this (if python is the language you're looking for, find is the search pattern, and you want -w to find only "find" as a whole word):

langgrep python -w find ~/bin/*

Tags: , ,
[ 09:57 Feb 28, 2009    More programming | permalink to this entry | comments ]

Sat, 16 Aug 2008

Fast Pixel Ops in GIMP-Python

Last night Joao and I were on IRC helping someone who was learning to write gimp plug-ins. We got to talking about pixel operations and how to do them in Python. I offered my arclayer.py as an example of using pixel regions in gimp, but added that C is a lot faster for pixel operations. I wondered if reading directly from the tiles (then writing to a pixel region) might be faster.

But Joao knew a still faster way. As I understand it, one major reason Python is slow at pixel region operations compared to a C plug-in is that Python only writes to the region one pixel at a time, while C can write batches of pixels by row, column, etc. But it turns out you can grab a whole pixel region into a Python array, manipulate it as an array then write the whole array back to the region. He thought this would probably be quite a bit faster than writing to the pixel region for every pixel.

He showed me how to change the arclayer.py code to use arrays, and I tried it on a few test layers. Was it faster? I made a test I knew would take a long time in arclayer, a line of text about 1500 pixels wide. Tested it in the old arclayer; it took just over a minute to calculate the arc. Then I tried Joao's array version: timing with my wristwatch stopwatch, I call it about 1.7 seconds. Wow! That might be faster than the C version.

The updated, fast version (0.3) of arclayer.py is on my arclayer page.

If you just want the trick to using arrays, here it is:

from array import array

[ ... setting up ... ]
        # initialize the regions and get their contents into arrays:
        srcRgn = layer.get_pixel_rgn(0, 0, srcWidth, srcHeight,
                                     False, False)
        src_pixels = array("B", srcRgn[0:srcWidth, 0:srcHeight])

        dstRgn = destDrawable.get_pixel_rgn(0, 0, newWidth, newHeight,
                                            True, True)
        p_size = len(srcRgn[0,0])               
        dest_pixels = array("B", "\x00" * (newWidth * newHeight * p_size))

[ ... then inside the loop over x and y ... ]
                        src_pos = (x + srcWidth * y) * p_size
                        dest_pos = (newx + newWidth * newy) * p_size
                        
                        newval = src_pixels[src_pos: src_pos + p_size]
                        dest_pixels[dest_pos : dest_pos + p_size] = newval

[ ... when the loop is all finished ... ]
        # Copy the whole array back to the pixel region:
        dstRgn[0:newWidth, 0:newHeight] = dest_pixels.tostring() 

Good stuff!

Tags: , , ,
[ 21:02 Aug 16, 2008    More gimp | permalink to this entry | comments ]

Sun, 25 May 2008

Crikey in Python, and generating key events with XTest

A user on the One Laptop Per Child (OLPC, also known as the XO) platform wrote to ask me how to use crikey on that platform.

There are two stages to getting crikey running on a new platform:

  1. Build it, and
  2. Figure out how to make a key run a specific program.

The crikey page contains instructions I've collected for binding keys in various window managers, since that's usually the hard part. On normal Linux machines the first step is normally no problem. But apparently the OLPC comes with gcc but without make or the X header files. (Not too surprising: it's not a machine aimed at developers and I assume most people developing for the machine cross-compile from a more capable Linux box.)

We're still working on that (if my correspondant gets it working, I'll post the instructions), but while I was googling for information about the OLPC's X environment I stumbled upon a library I didn't know existed: python-xlib. It turns out it's possible to do most or all of what crikey does from Python. The OLPC is Python based; if I could write crikey in Python, it might solve the problem. So I whipped up a little key event generating script as a test.

Unfortunately, it didn't solve the OLPC problem (they don't include python-xlib on the machine either) but it was a fun exercises, and might be useful as an example of how to generate key events in python-xlib. It supports both event generating methods: the X Test extension and XSendEvent. Here's the script: /pykey-0.1.

But while I was debugging the X Test code, I had to solve a bug that I didn't remember ever solving in the C version of crikey. Sure enough, it needed the same fix I'd had to do in the python version. Two fixes, actually. First, when you send a fake key event through XTest, there's no way to specify a shift mask. So if you need a shifted character like A, you have to send KeyPress Shift, KeyPress a. But if that's all you send, XTest on some systems does exactly what the real key would do if held down and never released: it autorepeats. (But only for a little while, not forever. Go figure.)

So the real answer is to send KeyPress Shift, KeyPress a, KeyRelease a, KeyRelease Shift. Then everything works nicely. I've updated crikey accordingly and released version 0.7 (though since XTest isn't used by default, most users won't see any change from 0.6). In the XSendEvent case, crikey still doesn't send the KeyRelease event -- because some systems actually see it as another KeyPress. (Hey, what fun would computers be if they were consistent and always predictable, huh?)

Both C and Python versions are linked off the crikey page.

Tags: , , ,
[ 14:50 May 25, 2008    More programming | permalink to this entry | comments ]

Fri, 12 Oct 2007

PyTopo and PyGTK pixbuf memory leakage

On a recent Mojave desert trip, we tried to follow a minor dirt road that wasn't mapped correctly on any of the maps we had, and eventually had to retrace our steps. Back at the hotel, I fired up my trusty PyTopo on the East Mojave map set and tried to trace the road. But I found that as I scrolled along the road, things got slower and slower until it just wasn't usable any more.

PyTopo was taking up all of my poor laptop's memory. Why? Python is garbage collected -- you're not supposed to have to manage memory explicitly, like freeing pixbufs. I poked around in all the sample code and man pages I had available but couldn't find any pygtk examples that seemed to be doing any explicit freeing.

When we got back to civilization (read: internet access) I did some searching and found the key. It's even in the PyGTK Image FAQ, and there's also some discussion in a mailing list thread from 2003.

Turns out that although Python is supposed to handle its own garbage collection, the Python interpreter doesn't grok the size of a pixbuf object; in particular, it doesn't see the image bits as part of the object's size. So dereferencing lots of pixbuf objects doesn't trigger any "enough memory has been freed that it's time to run the garbage collector" actions.

The solution is easy enough: call gc.collect() explicitly after drawing a map (or any other time a bunch of pixbufs have been dereferenced).

So there's a new version of PyTopo, 0.6 that should run a lot better on small memory machines, plus a new collection format (yet another format from the packaged Topo! map sets) courtesy of Tom Trebisky.

Oh ... in case you're wondering, the ancient USGS maps from Topo! didn't show the road correctly either.

Tags: , , ,
[ 21:21 Oct 12, 2007    More programming | permalink to this entry | comments ]

Tue, 04 Sep 2007

Egg Timer in Python and TkInter

I left the water on too long in the garden again. I keep doing that: I'll set up something where I need to check back in five minutes or fifteen minutes, then I get involved in what I'm doing and 45 minutes later, the cornbread is burnt or the garden is flooded.

When I was growing up, my mom had a little mechanical egg timer. You twist the dial to 5 minutes or whatever, and it goes tick-tick-tick and then DING! I could probably find one of those to buy (they're probably all digital now and include clocks and USB plugs and bluetooth ports) but since the problem is always that I'm getting distracted by something on the computer, why not run an app there?

Of course, you can do this with shell commands. The simple solution is:

(sleep 300; zenity --info --text="Turn off the water!") &

But the zenity dialogs are small -- what if I don't notice it? -- and besides, I have to multiply by 60 to turn a minute delay into sleep seconds. I'm lazy -- I want the computer to do that for me!
Update: Ed Davies points out that "sleep 5m" also works.

A slightly more elaborate solution is at. Say something like: at now + 15 minutes and when it prompts for commands, type something like:

export DISPLAY=:0.0
zenity --info --text="Your cornbread is ready"
to pop up a window with a message. But that's too much typing and has the same problem of the small easily-ignored dialogs. I'd really rather have a great big red window that I can't possibly miss.

Surely, I thought, someone has already written a nice egg-timer application! I tried aptitude search timer and found several apps such as gtimer, which is much more complicated than I wanted (you can define named events and choose from a list of ... never mind, I stopped reading there). I tried googling, but didn't have much luck there either (lots of Windows and web apps, no Linux apps or cross-platform scripts).

Clearly just writing the damn thing was going to be easier than finding one. (Why is it that every time I want to do something simple on a computer, I have to write it? I feel so sorry for people who don't program.)

I wanted to do it in python, but what to use for the window that pops up? I've used python-gtk in the past, but I've been meaning to check out TkInter (the gui toolkit that's kinda-sorta part of Python) and this seemed like a nice opportunity since the goal was so simple.

The resulting script: eggtimer. Call it like this:

eggtimer 5 Turn off the water
and in five minutes, it will pop up a huge red window the size of the screen with your message in big letters. (Click it or hit a key to dismiss it.)

First Impressions of TkInter

It was good to have an excuse to try TkInter and compare it with python-gtk. TkInter has been recommended as something normally installed with Python, so the user doesn't have to install anything extra. This is apparently true on Windows (and maybe on Mac), but on Ubuntu it goes the other way: I already had pygtk, because GIMP uses it, but to use TkInter I had to install python-tk.

For developing I found TkInter irritating. Most of the irritation concerned the poor documentation: there are several tutorials demonstrating very basic uses, but not much detailed documentation for answering questions like "What class is the root Tk() window and what methods does it have?" (The best I found -- which never showed up in google, but was referenced from O'Reilly's Programming Python -- was here.) In contrast, python-gtk is very well documented.

Things I couldn't do (or, at least, couldn't figure out how to do, and googling found only postings from other people wanting to do the same thing):

I expect I'll be sticking with pygtk for future projects. It's just too hard figuring things out with no documentation. But it was fun having an excuse to try something new.

Tags: , ,
[ 13:35 Sep 04, 2007    More programming | permalink to this entry | comments ]

Fri, 25 Aug 2006

PyTopo 0.5

Belated release announcement: 0.5b2 of my little map viewer PyTopo has been working well, so I released 0.5 last week with only a few minor changes from the beta. I'm sure I'll immediately find six major bugs -- but hey, that's what point releases are for. I only did betas this time because of the changed configuration file format.

I also made a start on a documentation page for the .pytopo file (though it doesn't really have much that wasn't already written in comments inside the script).

Tags: , , , ,
[ 21:10 Aug 25, 2006    More programming | permalink to this entry | comments ]

Sat, 03 Jun 2006

Cleaner, More Flexible Python Map Viewing

A few months ago, someone contacted me who was trying to use my PyTopo map display script for a different set of map data, the Topo! National Parks series. We exchanged some email about the format the maps used.

I'd been wanting to make PyTopo more general anyway, and already had some hacky code in my local version to let it use a local geologic map that I'd chopped into segments. So, faced with an Actual User (always a good incentive!), I took the opportunity to clean up the code, use some of Python's support for classes, and introduce several classes of map data.

I called it 0.5 beta 1 since it wasn't well tested. But in the last few days, I had occasion to do some map exploring, cleaned up a few remaining bugs, and implemented a feature which I hadn't gotten around to implementing in the new framework (saving maps to a file).

I think it's ready to use now. I'm going to do some more testing: after visiting the USGS Open House today and watching Jim Lienkaemper's narrated Virtual Tour of the Hayward Fault, I'm all fired up about trying again to find more online geologic map data. But meanwhile, PyTopo is feature complete and has the known bugs fixed. The latest version is on the PyTopo page.

Tags: , , , ,
[ 17:25 Jun 03, 2006    More programming | permalink to this entry | comments ]

Tue, 21 Jun 2005

A Fast Volume Control App

I updated my Debian sid system yesterday, and discovered today that gnome-volume-control has changed their UI yet again. Now the window comes up with two tabs, Playback and Capture; the default tab, Playback, has only one slider in it, PCM, and all the important sliders, like Volume, are under Capture. (I'm told this is some interaction with how ALSA sees my sound chip.)

That's just silly. I've never liked the app anyway -- it takes forever to come up, so I end up missing too much of any clip that starts out quiet. All I need is a simple, fast window with a single slider controlling master volume. But nothing like that seems to exist, except panel applets that are tied to the panels of particular window managers.

So I wrote one, in PyGTK. vol is a simple script which shows a slider, and calls aumix under the hood to get and set the volume. It's horizontal by default; vol -h gives a vertical slider.

Aside: it's somewhat amazing that Python has no direct way to read an integer out of a string containing more than just that integer: for example, to read 70 out of "70,". I had to write a function to handle that. It's such a terrific no-nonsense language most of the time, yet so bad at a few things. (And when I asked about a general solution in the python channel at [large IRC network], I got a bunch of replies like "use int(str[0:2])" and "use int(str[0:-1])". Shock and bafflement ensued when I pointed out that 5, 100, and -27 are all integers too and wouldn't be handled by those approaches.)

Tags: , , ,
[ 14:54 Jun 21, 2005    More programming | permalink to this entry | comments ]

Wed, 13 Apr 2005

PyTopo 0.3

I needed to print some maps for one of my geology class field trips, so I added a "save current map" key to PyTopo (which saves to .gif, and then I print it with gimp-print). It calls montage from Image Magick.

Get yer PyTopo 0.3 here.

Tags: , , ,
[ 16:56 Apr 13, 2005    More programming | permalink to this entry | comments ]

Sat, 09 Apr 2005

Python Expose vs. Focus

A few days ago, I mentioned my woes regarding Python sending spurious expose events every time the drawing area gains or loses focus.

Since then, I've spoken with several gtk people, and investigated several workarounds, which I'm writing up here for the benefit of anyone else trying to solve this problem.

First, "it's a feature". What's happening is that the default focus in and out handlers for the drawing area (or perhaps its parent class) assume that any widget which gains keyboard focus needs to redraw its entire window (presumably because it's locate-highlighting and therefore changing color everywhere?) to indicate the focus change. Rather than let the widget decide that on its own, the focus handler forces the issue via this expose event. This may be a bad decision, and it doesn't agree with the gtk or pygtk documentation for what an expose event means, but it's been that way for long enough that I'm told it's unlikely to be changed now (people may be depending on the current behavior).

Especially if there are workarounds -- and there are.

I wrote that this happened only in pygtk and not C gtk, but I was wrong. The spurious expose events are only passed if the CAN_FOCUS flag is set. My C gtk test snippet did not need CAN_FOCUS, because the program from which it was taken, pho, already implements the simplest workaround: put the key-press handler on the window, rather than the drawing area. Window apparently does not have the focus/expose misbehavior.

I worry about this approach, though, because if there are any other UI elements in the window which need to respond to key events, they will never get the chance. I'd rather keep the events on the drawing area.

And that becomes possible by overriding the drawing area's default focus in/out handlers. Simply write a no-op handler which returns TRUE, and set it as the handler for both focus-in and focus-out. This is the solution I've taken (and I may change pho to do the same thing, though it's unlikely ever to be a problem in pho).

In C, there's a third workaround: query the default focus handlers, and disconnect() them. That is a little more efficient (you aren't calling your nop routines all the time) but it doesn't seem to be possible from pygtk: pygtk offers disconnect(), but there's no way to locate the default handlers in order to disconnect them.

But there's a fourth workaround which might work even in pygtk: derive a class from drawing area, and set the focus in and out handlers to null. I haven't actually tried this yet, but it may be the best approach for an app big enough that it needs its own UI classes.

One other thing: it was suggested that I should try using AccelGroups for my key bindings, instead of a key-press handler, and then I could even make the bindings user-configurable. Sounded great! AccelGroups turn out to be very easy to use, and a nice feature. But they also turn out to have undocumented limitations on what can and can't be an accelerator. In particular, the arrow keys can't be accelerators; which makes AccelGroup accelerators less than useful for a widget or app that needs to handle user-initiated scrolling or movement. Too bad!

Tags: , , ,
[ 20:52 Apr 09, 2005    More programming | permalink to this entry | comments ]

Wed, 06 Apr 2005

PyTopo is usable; pygtk is inefficient

While on vacation, I couldn't resist tweaking pytopo so that I could use it to explore some of the areas we were visiting.

It seems fairly usable now. You can scroll around, zoom in and out to change between the two different map series, and get the coordinates of a particular location by clicking. I celebrated by making a page for it, with a silly tux-peering-over-map icon.

One annoyance: it repaints every time it gets a focus in or out, which means, for people like me who use mouse focus, that it repaints twice for each time the mouse moves over the window. This isn't visible, but it would drag the CPU down a bit on a slow machine (which matters since mapping programs are particularly useful on laptops and handhelds).

It turns out this is a pygtk problem: any pygtk drawing area window gets spurious Expose events every time the focus changes (whether or not you've asked to track focus events), and it reports that the whole window needs to be repainted, and doesn't seem to be distinguishable in any way from a real Expose event. The regular gtk libraries (called from C) don't do this, nor do Xlib C programs; only pygtk.

I filed bug 172842 on pygtk; perhaps someone will come up with a workaround, though the couple of pygtk developers I found on #pygtk couldn't think of one (and said I shouldn't worry about it since most people don't use pointer focus ... sigh).

Tags: , , , ,
[ 16:26 Apr 06, 2005    More programming | permalink to this entry | comments ]

Sun, 27 Mar 2005

Python GTK Topographic Map Program

I couldn't stop myself -- I wrote up a little topo map viewer in PyGTK, so I can move around with arrow keys or by clicking near the edges. It makes it a lot easier to navigate the map directory if I don't know the exact starting coordinates.

It's called PyTopo, and it's in the same place as my earlier two topo scripts.

I think CoordsToFilename has some bugs; the data CD also has some holes, and some directories don't seem to exist in the expected place. I haven't figured that out yet.

Tags: , , , ,
[ 17:53 Mar 27, 2005    More programming | permalink to this entry | comments ]

Topographic Maps for Linux

I've long wished for something like those topographic map packages I keep seeing in stores. The USGS (US Geological Survey) sells digitized versions of their maps, but there's a hefty setup fee for setting up an order, so it's only reasonable when buying large collections all at once.

There are various Linux mapping applications which do things like download squillions of small map sections from online mapping sites, but they're all highly GPS oriented and I haven't had much luck getting them to work without one. I don't (yet?) have a GPS; but even if I had one, I usually want to make maps for places I've been or might go, not for where I am right now. (I don't generally carry a laptop along on hikes!)

The Topo! map/software packages sold in camping/hiking stores (sometimes under the aegis of National Geographic are very reasonably priced. But of course, the software is written for Windows (and maybe also Mac), not much help to Linux users, and the box gives no indication of the format of the data. Googling is no help; it seems no Linux user has ever tried buying one of these packages to see what's inside. The employees at my local outdoor equipment store (Mel Cotton's) were very nice without knowing the answer, and offered the sensible suggestion of calling the phone number on the box, which turns out to be a small local company, "Wildflower Productions", located in San Francisco.

Calling Wildflower, alas, results in an all too familiar runaround: a touchtone menu tree where no path results in the possibility of contact with a human. Sometimes I wonder why companies bother to list a phone number at all, when they obviously have no intention of letting anyone call in.

Concluding that the only way to find out was to buy one, I did so. A worthwhile experiment, as it turned out! The maps inside are simple GIF files, digitized from the USGS 7.5-minute series and, wonder of wonders, also from the discontinued but still useful 15-minute series. Each directory contains GIF files covering the area of one 7.5 minute map, in small .75-minute square pieces, including pieces of the 15-minute map covering the same area.

A few minutes of hacking with python and Image Magick resulted in a script to stitch together all images in one directory to make one full USGS 7.5 minute map; after a few hours of hacking, I can stitch a map of arbitrary size given start and end longitude and latitude. My initial scripts, such as they are.

Of course, I don't yet have nicities like a key, or an interactive scrolling window, or interpretation of the USGS digital elevation data. I expect I have more work to do. But for now, just being able to generate and print maps for a specific area is a huge boon, especially with all the mapping we're doing in Field Geology class. GIMP's "measure" tool will come in handy for measuring distances and angles!

Tags: , , ,
[ 11:13 Mar 27, 2005    More programming | permalink to this entry | comments ]

Syndicated on:
LinuxChix Live
Ubuntu Women
Women in Free Software
Graphics Planet
Ubuntu California
Planet Openbox
Planet LCA2009

Friends' Blogs:
Ups & Downs
DailyBBG
Long Live the Village Green
Dan Heller
Morris "Mojo" Jones
Jane Houston Jones

Other Blogs:
DevChix
Scott Adams
Dave Barry
BoingBoing (Cory Doctorow)
Young Female Scientist

Powered by PyBlosxom.