One of the closing lightning talks at PyCon this year concerned the answers to a list of Python programming puzzles given at some other point during the conference. I hadn't seen the questions (I'm still not sure where they are), but some of the problems looked fun.
One of them was: "What are the letters not used in Python keywords?"
I hadn't known about Python's
keyword module, which could
come in handy some day:
>>> import keyword >>> keyword.kwlist ['and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'exec', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'not', 'or', 'pass', 'print', 'raise', 'return', 'try', 'while', 'with', 'yield']
So, given the list of keywords, what's the best way to find the list of unique letters?
Any time you want a list of unique anything, you want a
>>> set([1, 2, 3, 2, 2, 4, 5, 1, 5]) set([1, 2, 3, 4, 5])But first you need a list of letters so can make a set out of it.
Split the list of words into a list of letters
My first idea was to use list comprehensions. You can split a single word into letters like this:
>>> [ x for x in 'hello' ] ['h', 'e', 'l', 'l', 'o']
It took a bit of fiddling to get the right syntax to apply that to every word in the list:
>>> [[c for c in w] for w in keyword.kwlist] [['a', 'n', 'd'], ['a', 's'], ['a', 's', 's', 'e', 'r', 't'], ... ]
Update: Dave Foster points out that
[list(w) for w in keyword.kwlist] is another way,
simpler and cleaner way than the double list comprehension.
That's a list of lists, so it needs to be flattened into a single list of letters before we can turn it into a set.
Flatten the list of lists
There are lots of ways to flatten a list of lists. Here are four of them:
[item for sublist in [[c for c in w] for w in keyword.kwlist] for item in sublist] reduce(lambda x,y: x+y, [[c for c in w] for w in keyword.kwlist]) import itertools list(itertools.chain.from_iterable([[c for c in w] for w in keyword.kwlist])) sum([[c for c in w] for w in keyword.kwlist], )
That last one, using sum(), makes use of the fact that
Python uses + for list concatenation -- in other words, that
[1, 2, 3] + [4, 5, 6] is
[1, 2, 3, 4, 5, 6].
But the first method (
item for sublist in) is faster: see
Making a flat list out of list of lists in Python
And another StackOverflow thread has a
for plotting speed vs. list size of various flatteners.
A simpler way of making the set
But it turns out none of this list comprehension stuff is needed anyway.
set('word') splits words into letters already:
>>> set('bubble') set(['e', 'b', 'u', 'l'])Ignore the order -- elements of a set often end up displaying in some strange order. The important thing is that it has all the letters and no repeats.
Now we have an easy way of making a set containing the letters in one word. But how do we apply that to a list of words?
Again I initially tried using list comprehensions, then realized there's an easier way. Given a list of strings, it's trivial to join them into a single string using ''.join(). And that gives us our set of letters within keywords:
>>> set(''.join(keyword.kwlist)) set(['a', 'c', 'b', 'e', 'd', 'g', 'f', 'i', 'h', 'k', 'm', 'l', 'o', 'n', 'p', 's', 'r', 'u', 't', 'w', 'y', 'x'])
What letters are not in the set?
Almost done! But the original problem was to find the letters not in
keywords. We can do that by subtracting this set from the set of all
letters from a to z. How do we get that? The
module will give us a list:
>>> string.lowercase 'abcdefghijklmnopqrstuvwxyz'
You could also use a list comprehension and
range won't give you a range of
>>> [chr(i) for i in range(ord('a'), ord('z')+1)] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']It's a bit longer, but doesn't require an import.
Now that you have your a-z set, just subtract the two sets:
>>> set(string.lowercase[:]) - set(''.join(keyword.kwlist)) set(['q', 'j', 'z', 'v'])
So the only letters not used in Python keywords are q, j, z and v.
Just a useless little ditty, really ... but I thought it was a fun exercise, so maybe you will too.
[ 12:36 Mar 19, 2013 More programming | permalink to this entry | comments ]