Importing Cookies from a Firefox Profile in Python (Shallow Thoughts)

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Fri, 03 Dec 2021

Importing Cookies from a Firefox Profile in Python

I wrote at length about my explorations into selenium to fetch stories from the New York Times (as a subscriber). But I mentioned in Part III that there was a much easier way to fetch those stories, as long as the stories didn't need JavaScript.

That way is to use normal file fetching (using urllib or requests), but with a CookieJar object containing the cookies from a Firefox session where I'd logged in.

FeedMe was already using an empty CookieJar, since some sites die or go into infinite loops if they can't set cookies. Its CookieJar started out empty and just let each site write cookies as they saw fit.

from http.cookiejar import CookieJar
import urllib.request, urllib.error, urllib.parse

cookiejar = CookieJar()

opener = urllib.request.build_opener(
    urllib.request.HTTPCookieProcessor(self.cookiejar))
response = opener.open(request, timeout=100)

FeedMe uses the built-in urllib rather than requests, because the code is old, and since urllib works fine, I've never gotten around to rewriting it. But it's even easier with requests:

response = requests.get(url, cookies=cookiejar)

That just left importing cookies from a Mozilla profile.

http.cookiejar includes a class called MozillaCookieJar. So it sounds like the functionality is already there, right?

Well, no. From the documentation linked in the previous paragraph:

class http.cookiejar.MozillaCookieJar(filename, delayload=None, policy=None)

A FileCookieJar that can load from and save cookies to disk in the Mozilla cookies.txt file format (which is also used by the Lynx and Netscape browsers).

Firefox stopped using the cookies.txt format around 2008, as best I can determine, when they switched to using cookies.sqlite instead. There was a bug on MozillaCookieJar filed back then on the issue, with a patch, but the bug was rejected because the Python 2.6/3.0 release was about to happen, and the bug was closed at that time rather than merely being postponed. I filed a new bug hoping to re-raise the issue.

But meanwhile, the only way to use a MozillaCookieJar is to write code to read the sqlite file and translate it to the old cookies.txt format. The best code I've found for doing that comes from a 2009 blog post: Reading Firefox 3.x cookies in Python which I found via a StackOverflow thread, Accessing Firefox 3 cookies in Python. The code is in both places, so I needn't repeat it here.

The method is a little squinchy, using a StringIO to emulate a cookies.txt file, but it works fine, at least until such time as someone sees fit to replace the almost 15 years out of date MozillaCookieJar code with something that actually works.

Tags: , , , ,
[ 12:22 Dec 03, 2021    More programming | permalink to this entry | ]

Comments via Disqus:

blog comments powered by Disqus