Wow the state-of-the-art for parsing HTML with the Python standard library is... pretty far back, huh

@xor I'd just use BeautifulSoup4 with the lxml backend

@divergentdave as a challenge I'm trying to stick to the standard library and hoo boy it is way less good

(I got it to work with regex but like...)


@xor (extremely xkcd voice) the HTML5 spec, with its standardization of parsing, postdates Python 3 by six years!

Sign in to participate in the conversation

A community centered on the Twin Cities of Minneapolis and St. Paul, Minnesota, and their surrounding region. Predominantly queer with a focus on urban and social justice issues.