You are looking at archived content from my "Bookworm" blog, an experiment that ran from 2014-2016. Not all content may work. For current posts, see here.

Posts with tag politics

Back to all posts
Hansard Dec 14 2015

A first pass at understanding the potential of the Hansard corpus through a Bookworm browser.

I’ve divided up the native XML by using the intrinsic speaker tag into a variety of individual speeches.

A “speech” can be very short; on average, each one in the Hansard corpus is 225 words.