Mining the Minds of Congress
August 3rd, 2006 by Josh
What does Congress really “care” about? Some political scientists tried to answer that question, using some rather innovative means. They have data-mined the text (over 70 million words in over 70,000 documents) of the Congressional Record.
The Congressional Record contains verbatim transcripts of all speeches on the floor of the House and the Senate. The team tried to determine the hot topics in Congress, and how they change over time, based on how much they are talked about. What’s so special about that?
What’s exciting about this project and others like it is that computers are at last capable of unsupervised, dynamic analysis, and they can produce meaningful results with little or no intervention (humans will still be required to interpret the results, of course). The researchers in this project turned their software loose on 70 million words of Congressional debate without doing any initial topic coding.
The computer was able to group speeches into topics, even when those speeches did not feature certain usual keywords. In one example, a speech that contained the words “terrorism,” “medical,” and “psychological” was correctly lumped in with other education speeches, even though it contained terms not normally found in an education speech (traditional searches would fail at this test). Once the computer has done its statistical analysis and grouped speeches into topic clusters, researchers then looked at a few speeches from each cluster and assigned a name to it (”education” or “terrorism,” for instance). Once that was done, interesting questions could be answered.
Among their findings is that talk about “judicial nominations” increased steadily from 1997 to 2004 whereas attention to “abortion” has decreased, going from about 5% of floor time in the 105th Congress to 1% in the 108th (we’re currently in the 109th).
I haven’t read the report, but it looks interesting. It’s definitely not light reading, though. . . much of the math is beyond what I can follow. Still, being the computer geek that I am, I think this is pretty cool.
Category: Politics, Techno-Geekery | 1 Comment »

