The Library of Congress Loves Every Kind of Tweet, but It Can’t, Can’t Search Optimize Every Tweet

Elsewhere on the internet

By Susana Polo Jan 4th, 2013, 4:19 pm

Recommended Videos

Back in 2010, the U.S. Library of Congress announced that it would be archiving every public tweet made since 2006. They’re back again today, to say “Yeah, so. You guys tweet, like, a lot.”

In the nearly three years since announcing their initiative, the Library of Congress has actually managed to create its archive of every public tweet between 2006 and 2010, and has developed a complete system for taking in and archiving everything that comes out of twitter and saving it to their servers. That means that these days, they’re taking in about half a billion tweets per day. So what’s the problem, you ask?

Well, the Library of Congress, being a place where people do research, doesn’t feel like the archive is really up to their standards yet. Primarily because performing one search query on it can take about twenty four hours to complete. From their announcement:

The Library has assessed existing software and hardware solutions that divide and simultaneously search large data sets to reduce search time, so-called “distributed and parallel computing.” To achieve a significant reduction of search time, however, would require an extensive infrastructure of hundreds if not thousands of servers. This is costprohibitive and impractical for a public institution.

The Library’s ultimate goal is to create an archive that offers “free, indexed, and searchable access” to legislative researchers and scholars, and the fact is that the technology simply isn’t there yet. But the Library is working on it, with folks from Twitter and Gnip, the company that collects their tweets for them, and with researchers themselves.

But until they work things out, all that embarassing stuff you tweeted while watching “The End of Time” and hope no one ever finds again is safe.

(via Gizmodo.)

Have a tip we should know? [email protected]

Filed Under:

Library of Congress Twitter

Follow The Mary Sue:

Susana Polo - Editor at Large

Susana Polo thought she'd get her Creative Writing degree from Oberlin, work a crap job, and fake it until she made it into comics. Instead she stumbled into a great job: founding and running this very website (she's Editor at Large now, very fancy). She's spoken at events like Geek Girl Con, New York Comic Con, and Comic Book City Con, wants to get a Batwoman tattoo and write a graphic novel, and one of her canine teeth is in backwards.

Qween Jean Makes History as First Trans Woman to Win a Tony Award: ‘We Have to Shift the Paradigm’

I Witnessed a God Perform AKA I Went to the Vampire Lestat Concert in New York City

Daisy Ridley Reading Lena Dunham’s Book Is the Most Delicious Kind of Chaos For a Friday

Late Night Icon Kimmel Says He Felt Defeated After Uncovering a Bizarre Truth About Network Television

Can I Convince You to Watch the Hot Vampire Show Yet?

The Library of Congress Loves Every Kind of Tweet, but It Can’t, Can’t Search Optimize Every Tweet

Elsewhere on the internet

Filed Under:

Follow The Mary Sue: