Welcome to the Project Tango blog. At the moment the project is in its infant stages, so bear with us.

Tango began as a series of conversations between the NINES group and other parties around the issue of accessibility to out-of-print scholarly works copyrighted before the advent of the internet, but also about the future of books in general. The name for the project came from a conversation between Jerome McGann and Madelyn Wessel, our resident copyright expert. Publishers and scholars most learn how to tango together, quipped Wessel and the rest brings us here.

I joined NINES in the summer as one of their fellows along with Annie Swafford and Michael Pickard and we were immediately recruited to the Tango project. At the time, Jerome McGann, Andrew Stauffer and Dana Wheeles (@bluesaepe) were in the thick of brainstorming adequate solutions for these out-of-print scholarly works around the usual suspects: production, stewardship and copyright. In the absence of an umbrella institution that could coordinate these issues, the main question was how to resolve the problems in a way that would not depend on such an institution, but that would still revolve around a collectivity. What you see here is the result of our continued conversations and we offer them to the public with a healthy dose of both skepticism and drive. We encourage you to join our conversation.

Storage and Stewardship: The idea is for these texts to be made available to anyone with internet access, free of charge and in their integrity —no need to replicate Google Books, after all. There are many possibilities here, and we are exploring most of them fearlessly. There’s HathiTrust, various University repositories, Open Library, an open subset of JStor, et al. This stage of the project we feel safe to punt on for now, because we must make sure that copyright and production can be taken care of before we go too far out to sea.

Copyright: To rescue these texts from the copyright limbo they inhabit, made evident by the Google Books affair, we would send a standardized letter to authors with a template attached meant for the publishers. Because contracts signed before the Internet had no proviso for digital publication, we figured a simple addendum in the form of a written agreement between author and publisher would suffice. The allure for the authors is immediate, since their often neglected works would once more reach their community of interest.

For publishers it may not be a bad deal either, since we would add value in the form of proofing and metadata, while reserving for them any profits to be made —from say a print-on-demand venture— by adding an exception to the typical creative commons attribution, share-alike, non-commercial license. Our target publishers would be small-to-medium sized presses that need help transitioning to the web. In the beginning, the selection of works necessarily have to come from the scholars themselves, prioritizing in a sense the rescue of works that are pertinent to scholarship today. Nevertheless, we imagine the possibility of a clearinghouse for pre-established lists. Because we are talking about atomized copyright retrieval, and as you will soon see, atomized production, in the end this would only work if the model we are offering spreads across the land, and if that monadic work is joined at the hip to a collectivity of the sort that NINES exemplifies. Which brings me to the most exciting and perhaps the most daring part of the project.


The product so far is simple and familiar: A PDF image with well-proofed, searchable text behind it and companion RDF. Mass production, not that simple. In order to generate enough PDFs of out-of-print works to make the model an efficient way to publish in the long run, the solution had to involve crowd-sourcing of some sort, and this is where McGann had one of those moments he has. We could kill two birds with one stone, he suggested, by using the model to teach undergraduate English majors about digital humanities and textual criticism. I must confess he had me at “teach undergraduate English majors the basics of digital humanities and textual criticism.” And to put a cherry on top, he added, the model should also involve a graduate student (or two) to run digital workshops, effectively involving all levels of the academic food-chain in the larger project of digital humanities. While the professor can go about his business, teaching lit as best suits his fancy, the teaching assistant(s) in the workshops can focus on DH related to the materials. (Jerry’s class is pursuing interpretational method via a performative, recitation-based approach to the study of poetic form and meaning). I like to think of the workshops as PHYS lab for lit-heads, minus the 1 extra credit-hour.

As we talked more and more about the pedagogical model, and in no small part thanks to the insight of Bethany Nowviskie, (@nowviskie) we saw that the production of PDFs would have to make room for the digitization, mark-up and digital analysis of primary sources. With these added techniques the students could get a more complete picture of what it is that we do around here, while broadening their understanding of the readings. If done right, we are offering the next generation an early start on the ongoing mass migration of paper to bytes that another generation of scholars begun in the 90s.

We are launching our alpha model in Prof. McGann’s ENAM 4500, English and American Poetry of the Nineteenth Century class this semester. About a week ago, Chris Forster (@cforster) joined the team, and together we will be running the said workshops and writing this blog to serve as both documentation and testament to this adventure. You should be hearing from him soon enough. Together with the blog entries, we will post all materials and tutorials used for the class under a Creative Commons license so feel free to use them as the spirit moves you. In the meantime, let me briefly describe the digital component of the class as it stands.

Digital workshops will be conducted outside of class. Students will be required to come to group workshops for 1 hour a week, which will alternate between instruction and discussion, and for another 4 hours, the graduate assistants (me and Chris in this case) will provide tech support in the form of office hours at the Scholars Lab. We initially estimate that students will be working on the workshop an average of 3 hours outside of class every week. You can read the schedule for the digital workshop and in essence the digital skills we will be working with here.

The class will be divided into 2 student projects corresponding to the midterm and the final. Project alpha is the pdf of the out-of-print scholarly work. Here is where they learn the production of high-end, well-proofed PDFs of secondary sources. The texts for this exercise were chosen carefully to harmonize with the content of the seminar, and students will be reading from them besides reproducing them. Because the proof will be on the proofing, this exercise also allows for the teaching of matters that have fallen under the purview of traditional textual criticism.

The second project is the production of a digital variorum of a small primary source, in this case of a couple of poems by Edgar Allan Poe. In this half, students will get their first introduction to mark-up in the form of a reduced version of TEI. You may have already seen the original TEI handout that I produced while Annie, Mike and I tested out the digitization model during the summer break. Since the original handout was written at a time when we were envisioning the students working only with secondary sources, we will adapt it for primary sources when the time comes.

After the different instantiations of the poems are marked up properly, the students will then be required to run them through the new version of Juxta which is about to be released with spanking new TEI functionality. We expect to teach students not only how to use the software effectively but also how to draw stemmatic conclusions from the comparisons —once again linking the digital component to the practice of textual scholarship.

So here we are at the University of Virginia, at the beginning of the semester of Prof. McGann’s ENAM4500, American and British poetry in the XIXth Century, and we are about to embark on a fascinating pedagogical/publishing experiment. A few days ago was the first day of class. We have a group of 8 students that we’ve divided into two teams of 4 each. Each team has a leader in charge of coordinating the group’s activities and who responds to us. Each team also has a flash drive to help them coordinate their files. As you can see, next week we begin with the ever-dreaded, but oh-so necessary scanning. Please join us then, as we document the plight of young 20-somethings as they battle against the 300dpi TIFF monster and the gargantuan scanners of the Scholars Lab.


Alex Gil is Digital Scholarship Coordinator at Columbia University for the Humanities and History division. His research revolves around otr-American literatures and culture, digital humanities and critical theory. His dissertation traced the miraculous trajectory of Aimé Césaire's "Et les chiens se taisaient" in the fields of legology.
