A little over a year and and a half ago I posted something of an angry diatribe against the Harry Ransom Center and the academic community in general regarding the loss of my job and the consequent end of the Fragments Project. Now that I’ve had some time to ruminate on that experience from the vantage point of a public servant working outside the ivory tower, I want to offer a softer interpretation.

I do not hold responsible the director of the Ransom Center nor any of it’s staff nor anyone else involved in mentoring me or employing me during the 7 years in which I endeavored to find a permanent position in the library world  for my life or the trajectory of the Project. We are all struggling to do the best that we can and too often those who are in leadership have to make difficult decisions for the sake of the future of an institution that can have tragic consequences for a few. Mentors offer advice that may have served them well when they were moving up in the world, without realizing that changing times present new and different challenges. The Fragments Project was fun, educational, and served to increase awareness and interest in a relatively obscure but important topic. I would hate for that legacy to be marred by the resentfulness of it’s creator.

So think big, go big, and don’t forget to watch out for the switchbacks. Godspeed to the crowdsourcers.

Why the Medieval Fragments Project Nolonger Exists: and when crowdsourcing doesn’t work.

I regret to inform you that the Medieval Fragments Project (formerly known as the Ransom Fragments Project) has been officially terminated. In February of 2015, after four years of successful and dedicated labor in which I completed 6 major projects within the allotted time and budgetary constraints, The current director of the Harry Ransom Center chose not to renew my contract as an archivist with the Harry Ransom Center. During those four years of contract employment I took the personal initiative to direct the Medieval Fragments Project on a volunteer basis. My efforts included hundreds of hours of searching, transcribing, translating, writing, editing, coding, coordinating, and collaborating, not to mention traveling across the country to present my findings to the archives, rare book, and manuscript community and promoting the project via social media networks. I conducted this work while diligently and successfully completing my paid duties as assigned in my contracts (my paid duties included arranging and describing 20th-century literary collections).

To be “thrown out into the cold” by the very institution which praised and allegedly supported my efforts as a scholar and an archivist and to realize that the road to gainful employment at another institution would take months of fruitless labor, unemployment, and isolation was not only professionally humiliating but also financially and emotionally traumatic for me, my spouse, and my 3-year- old daughter. Based upon six years of post-graduate education, my experience with the Ransom Center, and my seven-year quest to find permanent employment in the cultural heritage “industry” I have concluded that universities do not operate by a merit based system, but rather by a form of elitist inbreeding wherein only a handful of privileged individuals are hand-picked through political intrigue and spoon fed to glory by a handful of privileged individuals who were hand-picked through political intrigue and spoon fed to glory.

Consequently, on April 13th I chose to abandon the cultural heritage profession (as it abandoned me) and am now working as a salaried, permanent employee of the Health and Human Services Commission of Texas. Because I can no longer maintain any curatorial control over the fate of the fragments and because I cannot afford to expend any more time and effort managing the Fragments Project, I made the painful decision to permanently deactivate the Flickr account containing the images and associated metadata. I extend my sincerest apologies for all of you who took time to assist in the identification of those medieval fragments. I would argue that the Harry Ransom Center owes all of you an apology as well. Lest anyone accuse me of selfish destruction, I am letting the public know here and now that all essential information regarding the project and the identified fragments can be found in a written report and catalog. Copies of this report and catalog are available at the Harry Ransom Center. I recommend contacting Olivia Primanis, head book conservator at the HRC, if you wish to conduct further research. A description of the project along with a catalog of the fragments will be published in the journal Manuscripta in 2016.

Advocates of crowdsourcing should learn the following things from my experience:

  • Crowdsourcing initiatives should have the full and official support of an established institution if they are to survive in the long run.
  • Crowdsourcing initiatives should never rely entirely on volunteer labor. The people who direct the project should have a financial stake in the enterprise.
  • If a crowdsourcing project is to exist entirely on volunteer labor, then the subject of the initiative should not be under the curatorship of a cultural heritage institution.

It is my sincerest desire that my readers will take a moment in their busy lives to carefully and compassionately reexamine what it means to be a scholar and a keeper of our cultural heritage. I hope that this reexamination will shed light upon the inequalities and exploitative practices that are deeply imbedded in the life, actions, and philosophy of the academic and cultural heritage community.

Crowdsourcing the Arcane: Utilizing Flickr (and Google) to Describe Medieval Manuscript Fragments

The Harry Ransom Center has been conducting a survey of medieval manuscript fragments as binder’s waste in their book collection since 2011. In 2012 we began crowdsourcing the identification of those fragments. Below is the transcript of a recent talk I gave on the project at the Society of Southwest Archivists Annual Meeting in Austin. Please note that some of the stats have changed since the talk was given.


“Reduce, Re-Use, Recycle. Any responsible modern citizen is familiar with this concept. The idea is not new.


As handwritten books of the Middle Ages became outdated, bookbinders of the early modern period would cut out their sturdy parchment leaves and recycle those leaves as covers, pastedowns, spine-linings or gathering reinforcements in new, “cutting edge printed books.” The practice of cutting up unwanted and out-dated manuscripts in order to re-use their parchment leaves as binding waste lasted until around 1650 when the sources finally began to dry up. Today a large number of medieval manuscripts are dispersed throughout the world’s great libraries within the bindings of early printed books.


Historically speaking medieval manuscript fragments bound into books have received less descriptive attention than complete manuscript codices. Catalogers may note them as “manuscript waste” in the MARC record for the book in which they are bound, but this is typically it. Institutions rarely have the time or resources to provide transcriptions, highly detailed descriptions, and other extensive metadata. But since many collections of complete medieval codices have now been digitized, scholars are increasingly turning their attention to what I like to call the “orphans of the manuscript world.”

A little over two years ago the Harry Ransom Center began a survey of medieval manuscript waste in the printed book collection.  Given my background in manuscript studies I was obviously excited about the project. But identifying the texts on these fragments can be challenging even for the most skilled medieval manuscript specialist and I had a number of other projects which took immediate priority over this one.


I was familiar with the successes of several institutional crowdsourced transcription projects and the collaborative identification of historic photographs on Flickr by amateur enthusiasts.

And it was the latter that got me wondering if something similar might be done with medieval binder’s waste. Although the work is highly specialized, we figured surely there were enough folks out there with some skill in Latin Paleography that we might actually be able to identify most of these objects relatively quickly. Sure, we had the option of adopting vague descriptive language or simply not identifying any of the really difficult texts, but I was tantalized by the possibilities of crowdsourcing and collaborative description and this seemed like a good excuse to try it out.


Creation of Flickr account

Having received approval from the proper authorities, early in June of 2012, we began taking images with a point and shoot camera (and later a smart phone with an 8-mega-pixel camera) and posting them on a Flickr pro account.


Creation of Facebook account

We also created a Facebook account with the same name and a banner image of one of our more attractive fragments so that we could post notifications about new images, interesting problems, and project milestones for followers. We currently have 131 “likes” on Facebook. I’ll admit that at least a 3rd of those are probably my friends. The rest tend to be  from Europe.


Creation of Twitter account

I initially resisted Twitter since I was the one managing the accounts and it seemed a bit overwhelming, but after presenting our work at a conference in late October of 2012, I was convinced by several prominent manuscript scholars to go for it. In order to help unify our presence online, we adopted an icon of an illuminated initial to use across all accounts. We now have around 241 Twitter followers. The way we got our followers was by following libraries with large manuscript collections and medievalists with lots of followers! We have noticed a direct correlation between twitter posts and increased views on our site. There’s no question that you will see increased traffic if you post photos regularly on Twitter and Facebook.

Initial exploratory phase

Once we had 34 images posted on Flickr we made an announcement on the rare-book list-serve Ex-Libris. Our Flickr site received 659 views and 4 potential identifications in the first couple of days. The first serious increase in traffic came when the Ransom Center ran a story about our project on its blog Cultural Compass late in July which brought us almost immediately up to 2,422 total views and around 16 different contributions. The next big jump in traffic came in mid-August 2012 after posting an announcement to the Early Book Society list-serve which brought us to 6,828 all-time views and several more contributions.


SLU conference

By the time we finished posting images of all 79 known fragments on October 5th 2012 the site had received over 14,000 views and 21 of the fragments had been identified.

Intensification of search

After finding and posting images of all known fragments we conducted a shelf-search through our minimally cataloged collections of books printed before 1700 and managed to increase the total number of distinct fragments in bindings from 79 to 128.


Creations of sets

Using the “sets” feature on Flickr we arranged our photos into groupings of images of fragments from individual bindings. So although there are now 119 sets, some of those sets include fragments that originate from more than one manuscript.



Once we had had all the sets created we arranged them by the call number of the books in which the fragments were bound.  We also used the “Collections” feature which allows you to arrange sets into groups to create a few curated sets of photos organized by different types of binder’s waste. Really the possibilities are endless.


Anatomy of a Flickr page

Here’s an example of a typical page. Flickr underwent a major site redesign just this past Tuesday and the images are now displayed in their original resolution which is fantastic for viewing small scripts.

Each photo has a title with the call number of the book and the physical location of the fragment within the book.

Metadata for the photo is included just below the image and it includes just the basic information about the fragment so that viewers can get a sense of what we do and do not know.

We also utilize the tag feature and basically use uncontrolled vocabulary. More tags generally mean more views. The of course it’s no guarantee.

Below that is a statement about the project and a link to the MARC record for the book. We are also creating links within the marc record to the Flickr image so that it’s a two-way street for traffic. I’m hoping that there’s a way for us to ultimately track traffic from Flickr to our Online Public Access Catalog via these links.

And finally, below this metadata field are viewer comments.



So to summarize our results: As of yesterday our Flickr site has received over 26,445 views since June of 2012. 122 out of 359 images have received comments.

Viewers have potentially identified 71 out of 128 distinct texts. 15 fragments are not really identifiable due to loss of text or lack of visibility. And I have identified 22 fragments myself thus far.


This means that (assuming the rest of the identifications are verified) 93 out of 113 identifiable fragments have been identified since June of 2012 and 62% of these have occurred through crowdsourcing.


Methods of searching

A closer look at the contributors confirms what others have learned from crowdsourcing projects, which is that a majority of contributions are made by a hand-full of “well-informed enthusiasts.” In our case, one lawyer and rare book enthusiast with a tremendous drive has made the majority of identifications. So far we have 21 contributors and out of those 21 this person has commented on 28 sets whereas the rest average around 3. We don’t know a whole lot about our contributors yet, since Flickr hides that information, but contributors with known credentials tend to provide information via e-mail rather than creating an account and posting directly on Flickr.


It’s important to note that most contributions involved identifications of fragments via text-string searches in Google Books. A couple things need to be said about this. First, it’s amazing what you can do with Google Books. When I started training in manuscript studies about 7 years ago, the primary method of identifying texts was to use good old fashioned off the shelf reference sources. This usually required an excellent memory, access to bibliographies, and a great library. Google Books now truncates quite a bit of this work. The main problem with some attributions is that just because a string of script on a fragment matches a string of text in Google Books doesn’t mean it is the exact same text, or that the digitized book represents the best edition. But I’ve found that in most cases this “Googling” method allows you to at least get a reference point.

So far we’ve reviewed  89 of the 128 fragments and have only found 1 (possibly 2) miss-identifications for fragments identified through crowdsourcing. I think this is pretty amazing. So far no one has attempted to do anything malicious (knock on wood) or made any wildly wrong identifications

Database entry

Once we have finished verifying all contributions we will transfer our metadata from Flickr to our database of medieval and early modern manuscripts on the Ransom Center website. The images will stay on Flickr, but we haven’t decided what to do about the metadata since the authoritative information will be on our website. We would certainly like to keep the conversation going and encourage viewers to continue commenting and sharing insights.

Bigger picture–what’s next

 So what exactly do our numbers tell us? Well, I’d like to think it shows that you can indeed crowdsource the arcane. Although the ratio of comments to views has been fairly low, from the perspective of access, this project has been a success especially given that this is an extremely niche subject.  Whether or not the project is a success purely in terms of scholarship remains to be seen.


Flickr disclaimer

I want to assure you that this is not a wholesale endorsement of Flickr. It has some major drawbacks. For instance, there is no true zoom feature and large leaves with small dense script are virtually unworkable. Users can’t view and comment on multiple images at the same time and it’s also an awkward platform for creating transcriptions—the comments section is usually too far below the image to transcribe while looking at the text. Finally, Flickr should not be used as a repository for long term preservation of your digital photos since we just don’t know how long they’ll be around or how they will change.

One way I like to think of it is as a FREE collaborative brainstorming sketchpad or drawing board for your own future professional platform/ website.


In conclusion, I do think it’s important to maintain the integrity of our profession by not wholesale farming-out descriptive work to just anybody, but I’d like to suggest that there is room for both the amateur and the expert in parallel forums. At the very least, the comments posted on our Flickr site create a record of interaction with our holdings from the outside that’s generally not possible with most institutional platforms.


The Ransom Center Fragments Flickr site no doubt will ultimately vanish into the mists of this digital dark age. But the project never set out to provide a monolithic body of inerrant data. All collaborative projects like this, in my opinion, are more about the process than the final product. There is enormous potential for discovery within and between different knowledge communities right now and it’s fair to say that open-ended crowdsourcing can exist alongside professionally vetted, sustainable institutional projects. Let’s go crowdsource the arcane.”

Postscript (things I would like to have included in the talk)

Below are a few of the more significant lessons learned in this project.

  1. Collaborative identification of texts doesn’t necessarily save you time (you still have to verify).
  2. Do utilize list-serves, they will likely be your largest source of viewers.
  3. Incorporate blog posts and try to get publicity through bigger websites/institutions.
  4. It’s a good idea to try and get your local user group behind you as quickly as possible.
  5. Make physical connections with your local user group as soon as possible.
  6. It’s probably best to try and facilitate comments by responding right away instead of just observing and then getting back to them later—don’t assume others will chime in.
  7. Some contributors provided extremely detailed paleographic assessments (sometimes of marginal inscriptions) without trying to identify the text—be prepared for unexpected observations.
  8. One additional advantage to photo-documentation is reduced handling when describing.
  9. I’m pretty sure some fragments were viewed simply because the photos looked interesting.










Crowdsourcing the Medieval Text: New Avenues for Examining Leaves and Fragments

I recently presented a paper at the 39th Saint Louis Conference on Manuscript Studies on crowdsourcing the description of medieval manuscript fragments. The paper draws upon my project at the Harry Ransom Center to survey medieval manuscript fragments used as binding waste in early modern books. A transcript of the presentation is provided below:

For a full slide presentation: SLU PowerPointMicah

“I would like to begin my talk with an anecdote which I hope will provide some context to my presentation. By now I’m sure most have you have been exposed to the term “crowdsourcing.” It is in danger of being overhyped, but is nontheless an important movement which has been gaining speed in the digital humanities since it first achieved mass recognition around 2010.

Some of the more notable examples include the Project Gutenberg, Papers of the War Department and the Zooniverse/Citizen Science Alliance projects.  I was personally exposed to crowdsourcing two years ago in my Information Studies program at the University of Texas while taking a course on digital curation.

What truly inspired me was not one of the larger institutional projects just mentioned but a remarkable exchange on Flickr– Yahoo’s image hosting site– among WWI enthusiasts. In 2005 a Flickr member, Jens-Olaf Walter, posted a photograph from WWI of German soldiers scrambling across railroad tracks somewhere in Finland. He accompanied the title with the simple phrase: “official army photo, German-Finnish Sign “Haltpunkt“?” It was not long before another user with the rather odd screen name “timonoko” posted a comment identifying the sign in the photograph and noting that it was near Helsinki but that he did not recognize the scenery.

Almost two and a half years later, another member took the challenge up by using Google Maps and a variety of clues in the photo to suggest a possible location. In response to this both Jens-Olaf and the other contributor posted photos of old Finish maps, one of which indicated a change in the railway line through that particular town. Amazingly “timonoko” posted a video clip from a Finish TV series “Memories of 1918” showing the exact same scene of the soldiers in the photograph crossing the railroad—except that it had been caught on film! (Sadly, I think the clip has now been removed).

The stream of comments and image posts do not end there. Multiple researchers began to chime in offering related photos about the train tracks, the nearby train station, and the German military unit in the photograph. The amount of collaborative research recorded for this single photo verges on the absurd. All this to say, examining the photograph and reading the comments was an eye opening experience for me. It points to the power of sites like Flickr to facilitate the collaborative process of describing and identifying historical artifacts.

But what is perhaps most instructive about this exchange is that the image also resides in the Great War Archive, a digital repository of WWI images. And there it sits, where I first found it, cut off from its stream of comments.

So how do we build upon the successes of repositories of digitized medieval manuscripts? How do we build a platform where anyone with a genuine interest in medieval manuscript fragments and a basic grasp of medieval paleography and codicology can offer input?

When archives and special collections began digitizing in the late nineties we produced what is referred to as shallow digitization. The basic institutional model has always been to make high quality scans, then put them in a repository online and hope that researchers take a look at them. One of the problems with this model is that the metadata is fairly limited. Yes, we’ve developed rigorous standards, but the descriptions still tend to be minimal and institutionally-focused. Most notably, there has been no time to provide transcriptions, or extensive keywords, and more often than not,

the images reside in software platforms that are inaccessible to Google. This is where the dramatic growth in the use of online social media, image hosting sites, and blogs offers a whole new realm of possibilities for extracting and sharing information about historical artifacts.

Historically, medieval manuscript leaves, fragments, and binding waste have received considerably less attention from academics and librarians than bound codices. And yet, here in the United States especially, institutions that hold rare materials are more likely to have medieval leaves and binding waste in their collections than whole volumes. Despite the abundance of such artifacts, few have been extensively researched, surveyed, and described (the Ege leaves being one notable exception). This is at least one reason why leaves, fragments, and binding waste hold the most potential to benefit from a more collaborative digitization model.

While formal archive and special collection websites remain essential for highly professional and stable long-term projects, broader popular social media and photo-sharing sites such as Flickr and Facebook offer the potential to provide an easy, inexpensive, and more widely accessible platform for crowdsourcing the description of medieval manuscript fragments and binding waste.

Because I am the lone archivist with a masters in medieval studies at the Harry Ransom Center, for the past year or so I have had the sole responsibility of surveying and describing the medieval and early modern leaf collection and manuscript binding waste in the Ransom’s book collection. The project has been challenging but thoroughly enjoyable. On occasion I have made totally unexpected discoveries, such as finding an impression of circa 16th-century leather-rimmed spectacles on the manuscript-waste endpapers of an early printed book.

Unfortunately cataloging these objects is NOT outlined in my primary responsibilities as an archivist. This inconvenient fact combined with the unique challenges of describing fragments forced me to recognize that I needed the assistance of others in the rare book and manuscript community if I was going to make any substantive progress on the project. Inspired by the example of WWI enthusiasts on Flickr and other collaborative transcription projects, I suggested we try something similar with the Ransom fragments. Having been given the green light by the proper authorities, early in June of this year, I began posting digital images on Flickr and inviting members of the rare book community to examine and share insights

Notifications about new images were posted on a related Facebook page.

The first week was promising. We made an initial announcement on ExLibris and Facebook and out of the 34 images posted had 659 views and 4 potential text identifications.

The real increase in traffic came when the Ransom Center ran a story about the project on its blog Cultural Compass late in July which brought us up to 2,422 views and around 16 total contributions. The next big jump in traffic came in mid-August after posting an announcement to the Early Book Society which brought us to nearly 7,000 all-time views and several new contributors. By the time I had finished posting images of all known fragments on October 5th we were well above 13,000 views. Early on I predicted that all fragments in the survey would be identified by the time I finished posting images of every item. Although this did quite not come true, the statistics are nevertheless encouraging.

As of October 10, 2012 the collection had been viewed over 14,000 times.

So what exactly do these numbers tell us? Well, for starters it tells us there is a healthy interest out there in images of medieval manuscript fragments from the Ransom Center. And that’s an encouraging thought. But this has to be tempered by the fact that over a four month period, the ratio of contributions to views was quite low–although it is highly probable that viewers will continue to make contributions in the months and years to come.

My colleague Ben Brumfield, a programmer who developed the successful transcription software FromthePage, already considers Ransom Center Fragments a crowdsourcing success given the highly specialized knowledge required to even approach these objects.  Whether or not the project is a success purely in terms of scholarship remains to be seen.

The survey comprises 78 books containing a little over 79 fragments.  The items span circa eight centuries, at least 8 geographic regions, and include a diverse representation of bookhands and documentary scripts, along with a variety of texts. A diverse selection of binding styles are also represented although the limp vellum structure and other forms of parchment binding comprise a majority.  At least 7 of the 79 fragments are too heavily abraded for their texts to be identified under normal light.

All images have been viewed at least once and 73 out of 225 images have comments. Twenty one of the texts on the fragments have been positively identified or at least attributed to print editions available online while another 13 fragments now include rough transcriptions or other relevant information. That means that within 4 months, contributors provided relevant information for 51 % of all identifiable fragments.

A closer look at the contributions confirms what others have learned from crowdsourced projects and that is what is called a power-law distribution in which most of the contributions are made by a hand-full of “well-informed enthusiasts.” In our case there were around 10 total contributors and 22 of the 72 contributions were made by one rare book enthusiast from San Antonio.  Similar outcomes seem to occur in both small and large projects.

I think it’s important to note that most contributions involved identifications of fragments via text-string searches in Google Books. A couple of things should be said about this. First, it’s amazing what you can do with Google Books. When I received training in manuscript studies just six years ago, the primary method of identifying texts was to use good old fashioned off the shelf reference sources. This usually required a very good memory and access to excellent bibliographies. Google Books now truncates quite a bit of this work. The only serious drawback of using Google Books, or relying on others who do so, is that just because a few lines of script on a fragment match a string of text in a book online doesn’t mean they are manifestations of the same work, or that the online book represents the critical edition of the work. Regardless it’s still a useful and an immensely powerful tool which allows for some fascinating discoveries.

I hope that by now my audience is asking the question— so what’s next?  Well, from my perspective there are at least three avenues for going forward:

The first is to use Flickr as the primary content manager and access point for these objects.

The second would be to Create or use a different content manager and software platform that participating institutions can use to upload images with an interface customized for fragments and binding waste and a more tightly controlled collaborative environment.

The third option would be to upload images to Digital Scriptorium or other individual institutional repositories.

Personally, I would like to see number 2 implemented while encouraging institutions to continue using Flickr as a broader and cheaper access point.

The folks at Integrating Digital Papyrology and their platform Papyrological Navigator at provide the closest approximation to what serious manuscript scholars might want. It’s an impressive collaboration between several institutions and senior scholars and represents probably the most granular and tightly controlled environment for collaborative work.

There are a few problems I see with this prototype. First, to my knowledge no institutions in the United States currently hold large databases of already digitized binding waste and fragments. Second, the papyri project is designed around small flat fragments of texts on papyrus. Binding waste is a far more complex 3-dimensional structure—often with multiple fragments in a single binding, sometimes pasted one on top of another. Third, the structures themselves are of interest to scholars and not just the text they contain.  And finally, Integrating Digital Papyrology is designed by academics, for academics.  Unless we offer images on public sites like Flickr in addition to something like this, we will not be contributing to lowering the wall between the broader public and cultural heritage institutions. Option 3 a perfectly good route to take, but like the Flickr option, is not a total solution.

I want to be absolutely clear, I am not advertising for Flickr. Admittedly it does have some incredibly useful features and management capabilities and in many ways it is superior to other collaborative platforms in terms of binding waste specifically. What I envision as a useful interface for paleographers, codicologists, and manuscript scholars is very close indeed to what Flickr has to offer. But it has some major drawbacks. For instance, there is no detailed zoom feature which is completely standard in other professional platforms. Large chunks of small dense script are virtually unworkable. Users can’t easily link comments to multiple images and it’s also an awkward platform for creating transcriptions—the comments section is a little too far below the image for visitors to transcribe while looking at the text. Finally, Flickr is owned by Yahoo, and there’s no company that is too big to fail. If we trust our images to Flickr we still need to back them up ourselves (Digital Scriptorium is a good option)—unless of course our purpose is only access and not preservation.

In a recent e-mail exchange, a prominent medievalist and manuscript scholar (who shall remain anonymous) explained what was absolutely essential to the feasibility and success of an operation like And I quote:

“The papyrus folks have broad support both from institutions with major holdings and from senior scholars.  There is also a strong ethos of quality control which manifests itself in rigorous vetting of comments and the ability NOT to post things that are not well substantiated.  So there is expertise plus process plus the commitment of a core group of people who are known to be seasoned papyrologists.”

I think the underlying message of this statement is that a broad-based manuscript fragment database with a platform for user contributions should NOT be open and accessible to just anybody off the street. I would like to challenge this assumption by asking a question. Is it essential to the long-term integrity of our discipline that we provide resources to which only the very best may contribute? Perhaps. However maybe there is room for both the amateur and the expert in a single forum. Or at the very least, in parallel forums.

The grand narrative of this age of information it is that the lines are blurring between those who have privileged access to knowledge and those who do not. As manuscript scholars it is our responsibility to affirm valid contributions when they are made, no matter where they come from, while at the same time diligently exposing error when and where it occurs. We can do this without creating boundaries to those not inside the academy.

The Ransom Center Flickr project no doubt has user contributed errors. But we never set out to provide a monolithic body of inerrant data. The Flickr site, and hopefully all collaborative projects, are are as much about the process as the final product. Open-ended crowdsourced projects should be able to exist comfortably alongside enterprises like Digital Scriptorium.

At the risk of being overly pedantic, I would like to conclude with a quote from Plutarch’s Life of Alexander. Alexander, being the man of ambition that he was, found himself rather put out by Aristotle’s decision to publish certain doctrines which were traditionally passed on via oral communication to the initiated. According to Plutarch, Alexander states:

“You have not done well to publish your books of oral doctrine; for what is there now that we excel others in, if those things which we have been particularly instructed in be laid open to all?”

It is my hope that as we move we can rise above the temptations of exclusivity which afflicted Alexander the Great.

And one final note: for all you catalogers out there: please include as much information as possible about manuscript binding waste in the notes field of MARC 21 records. This is currently the most efficient way to locate these items unless you want to make us slog through old auction catalogs or (heaven forbid) physically browse the closed stacks!”


Flickr page

Facebook page

Conference Presentation: Manuscript Leaf Collections

Last month I gave a presentation at the Society of American Archivist (Indiana University Chapter) on medieval manuscript leaf collections. The paper represented, at least in part, my reflections on what I had learned through cataloging work on manuscript leaves at Western Michigan University’s Special Collections and the Harry Ransom Center from 2009-2010.  Below you will find that paper as well as its accompanying powerpoint. My apologies if the style seems a bit “breezy”–it was intended to be read aloud, not submitted for publication!


In the late 1940s, the art historian and book-collector Otto F. Ege did something that would have a huge impact on medieval manuscript leaf collecting in the United States. He selected fifty medieval manuscripts from his personal collection and removed several dozen individual leaves from each one.  He mounted these leaves, along with descriptive labels, onto large paper mats and place each mounted leaf into a portfolio box. The final result was forty boxed sets containing fifty manuscript leaves each. Eventually these portfolios entitled “Fifty Original Leaves from Medieval Manuscripts” were offered for sale to university and public libraries around North America.[1] Although this act of biblioclasty (or, “book-breaking”) may seem shocking and even abhorrent to most librarians, archivists, and bibliophiles, his behavior was simply not extraordinary. While the removal of a whole medieval manuscript leaf for collectable reasons did not become popular until around the late 19th century, people have been cutting up manuscript books to re-use the parchment going all the way back to the origin of the codex. For at least a thousand years Europeans re-used vellum from older manuscripts as flyleaves, sewing guards, wrappers and to strengthen bindings. Manuscript cuttings have been pressed into service over the centuries for a surprising variety of domestic tasks such as jam jar covers, wallpaper, candlesticks, and lampshades just to name a few.[2]

The removal of individual illuminations from manuscripts goes back at least to the fourteenth century when some bookmakers would remove miniatures and decorated initials to ornament their new manuscripts. Antiquarians, like the famous Sir Robert Cotton, were cutting out fragments as specimens of ancient handwriting or decoration in the seventeenth century.[3] Such acts were generally not done for commercial reasons but rather out of the desire to collect mere curiosities or souvenirs. By the mid 19th century we see the beginnings of an interest in whole leaves—decoration or not.[4] The practice of creating “Leaf Books” or removing an original leaf from a significant manuscript and publishing it with an essay written by a prominent author can be traced to at least 1841.[5]

The interest in collecting single leaves began to increase dramatically in the early 1900s. Robert Forrer of Strassburg, published a catalogue of his collection in 1913 containing 38 whole manuscript leaves—a large number for the time. But by 1956 collectors like Erik Von Scherling of Leiden issued catalogues containing nearly 2600 whole leaves—many of which he sold in the United States.[6]

Disdain for the cutting up of manuscripts is surely as old as the practice itself. Individuals expressed reservations towards such activity in the 1800s. For example, James Dennistoun, a Scottish antiquary and art collector called Napoleon’s French troops “boors” for cutting up manuscripts, which he had subsequently “saved” in Italy around 1838.[7] In 1860 H. M. Lucien[8] supposedly became the first person to refer to collectors who cut up manuscripts as “vandals.”[9] The well-known eighth edition of ABC for Book Collectors, by John Carter and Nicolas Barker, states that biblioclasty should be discouraged even if done with good intentions. Despite such spirited dissension towards cutting up books, Ege’s deed was simply one in a very well established tradition of such behavior.

To be fair, his inclination was primarily an altruistic one.  Otto Ege’s tenure at the library school of Western Reserve and the Cleveland Institute of Art, his devotion to teaching the book arts to the general public, and his passion for medieval book decoration led him to believe that such objects could act as a source of inspiration to modern–day bookmakers. He authored dozens of articles on this subject in art education journals and loaned materials to public book exhibits regularly.  Therefore, his “Fifty Original Leaves from Medieval Manuscripts” portfolios were a logical part of his fervent commitment to populist art education in America. However, his secondary intention for making these sets was to profit financially from many years of book collecting. Although Ege died before he was able to sell the portfolios, his widow Louise, continued with the plan and began dispersing the boxed sets to individuals and institutions for $750 each.[10] All told, Ege probably sold hundreds upon hundreds of modestly priced single manuscript leaves in his lifetime.[11]

Recently, thanks largely to the efforts of Greta Smith (Miami University) and Fred Porcheddu (Denison University), thirty of Ege’s forty specimen sets of medieval manuscript leaves have been located, nearly all of them in libraries in the U.S. and Canada.  Smith and Porcheddu have produced an excellent website which serves as a single location where images, descriptions, and other information about Otto Ege and his leaves can be gathered and shared. The ultimate goal is to transfer the information on the site to XSLT language so that the leaves and their transcriptions and translations will be even more searchable. This attempt to “virtually” reconstruct the component manuscripts of Ege’s portfolios is an admirable one and I hope that institutions and individuals will continue to participate in the ongoing project.

I begin with the story of Ege and his dismembered manuscripts partly because no-one can talk about leaves and fragments of medieval books without at least mentioning him and partly because I believe the above work of Smith and Porcheddu and others on the Ege leaves is, at least for now, more of an anomaly than the standard. Hundreds if not thousands of manuscript leaves in archives and special collections have been forgotten at best, inadequately arranged and inconsistently described at worst. To date, I am not aware that anyone has tried to produce a census of leaves and fragments held in institutions in the United States. Some scholars have speculated about the probable number of leaf books and portfolios of leaves like Ege’s that are in existence[12] but likely, we will never really know how many excised leaves are out there. Indeed those estimates don’t include the thousands of individual leaves and fragments that have been acquired over the years by various institutions and individuals. Furthermore, the sale in manuscript leaves seems to be going strong, despite criticism. Thanks to online sites such as E-Bay, the trade flourishes, with many dealers using the age-old justification that such items were already excised from manuscripts before coming into their possession. In 2003 individuals at the Institute for the Study of Illuminated Manuscripts in Denmark (CHD) began attempting to track and catalog manuscript leaves as they were sold through E-Bay—an admirable effort to be sure, but they have not updated their website since 2007.[13] Regardless of the situation, I would argue that our real focus should continue to be on how to best digitize, arrange, and describe such objects in our archives or special collections regardless of how they came to be there.

One of the advantages that leaves have over full codices is their potential to serve as pedagogical tools. Although full manuscripts hold more information, physical access to them is often more restricted—especially for students and the beginning scholar. For example, the Harry Ransom Center at the University of Texas is now beginning to restrict physical access to bound codices as they are digitized. But any student can still personally handle and examine items from the leaf collection. Given the now fragile or deteriorated condition of many medieval codices, I imagine this situation will likewise increase at other institutions. Accordingly, if leaves and fragments are the primary way that many will be able to physically study medieval artistic and textual culture, it is all the more imperative that we put forth an effort to more usefully organize and describe them, and thus increase access to them.

The history of what we would call “modern” cataloging of medieval manuscripts in the English speaking world dates back to the early 1600s with Thomas Jame’s catalog of medieval manuscripts at Oxford and Cambridge.[14] This early effort stimulated subsequent catalogs in Britain with increasing breadth and depth, culminating finally in the 19th century with the work of M.R. James[15] and then in the 20th with Neil R. Ker’s Medieval Manuscripts in British Libraries.[16]

But the first, and so far only, attempt to produce a printed union catalog of all the medieval and Renaissance manuscripts in the United States and Canada began with Seymour De Ricci’s Census of Medieval and Renaissance Manuscripts (published in 1937) [17] and Faye and Bond’s Supplement (published in 1962).[18] This ambitious project undoubtedly brought numerous unknown manuscripts to general attention. But in many ways De Ricci’s work discouraged more detailed cataloging at individual institutions and it recorded very few manuscript leaves or fragments.

Massive cataloging projects sponsored by the German Research Foundation (DFG)[19] in the 1950s, together with the example provided by Ker’s work in Britain inspired renewed efforts in the 80s to produce printed catalogs in the United States. The efforts of such major institutions as the Beinecke in 1984,[20] the Claremont in 1986,[21] and the Newberry Library[22] and Huntington in 1989 [23] represented the “coming of age” of pre-modern manuscript cataloging in the U.S.[24] Most appropriated the methodology developed by N. R. Ker in Medieval Manuscripts in British Libraries. By organizing their catalogs along these lines, they provided a much higher level of detail than De Ricci’s Census and established the basis for subsequent efforts in electronic cataloging projects.

At a number of international conferences, beginning with one held in 1989 at Munich, scholars collectively debated how best to reduce medieval manuscripts to an electronic, machine-readable, and searchable form. There was particular disagreement concerning the adaptability of MARC 21 for the task.[25] At that time, if an institution wanted to try and catalog their medieval manuscripts using MARC 21 they would likely use Archives and Manuscript Control (AMC) format with Archives, Personal Papers, and Manuscripts (APPM) specialized rules.[26] Yet these separate format rules were essentially designed for collections of unpublished modern or early modern papers and documents.

For example, with “certain pre-1600 manuscripts” and for “book-like manuscripts,” APPM refers the reader back to AACR2R, chapter 4,[27] which provides a few extra tips for how to record certain special features. But because pre-modern manuscripts lack the usual identifying marks of authorship and publication that distinguished printed books, AACR2R lumps them together with all manuscripts as a single cataloging format according to the general principle that they are all “unpublished materials.”[28]

In 1996, The Hill Museum & Manuscript Library at Saint John’s University collaborated with the Vatican Film Library at Saint Louis University to begin a project entitled Electronic Access to Medieval Manuscripts (EAMMS).  This effort was funded along with Digital Scriptorium by the Andrew W. Mellon Foundation. An international body of specialists came together to participate in the project with the goal of increasing access to medieval manuscript-related records and information.[29] In 2002, these guidelines were taken up, refined and then approved by the Bibliographic Standards Committee of the Rare Books and Manuscripts section of the association of College and Research Libraries. The final product, released as a supplement to AACR2 was entitled Descriptive Cataloging of Ancient Medieval Renaissance and Early Modern Manuscripts or AMREMM. These new guidelines had a number of advantages over APPM including a new treatment of the supplied title and the order of its elements, special guidelines for dealing with manuscripts containing multiple works, new provisions for recording layout, musical works, and binding to name a few. [30]

Interestingly, although AMREMM represents a major improvement over previous guidelines, many institutions are using discrete databases with localized standards. Some repositories like the Morgan Library and Museum cataloged their medieval and Renaissance manuscripts using MARC 21 before the development of AMREMM. The Head of Cataloging and Database Maintenance at the Morgan admits that their records do not conform to a number of AMREMM guidelines. Because “unlike most library collections, the Morgan treats these items primarily as art objects with an emphasis on the artistic quality of their illustrations as opposed to the various texts in the manuscripts.” [31]

This raises the question of whether AMREMM is too “text-focused” in its guidelines. It does of course provide rules for dealing with decoration and illumination[32] along with six accompanying examples. But it avoids specifying any controlled vocabulary for their description, instead pointing to a variety of different works (many of which do not entirely agree with one another) for “technical terminology.”[33] And the examples of MARC 21 records provided in the appendix do seem favor textual over artistic issues. As the medievalist Rowan Watson once aptly stated, “it is the role of the cataloger to recognize the various questions on which a fragment can provide evidence and not solely to mention its textual importance.”[34]

In my own experience, AMREMM seems generally well suited to cataloging full codices with multiple texts, but it provides rather less clear guidance for individual leaves and fragments. The guidelines for cataloging charters and papal bulls are also somewhat wanting—and the reference works it points to are old and overly specialized. There is no example of a MARC 21 record for a leaf in the appendix and it is unclear whether or not one should transcribe any part of a single leaf if it technically has no incipit or explicit.[35] [slide19—click again] I personally chose to transcribe the first and sometimes last words of text on a leaf regardless of whether they were technically incipits are explicits. This allows for the possibility of identifying what kind or what portion of a manuscript a leaf comes from. In cases where there may not be enough time or money to digitize leaves, this also opens the possibility for researchers to find sister leaves in the same or other institutions with conjugate texts.

It also came to my attention during research, somewhat obviously, that the bulk of medieval manuscript leaves and fragments in the United States originates from the 15th and 16th centuries. And although there are examples of fragments of rare and historically significant texts out there, the vast majority have been excised from rather more common liturgical books and Books of Hours. Accordingly, there is a need for newer related resources in English. One of the few comprehensive printed reference works for this very subject, Hughes’ Medieval Manuscripts for Mass and Office,[36] is so dense and complex that some scholars have called it more of a disservice than an aid.  Recent works like Clemens’ and Graham’s Introduction to Manuscripts Studies, are a good place to start, but right now a couple of websites created by industrious scholars for researching Books of Hours and liturgical books are probably the best resources for the cataloger lacking specialized knowledge.[37]

One of the most obvious values of leaves and fragments is their service as specimens of medieval handwriting. In Europe some archives have used fragment collections to produced noteworthy paleographic albums illustrating the history of medieval scripts.[38] The proper identification of scripts is one of the best ways to localize and date individual leaves. But resources that many catalogers depend on for controlled vocabulary like the Getty Art and Architecture Thesaurus lack the necessary level of detail to provide sufficient analysis for late medieval scripts. Many different systems of nomenclature have been proposed and used over the years, but the recent work of scholar Albert Derolez [39] may be helpful in creating a more standardized nomenclature. Currently I am working on developing a resource based on his work for local use at the Harry Ransom Center. Possibly, with the assistance of other scholars, the ultimate goal would be to expand the controlled vocabularies for medieval scripts of the Getty Art and Architecture Thesaurus.


In this brief paper I have attempted to situate the cataloging of pre-modern manuscript leaves within the discourse of descriptive standards for medieval codices. The practice of cutting leaves and fragments out of bound codices has a long and venerable tradition, and despite opposition, continues to this day. Regardless of how we feel about this, archives and special collections need to make an effort to increase access to their own leaf collections. Because they are isolated from their larger context, leaves and fragments are often viewed as meager novelties—not useful for real scholarship. But such objects can and should be fully exploited as teaching tools within a variety of disciplines. Take the all too common example of a single decorated antiphonary leaf used as wrappers for an early-modern printed book; such an object presents a number of opportunities for research and teaching–from codicological, paleographic or musicological disciplines to the study of historical book structures. Manuscript leaves used as binding fragments provide a rich cultural context to their accompanying books and can serve as potential evidence in determining where and when a book was made or re-bound.

Fully bound medieval manuscripts justifiably receive the lion’s share of attention in description and digitization, but most institutions in the U.S. likely hold larger collections of medieval manuscript leaves and fragments than bound codices. One of the great potentials of the internet and current digital technologies is the possibility of reuniting excised and dis-bound objects online—as individuals like Smith and Porcheddu have done and continue to do with the Ege portfolios.  Although this may be complicated by the differing needs of user communities and the variety of goals of individual institutions, we can still strive to present such objects according to the most intellectually rigorous standards. As new descriptive guidelines are developed for cataloging pre-modern manuscripts in various formats we should keep in mind that the archivist or cataloger does not need to provide an exhaustive study of every fragment they come across, but they should attempt to provide enough information to at least help the researcher know if the object deserves a more in-depth examination.

We should also make an effort to connect our metadata in online-searchable databases to larger scholarly sources of information such as WorldCat. The MARC 21 format has its limitations and AMREMM is perhaps not as thorough as it should be, but by making manuscript leaf records available in a library’s on-line public access catalog we increase the “possibility of there one day existing an electronic union catalog for medieval manuscripts that would encompass the holdings of North American libraries and those of other countries as well.”[40] Finally, in situations where a library or archive lacks the specialized knowledge required to properly catalog medieval leaves, they should consider the possibilities presented by crowdsourcing the work through online communities and image hosting sites like Flickr.

Describing medieval manuscript leaves is a particularly specialized skill—one that ultimately depends in part on interpretation. The problem is aptly summarized by Gregory Pass, the editor of AMREMM, who states that the cataloger’s role in transcribing a manuscript text is “best conceived as interpretive” and the final representation is “necessarily a subjective matter.”[41] I would argue that this very interpretive process is something that any individual examining leaves and fragments can greatly benefit from. Such objects represent a treasure trove of information about the medieval world, even as isolated pages. It is my hope that archives and special collections will begin to more fully consider the range of opportunities for learning that this underappreciated resource presents—just as (heaven forbid!) Ege would have wanted.

Week 10: Sample Descriptions

To give you an idea of the work I have been doing on the manuscript collection, I have attached a pdf of my description of MS 148, which is a leaf, not a full manuscript book. When approaching my descriptions, I primarily followed the kind of format you will find for most printed catalogs of medieval manuscript collections at major insitutions like the Bienecke, the British Library and the Newberry Library.  It is the cataloging practices of these institutions that influenced the Descriptive Guidelines for Ancient Medieval and Early Modern Manuscripts. Basically, if you create a catalog description of a manuscript that follows the above insitutional practices, you should be able to convert that information into a MARC record or a Metadata record without having to consult the actual item. 

The basic framework of a description of a medieval manuscript includes the following sections:

Institutional call number,

title, date, location,


 physical description,



The contents section is usually further broken down by listing the various sections of the work and providing any citations to modern critical editions of that work. One also gives the physical span (pages or folios) of each of these sections along  the opening and closing words of those sections.

The physical description section can be very detailed or minimally detailed. I strove for a balance, especially with single leaves, to keep it fairly basic. The elements that I include are:

Paper or parchment

Number of folios

Size of book/document


Decoration (initials, colors, illuminations, drawings)


With above elements, a cataloger can reproduce the record in MARC format, although, I must say that after working on a few records with the rare book cataloger, it is not a straightforward process! And of course one of the big differences between a MARC record and a printed catalog record, is subject analysis–a far from straightforward process that takes solid grasp of the cultural and historical significance of the object.

Below you will find a pdf of my description of MS 148 and a link to the MARC record for it on the WMU library website. Once the object has been digitized I’ll post the link to the item along with it’s DUBLINCORE metadata record which I also produced.


Week 9/10 Final Week.

I’ve reached the end of my capstone project and I am happy to say that I have now completed all 49 descriptions for the manuscript collection. I’ll spend a few days next week looking over the work I’ve done with the rare book cataloger and answering any questions he might encounter when he begins using them to create MARC records of the MSS, but basically, I’m finished. I have tried to make sure my descriptions are compatible with the AMREMM (Descriptive Cataloging of Ancient Medieval and Early Modern Manuscripts) so I doubt there will be many issues beyond interpreting the format of my descriptions. I’ll also look over them with the head rare book librarian and answer any questions she might have in regards to using them as data to accompany digital images of the objects in the collection. It may be a few years before all 49 items are digitized and ready to go online, but when they are, my descriptions will accompany them in some form.
As far as concluding thoughts for this project go, I have mostly good ones, but I was a bit dissapointed that I ended up not doing anything with ContentDM or ARCHON. Having said that, this has been a great opportunity to sharpen my rarebook cataloging skills, particularly with regard to medieval and early modern legal documents and how to interpret and describe them. Hopefully this will open up additional opportunities to continue this type of work as more and more institutions put their medieval manuscript collections online.

As far as my work in the archives goes, I learned a great deal about subject-access and authority control in MARC cataloging 19th- and 20th-century archival materials. While many repositories with archival collections are skipping the MARC phase and going straight to online databases with digitized images, I still think that it is valuable to put collection records on an library’s OPAC and thus enable a researcher who might be browsing their catalog for books on say, prohibition, to stumble upon a record for an archived collection of a local prohibition society. It’s also the easiest way for users to find primary source material throughout the country just by searching on WorldCat. I personally found that coming up with appropriated subject headings and subdivisions was one of the most interesting parts of this job–particularly when a collection comprised a large variety of often unrelated materials.
I’m very greatful to all the faculty and staff at Waldo Libary’s Special Collections and the Western Michigan Univ. Regional Archives for allowing me to come and work with them on these great projects. I hope they benefited at least a fraction as much as I have from the experience.

**I plan to post a guide to interpreting my descriptions, along with an example record, on this blog in a few days, probably as a pdf.**


