New Tools for Stewardship: Q&A with JSTOR's Roger Schonfeld

Nearly all archival institutions, at every scale, hold a backlog of material awaiting processing. JSTOR’s recently created Digital Stewardship Services aims to address this situation with a next-gen service designed to help libraries and archives describe, preserve, manage, and share their collections using JSTOR’s AI-driven Seeklight tool (in conjunction with human expertise. Roger Schonfeld, who was recently named Managing Director of JSTOR Stewardship, spoke to LJ about bringing machine knowledge to a human-centered workflow.

Nearly all archival institutions, at every scale, hold a backlog of material awaiting processing. JSTOR’s recently created Digital Stewardship Services aims to address this situation with a next-gen service designed to help libraries and archives describe, preserve, manage, and share their collections using JSTOR’s AI-driven Seeklight tool (which recently won the Society of American Archivists’ C.F.W. Coker Award for Description) in conjunction with human expertise.

Roger Schonfeld, who was recently named Managing Director of JSTOR Stewardship, spoke to LJ about bringing machine knowledge to a human-centered workflow.


Roger Schonfeld headshotLJ: How does Digital Stewardship Services work?

Roger Schonfeld: It lets you basically take an archival box and scan the materials in it and upload them into Seeklight, and it will generate candidate metadata that that a professional can then review.

Over time, we envision it adding all sorts of other collections processing features. We’re thinking beyond just metadata generation. On the other side of Digital Stewardship Services is the ability to put these libraries, archives, and distinctive collections seamlessly into JSTOR and interconnect them with the secondary literature, and make them discoverable and impactful, either in one’s own institution or on an open access basis, and then be able to use the Portico digital preservation technology to preserve it.

Who is currently using it?

We have several hundred institutions that have been participating, mostly in the digital asset management parts of the parts of the system. But we are also offering an opportunity to any libraries, archives, or museums that are interested to join the version of this that has Seeklight as part of it. We currently have several dozen institutions that have joined up to do that. We’re seeing everything from large research universities to community colleges—a lot of interest in the opportunities of learning together about how a technology product like this one could really transform the way that content is made available.

Given that number of institutions, how is the service scaled up and down?

What they largely have in common is a strong sense—today, I think, even more than in the past—of the importance of grounding not just our scholarship, but our public discourse in original sources and primary sources. We’ve seen a lot of interest in that across the sector. Some institutions have enormously large and complex distinctive collections. Some of them have comparatively simple or modest ones. But everybody wants to activate those collections and ensure that they have impact. Some smaller institutions may have never had a digital collection system before, and so for them, this is an opportunity to think differently about the opportunity to activate those materials digitally. Some larger institutions have a dozen different platforms and systems that all interconnect with one another and have specialized functionalities. But we’re seeing an interest equally in how to integrate and learn from the opportunities that new technologies can bring into this important part of the work.

We have the original sources of our culture, our history, our society, our ancestors, that bring a truth forward that is important wherever you are, whatever your points of view might be. And I see a real opportunity in this work to enhance the ability for both scholarship and the public discourse to be ever more grounded in those original sources. I feel like that’s essential.

Not everyone who processes collections is comfortable with using AI. How do you address that?

There are all sorts of reasons to come from different points of view here. What I can say about our vision is that we are trying to be grounded—not, Oh, there’s a new technology, let’s use it, but more like, What is the work of a library, of an archive? What is the need that an end user has today? And how can we build services that activate an array of different kinds of technologies and human expertise alongside one another, ideally integrated closely, to better serve those purposes? What we’ve tried to do with Seeklight is not develop a tool that can just build perfect metadata—who knows if that will ever be a possibility?—but build a tool that can accelerate the work of a human who is trying to build the metadata that’s needed for discovery and management of collections.

We’re building tools and workflows that empower that human to review and edit and manage the candidate metadata that’s coming out of Seeklight, and that’s all integrated very closely with the purpose of keeping the human expert in control. Now in some institutional contexts that will involve a lot more review of the materials, because that’s something that fits into their staffing capacity and their expectations. In other institutions, it may unlock so much opportunity to just move materials digitally, it’s really exciting. There are going to be all kinds of ways that these tools will be incorporated into an archival processing workflow, and we want to support whatever those might look like.

On the other side, on the JSTOR platform, we have an interactive research tool that gives a researcher or a learner an opportunity to interact with the content objects, with the journal article or the book chapter. And again, we didn’t want to just turn on a search bot. We have some incredibly thoughtful and caring product leaders in our organization who are really connected to the work of our user communities, of librarians. We really want to think about: how can we incorporate technologies in ways that actually serve the user’s purpose, the library’s purpose? I recognize that not every institution or individual is going to be equally enthusiastic. Any new technology engenders some change in how we do our work, but ultimately, JSTOR and Ithaka, are committed to doing this in ways that are caring, responsible, ethical, and, deeply aligned with the communities that we serve.

How do you incorporate best practices for such a wide range of users?

One of the things that I’m really committed to is that we’re working with each participant to hear how they’re doing it. Okay, you’re implementing it in this kind of way, this kind of staffing model, this kind of workflow, whatever those things might be. This other institution is doing it in a different kind of way. Is one of those better than the other? We’re at the point where there may not be a set of best practices, but we we’re trying to develop some different cases of how this can be implemented in use.

The challenge, and it’s something that we think about a lot, is how to take preservation responsibilities as seriously as our organization does, while also trying to stay aligned with the changing dynamics in the environment and ecosystem that we work in. You don’t want to get entrenched. The way that I’ve been encouraged to think about this role is that I’m learning, and we’re learning along with our community, and we try to bear that in mind.

One of the things that I see as an exciting element of how we’ve approached this is we really are thinking about this as shared infrastructure for the collecting community. One of the things that is going to be important to navigate is that institutions want to have a sense of distinctiveness around their collections. But the infrastructure that supports that can be shared across the community. I think the way that we’re incorporating the JSTOR platform in this work, and the way that we’re seeing directions that Seeklight can develop into, we’re trying to strike the right balance between wanting institutions not to be locked into infrastructure. We want them to have flexibility and freedom. And we also want them to benefit from scale.

Author Image
Lisa Peet

lpeet@mediasourceinc.com

Lisa Peet is Executive Editor for Library Journal.

Comment Policy:
  • Be respectful, and do not attack the author, people mentioned in the article, or other commenters. Take on the idea, not the messenger.
  • Don't use obscene, profane, or vulgar language.
  • Stay on point. Comments that stray from the topic at hand may be deleted.
  • Comments may be republished in print, online, or other forms of media.
  • If you see something objectionable, please let us know. Once a comment has been flagged, a staff member will investigate.


RELATED 

ALREADY A SUBSCRIBER?

We are currently offering this content for free. Sign up now to activate your personal profile, where you can save articles for future viewing

ALREADY A SUBSCRIBER?