Digital Advances in the TfL Corporate Archives

Date & Time

Nov. 5, 2025, 8 a.m. - Nov. 5, 2025, 9 a.m.

Cost

$0

Location

Online


Sign Up


Description

Overview

We're starting to use Data Science skills to help us with some of our digital challenges. Join us for some demos and a Q+A

The digital black hole is very real, but so too is the digital records mountain - and it increasingly seems unending. It seems inevitable that the rate of growth of digital records coming into the custody of archive repositories will only continue to increase. It also seems inevitable that these repositories will not find their staff numbers growing at a rate that can process and manage this growth effectively. Like it or not, data science skills and the use of AI will have to be explored and employed to some extent.

This is what we've been doing at TfL Corporate Archives for the past 6 years (with a 3yr hiatus thanks to you-know-what!), and we'd like to share with you all the initial results of some of this work.

  • AI generated descriptions of digital records, with standardised catalogue entry outputs flagging of offensive terminology and personal data
  • Trained image captioning tool, tailored for TfL needs
  • In-house handwriting recognition tool
  • Auto-classification, or suggested classification, of digital records based on title and description

We'll share demos of the above, our experiences, and our plans going forwards - which includes training.

Let's be clear, none of the Archives team have to date received any formal data science training and it may well be said by some that we still don't really know what we're doing!! We're also fortunate to have been able to tap into the expertise of others. This means that any technical questions you ask us we may not be able to answer - but we will go away, find the answers, and get back to you! But hopefully our lack or technical expertise or training will also help encourage you about what can be achieved despite these barriers.