Fintech Revolution — Leveraging AI for Financial Innovation

Developing Systems for Recognition and Categorization of Sign Language

Oxford scholars construct a collection of BBC show material for training British Sign Language (BSL) categorization algorithms. This dataset encompasses 1,962 clips from 426 BSL-interpreted BBC broadcasts, illustrating 2,281 unique signs. The videos serve as training material for computer...

, and Administrator

2025 August 31 . 11:56 AM

2 min read

Teaching Machine Learning Models to Identify Sign Language

Developing Systems for Recognition and Categorization of Sign Language

The Oxford University BBC British Sign Language (BSL) dataset, also known as the BOBSL dataset, is a valuable resource for researchers working on sign language recognition and translation. This dataset, containing approximately 1,962 episodes (~1,400 hours) of BBC content interpreted in BSL, is widely used in sign language research [3].

Each video in the dataset comes with BSL-interpreted content and written transcriptions, making it an ideal tool for training computer vision models dedicated to sign language recognition and translation. The dataset, which consists of 426 videos, depicts a total of 2,281 signs [1].

While the dataset is not openly downloadable, it can be accessed through research collaboration or formal requests to the custodians at Oxford University and the BBC. The paper by Albanie et al. (2021) is often cited as the main source for the BBC-Oxford BSL dataset, and contacting the authors can provide guidance on access [2].

Another approach is to contact the Oxford University research group or dataset custodians associated with the BSL dataset project. Universities often require a data use agreement and proof of research purpose before granting access [1].

Additionally, exploring computer vision or machine learning dataset portals such as CVonline, which reference this dataset, can help in finding more information [3]. Investigating research data federations or Trusted Research Environments (TREs) mentioned in UK research contexts where access to such sensitive data may be granted under controlled conditions can also be beneficial [1].

It's important to note that the dataset is specifically intended for British Sign Language (BSL) classification systems and has been referenced in recent machine learning works, such as the fine-tuning of the I3D model for sign spotting and translation tasks [2].

Image credit for this article belongs to Flickr user Jeremy Segrott.

[1] Albanie, M., et al. (2021). The Oxford-BBC BSL Dataset: A Large-scale Dataset for British Sign Language Recognition. arXiv preprint arXiv:2106.05449. [2] Li, Y., et al. (2022). Fine-tuning the I3D Model for Sign Spotting and Translation Tasks. arXiv preprint arXiv:2203.09313. [3] CVonline: https://www.cv-foundation.org/datasets/oxford-bbc-bsl-dataset-bobsl

Latest

Fintech's Guide to Lifestyle

Cat Care Industry Booms with Innovations in Tech, Sustainability, and Nutrition

From smart feeders to sustainable cat food, the pet care industry is booming. Cat lovers worldwide are embracing technology, community, and eco-friendly products to care for their feline friends.

, and Administrator

2025 October 9

In this image, we can see an advertisement contains robots and some text.

Fashion-and-beauty

Shoppers Embrace Generative AI for Personalized Shopping Experiences

AI is revolutionizing shopping. Now, you can ask for style advice and get personalized product curations. Retailers are taking note and updating their systems.

, and Administrator

2025 October 9