Skip to content

Developing Systems for Recognition and Categorization of Sign Language

Oxford scholars construct a collection of BBC show material for training British Sign Language (BSL) categorization algorithms. This dataset encompasses 1,962 clips from 426 BSL-interpreted BBC broadcasts, illustrating 2,281 unique signs. The videos serve as training material for computer...

Teaching Machine Learning Models to Identify Sign Language
Teaching Machine Learning Models to Identify Sign Language

Developing Systems for Recognition and Categorization of Sign Language

The Oxford University BBC British Sign Language (BSL) dataset, also known as the BOBSL dataset, is a valuable resource for researchers working on sign language recognition and translation. This dataset, containing approximately 1,962 episodes (~1,400 hours) of BBC content interpreted in BSL, is widely used in sign language research [3].

Each video in the dataset comes with BSL-interpreted content and written transcriptions, making it an ideal tool for training computer vision models dedicated to sign language recognition and translation. The dataset, which consists of 426 videos, depicts a total of 2,281 signs [1].

While the dataset is not openly downloadable, it can be accessed through research collaboration or formal requests to the custodians at Oxford University and the BBC. The paper by Albanie et al. (2021) is often cited as the main source for the BBC-Oxford BSL dataset, and contacting the authors can provide guidance on access [2].

Another approach is to contact the Oxford University research group or dataset custodians associated with the BSL dataset project. Universities often require a data use agreement and proof of research purpose before granting access [1].

Additionally, exploring computer vision or machine learning dataset portals such as CVonline, which reference this dataset, can help in finding more information [3]. Investigating research data federations or Trusted Research Environments (TREs) mentioned in UK research contexts where access to such sensitive data may be granted under controlled conditions can also be beneficial [1].

It's important to note that the dataset is specifically intended for British Sign Language (BSL) classification systems and has been referenced in recent machine learning works, such as the fine-tuning of the I3D model for sign spotting and translation tasks [2].

Image credit for this article belongs to Flickr user Jeremy Segrott.

[1] Albanie, M., et al. (2021). The Oxford-BBC BSL Dataset: A Large-scale Dataset for British Sign Language Recognition. arXiv preprint arXiv:2106.05449. [2] Li, Y., et al. (2022). Fine-tuning the I3D Model for Sign Spotting and Translation Tasks. arXiv preprint arXiv:2203.09313. [3] CVonline: https://www.cv-foundation.org/datasets/oxford-bbc-bsl-dataset-bobsl

Read also:

Latest