How2Sign Dataset
Continuous American Sign Language Multiview videos Depth data 2D & 3D skeletons Gloss annotation English translation

First large-scale multimodal and multiview continuous American Sign Language dataset

Download Sample "      " Download Full Dataset

ABOUT

We introduce How2Sign, a multimodal and multiview continuous American Sign Language (ASL) dataset, consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.
A three-hour subset was further recorded in the Panoptic studio enabling detailed 3D pose estimation.
This dataset is publicly available for research purposes only.

Download

This section is under construction.
We will be releasing the other modalities as well as scripts to easily download soon!
The dataset is publicly available for research purposes only.

Download the videos, annotations and metadata separately

Green Screen RGB videos (frontal view)
Green Screen RGB videos (side view)
Green Screen RGB clips* (frontal view)
Green Screen RGB clips* (side view)
B-F-H 2D Keypoints clips* (frontal view)
B-F-H 2D Keypoints clips* (side view)

Copyright

The dataset on this webpage is copyright by us and published under the Creative Commons Attribution-NonCommercial 4.0 International License. This means that you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may NOT use the material for commercial purposes.

Disclaimer

The How2Sign dataset was collected as a tool for research, however, it is worth noting that the dataset may have unintended biases (including those of a societal, gender or racial nature). For more information about the bias that the dataset might present, please refer to the published paper.

Green Screen RGB clips*

The Green Screen RGB clips were segmented using the original timestamps from the How2 dataset. Each clip corresponds to one sentence of the English translation. Note that this may not have a perfect alignment between the ASL video and the English translation due to the differences between both languages.
The manual re-alignment of the English translation with the ASL videos will be released soon!

How2 Data*

For copyright reasons, we are not allowed to redistribute the How2 data. Please refer to the original repository to request and download the entire How2 dataset if needed.
Note that the How2Sign follows the original [train/validation/test] splits of the How2 dataset.

Reporting Issues

If you have any problems with the data, please let us know by creating a " New issue" on the Github repo.

Publication

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, and Xavier GirĂ³-i-Nieto
CVPR, 2021
[PDF] [1' video] [Poster]

When using the How2Sign Dataset please reference:

@inproceedings{Duarte_CVPR2021,
    title={{How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language}},
    author={Duarte, Amanda and Palaskar, Shruti and Ventura, Lucas and Ghadiyaram, Deepti and DeHaan, Kenneth and
                   Metze, Florian and Torres, Jordi and Giro-i-Nieto, Xavier},
    booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021}
}
  

Video Summary



Developed by

     


Supported by