BanglaLekhaImageCaptions

Name: BanglaLekhaImageCaptions
Published: 2018-09-02
License: Unknown

Dataset Information

Modalities

Images, Texts

Languages

Bengali

Introduced

2018

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

This dataset consists of images and annotations in Bengali. The images are human annotated in Bengali by two adult native Bengali speakers. All popular image captioning datasets have a predominant western cultural bias with the annotations done in English. Using such datasets to train an image captioning system assumes that a good English to target language translation system exists and that the original dataset had elements of the target culture. Both these assumptions are false, leading to the need of a culturally relevant dataset in Bengali, to generate appropriate image captions of images relevant to the Bangladeshi and wider subcontinental context. The dataset presented consists of 9,154 images.

Variants: BanglaLekhaImageCaptions

Associated Benchmarks

This dataset is used in 1 benchmark:

Image Captioning - Metrics: BLEU-1, BLEU-2, BLEU-3, BLEU-4, CIDEr, METEOR, ROUGE-L, SPICE

Recent Benchmark Submissions

Task	Model	Paper	Date
Image Captioning	CNN + 1D CNN	Improved Bengali Image Captioning via …	2021-02-14

Research Papers

Recent papers with results on this dataset:

Improved Bengali Image Captioning via deep convolutional neural network based encoder-decoder model (2021) -

External Links:

BanglaLekhaImageCaptions

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview