CCGbank

Dataset Information
Introduced
2005
License
Homepage

Overview

CCGbank is a translation of the Penn Treebank into a corpus of Combinatory Categorial Grammar derivations. It pairs syntactic derivations with sets of word-word dependencies which approximate the underlying predicate-argument structure.
The dataset contains 99.44% of the sentences in the Penn Treebank, for which it corrects a number of inconsistencies and errors in the original annotation.

Source: CCGbank

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
CCG Supertagging Heterogeneous Dynamic Convolutions Geometry-Aware Supertagging with Heterogeneous Dynamic … 2022-03-23
CCG Supertagging NeST-CCG + BERT Supertagging Combinatory Categorial Grammar with … 2020-10-13
CCG Supertagging BiLSTM-LAN Hierarchically-Refined Label Attention Network for … 2019-08-23
CCG Supertagging CVT + Multi-task + Large Semi-Supervised Sequence Modeling with Cross-View … 2018-09-22

Research Papers

Recent papers with results on this dataset: