Dataset

All datasets used in our study can be downloaded from here. These include: (i) the manually annotated dataset of classified and linked code comments; (ii) the dataset used to pre-train the T5 model; and (iii) the large-scale dataset automatically built by running SALOON on 10k GitHub repositories, and used to train STUNT.

In the following we report concrete examples of manually classified linked comments, all belonging to the code summary category (since these are the focus of our paper). The comment we classified and linked is reported in red, while the statements it documents are highlighted in yellow.

ID	Examples of Manually Classified and Linked Comments