Comment indexer les corpus oraux ?

Pascal Cordereix

doi:10.1051/hel/2016380208

Free Access

Issue		HEL Volume 38, Number 2, 2016 Constitution de corpus linguistiques et pérennisation des données


Page(s)		101 - 113
DOI		https://doi.org/10.1051/hel/2016380208
Published online		1 février 2017

Histoire Épistémologie Langage 38/2 (2016), p. 101-113

Comment indexer les corpus oraux ?

Pascal Cordereix

Bibliothèque nationale de France / Laboratoire Ligérien de Linguistique avec l’aimable relecture de Michel Jacobson, CNRS / LLL

Résumé

La patrimonialisation des corpus oraux fait désormais partie de leur cycle de vie. Le geste de « mettre à part » (Michel de Certeau) qui caractérise toute entrée en archives amène notamment à un ensemble d’actions descriptives (inventaire, catalogage...) normées, qui vont permettre la consultation, la diffusion, l’exploitation et la conservation pérenne, etc. du corpus. Dans cet article, nous présentons certaines problématiques sous-jacentes à la description d’archives sonores dans le cadre d’une institution patrimoniale. Nous replacerons ces questionnements dans une perspective historique, des premières fiches descriptives à la fin du XIX^e siècle jusqu’aux modèles conceptuels de données du web sémantique et du web de données aujourd’hui.

Abstract

Becoming part of cultural heritage, patrimonialisation has now become a step in spoken corpuses’ life-cycle. The action of ‘putting aside, gathering’ (Michel de Certeau) which characterizes any archiving leads in particular to a range of standardized descriptive processes (inventorying, cataloging...), which will provide catalogue consulting, dissemination, use and permanent preservation of the body of archives. In this article, we will develop some issues underlying sound archives description in the context of heritage institutions. We will put these issues from a historical perspective, dating back from the XIX^th century written descriptive sheets to conceptual data formats in our today semantic web and linked data environment.

Mots clés : Archives sonores / corpus oraux / corpus de la parole / archive numérique / conservation pérenne / Bibliothèque nationale de France / métadonnées / web sémantique / web de données

Key words: Sound archives / spoken corpus / digital archive / sound preservation / National Library of France / metadata / semantic web / linked data