Corpus Oral y Sonoro del Español Rural

From Dialectsyntax
(Difference between revisions)
Jump to: navigation, search
 
(17 intermediate revisions by one user not shown)
Line 1: Line 1:
 
===Description===
 
===Description===
  
The Audible Corpus of Spoken Rural Spanish (after its Spanish abbreviation: COSER) is a dialectal corpus based on interviews with informants who have been object of interest in the traditional dialectology: rural native speakers, elderly and with a low education degree. So far, 1,497 informants have been recorded, with the following distribution by sex:  
+
The Audible Corpus of Spoken Rural Spanish (after its Spanish abbreviation: COSER [http://www.lllf.uam.es:8888/coser/]) is a dialectal corpus based on interviews with informants who have been object of interest in the traditional dialectology: rural native speakers, elderly and with a low education degree. So far, 1,497 informants have been recorded, with the following distribution by sex:  
  
Males 662 (44.2%) Females 835 (55.7%) Total: 1.497  
+
Males 662 (44.2%) Females 835 (55.7%) Total: 1,497  
  
 
The informants' global average age is 72.9 years old. COSER deals with a survey oriented towards informants, who have been born in the first third of the 20th century, and who have not received much instruction. On the whole, they have attended some years of elementary school learning, according to their declarations, "to read and write, and four more rules [on elementary mathematics]".  
 
The informants' global average age is 72.9 years old. COSER deals with a survey oriented towards informants, who have been born in the first third of the 20th century, and who have not received much instruction. On the whole, they have attended some years of elementary school learning, according to their declarations, "to read and write, and four more rules [on elementary mathematics]".  
 
The recordings within the COSER have been regularly obtained since 1990 up to recently in a series of surveys campaigns. This fieldwork has been organized by the support of several research projects and as a part of the fieldwork attached to the optional subjects "Hispanic Dialectology" (1988-1996) and "The Spoken Spanish: Peninsular Variants" (1996-2011), belonging to the Degree on Hispanic Philology in the Autonomous University of Madrid (UAM).  
 
The recordings within the COSER have been regularly obtained since 1990 up to recently in a series of surveys campaigns. This fieldwork has been organized by the support of several research projects and as a part of the fieldwork attached to the optional subjects "Hispanic Dialectology" (1988-1996) and "The Spoken Spanish: Peninsular Variants" (1996-2011), belonging to the Degree on Hispanic Philology in the Autonomous University of Madrid (UAM).  
Until now, 801 rural places of the Center and North of the Iberian Peninsula have been interviewed. The final objective is to obtain recordings of the Spanish language spoken in rural areas of the whole Iberian Peninsula. The localities surveyed so far appears in the map: [[File:COSER_localities.jpg]]
+
Until now, 801 rural places of the Center and North of the Iberian Peninsula have been interviewed. The final objective is to obtain recordings of the Spanish language spoken in rural areas of the whole Iberian Peninsula. The localities surveyed so far appears in the map:  
  
Map: Localities surveyed in COSER (2011)
 
  
The audible materials include, for the time being, the central band of the Iberian Peninsula. Besides, the density of the network of points is comparable to that of the regional atlases or, even, thicker.  
+
[[File:COSER_localities.jpg]]
In general, the COSER has nowadays circa 1,000 recording hours. All of them has been digitalized. Some have been also transcripted as text files, thanks to the support obtained by several research projects and the participation of numerous generations of students of the degree in the UAM, who have transcripted recordings that they collected themselves, as a part of their works in the academic course. These recordings and transcriptions are available at www.uam.es/coser.  
+
 
 +
 +
The audio materials include, for the time being, the central band of the Iberian Peninsula. Besides, the network density is comparable to that of the regional atlases or, even, larger.  
 +
In general, COSER has nowadays circa 1,000 recording hours. All of them has been digitized. Some have been also transcribed as text files, thanks to the support obtained by several research projects and the participation of numerous students generations in the UAM, who have transcribed recordings that they collected themselves, as a part of their work in the academic course. Forty hours of these recordings and their transcriptions are available now at www.uam.es/coser. In 2012, 150 hours will be available.
  
 
===Methodology===
 
===Methodology===
  
The methodology used in the COSER belongs to that of the sociolinguistic interview, whose topics deal with the traditional life in the field. The fact that the interview focuses on these thematic modules does not prevent the interest to change towards others, such as the education, the desires, the personal experiences, the own life or the family, depending on the degree of comfort and spontaneity shown by the informant. The decision to focus the interview on thematic modules related to the rural life "from the past" has to do with the fact that, in order to accept the interview, the potential informant has to believe to have certain knowledge on a lifestyle already passed. This awareness is the result of his personal experience and of his age, and that knowledge grants the interviewed informative "authority" before the urban interviewer. The informant accepts the interview encouraged by our interest for the testimony about a lifestyle in decadence, of which the interviewed is known as an expert. We think that the spontaneous collaboration of the informant would be much more difficult to manage, if he/she were asked at the beginning about opinions or personal experiences, or on questions foreign to the rural life. In fact, in not a few occasions the informants have mentioned the condition of the university students to try to elude the interview, invoking "if you already know everything better than me!". The insistence of the team towards their interest in the strictly local traditions, in contrast with that of other rural places, has been in many occasions determinant so that he/she accepted the interview.  
+
The methodology used in COSER has consisted in sociolinguistic interviews, aimed by part of the interviewers at some subjects of traditional country life. The fact that the interview is focussed on such specific subjects does not prevent that, after some time and having gained the informant’s confidence, interest is aimed at other subjects, such as education, personal hopes and experiences, life or family, depending on the level of easiness and spontaneity shown by the informant. The decision of  focusing the interview on specific subjects related to rural life “of former times” has much to do with the fact that, in order to accept to be interviewed, potential informants must prove to have some knowledge about a way of life in decline. This knowledge is a product of their own personal experience and age and gives them informative "authority" in front of the urban interviewer. Informants accept the interview as they realize that we are interested in a testimony on a way of life in decline about which very few have hardly any memory at all and which they know they are expert on. We think that the informants’ spontaneous cooperation would be much more difficult if they would be required at first to be interviewed on personal views or experiences, linguistic matters or other aspects beyond rural life. The fact that the interviewing team has insisted on their specific interest in the strictly local tradition, in contrast to that of other rural enclaves, as well as in the exclusive informant’s condition as recipient of such tradition, has been on many occasions a decisive factor for accepting the interview.  
We find sometimes informants, whose answers tend to be monosyllabic, with short phrases. However, we always try to address a talkative interviewed, who can easily explain and discuss the subjects we deal with. This fact makes the success not be ever assured. An interview can be ideal or terrible based, previously, on the same conditions. This way, not all the interviews are neither equally suitable nor informative, depending on the disposition of the informant, on the skill of the interviewers, and on the interaction among both of them. Nevertheless, there is no testimony to be despised.  
+
Informants are always randomly contacted, with no previous actions, among the local inhabitants fulfilling the above mentioned requirements. Due to the experience, not much gratifying, of some interviews on account of the informants’ low communication ability (people not much willing to speak, who answered with very short sentences or just in monosyllables) led us to add subsequently the condition of loquacity (“that the informants like talking”) to the informants’ selection protocol. As it will be obviously well-known to anyone who has ever carried out fieldwork, success is never assured, and an interview starting under the same conditions may be optimum or dreadful. Thus, not all interviews are equally suitable or informative, depending on the informants’ willingness, the interviewers’ skills as well as the interaction between them; however, no testimony should be disregarded for that reason.
Regarding the number of informants from every place, the COSER has preferred to interview in depth a single person, be man or woman. Nevertheless, the conditions of the recording do not always avoid the partial interruption of other individuals (in general, members of the family or acquaintances who, attracted by the extraordinary circumstance of the interview, feel tempted to intervene and give their testimony). Thus, although the COSER has registered up to 1,497 informants, most of the times, each locality has only been surveyed once.
+
This methodology can not avoid the problem of accommodation between the informant and the interviewer, or the challenging representativeness of the informant ramdomly chosen. Nevertheless, we think that the quantity of the data allows to circumvent these potential problems, since the data always show geographical coherence and make it possible to discard those informants who could be considered anomalous with their area.
At the beginning, the protocol of the interview was designed in order to witness certain linguistic phenomena (in I, the use of the atone pronouns), but immediately other many features arose, apart from those expected. From that moment, the development of the conversation has looked for the creation of contexts favouring the appearance of dialectal information of all kind. Some of them were alluded in the specializing bibliography and others that scarcely had hardly attracted attention. In this respect, it is necessary to highlight that the COSER interviews have proved to be specially useful to document dialectal phenomena relative to the grammar; this latter aspect has been traditionally little represented in the dialectal monographs and in the questionnaires of the linguistic atlases.  
+
Regarding the number of informants of each enclave, in general one single person has preferably been thoroughly interviewed in COSER, either a man or a woman. Nevertheless, recording conditions have sometimes not allowed to avoid interruptions from other individuals (generally members of the family or acquaintances who, drawn by such an extraordinary event as the interview, cannot resist the temptation to take part in the interview by giving their own testimony.) Thus, although up to 1,497 informants have been recorded in COSER, most of the times only one informant per enclave has actually been thoroughly surveyed as desired (almost the half).
The average duration of the recordings is an hour and a quarter (75 min.) by locality, but it can range from only half an hour until even more than two hours and a half. The quality of the recordings is not directly proportional to the duration, since there exist excellent and very informative recordings of only half an hour, whose results are comparable to the obtained ones in a longer session.
+
The average duration of the recordings is one hour and fifteen minutes (75 minutes) per enclave, although it may range from just half an hour up to  more than two hours and a half. The quality of the data recorded is not directly proportional to the duration, since there are excellent and very informative recordings of just half an hour, whose results are comparable to those obtained in a longer session.
 +
 
 +
===Utilitiy and limitations===
 +
 
 +
COSER is a corpus aimed to measure the differences which may be found in the speech of sociocultural groups with a lower education in rural areas. It is therefore a complement to both linguistic atlases and to the different corpora of cultivated and urban speech which have been compiled or are planned to be so in the Spanish-speaking world. The uniformity in the methodology used makes it useful to measure both the linguistic distance which separates different areas (physical distance) and the linguistic distance which separates this social group from others, like for instance, that of speakers with a higher sociocultural level or that of younger speakers (social distance). Although the proportion of men and women interviewed is not identical  (55,7 % women vs. 44,2% men), the number of speakers of each gender is statistically representative and also allows to investigate linguistic differences associated with gender.
 +
The fact that the media are the sources of most Spanish oral corpora lends some singularity to the COSER, since the interviewed speakers for COSER are rarely recorded in this field. The comparison between the data obtained in COSER and in other corpora of spoken Spanish enables thus to point out clear sociocultural differences. In this regard, COSER has proved especially useful since it provides the study of non-standard grammatical solutions, which are usually systematically avoided in written language and in the speech of sociocultural groups of higher education. For that reason, Chambers (1995) has proposed, as a sociolinguistic universal, the qualitative character (presence/absence) of grammatical variables in the social scale, in contrast to the quantitative character of phonetic variables.
 +
 
 +
===Research lines and publications===
 +
 
 +
COSER materials have made it possible to research some aspects of the grammatical variation in Spanish, whose results have been peridocally published [http://www.lllf.uam.es/coser/contenido.php?es&publicaciones]. The research topics have been the following: accusative / dative clitics alternation, mass neuter, clitic order, subjunctive / indicative variation, double determination, personal infinitives for 3rd person plural subjects, reflexive passives and reflexive impersonals and analogical verb forms.
 +
 
 +
===Research team===
 +
 
 +
Inés Fernández-Ordóñez Hernández, Project Director and Main Researcher
 +
 
 +
Enrique Pato Maldonado, Researcher
 +
 
 +
Javier Rodríguez Molina, post-doctoral Researcher
 +
 
 +
Bautista Horcajada Diezma, ICT Developer
 +
 
 +
Carlota de Benito Moreno, PhD student Dialectology
 +
 
 +
Víctor Lara Bermejo, PhD student Dialectology
 +
 
 +
Beatriz Martín Izquierdo, Research Assistant
 +
 
 +
Sara García Motilla, Research Assistant

Latest revision as of 13:45, 18 January 2012

Contents

[edit] Description

The Audible Corpus of Spoken Rural Spanish (after its Spanish abbreviation: COSER [1]) is a dialectal corpus based on interviews with informants who have been object of interest in the traditional dialectology: rural native speakers, elderly and with a low education degree. So far, 1,497 informants have been recorded, with the following distribution by sex:

Males 662 (44.2%) Females 835 (55.7%) Total: 1,497

The informants' global average age is 72.9 years old. COSER deals with a survey oriented towards informants, who have been born in the first third of the 20th century, and who have not received much instruction. On the whole, they have attended some years of elementary school learning, according to their declarations, "to read and write, and four more rules [on elementary mathematics]". The recordings within the COSER have been regularly obtained since 1990 up to recently in a series of surveys campaigns. This fieldwork has been organized by the support of several research projects and as a part of the fieldwork attached to the optional subjects "Hispanic Dialectology" (1988-1996) and "The Spoken Spanish: Peninsular Variants" (1996-2011), belonging to the Degree on Hispanic Philology in the Autonomous University of Madrid (UAM). Until now, 801 rural places of the Center and North of the Iberian Peninsula have been interviewed. The final objective is to obtain recordings of the Spanish language spoken in rural areas of the whole Iberian Peninsula. The localities surveyed so far appears in the map:


COSER localities.jpg


The audio materials include, for the time being, the central band of the Iberian Peninsula. Besides, the network density is comparable to that of the regional atlases or, even, larger. In general, COSER has nowadays circa 1,000 recording hours. All of them has been digitized. Some have been also transcribed as text files, thanks to the support obtained by several research projects and the participation of numerous students generations in the UAM, who have transcribed recordings that they collected themselves, as a part of their work in the academic course. Forty hours of these recordings and their transcriptions are available now at www.uam.es/coser. In 2012, 150 hours will be available.

[edit] Methodology

The methodology used in COSER has consisted in sociolinguistic interviews, aimed by part of the interviewers at some subjects of traditional country life. The fact that the interview is focussed on such specific subjects does not prevent that, after some time and having gained the informant’s confidence, interest is aimed at other subjects, such as education, personal hopes and experiences, life or family, depending on the level of easiness and spontaneity shown by the informant. The decision of focusing the interview on specific subjects related to rural life “of former times” has much to do with the fact that, in order to accept to be interviewed, potential informants must prove to have some knowledge about a way of life in decline. This knowledge is a product of their own personal experience and age and gives them informative "authority" in front of the urban interviewer. Informants accept the interview as they realize that we are interested in a testimony on a way of life in decline about which very few have hardly any memory at all and which they know they are expert on. We think that the informants’ spontaneous cooperation would be much more difficult if they would be required at first to be interviewed on personal views or experiences, linguistic matters or other aspects beyond rural life. The fact that the interviewing team has insisted on their specific interest in the strictly local tradition, in contrast to that of other rural enclaves, as well as in the exclusive informant’s condition as recipient of such tradition, has been on many occasions a decisive factor for accepting the interview. Informants are always randomly contacted, with no previous actions, among the local inhabitants fulfilling the above mentioned requirements. Due to the experience, not much gratifying, of some interviews on account of the informants’ low communication ability (people not much willing to speak, who answered with very short sentences or just in monosyllables) led us to add subsequently the condition of loquacity (“that the informants like talking”) to the informants’ selection protocol. As it will be obviously well-known to anyone who has ever carried out fieldwork, success is never assured, and an interview starting under the same conditions may be optimum or dreadful. Thus, not all interviews are equally suitable or informative, depending on the informants’ willingness, the interviewers’ skills as well as the interaction between them; however, no testimony should be disregarded for that reason. This methodology can not avoid the problem of accommodation between the informant and the interviewer, or the challenging representativeness of the informant ramdomly chosen. Nevertheless, we think that the quantity of the data allows to circumvent these potential problems, since the data always show geographical coherence and make it possible to discard those informants who could be considered anomalous with their area. Regarding the number of informants of each enclave, in general one single person has preferably been thoroughly interviewed in COSER, either a man or a woman. Nevertheless, recording conditions have sometimes not allowed to avoid interruptions from other individuals (generally members of the family or acquaintances who, drawn by such an extraordinary event as the interview, cannot resist the temptation to take part in the interview by giving their own testimony.) Thus, although up to 1,497 informants have been recorded in COSER, most of the times only one informant per enclave has actually been thoroughly surveyed as desired (almost the half). The average duration of the recordings is one hour and fifteen minutes (75 minutes) per enclave, although it may range from just half an hour up to more than two hours and a half. The quality of the data recorded is not directly proportional to the duration, since there are excellent and very informative recordings of just half an hour, whose results are comparable to those obtained in a longer session.

[edit] Utilitiy and limitations

COSER is a corpus aimed to measure the differences which may be found in the speech of sociocultural groups with a lower education in rural areas. It is therefore a complement to both linguistic atlases and to the different corpora of cultivated and urban speech which have been compiled or are planned to be so in the Spanish-speaking world. The uniformity in the methodology used makes it useful to measure both the linguistic distance which separates different areas (physical distance) and the linguistic distance which separates this social group from others, like for instance, that of speakers with a higher sociocultural level or that of younger speakers (social distance). Although the proportion of men and women interviewed is not identical (55,7 % women vs. 44,2% men), the number of speakers of each gender is statistically representative and also allows to investigate linguistic differences associated with gender. The fact that the media are the sources of most Spanish oral corpora lends some singularity to the COSER, since the interviewed speakers for COSER are rarely recorded in this field. The comparison between the data obtained in COSER and in other corpora of spoken Spanish enables thus to point out clear sociocultural differences. In this regard, COSER has proved especially useful since it provides the study of non-standard grammatical solutions, which are usually systematically avoided in written language and in the speech of sociocultural groups of higher education. For that reason, Chambers (1995) has proposed, as a sociolinguistic universal, the qualitative character (presence/absence) of grammatical variables in the social scale, in contrast to the quantitative character of phonetic variables.

[edit] Research lines and publications

COSER materials have made it possible to research some aspects of the grammatical variation in Spanish, whose results have been peridocally published [2]. The research topics have been the following: accusative / dative clitics alternation, mass neuter, clitic order, subjunctive / indicative variation, double determination, personal infinitives for 3rd person plural subjects, reflexive passives and reflexive impersonals and analogical verb forms.

[edit] Research team

Inés Fernández-Ordóñez Hernández, Project Director and Main Researcher

Enrique Pato Maldonado, Researcher

Javier Rodríguez Molina, post-doctoral Researcher

Bautista Horcajada Diezma, ICT Developer

Carlota de Benito Moreno, PhD student Dialectology

Víctor Lara Bermejo, PhD student Dialectology

Beatriz Martín Izquierdo, Research Assistant

Sara García Motilla, Research Assistant

Personal tools