Introduction
Bilingualism is a gradient of experiences that show significant variation across individuals who speak more than one language (DeLuca et al., 2019). This inter-individual variation is evident along several axes between first- (L1) and second-acquired (L2) languages, including proficiency and daily usage, especially when considering unbalanced bilinguals. As the incidence of acquired brain injury (ABI), e.g., stroke, increases (Katan and Luft, 2018) leading to language impairment in aging bilingual populations, it can be expected that bilingual people with aphasia (BPWA) will comprise a greater share of caseloads in forthcoming years (Centeno et al., 2020).
Examining the influence of bilingual language experience on the performance of BPWA on linguistic and cognitive standardized assessments and experimental tasks remains an important pursuit to address basic research questions about language in the bilingual brain post-ABI, and to personalize care and predict individualized outcomes. However, understanding and quantifying language experience in bilingual individuals remains a highly complex undertaking (see Silva-Corvalán and Treffers-Daller, 2015; Köpke and Genevska-Hanke, 2018 for a review), a factor that is further complicated in individuals with ABI for whom language experiences may change following aphasia onset (Peñaloza et al., 2019). Several resources and assessments are often employed to elaborate dimensions of language experience (e.g., use, confidence, and proficiency) and address important questions in bilingual research (Francis, 2021). However, there is no clear consensus in the field on how to survey and quantify language experience in healthy bilinguals (HB) and BPWA, suggesting a need for consistency when collecting and reporting these measures across different studies (e.g., Kašćelan et al., 2022).
Finally, large-scale recruitment of BPWA is a challenging undertaking given the resources required to process participants through studies. As Kašćelan et al. (2022) suggest, bilingualism is an intricate construct and therefore variability among different approaches has arisen in operationalizing bilingualism for research purposes. Accordingly, transparency in methods and conceptualization of measures is relatively low, resulting in controversies such as the equivocal presence of bilingual advantage in executive function (Marian and Hayakawa, 2021). While various datasets exist relating to aphasia (e.g., Mirman et al., 2010; MacWhinney et al., 2011) and multilingualism (e.g., de Bruin et al., 2017) separately, few large datasets exist representative of both populations simultaneously. Furthermore, much of the presently available bilingual data lacks more detailed information. For example, Surrain and Luk (2019) examined 186 studies published between 2005 and 2015 to examine the features of bilingual experience reported. While 79% of studies reported general information about language use at home, a minority (39%) reported this using proportions. de Bruin (2019), reviewing different language history questionnaires and measurement tools, also found that for metrics like age of acquisition (AoA), proficiency, and language use, there was a significant lack of definitional congruency, e.g., “late” vs. “early” bilingualism.
To that end, the aim of this Data Report is to introduce LEX-BADAT: Language EXperience in Bilinguals with and without Aphasia DATaset. We used the language use questionnaire (LUQ; Kastenbaum et al., 2019)—which uses continuous scales and has shown predictive value regarding lexical access in both HB (Kastenbaum et al., 2019) and BPWA (Peñaloza et al., 2019)—and explicitly defined its metrics. We provide summaries of the two datasets included, one of 85 BPWA and one of 31 HB. Additionally, given that the data are multidimensional, we present the results of principal component analyses (PCA) on raw LUQ data for ease of use in statistical analyses (e.g., generating component scores for participants not included in the analyses can be accomplished using the provided loading matrices), and to detect latent variables summarizing bilingual language experience in these two groups. As a result, this is both (i) the largest dataset of language experience in Spanish-English BPWA, and (ii) the largest dataset that includes both HB and BPWA using a shared instrument with directly comparable scales. We believe these data meaningfully inform researchers about the structure of language experience in linguistically heterogenous populations primarily in the United States, and perhaps more importantly, on the underexplored language experience of bilinguals whose access to language is impaired.