On this page, you can browse the MACAWS team's presentations at various conferences and academic events. To look up the presentations that we gave in a specific year, click on the links provided below.
Vinokurova, V. (2025, February). Different strategies for different proficiencies: Building communicative competence in Russian through pragmatics instruction [Roundtable]. AATSEEL, online.
In her roundtable presentation, Valentina showcased how MACAWS speech-based activities can be used for teaching pragmatics in beginner- and intermediate-level language classrooms.
Gorlova, A., Vinokurova, V. (2024, February 15 – 17). Teachers’ Perceptions of Interactive DDL Using a Multilingual Learner Corpus. AATSEEL, Las Vegas, NV.
Data-driven learning (DDL) involves granting students access to genuine linguistic data via corpora and allowing them to construct their own regulations for meanings, rules, and forms, thus fostering inductive learning. In the DDL framework, the instructor's primary role involves offering support for inductive learning while also balancing independent analysis and guidance (Granger, 2002). According to prior research, teachers hold favorable attitudes toward DDL, though they also report challenges like time constraints, technical proficiency, and limited training (Chen et al., 2018). Prior research was primarily conducted in the contexts of ESL and EFL; therefore, there is a need to explore teacher training needs and perceptions of DDL implementation in other languages.
This study addresses this gap by examining Russian language teacher perceptions of implementing DDL activities designed using the Russian section of the Multilingual Academic Corpus of Assignments – Writing and Speech (MACAWS) learner corpus (Staples et al., 2019-). We qualitatively analyze data from implementation surveys (N = 9), instructor workshop surveys (N = 10), and field notes. Initial findings suggest that workshop and survey participants lean towards utilizing pre-made materials over creating or modifying them. Detailed outcomes will explore the challenges in the implementation of activities. The results will be discussed in the context of corpus project evolution, future training needs, technical and pedagogical difficulties highlighted by participants, and strategies for overcoming these challenges, including training materials developed by the MACAWS team.
Vinokurova, V., & Novikov, A. (2023, February). Multiliteracies approach to task design using a learner corpus. AATSEEL 2023 (online).
This presentation demonstrated the task-text-task cycle using the concept of “Available Designs” borrowed from the multiliteracies approach (Cope & Kalantzis, 2009). Applying this framework to our context, the cycle of “Available Designs” starts with a teacher creating a task (which can be located in the MACAWS Repository). Next, learners engage in the process of “Designing”: they respond to the task by designing their texts. The “Redesigned,” or the products of learners’ work, can be located in the MACAWS Corpus and further used to create new tasks for learners. In this way, as per the multiliteracies framework, students’ “Redesigned” becomes someone else’s “Available Designs.” Crucially, by creating tasks on the basis of a learner corpus, teachers can quickly find level-appropriate texts for their students and create powerful and confidence-boosting activities, drawing on the experience of other students.
Staples, S. (2022, September). Expanding the impact of corpus linguistics in the classroom. Plenary talk for American Association of Corpus Linguistics, Northern Arizona University, Flagstaff, AZ.
This talk focuses on renewing and reinvigorating the discussion of the “corpus revolution” in language teaching (Conrad, 2000; Cortes 2013) and expands the conversation beyond the (English) language teaching classroom to other educational settings. It also argues for increased involvement of in-service teachers and other stakeholders to envision new utilities for teaching with corpora.
We have seen many advances in the impact of corpus linguistics in the past 10 years, including the incorporation of corpus-based materials in new textbook series for English language classrooms, and a proliferation of studies that illustrate the positive impact of using data-driven learning in language teaching classrooms (Reppen and colleagues, 2019; Schmitt & Schmitt, 2011; Boulton & Cobb, 2017). We have also seen an increase in corpus-based materials and activities in graduate level English for Academic Purposes courses (Charles, 2011, 2014; Cotos, 2014; Swales & Feak, 2012). However, there is still a large gap to fill for ESP courses outside of this setting, particularly for spoken discourse. In addition, most of the research on teaching with corpora has taken place in English. As we introduce corpus-based teaching into new domains, corpus linguists also need to take an active role in bridging the gap between research and practice. While recent work has suggested that DDL and other corpus-based approaches to teaching has been met positively by in-service teachers (Anthony et al., 2019; Schmidt, in press), it is still acknowledged that there is a heavy lift for teachers to incorporate corpus-based instruction into their pedagogy (Crosthwaite et al., 2021; Leńko-Szymańska, 2017; Ma et al., 2021). Using examples from my scholarship, both published and in progress, I illustrate how corpus linguistics can expand its impact in the classroom from where it is today. The talk will focus on incorporating corpus research into three specific teaching contexts: a pronunciation course for medical professionals, first year undergraduate composition courses for both international and domestic students, and finally first and second year language courses for Russian and Portuguese learners. In each case, involving in-service teachers and other stakeholders in the expanded vision for corpus linguistics in classrooms is key.
Staples, S. (2022, July). Learner corpora and data-driven learning: moving toward an asset-based approach. Plenary for Teaching and Language Corpora, University of Limerick, Limerick, Ireland.
The last 20 years has seen an exciting increase in the use of corpora for data-driven learning (DDL), and we have established the benefits of using DDL with students at various levels of language learning and in various instructional settings (Boulton & Cobb, 2017; Crosthwaite, 2020). Among other affordances, corpora have provided learners with methods of autonomous learning and models of authentic language use across a variety of registers and speaker groups. Corpora have also contributed greatly to the larger shift in language teaching away from forms-focused rule-governed instruction to form-focused usage-based instruction that emphasizes functional language use that shifts across speakers and discourse contexts. However, most DDL studies use “native speaker” corpora, in part due to ease of access (see Cotos, 2014 and Lewandowska, 2013 for exceptions). Learner corpora have been used either for indirect DDL or have primarily been seen as a way to address errors in language use. This narrow focus is somewhat surprising given that the field of SLA has been moving towards multilingual models of language learning, drawing on learners’ languages as resources and emphasizing intelligibility and comprehensibility and use of English as a Lingua Franca over adherence to an imagined native speaker norm (Ortega, 2013). Many teachers are also embracing these principles of language teaching and are working to incorporate new approaches that support them into their pedagogy.
This talk will focus on the affordances of learner corpora, particularly to advance asset-oriented models of language learning that promote learner texts as models for other language learners (mentor texts), as sites for discussion of functional language use in and beyond the concordance line, and to address language choices in contexts relevant to learners. Examples will be drawn from the speaker’s work with teachers and learners in a variety of classroom contexts.
Sommer-Farias, B., Centanin Bertho, M., Vinokurova, V., & Staples, S. (2021, November). How to teach writing with learner corpus data. ACTFL 2021 (online).
Writing in a foreign language is considered one of the most difficult skills to teach. This challenge can be associated with two factors: 1) difficult access to authentic texts that are suitable to topics and proficiency levels of FL learners, and 2) the need for more concise descriptions of the linguistic resources that represent content across genres focused on FL instruction. Texts produced by learners made available through learner corpora are one alternative to access authentic texts on more relatable topics and a more appropriate level of vocabulary and grammar. This presentation will share examples of activities created to teach writing to novice and intermediate students in Portuguese and Russian classes using one learner corpus. The presentation will contextualize corpus-based activities created for the topic "travel and tourism" focusing on motion verbs, accusative and prepositional phrases for Russian, and past perfect and expressions for recommendations for Portuguese.
Sommer-Farias, B., & Picoral, A. (2021, August). Teacher perceptions on Open Educational Resources: Interactive DDL & MACAWS. AILA World Congress of Applied Linguistics, Groningen, the Netherlands (online).
Picoral, A., Sommer-Farias, B., Novikov, A., & Staples, S. (2021, May). Interactive Data-Driven Learning (iDDL): A report on creating and using interactive corpus-based activities. CALICO, Seattle, WA, United States (online).
Novikov, A. (2021, March). Using a learner corpus as a medium between unfocused and focused tasks in task-based language teaching. American Association of Teachers of Slavic and East European Languages (AATSEEL) (online).
Novikov, A. (2021, March). Syntactic and morphological complexity measures as markers of L2 Russian development. American Association of Teachers of Slavic and East European Languages (AATSEEL) (online).
Centanin-Bertho, M. (2021, February). Working with learner corpus-data: An introduction to MACAWS. Hispanic and Lusophone Linguistics Working Group Colloquium, University of Arizona (online).
Staples, S. (2021, February). Using corpora for pedagogy and research. Invited talk at Michigan State University (online).
Picoral, A. (2020, July). Copula constructions in a multilingual learner corpus as evidence of L3 development. 14th Teaching and Language Corpora (TaLC), Perpignan, France (online).
Novikov, A. (2020, July). Lexico-grammatical development of Russian learners. Teaching and Language Corpora (TaLC), Perpignan, France (online).
Sommer-Farias, B., Picoral, A., Novikov, A., Staples, S. (2020, July). Teachers’ Perceptions of Interactive DDL Using a Multilingual Learner Corpus. Teaching and Language Corpora (TaLC), Perpignan, France (online).
Novikov, A. (2020, February). MACAWS Russian: Corpus design and creation of usage-inspired pedagogical materials. American Association of Teachers of Slavic and East European Languages (AATSEEL), San Diego, CA, United States.
This presentation introduced MACAWS (Russian), and how it represents a valuable resource for creating usage-inspired pedagogical materials. The rationale behind using texts from this corpus for creating pedagogical materials is rooted in the principles of ecological validity and situational context that guided the corpus design. More specifically, the texts in the corpus come from student assignments with information about tasks collected from naturally occurring pedagogical settings.
The situational context also proves useful in designing activities based on the premise that situational characteristics of texts are directly linked to the linguistic features (Biber, 1988; Biber et al, 2015). Although the functionality of linguistic features has been thoroughly investigated in research, its pedagogical application is still lacking. Thus, this presentation discussed using learner corpora as a resource for making meaningful connections between grammar and lexis, grammar and culture, and grammar and task. These connections were demonstrated through the presentation of three activities developed with the use of MACAWS.
Picoral, A., & Carvalho, A. (2020, October). The acquisition of preposition+article contractions in L3 Portuguese among different L1- speaking learners: A variationist approach. NWAV 2019, Eugene, OR, United States.
This paper sheds light on differential paths of third language (L3) acquisition of Portuguese by Spanish-English speakers whose first language is Spanish (L1 Spanish), English (L1 English), or both in the case of heritage speakers of Spanish (HL). Specifically, we look at the acquisition of a categorical rule in Portuguese, where some prepositions are invariably contracted with the determiner that follows them. Based on a corpus of 841 written assignments by Portuguese L3 learners, we extracted 10,047 tokens in obligatory contraction contexts. We analyzed the impact of linguistic (type of preposition and lexical frequency) and extra linguistic factors (course level, learner’s L1 and L2), with individual as random factor, using Rbrul (Johnson, 2008). Results point to clear tendencies, albeit abundant individual differences. L1 English and HL speakers acquire contractions at higher rate than L1 Spanish speakers, revealing a non-facilitatory role of a cognate L1 in transfer patterns during L3 acquisition.
Sommer-Farias, B. (2019, August). MACAWS e o ensino da escrita: O uso de um corpus de textos de aprendizes de português para a criação de materiais didáticos. VIII Encontro Mundial sobre o Ensino de Português (EMEP), Princeton University, Princeton, NJ, United States.
This study introduces a Portuguese learner corpus, MACAWS (Multilingual Academic Corpus of Assignments Writing and Speech), and demonstrates how the platform can be used to create teaching materials to support writing classes. To date, MACAWS contains texts produced by 193 students in five Portuguese courses at the University of Arizona (UA), covering 41 different tasks, totaling 256,430 tokens, in addition to metadata about each student. Since many University of Arizona students are bilingual English-Spanish, and some have Spanish and Portuguese as their heritage language, this platform has the potential to contribute to the development of Portuguese teaching materials in similar contexts. The use of the corpus is exemplified through two form-focused learning activities, which include reviews and travel reports, to demonstrate how to establish links across linguistic resources, textual genres (Boulton, 2009) and levels of proficiency. Portuguese language teaching and the elaboration of pedagogical tasks on such analyzes are based on data-driven learning principles and on the noticing of linguistic patterns (Boulton; Cobb, 2017). The results offer support for the contextualized teaching of linguistic resources, enabling the creation of tasks appropriate to the learners' prior knowledge and the improvement of writing through the induction of noticing linguistic patterns relevant to the genres of discourse.
Sommer-Farias, B., Novikov, A., Picoral, A., & Staples, S. (2018, October). MACAWS: Multilingual Academic Corpus of Assignments Writing and Speaking. Arizona Corpus Linguistics conference (AZCL), Flagstaff, AZ, United States.
This paper describes the process of building the Multilingual Academic Corpus of Assignments - Writing and Speech (MACAWS) under the mentorship of Dr. Shelley Staples, who has extensive experience with corpus building, such as CROW (Corpus & Repository of Writing) (Kwon, Partridge & Staples, 2018). MACAWS is designed to be an online platform for researchers and instructors to access students’ assignments and pedagogical materials from language programs at the University of Arizona, starting with Portuguese and Russian. Few corpora are built directly from assignments in language classes, especially in Less Commonly Taught Languages (LCTLs) (Spina, 2017), limiting the ways in which instructors and researchers can use corpora in their classes. In addition to a corpus, MACAWS will also include a repository of pedagogical artifacts (e.g., syllabi, lesson plans) associated with the assignments in the corpus. Hence, this project offers new support to SLA and language teaching research for LCTLs through learner corpora, allowing teachers and researchers to investigate a large amount (256,430 tokens for Portuguese writing; 16,496 tokens for Russian writing) of representative learner data (Granger, 2002). The presentation will discuss: 1) the process of collecting assignments and processing files to build the corpus; 2) involvement of instructors at the initial stage of assignment submission and their participation in Fall 2018 workshops on how to use the corpus and corpus-based materials for language teaching. A preliminary set of pedagogical materials for both languages will be discussed.
Picoral, A. (2018, September). Compilation and automated annotation of a Portuguese learner corpus: Challenges and lessons learned. American Association of Corpus Linguistics, Atlanta, GA, United States.