Computational tools in Applied Linguistics

AntConc 3.5.8

AntConc 3.5.8 is a toolkit for text analysis and can be downloaded for free from Laurence Anthony's website. This kit provides a concordance tool, file viewer, word list tool and a collocation identifier. In addition, other very useful tools for text analysis, such as the TagAnt tagger, the AntCorGen corpora generator and the AntfileConverter can be downloaded from the same website.


Research, Resources and Language Information Center (CEPRIL)

Founded in 1983 by members of the Postgraduate Program in Applied Linguistics and Language Studies (LAEL) at PUC-SP, CEPRIL offers a specialized bibliographic collection in Applied Linguistics, authentic language databases in various contexts and computational tools for analysis of data.


English-Corpora.Org is a set of online tools that, among other features, has an automatic concordancer tool for identifying the frequency of occurrence of isolated words, collocational and colligational patterns or even allows the search for words belonging to the same category (i.e. , clothes). The tool also makes it possible to explore corpora such as the Corpus of Contemporary American English (Coca), the British National corpus (BNC), the corpus of Historical American English (COHA), The Hansard corpus, the TIME Magazine corpus and the most recent Coronavirus corpus, which aims to record the language used by speakers when referring to the pandemic of the year 2020.


GELC Corpus Collection

The file was created by members of the Corpus Linguistics Study Group (GELC) to facilitate the collection and creation of corpora composed of texts downloaded from websites. In addition to downloading urls (whole websites), html (pages) and pdfs, the tool makes it possible to convert html files to txt, pdf to txt and encode txt files.

shell script downloaddownload the Gelc Corpus tutorial


Kaleidographic is a dynamic and interactive visualization tool that can show relations between multiple variables in your dataset. Build your own free Kaleidographics by clicking on the button below. 


Script: Qualidade da Fala Alaríngea (Praat)

Este material é dividido em duas partes: I. MANUAL (versões português e inglês) e II. Sequência de comandos (códigos) do software em versão PDF ( a extensão do arquivo é .praat). O software livre, formulado como um Script por Albert Rilliard (Université Paris Saclay, CNRS, LISN) em colaboração com Zuleica Camargo (PUC-SP) & Nathalia dos Reis (ICESP) é distribuído sob a licença CeCILL FREE SOFTWARE LICENSE
AGREEMENT versão 2.1 ou superior (compatível com GNU GPL).

RILLIARD, Albert Olivier Blaise; REIS, Nathalia; CAMARGO, Zuleica. Script qualidade de voz alaringea. praat . LISN /PUC-SP/ ICESP.  Copyright. Version 0.3/2022 (released as a free software).

Fonte de Financiamento: Plano de Incentivo à Pesquisa- PUC-SP; PIPEq - Auxílio à Pesquisa. 



Baixe o script

Corpus hate Speech

Learn about this corpus on Github.

Access here
Portal da Ciência Aberta