Everything a new team needs
The shared instruments of the network. Registered teams receive the full pack by email after applying on the Join page; the summaries below let you assess fit before applying.
The three core instruments
Researcher workshop pack
18 printable worksheets: project identity, RQ0–RQ6, H1–H6 with observables, the theoretical framework, the linguistic-prediction module, corpus and annotation protocols, the AI pipeline, the reader study, six weekly day-by-day plans with Friday checkpoints, deliverables checklist, starter bibliography.
COGNILANG Analyzer
A Streamlit app for AI-assisted exploration: textual-index extraction, similarity/divergence comparison, sentence alignment, TRA shift detection, DIS classification, PRD marker identification, and a cross-analysis tab linking features × shifts × prediction to cognitive hypotheses. JSON and PDF export with full Unicode (RTL Arabic included).
Code-book with examples
The frozen LIN/DIS/TRA/PRD grid with one positive and one negative example per code per language, the κ procedure, and the deviation-log template every replication fills in.
Recommended tools per component
Model substitution across languages is expected; the validation logic is what stays constant. Versions of every tool used must be logged.
| Component | Reference tools (pilot) |
|---|---|
| EN/FR processing | spaCy (en_core_web_*, fr_core_news_*); stanza as cross-check. |
| Arabic processing | CAMeL Tools, Farasa, or stanza (ar) — normalisation settings documented. |
| Other languages | stanza covers 70+ languages; for CJK add jieba / fugashi; report segmentation agreement on a sample. |
| Embeddings / alignment | sentence-transformers (LaBSE), LASER; faiss for nearest-neighbour pairing. |
| Surprisal (PRD-1) | transformers + minicons with one monolingual causal LM per language; within-language comparisons or normalised profiles only. |
| LLM assistance | Any capable LLM API for frame labelling and shift pre-classification — always benchmarked against the gold subset; prompts versioned. |
| Statistics | pandas, scikit-learn, statsmodels; κ on doubled annotations; notebooks under version control. |
The six-week cycle
| Week | Focus | Key deliverables |
|---|---|---|
| Week 1 | Framing and architecture | Title, RQ0–RQ6, H1–H6 with observables, definitions (incl. linguistic prediction), outline, abstracts, ethics application submitted. |
| Week 2 | Literature review | Three-block review (cognition · translation · AI) + state of the art on translation as cognition; frozen DIS-1 label set; formatted bibliography. |
| Week 3 | Methodology and corpus | Corpus collected and documented; grid piloted and frozen (κ ≥ 0.70); AI pipeline (incl. surprisal module) validated on a sample. |
| Week 4 | Analysis | Comparative LIN/DIS tables; TRA typology; predictability profiles; shift × surprisal cross-table; results and discussion drafts. |
| Week 5 | Finalisation | Introduction and conclusion; harmonised text; verified citations and bibliography; near-final version. |
| Week 6 | Revision and delivery | Final document (DOCX + PDF); annexes; oral presentation pack; deviation log to the network. |
Starter bibliography, by axis
A — Cognition & comprehension
Kintsch, Comprehension: A Paradigm for Cognition · Sweller et al. on cognitive load theory · interdisciplinary work on cognition and language contact.
B — Cognitive translation studies
Pym on cognitive translation studies · the CTIS programmatic literature ("an evolving research area and a thriving community") · systematic reviews of cognitive approaches to translation · translation-meets- cognitive-science syntheses.
C — News discourse & framing
Entman on framing as a fractured paradigm · corpus-based discourse analysis of translated news (e.g. intensifiers in bilingual news translation) · Bielsa & Bassnett, Translation in Global News.
D — AI / NLP for multilingual news
Multilingual multifaceted understanding of online news · scaling up multilingual news framing analysis · tool papers for the components actually used (cite per team).
E — Linguistic prediction
Kutas & Federmeier on the N400 and anticipatory semantic processing · Pickering & Garrod's integrated theory of production and comprehension · Huettig's four central questions about prediction · Hale and Levy on surprisal and expectation-based comprehension · Chernov, Inference and Anticipation in Simultaneous Interpreting · Amos & Pickering on prediction in simultaneous interpreting · Seeber on cognitive load in SI.