PACTE - Collaborative Electronic Text Annotation Platform

Platform

PACTE is a collaborative annotation web platform for text content that integrates an array of practical tools for research groups. It offers three annotation modes: manual, semi-automatic and automatic. Manual annotation is carried out by means of an interface optimized to allow rapid entry of the data enriching a text. The automatic mode is composed of all the specialized and configurable annotation services (named entities, disambiguated terminology, etc.). Semi-automatic annotation, using active learning algorithms, allows training of a prediction model with minimal annotation, requiring less effort to annotate large text corpora. See below for more information on these tools.

The collaborative nature of the PACTE web platform allows sharing of analyses and annotations with other researchers, thereby facilitating cooperation and opening the door to large-scale multi-partner studies. PACTE generates important gains in productivity by significantly reducing the time spent on analysis, while improving consistency.

Manage large text corpora

Create, import and access your corpora.

Manual annotation

Create and modify your annotations.

Define your terminology

Consult an existing lexicon or build your own!

Create whole annotation projects

Assign annotation tasks to team members.

Define custom annotation schemas

Structure the information you need to enrich your documents.

Start annotation services

Launch one of our linguistic, lexical or semantic annotation tools.

Semi-automatic annotation

No service does what you need it to do? Train your own, customized annotator!

Bilingual platform

Work in English or in French, depending on team members’ preferences.

Search by annotation

Find the relevant documents according to annotation type.

Web Services

Many of the automated web annotation services based on CRIM’s advanced analytic algorithms are available in PACTE. They can be found in three categories, involving basic linguistic analysis, lexical data or semantic analysis. As web services, they may also be used independently of PACTE by calling their API. If you are interested, please contact us.

Morphosyntactic analysis

Parse each word’s morphosyntactic function, grammatical gender, number, lemma, etc.

Document profile

Identify a document’s relevant domains through a specialized terminology lexicon.

Terminology disambiguation

Annotate texts using lexicon’s terms that are directly related to a given field.

Recognize a named entity

Isolate expressions designating named entities (people, places, organizations).

Active learning

Train a prediction model to note the properties specific to a type of annotation by annotating occurrences that provide a maximum of information for the model, allowing to annotate with minimal effort.

About PACTE

PACTE was originally developed by CRIM for a set of research groups, notably the Laboratoire d’ingénierie cognitive et sémantique (LiNCS) and the Centre de recherche et d'expertise Jeunes en difficulté (CCSMTL). The platform, having been developed for general use, is adaptable and can meet a broad range of needs of research communities outside the initial groups.

2016

CRIM receives funding under the CANARIE Research Software Program to develop the PACTE platform.
Feb. 2017

Alpha version available on line, three research groups participating.
Sept. 2017

Presentation of PACTE platform and domain profiling annotation service at the 12th International Conference on Computational Semantics (IWCS 2017) for the "Interoperable Semantic Annotation" and the "Language, Ontology, Terminology and Knowledge Structures" workshops (Montpellier, France).
Nov. 2017

Presentation of the named entity annotation service at the Text Analysis Conference 2017 in Maryland.
2018+

Be part of our story!