Télécharger la liste

Description du projet

Uplug is a collection of tools for linguistic
corpus processing, word alignment, and term
extraction from parallel corpora. Several tools
have been integrated in Uplug. Pre-processing
tools include a sentence splitter, tokenizer, and
external part-of-speech tagger and shallow
parsers. The following external tools are used:
the Grok system for English (tagging and chunking)
and the morphological analyzer ChaSen for
Japanese. Other tools such as the TreeTagger can
easily be added. Translated documents can be
sentence aligned using the length-based approach
by Gale & Church. Words and phrases can be aligned
using the clue alignment approach and the toolbox
for training statistical alignment models GIZA++.

Système requise

System requirement is not defined
Information regarding Project Releases and Project Resources. Note that the information here is a quote from Freecode.com page, and the downloads themselves may not be hosted on OSDN.

2006-10-13 22:14
0.2.0

La gestion des utilisateurs pour le Web interfaces basées sur l'alignement (CIA & ISA) utilisant la protection par mot de passe simple et spécifique à l'utilisateur de stockage de données. 2 gouttières intégrées dans la nouvelle phrase Uplug: hunalign et GMA. Une autre approche d'alignement de phrases: uplugalign (longueur, à base d'alignement de phrases avec des apparentés / filtres dictionnaire). La documentation de démarrage rapide de nouvelles fonctions.
Tags: Major feature enhancements
User management for the Web-based alignment interfaces (ICA & ISA) using simple password protection and user-specific data storage. 2 new sentence aligners integrated into Uplug: hunalign and GMA. Another sentence alignment approach: uplugalign (length-based sentence alignment with cognate/dictionary filters). Quickstart documentation for the new features.

Project Resources