NLP 项目资料汇总

Expand Table of Contents

- [NLP 项目资料汇总](#nlp-项目资料汇总) - [Table of Contents](#table-of-contents) - [NLP tools](#nlp-tools) - [Dateset](#dateset) - [Python Framework](#python-framework)

NLP tools

CRFsuite

A fast implementation of Conditional Random Fields (CRFs)

Dateset

GLUE

The General Language Understanding Evaluation (GLUE) benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems. GLUE consists of:

A benchmark of nine sentence- or sentence-pair language understanding tasks built on established existing datasets and selected to cover a diverse range of dataset sizes, text genres, and degrees of difficulty,

A diagnostic dataset designed to evaluate and analyze model performance with respect to a wide range of linguistic phenomena found in natural language, and

A public leaderboard for tracking performance on the benchmark and a dashboard for visualizing the performance of models on the diagnostic set.

Python Framework

seqeval

seqeval is a Python framework for sequence labeling evaluation. seqeval can evaluate the performance of chunking tasks such as named-entity recognition, part-of-speech tagging, semantic role labeling and so on.

This is well-tested by using the Perl script conlleval, which can be used for measuring the performance of a system that has processed the CoNLL-2000 shared task data.