Workflow Examples
NERDA
offers a simple easy-to-use interface for fine-tuning transformers for Named-Entity Recognition (=NER). We call this family of models NERDA
models.
NERDA
can be used in two ways. You can either (1) train your own customized NERDA
model or (2) download and use one of our precooked NERDA
models for inference i.e. identifying named entities in new texts.
Train Your Own NERDA
model
We want to fine-tune a transformer for English.
First, we download an English NER dataset CoNLL-2003 with annotated Named Entities, that we will use for training and evaluation of our model.
from NERDA.datasets import get_conll_data, download_conll_data
download_conll_data()
CoNLL-2003 operates with the following types of named entities:
- PERsons
- ORGanizations
- LOCations
- MISCellaneous
- Outside (Not a named Entity)
An observation from the CoNLL-2003 data set looks like this.
# extract the first _5_ rows from the training and validation data splits.
training = get_conll_data('train', 5)
validation = get_conll_data('valid', 5)
# example
sentence = training.get('sentences')[0]
tags = training.get('tags')[0]
print("\n".join(["{}/{}".format(word, tag) for word, tag in zip(sentence, tags)]))
If you provide your own dataset, it must have the same structure:
- It must be a dictionary
- The dictionary must contain
- 'sentences': a list of word-tokenized sentences with one sentence per entry
- 'tags': a list with the corresponding named-entity tags.
The data set does however not have to follow the Inside-Outside-Beginning (IOB) tagging scheme1.
The IOB tagging scheme implies, that words that are beginning of named entities are tagged with 'B-' and words 'inside' (=continuations of) named entities are tagged with 'I-'. That means that 'Joe Biden' should be tagged as Joe(B-PER) Biden(I-PER)
.
Now, instantiate a NERDA
model for finetuning an ELECTRA transformer for NER.
from NERDA.models import NERDA
tag_scheme = ['B-PER',
'I-PER',
'B-ORG',
'I-ORG',
'B-LOC',
'I-LOC',
'B-MISC',
'I-MISC']
model = NERDA(dataset_training = training,
dataset_validation = validation,
tag_scheme = tag_scheme,
tag_outside = 'O',
transformer = 'google/electra-small-discriminator',
hyperparameters = {'epochs' : 1,
'warmup_steps' : 10,
'train_batch_size': 5,
'learning_rate': 0.0001},)
Note, this model configuration only uses 5 sentences for model training to minimize execution time. Also the hyperparameters for the model have been chosen in order to minimize execution time. Therefore this example only serves to illustrate the functionality i.e. the resulting model will suck.
By default the network architecture is analogous that of the models in Hvingelby et al. 2020.
The model can be trained right away by invoking the train
method.
model.train()
We can compute the performance of the model on a test set (limited to 5 sentences):
test = get_conll_data('test', 5)
model.evaluate_performance(test)
Unsurprisingly, the model sucks in this case due to the ludicrous specification.
Named Entities in new texts can be predicted with predict
functions.
text = "Old MacDonald had a farm"
model.predict_text(text)
Needless to say the predicted entities for this model are nonsensical.
To get a more reasonable model, provide more data and a more meaningful model specification.
In general NERDA
has the following handles, that you use.
- provide your own data set
- choose whatever pretrained transformer you would like to fine-tune
- provide your own set of hyperparameters and lastly
- provide your own
torch
network (architecture). You can do this by instantiating aNERDA
model with the parameter 'network' set to your own network (torch.nn.Module).
Use a Precooked NERDA
model
We have precooked a number of NERDA
models, that you can download
and use right off the shelf.
Here is an example.
Instantiate a NERDA
model based on the English ELECTRA transformer, that has been finetuned for NER in English,
EN_ELECTRA_EN
.
from NERDA.precooked import EN_ELECTRA_EN
model = EN_ELECTRA_EN()
(Down)load network:
model.download_network()
model.load_network()
This model performs much better:
model.evaluate_performance(get_conll_data('test', 100))
Predict named entities in new texts
text = 'Old MacDonald had a farm'
model.predict_text(text)
List of Precooked Models
The table below shows the precooked NERDA
models publicly available for download. We have trained models for Danish and English.
Model | Language | Transformer | Dataset | F1-score |
---|---|---|---|---|
DA_BERT_ML |
Danish | Multilingual BERT | DaNE | 82.8 |
DA_ELECTRA_DA |
Danish | Danish ELECTRA | DaNE | 79.8 |
EN_BERT_ML |
English | Multilingual BERT | CoNLL-2003 | 90.4 |
EN_ELECTRA_EN |
English | English ELECTRA | CoNLL-2003 | 89.1 |
F1-score is the micro-averaged F1-score across entity tags and is evaluated on the respective test sets (that have not been used for training nor validation of the models).
Note, that we have not spent a lot of time on actually fine-tuning the models,
so there could be room for improvement. If you are able to improve the models,
we will be happy to hear from you and include your NERDA
model.
Performance of Precooked Models
The table below summarizes the performance as measured by F1-scores of the model
configurations, that NERDA
ships with.
Level | DA_BERT_ML |
DA_ELECTRA_DA |
EN_BERT_ML |
EN_ELECTRA_EN |
---|---|---|---|---|
B-PER | 93.8 | 92.0 | 96.0 | 95.1 |
I-PER | 97.8 | 97.1 | 98.5 | 97.9 |
B-ORG | 69.5 | 66.9 | 88.4 | 86.2 |
I-ORG | 69.9 | 70.7 | 85.7 | 83.1 |
B-LOC | 82.5 | 79.0 | 92.3 | 91.1 |
I-LOC | 31.6 | 44.4 | 83.9 | 80.5 |
B-MISC | 73.4 | 68.6 | 81.8 | 80.1 |
I-MISC | 86.1 | 63.6 | 63.4 | 68.4 |
AVG_MICRO | 82.8 | 79.8 | 90.4 | 89.1 |
AVG_MACRO | 75.6 | 72.8 | 86.3 | 85.3 |
This concludes our walkthrough of NERDA
. If you have any questions, please do not hesitate to contact us!