Skip to main content

Table 1 Examples of orthographic, morphologic, and prefix-suffix features

From: BCC-NER: bidirectional, contextual clues named entity tagger for gene/protein mention recognition

Feature

Example

Feature

Example

INITCAPS

Albumin

HAS_QUOTE

gstC’ mutans

ALLCAPS

SGPT

HAS_SLASH

P42/44

ENDCAPS

IgA

END_PLUS

HexA+

UPPER-LOWER

Serum ACTH

END_QUOTE

C’

TWOCAPS

LH

HASDASH

Ap-2

THREECAPS

HMG

INITDASH

-beta

MORECAPS

GGTP

ENDDASH

CD45-

MIXEDCAPS

EcoRI

2PREFIX

Fi(fibrin)

LOWERCASE

Calcitonin

3PREFIX

Fib(fibrin)

ENDDIGIT

cna1

4PREFIX

Fibr(fibrin)

ALPHANUMERIC

p53

2SUFFIX

in(fibrin)

SINGLECHAR

R

3SUFFIX

rin(fibrin)

NUMBERS_LETTERS

UR2

4SUFFIX

brin(fibrin)

HASDIGIT

E6

HASGREEK

TNF-alpha

GREEK

Alpha

HASROMAN

factor II

ROMAN

I,II,IV

PUNCTUATION

(,).,