Skip to main content

Advertisement

Table 1 Examples of orthographic, morphologic, and prefix-suffix features

From: BCC-NER: bidirectional, contextual clues named entity tagger for gene/protein mention recognition

Feature Example Feature Example
INITCAPS Albumin HAS_QUOTE gstC’ mutans
ALLCAPS SGPT HAS_SLASH P42/44
ENDCAPS IgA END_PLUS HexA+
UPPER-LOWER Serum ACTH END_QUOTE C’
TWOCAPS LH HASDASH Ap-2
THREECAPS HMG INITDASH -beta
MORECAPS GGTP ENDDASH CD45-
MIXEDCAPS EcoRI 2PREFIX Fi(fibrin)
LOWERCASE Calcitonin 3PREFIX Fib(fibrin)
ENDDIGIT cna1 4PREFIX Fibr(fibrin)
ALPHANUMERIC p53 2SUFFIX in(fibrin)
SINGLECHAR R 3SUFFIX rin(fibrin)
NUMBERS_LETTERS UR2 4SUFFIX brin(fibrin)
HASDIGIT E6 HASGREEK TNF-alpha
GREEK Alpha HASROMAN factor II
ROMAN I,II,IV PUNCTUATION (,).,