From: BCC-NER: bidirectional, contextual clues named entity tagger for gene/protein mention recognition
Feature | Example | Feature | Example |
---|---|---|---|
INITCAPS | Albumin | HAS_QUOTE | gstC’ mutans |
ALLCAPS | SGPT | HAS_SLASH | P42/44 |
ENDCAPS | IgA | END_PLUS | HexA+ |
UPPER-LOWER | Serum ACTH | END_QUOTE | C’ |
TWOCAPS | LH | HASDASH | Ap-2 |
THREECAPS | HMG | INITDASH | -beta |
MORECAPS | GGTP | ENDDASH | CD45- |
MIXEDCAPS | EcoRI | 2PREFIX | Fi(fibrin) |
LOWERCASE | Calcitonin | 3PREFIX | Fib(fibrin) |
ENDDIGIT | cna1 | 4PREFIX | Fibr(fibrin) |
ALPHANUMERIC | p53 | 2SUFFIX | in(fibrin) |
SINGLECHAR | R | 3SUFFIX | rin(fibrin) |
NUMBERS_LETTERS | UR2 | 4SUFFIX | brin(fibrin) |
HASDIGIT | E6 | HASGREEK | TNF-alpha |
GREEK | Alpha | HASROMAN | factor II |
ROMAN | I,II,IV | PUNCTUATION | (,)., |