Linguistic Variation

Natural language provides us with freedom to express the same ideas in different ways. Humans are generally capable of understanding the meaning of a natural language expression in spite of such variation; however, the freedom that accompanies natural language makes computerized understanding of the language difficult. In patient reports, a patient's clinical state can be expressed differently due to the linguistic characteristics of derivation, inflection, and synonymy.

Derivation and inflection change the form of a word (the word's morphology) while retaining the underlying meaning of the word. The adjective "mediastinal'' can be derived from the noun "mediastinum'' by exchanging the suffix -um for the suffix -al. Similar rules can be used to derive the adjective "laryngeal" from the noun "larynx'' or to derive the noun "transportation'' from the verb "transport.''

There are other forms of linguistic variation to contend with. The two most important are inflectional rules (which change a word's form, such as by pluralization of a noun or tense change of a verb) and synonymy (in which different words or phrases mean the same thing).

Physicians reading reports are seldom confused by derivation, inflection, or synonymous expressions. An NLP application attempting to determine whether a patient has shortness of breath, for example, must account for linguistic variation in order to identify "dyspnea,'' "short of breath,'' or "dyspneic'' as evidence of shortness of breath.

