Formal language
In
mathematics,
logic and
computer science, a
formal language is a
set of finite-length words (i.e. character strings) drawn from some finite
alphabet. Note that we can talk about
formal language in many contexts (scientific, legal and so on), meaning a mode of expression more careful and accurate than everyday speech. Use of a particular formal language in the sense intended here is an 'ultimate' version of that usage: formal enough to be used in written form for automatic computation, is a possible criterion.
A typical alphabet would be {
a,
b}, and a typical string over that alphabet would be
- ababba.
A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols
a and
b.
The
empty word (that is, length-zero string) is allowed and is often denoted by
e, ε or λ. While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded).
Some examples of formal languages:
- the set of all words over {a, b};\n* the set { an : n is a prime number };\n* the set of syntactically correct programs in a given programming language; or\n* the set of inputs upon which a certain Turing machine halts.
A formal language can be specified in a great variety of ways, such as:
Several operations can be used to produce new languages from given ones. Suppose
L1 and
L2 are languages over some common alphabet.\n* The
concatenation L1L2 consists of all strings of the form
vw where
v is a string from
L1 and
w is a string from
L2.\n* The
intersection of
L1 and
L2 consists of all strings which are contained in
L1 and also in
L2.\n* The
union of
L1 and
L2 consists of all strings which are contained in
L1 or in
L2.\n* The
complement of the language
L1 consists of all strings over the alphabet which are not contained in
L1.\n* The
right quotient L1/
L2 of
L1 by
L2 consists of all strings
v for which there exists a string
w in
L2 such that
vw is in
L1.\n* The
Kleene star L1* consists of all strings which can be written in the form
w1w2...
wn with strings
wi in
L1 and
n ≥ 0. Note that this includes the empty string ε because
n = 0 is allowed.\n* The
reverse L1R contains the reversed versions of all the strings in
L1.\n* The
shuffle of
L1 and
L2 consists of all strings which can be written in the form
v1w1v2w2...
vnwn where
n ≥ 1 and
v1,...,
vn are strings such that the concatenation
v1...
vn is in
L1 and
w1,...,
wn are strings such that
w1...
wn is in
L2.
A typical question asked about a formal language is how difficult it is to decide whether a given word belongs to the language.\nThis is the domain of
computability theory and
complexity theory.
\n\n\n\n\nzh-cn:形式语言\nzh-tw:形式語言
Category:Formal languages