Main Page

encyclopedia.codeboy.net

 

Space (punctuation)

\nA space is a punctuation convention for providing interword separation in some scripts, including the Latin, Cyrillic, and Arabic. Not all languages use spaces between words; the ancient Latin and Greek did not. Spaces were not used to separate words until roughly 600800 AD. (See interword separation for more on the history.) Traditionally, all CJK languages have no space: modern Chinese and Japanese still do not, but modern Korean uses space.

Spaces and computers

\nIn
programming language syntax, spaces are frequently used to explicitly separate tokens. Aside from this use, spaces and other whitespace are usually ignored by most modern programming languages; Python is one exception. In word processors and text editorss, if a line on a screen is shorter than the width of the screen or window, then the empty space to the right usually does not correspond with space characters in the file: there is simply a code indicating that the next text should be put on a new line. Thus, the size of the file is not made unnecessarily larger. If there are space characters, one usually does not see the difference; text editors and word processors often have an option to make them visible. Also, if there is a space character, the cursor can move there, otherwise usually not.

Spaces and digital typography

\nIn
computer programming, the normal space corresponds to Unicode and ASCII character 32, or U+0020. In HTML and XML multiple spaces or new line characters collapse into one single space, unless they are contained in a HTML tag such as <pre>, the xml:space="preserve" XML attribute is used, or CSS defines whitespace="pre" (or pre-line or pre-wrap). The special non-breaking-space &nbsp; always gives a non-collapsable space character. This should not, however, be done to indent text. Other kinds of spaces exist for special uses: for example an em dash can optionally be surrounded with a so-called hair space, Unicode character 8202, or U+200A. This space should be much thinner than a normal space, and is seldom used on its own. It can be written in HTML by using the numeric character entity &#x200A; or &#8202;. Unfortunately, very few user agents are able to render a hair space correctly: in most cases the result is an unwanted symbol or a question mark on the screen (depending on the font). {| border="1"\n|+ Normal space versus hair space\n|-\n| align="center"|Normal space|| align="center"|left right\n|-\n| align="center"|Normal space with em dash|| align="center"|left — right\n|-\n| align="center"|Hair space with em dash|| align="center"|left — right\n|-\n| align="center"|No space with em dash|| align="center"|left—right\n|} Unicode defines several space characters for fine typography. Depending on the browser and fonts used to view this table, not all spaces may display properly: {| border="1"\n|+ Space characters defined in Unicode\n!Code\n!HTML entity\n!Name\n!In Block\n!Display\n!Description\n|-\n|U+0020\n|not necessary\n|Space\n|Basic Latin\n| align="center"|] [\n|Normal space, same as ASCII character 0x20\n|-\n|U+00A0\n|&nbsp;\n|No-Break Space\n|Latin-1 Supplement\n| align="center"|] [\n|Identical to U+0020, but not a point at which a line may be broken\n|-\n|U+1680\n|&#5760;\n|Ogham Space Mark\n|Ogham\n| align="Center"|] [\n|Used for interword separation in Ogham text. Normally a vertical line in vertical text or a horizontal line in horizontal text, but may also be a blank space in "stemless" fonts. Requires an Ogham font.\n|-\n|U+2002\n|&#8194;\n|En Space, or Nut\n|General Punctuation\n| align="center"|] [\n|Width of one en\n|-\n|U+2003\n|&#8195;\n|Em Space, or Mutton\n|General Punctuation\n| align="center"|] [\n|Width of one em\n|-\n|U+2004\n|&#8196;\n|Three-Per-Em Space, or Thick Space\n|General Punctuation\n| align="center"|] [\n|One third of an em wide\n|-\n|U+2005\n|&#8197;\n|Four-Per-Em Space, or Mid Space\n|General Punctuation\n| align="center"|] [\n|One fourth of an em wide\n|-\n|U+2006\n|&#8198;\n|Six-Per-Em Space\n|General Punctuation\n| align="center"|] [\n|One sixth of an em wide\n|-\n|U+2007\n|&#8199;\n|Figure Space\n|General Punctuation\n| align="center"|] [\n|In fonts with monospaced digits, equal to the width of one digit\n|-\n|U+2008\n|&#8200;\n|Punctuation Space\n|General Punctuation\n| align="center"|] [\n|As wide as the narrow punctuation in a font\n|-\n|U+2009\n|&#8201;\n|Thin Space\n|General Punctuation\n| align="center"|] [\n|Approximately one fifth to one sixth of an em wide\n|-\n|U+200A\n|&#8202;\n|Hair Space\n|General Punctuation\n| align="center"|] [\n|Thinner than a thin space\n|-\n|U+200B\n|&#8203;\n|Zero-Width Space\n|General Punctuation\n| align="center"|]​[\n|Used to indicate word boundaries to text processing systems when using scripts that do not use explicit spacing; normally not a visible separation, but it may expand in passages that are fully justified\n|-\n|U+202F\n|&#8239;\n|Narrow No-Break Space\n|General Punctuation\n| align="center"|] [\n|Similar to U+00A0 No-Break Space\n|-\n|U+205F\n|&#8287;\n|Medium Mathematical Space\n|General Punctuation\n| align="center"|] [\n|Used in mathematical formulae\n|-\n|U+3000\n|&#12288;\n|Ideographic Space\n|CJK Symbols and Punctuation\n| align="center"|] [\n|As wide as a CJK character cell\n|} Unicode also provides some visible characters to stand in for space when necessary in the "Control Pictures" block: the Symbol For Space ␠ (U+2420), the Blank Symbol ␢ (U+2422), and the Open Box ␣ (U+2423). \n

"Don't let it end like this. Tell them I said something." - last words of Pancho Villa (1877-1923)