KOI8-R
This article may require copy editing for grammar, style, cohesion, tone, or spelling. (April 2026) |
| Alias(es) | cp878 (code page 878) |
|---|---|
| Languages | Russian, Bulgarian |
| Classification | 8-bit KOI, extended ASCII |
| Extends | KOI8-B |
| Based on | KOI-8 |
| Other related encodings | KOI8-U, KOI8-RU |
KOI8-R (RFC 1489) is an 8-bit character encoding derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses the Russian subset of a Cyrillic script. KOI-8, in turn, is an 8-bit extension of the KOI-7 encoding, which inherited a phonetic correspondence of Russian and Latin letters from the MTK-2 teletype code. As a result, Russian Cyrillic letters in KOI8-R are in pseudo-Latin alphabetical order rather than the normal Cyrillic one like in ISO 8859-5. Although this may seem unnatural, this has the useful effect that if the 8th bit is stripped, the text remains partially readable in any ASCII-based encoding (including KOI8-R itself) as a case-reversed transliteration. For example, "Код для обмена и обработки информации" (the Russian meaning of the "KOI" acronym) becomes kOD DLQ OBMENA I OBRABOTKI INFORMACII.
KOI-8 stands for 8-bitnyy kod dlya obmena i obrabotki informatsii (Russian: 8-битный код для обмена и обработки информации) which means "8-Bit Code for Information Interchange".[1] In Microsoft Windows, KOI8-R is assigned the code page number 20866. In IBM, KOI8-R is assigned code page 878.[2][3] KOI8-R also happens to cover Bulgarian.
It lacks proper quotation marks for these languages: both «...» and the Bulgarian „...“. Windows-1251 does support these, as well as more letters, and has thus become more popular. KOI8-R is used by less than 0.004% of websites, mostly Russian and Bulgarian.[citation needed] Unicode and UTF-8 is preferred to single-byte Cyrillic encodings in modern applications, Unicode contains 436 Cyrillic letters including for Old Cyrillic.
Character set
The following table shows the KOI8-R encoding. Each character is shown with its equivalent Unicode code point.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0x | ||||||||||||||||
| 1x | ||||||||||||||||
| 2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
| 3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
| 4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
| 5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
| 6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
| 7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
| 8x | ─ 2500
|
│ 2502
|
┌ 250C
|
┐ 2510
|
└ 2514
|
┘ 2518
|
├ 251C
|
┤ 2524
|
┬ 252C
|
┴ 2534
|
┼ 253C
|
▀ 2580
|
▄ 2584
|
█ 2588
|
▌ 258C
|
▐ 2590
|
| 9x | ░ 2591
|
▒ 2592
|
▓ 2593
|
⌠ 2320
|
■ 25A0
|
∙ 2219
|
√ 221A
|
≈ 2248
|
≤ 2264
|
≥ 2265
|
NBSP | ⌡ 2321
|
° 00B0
|
² 00B2
|
· 00B7
|
÷ 00F7
|
| Ax | ═ 2550
|
║ 2551
|
╒ 2552
|
ё 0451
|
╓ 2553
|
╔ 2554
|
╕ 2555
|
╖ 2556
|
╗ 2557
|
╘ 2558
|
╙ 2559
|
╚ 255A
|
╛ 255B
|
╜ 255C
|
╝ 255D
|
╞ 255E
|
| Bx | ╟ 255F
|
╠ 2560
|
╡ 2561
|
Ё 0401
|
╢ 2562
|
╣ 2563
|
╤ 2564
|
╥ 2565
|
╦ 2566
|
╧ 2567
|
╨ 2568
|
╩ 2569
|
╪ 256A
|
╫ 256B
|
╬ 256C
|
© 00A9
|
| Cx | ю 044E
|
а 0430
|
б 0431
|
ц 0446
|
д 0434
|
е 0435
|
ф 0444
|
г 0433
|
х 0445
|
и 0438
|
й 0439
|
к 043A
|
л 043B
|
м 043C
|
н 043D
|
о 043E
|
| Dx | п 043F
|
я 044F
|
р 0440
|
с 0441
|
т 0442
|
у 0443
|
ж 0436
|
в 0432
|
ь 044C
|
ы 044B
|
з 0437
|
ш 0448
|
э 044D
|
щ 0449
|
ч 0447
|
ъ 044A
|
| Ex | Ю 042E
|
А 0410
|
Б 0411
|
Ц 0426
|
Д 0414
|
Е 0415
|
Ф 0424
|
Г 0413
|
Х 0425
|
И 0418
|
Й 0419
|
К 041A
|
Л 041B
|
М 041C
|
Н 041D
|
О 041E
|
| Fx | П 041F
|
Я 042F
|
Р 0420
|
С 0421
|
Т 0422
|
У 0423
|
Ж 0416
|
В 0412
|
Ь 042C
|
Ы 042B
|
З 0417
|
Ш 0428
|
Э 042D
|
Щ 0429
|
Ч 0427
|
Ъ 042A
|
See also
- KOI8-B, a derivation of KOI8-R with only the letter subset implemented
- KOI8-U, another derivative encoding which adds Ukrainian characters
- KOI character encodings
- RELCOM
- Windows-1251, another common Cyrillic character encoding
References
- ^ (in Russian) ГОСТ 19768-74 (СТ СЭВ 358-76). Машины вычислительные и система обработки данных. Коды 8-битные для обмена и обработки информации.
- ^ "SBCS code page information - CPGID: 00878 / Name: Russian internet koi8-r". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. IBM. C-H 3-3220-050. Archived from the original on 2017-02-18. Retrieved 2017-02-18.
- ^ "CCSID information document; CCSID 878; KOI8-R CYRILLIC". IBM. Retrieved 2017-02-18.
- ^ Richter, Helmut (2016-01-04) [1999-08-18]. "KOI8-R.TXT". 2.0. Retrieved 2016-12-09.
- ^ Code Page CPGID 00878 (pdf) (PDF), IBM
- ^ Code Page CPGID 00878 (txt), IBM
- ^ International Components for Unicode (ICU), ibm-878_P100-1996.ucm, 2002-12-03
Further reading
- Flohr, Guido; Kiss, Gabor; Chernov, Andrey A. (2016) [2006]. "Locale::RecodeData::KOI8_R - Conversion routines for KOI8-R". CPAN libintl-perl. 1.0. Archived from the original on 2017-01-15. Retrieved 2017-01-15.
- Kostis, Kosta. "koi8-r (Russian U*IX encoding, also used by RELCOM)". 1.20. Archived from the original on 2017-01-16. Retrieved 2017-01-16.
- RFC 1489
- "KOI8-R (RFC 1489)". Kermit. Columbia University. Retrieved 2020-06-24.
- Kornai, Andras; Birnbaum, David J.; da Cruz, Frank; Davis, Bur; Fowler, George; Paine, Richard B.; Paperno, Slava; Simonsen, Keld J.; Thobe, Glenn E.; Vulis, Dimitri; van Wingen, Johan W. (1993-03-13). "CYRILLIC ENCODING FAQ Version 1.3". 1.3. Retrieved 2020-06-24.
External links
- Universal Cyrillic decoder, an online program that may help recovering Cyrillic texts with broken KOI8-R or other character encodings.
- "The Home of the KOI8-R since 1995". 1995. Retrieved 2016-12-05.
- Czyborra, Roman (1998-11-30) [1998-05-25]. "The Cyrillic Charset Soup". Archived from the original on 2016-12-03. Retrieved 2016-12-03.
- Hohlov, Yu. E. "Cyrillic Information Representation in Electronic Form - Character Set (Code Page) Tables". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
- Nechayev, Valentin (2013) [2001]. "Review of 8-bit Cyrillic encodings universe". Archived from the original on 2016-12-05. Retrieved 2016-12-05.
Content Disclaimer
Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.
- The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
- There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
- It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
- Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
- Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.