_BASE HREF="http://www.nagual.ru/~ache/koi8.html"_

  KOI8-R - Russian Net Character Set  

[Digging...] This page is under permanent construction... Visit again.

This page win a prize on !
(-, , , Pro )

DISCLAIMER: All material here are a result of my personal independent research and other peoples contributions, any company I am with is not responsible for them.

Ache.


This page at http://www.nagual.ru/~ache/koi8.html (original site in Moscow, Russia) is mirrored, use a near-by mirror for faster access.

Following mirrors updated daily (I hope):

If you have some problems with your DNS and can't login to ftp.relcom.ru as a result, try use WWW Kiarchive server (WWW interface to FTP archive) instead.


Table of Contents:


What is KOI8-R?

KOI8-R is a living de-facto standard of Internet Mail/News exchange, WWW browsing and other interactive services in Russian spread through the whole of ex-SU territory at least. It was designed for Russian/English languages only and covers only Russian Cyrillic characters, so if you seeking Ukrainian, Belorussian, etc. Cyrillic characters, try ISO-IR-111 from ECMA registry instead, it matches KOI8-R in common (letters) area.

Main KOI8-R standard documents:


KOI8-R Visualized (browsers checking)

Upper half of KOI8-R code table: 80h - FFh
[Charset Picture]

You can check how well your browser support KOI8-R. Two tables below must match the table above.
Your Fixed Font: 80h - FFh
 0123456789ABCDEF
8
9
A
B
C
D
E
F
Your Proportional Font: 80h - FFh
 0123456789ABCDEF
8
9
A
B
C
D
E
F

HTML Special Characters
©Copyright sign
 Non-breaking space
®Registered sign
­Soft hyphen
Trade mark sign
List bullets

Check your browser display HTML special characters (symbolic names) using KOI8-R encoding and not ISO8859-1 encoding. If you see wrong characters in this table and your font is true KOI8-R, report this bug to your browser development team.

Form input test, button names must be in KOI8-R. If your browser asks file downloading instead of page displaying, it can't handle charset= in HTTP header.

Enter Russian (KOI8-R) text here and press (1st) button:

Standard Russian keyboard layout except / letters (on ~/` key) and special characters from upper keys row.
[Keyboard Picture]


How to create Russian HTML documents in KOI8-R

Check variables your browser passes to HTTPD using this

CGI Test Script

If your browser asks file downloading instead of page displaying, it can't handle charset= in HTTP header. If your browser is configured for Russian language properly (using standards), you'll have KOI8-R in HTTP_ACCEPT_CHARSET field. For example:

HTTP_ACCEPT_CHARSET = KOI8-R, ISO-8859-1; q=0.1

Don't forget to put ACCEPT-CHARSET="KOI8-R, US-ASCII" attribute into your <FORM> tag. General syntax of this attribute is the same as in HTTP Accept-Charset header field (see below) but without any q= quality parameters. This attribute affects all <INPUT> elements of the <FORM>. If you want different charset for each <INPUT> element, you must use ENCTYPE=multipart/form-data form, check Form-based File Upload in HTML (RFC 1867) for more info.

See also:

Document only method, no server modification required

You need insert into the <HEAD> section of your document the following statement (as early as possible):

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=KOI8-R">

This method assumes that the client understands HTML 3.0 language and charset specifications. It assumes that the client understands KOI8-R charset too, i.e. the documents have fixed charset in this case and no on-the-fly document encoding conversion is possible.

Method which requires HTTP daemon actions

According to Hypertext Transfer Protocol -- HTTP/1.1 IETF Draft 06, client may request the document character set by using Accept-Charset header field. The example

Accept-Charset: koi8-r, windows-1251; q=0.8

means that the client knows about koi8-r and windows-1251 character sets besides default iso-8859-1 which any client must understand. If no quality parameter given, 1.0 value assumed (like for koi8-r charset in this example). Charsets with bigger quality values preferred.

If no Accept-Charset field is given, any character set is acceptable. In this case you can't tell the server that you use KOI8-R charset and it can feed you with, say CP1251.

The server uses Content-Type answer field to inform the client about document settings. For example

Content-Type: text/html; charset=koi8-r

Apache HTTPD Tuning for Several Character Sets

I run Apache HTTPD on this site, you can check its status to see its work in progress.

If you add

"text/html; charset=koi8-r"    html8
"text/html; charset=windows-1251"    htmlw

lines to your mime.types, the server will put a proper Content-Type field for all your KOI8-R documents ended with .html8. Also, in this example the server will put a proper Content-Type field for all your MS-Windows documents ended with .htmlw.

As an alternative, you can add

AddType "text/html; charset=koi8-r" .html8
AddType "text/html; charset=windows-1251" .htmlw

lines to your global srm.conf or local .htaccess with same effect.

The server is bound to use charset parameter; if document character set is not listed in Accept-Charset, the server should respond with the 406 (none acceptable) status code.

I made Apache v1.1.1 patch which implements dynamic choosing of proper document charset via Apache .var feature. Here is an example: a.var file (try it) which assumes MIME types and file extensions from examples above. If your WWW client generates proper Accept-Charset field, this example automatically chooses document in correct charset. When your WWW client accepts both KOI8-R and CP1251, KOI8-R document will be chosen with 10% comprehension.

URI: a; vary="type"

URI: a.html8
Content-Type: text/html; charset=koi8-r; qc=0.1

URI: a.htmlw
Content-Type: text/html; charset=windows-1251

It is covenient to store documents in the single charset, converting them on the fly. Sometimes it is possible to load conversion modules directly into HTTPD, but it is very implementation dependent and may require server re-building, so CGI scripts looks like more general solution here. In my previous example instead of two files in different charsets there can be one CGI script with charset passed as an argument which convert single file according to it. For example you can use trans Character Encoding Converter Generator Package to convert between various Russian charsets via UNICODE.

This method requires correct Accept-... fields coming from clients. Most of clients currently don't bother to do it. My patch have workaround for such clients: it uses charset guessing mechanism based on User-Agent header field pattern. Now you can put something like

GuessCharset "Mozilla/* (X11;*" koi8-r

in your config files, which means that Netscape under X11 treated as accepting KOI8-R charset.

Additionly, my patch helps to maintain correct charset in <FORM> input. Next method works in two ways: as standard says and as workaround for current bad practice.

In your HTML document:

  1. Use ACCEPT-CHARSET attribute with <INPUT> and <TEXTAREA> tags as I18N draft says, it must contain comma separated list of charsets acceptable by server (in Accept-Charset header field format but without quality parameters).
  2. Use POST method, it is impossible to determine charset for GET method arguments.

In your CGI script:

  1. Correct client must supply charset=name attribute in Content-Type header field. For example:
    Content-Type: application/x-www-form-urlencoded; charset=KOI8-R
    

    Value of this header field is accessible in CGI script via CONTENT_TYPE CGI variable. You can check how your browser do it using form input test. If a charset is present there, extract it and pass as an argument to your external document charset converter.

    Another standard variant is using ENCTYPE=multipart/form-data, but in this case your client must accompany each part of multipart message with correct charset=name in Content-Type field. I don't know any client which do it, so try to avoid this ENCTYPE.

  2. If charset not present, get it from ACCEPT_CHARSET CGI variable (remember, it contains charset guessed by User-Agent header field, my patch does it) and pass it to charset converter.
    WARNING: it works only for single charset in ACCEPT_CHARSET variable. I don't know (yet) what to do when multiply charsets are present there.

If you don't run Apache HTTPD with my patch, you need to directly ask somewhere in your page about preferred client charset.

Browsers which support this specifications:


[Win3.* Logo]

Microsoft Windows v3.* Stuff

How to setup Win3.11 for KOI8-R properly:

Fonts:

After downloading/unzipping add them using standard Windows procedure, i.e. via Control Panel|Fonts.

Keyboard Switchers:

ATTENTION: All keyboard switchers mentioned here (except WinKey) have CP1251 character set by default, not KOI8-R! You need download and install corresponding keyboard descriptions from below in addition to fonts from above to tune the switchers for KOI8-R.

Recommended: ParaWin 2.0 or CyrWin 4.0 (better), both commercial.

You can find ParaWin 2.0 on russian pirate CD, title: MICROSOFT WINDOWS, volume #1. You can find CyrWin 4.0 on the volume #3 of the same same CD line.

CyrWin 4.0 is able to switch font groups in addition to keyboards.

KOI8-R Keyboard Descriptions for Switchers:

Applicable Software:

Software Tuning:


[Win95 Logo]

Microsoft Windows 95 Stuff

For Win95 Standard Edition you need to make sure you installed Multilanguage Support. Go to Control Panel|Add/Remove Programs, check the Windows Setup tab and make sure MultiLanguage Support is checked. (It is not included with the diskette version of Win95, so if you installed from diskettes, download MutliLanguage Support from Microsoft). Then choose Russian in Control Panel|Regional Setting.

For Win95 Russian Edition you don't need Multilanguage support.

It seems that Win95 have more strict requirements to the fonts, it expect all font varations (i.e. Bold, Italic, Bold Italic) must exists and fonts with Normal variation only display blanks for missing variations.

If you can add something valuable to this section, please, drop me a note.

How to setup Win95 for KOI8-R properly:

Fonts:

BTW, there is useful tool to display additional .TTF font properties including character set and code pages into font properties dialog box, check Windows 95 font properties extension.

Keyboard Setup:

NOTE: Polish language choosed in the examples below, but any Central European language can be choosed instead to assotiate this keyboard to hacked Central European code page, see hack description for details.

For Win95 Standard Edition:

Copy this KOI8-R keyboard description to \Windows\System directory and use this Registry addition (feed it to Regedit.exe) to add KOI8-R keyboard to valid keyboards list.

If you want an experiment, try to use KOI8-R keyboard description for Win95 Russian Edition instead, maybe it will work better (I don't test it personally). Please, report me any results.

Press Control Panel|Keyboard|Language|Add and add Polish language. Choose it and press Poperties, then choose Russian (KOI8-R) in Keyboard Layout menu. Funally, you'll have following picture into Installed keyboard languages and layouts table:

EnEnglish (United States)      United States
PlPolishRussian (KOI8-R)
RuRussianRussian

Check in Enable indicator on taskbar box and choose one of Switch languages methods.

For Win95 Russian Edition:

Copy this KOI8-R keyboard description to \Windows\System directory and use this Registry addition (feed it to Regedit.exe) to add KOI8-R keyboard to valid keyboards list. Press  |||... and add language. Choose it and press , then choose (KOI8-R) in menu. Funally, you'll have following picture into      table:

En ()             
Pl (KOI8-R)
Ru

Check in   box and choose one of      methods.


KNOWN PROBLEMS: Win95 keyboard switch method have serious problems as designed. The problem list:

  1. Win95 implements per-application keyboard switcher, not for all aplications at once.
  2. If HotKey switching does not work, you can use menu switching (from lower right corner: En - Ru - Pl) or temporary go to parent (sic!) window, switch there and return back.
  3. If your keyboard gets stuck after switching to Russian mode, you need to temporarily go to parent window and return back.
  4. If you can't reenter Russian mode again in the current window, you need to temporarily go to parent window, switch to Russian mode there and return back. If it not help, open yet another window in the same program and try to switch keyboard there, then return to original window.
It happens not only with my KOI8-R keyboard, but with standard Win95 Russian keyboard too. Ask Microsoft to fix it.

Applicable Software:

Software Tuning:


[X Logo]

X-Windows Stuff

Fonts:

Locales:

Keyboard:

Place XFree86 3.1.2 keyboard mapping table into /usr/X11R6/lib/X11/xinit/.Xmodmap, then switch to/from russian (KOI8-R) keyboard via CapsLock (after X (re)started). I assume that you use default xinitrc or your $HOME/.xinitrc picks .Xmodmap too. If it doesn't work for you, enter xmodmap /usr/X11R6/lib/X11/xinit/.Xmodmap directly. If you can't modify system directories, just place the file into any directory and call xmodmap there. If you are under X86 OpenWindows, try X86 OpenWindows keyboard mapping table instead.

WARNING: Control keys don't work when russian mode is active, it is a known bug. Drop me a note, if you know, how to fix it.

Software Tuning:


DOS Stuff

Keyboard & Screen Drivers:

Charset Converters:

Applicable Software:

Software Tuning:


UNIX Stuff

Fonts:

Keyboard & Screen Drivers:

Charset Converters:

Locales:

Applicable Software:

Software Tuning:


Macintosh Stuff

I don't have Mac available, so can't comment following materials... Follow MacOS and KOI8-R link for more info.

Keyboard & Screen Drivers:

Software Tuning:


Miscellaneous


Contacts


Donations

Software russification and this page maintaining eats my time and resources without any reward... If you find this stuff useful and offer some money donation, it allows me to intensify my efforts. E-Mail me in this case to discuss donation ways.


There have been [make your visit count, load this image] visitors to this page since Nov 18, 1995.

Powered by FreeBSD. Powered by Apache. This page HTML 3.2 enhanced.

Copyright © 1995-96 by Andrey A. Chernov, Moscow, Russia.