Instrukcii po organizacii mirroringa biblioteki
Instrukciya lezhit zdes':
http://lib.ru/DOWNLOAD/mirroring.txt
Usloviya organizacii zerkala
Uslovij poka nikakih ne vydvigaetsya,
krome vpolne estestvennyh:
1. Prislat' mne url zerkala
2. Soobshchit' email-adres otvetstvennogo za zerkalo
3. Stavit' VSE apdejty biblioteki po mere postupleniya i
bez zaderzhki.
4. Ne isklyuchena veroyatnost' izmenenij v biblioteke,
kotorye potrebuetsya provodit' bystro i na vseh zerkalah.
Posemu - garantirujte, chto "srochnye apdejty" budut
stavit'sya nezamedlitel'no, As Soon As Possible.
Po sostoyaniyu na 23 yanvarya 2004 biblioteka sostoit iz 40 tysyach fajlov,
razmer - 4.6 Gb, temp prirosta - 50-100Mb v mesyac.
Poseshchaemost' bazovogo zerkala (lib.ru) - do 40 Gb
trafika, 500,000 zaprosov, 35-40 tysyach chelovek v sutki,
Poseshchaemost' vseh ostal'nyh zerkal ocenit' mozhno kak 30 - 500
chelovek v den' kazhdoe.
Vse fajly biblioteki lezhat na anonymous ftp v
v zaarhivirovannom vide.
Vse lezhit v formate cpio.gz odnim fajlom liYYMMDD.cpz ( ok. 2Gb )
(gde YY - god, MM - mesyac, DD - den' sozdaniya arhiva)
a tak zhe etot zhe arhiv porezannyj na kusochki liYYMMDD.r01,
liYYMMDD.r02 i t.d.
CHtoby raskrutit' vse eto bezobrazie - imet' Unix-mashinu v
lyubym httpd serverom i programmami perl, gzip, glimpse,
lezhashchimi v kataloge /usr/bin ili /usr/local/bin
Patch dlya kirilizacii glimpse lezhit v
http://lib.ru/WEBMASTER/locale.c
(imet' otpatchennyj pod KOI8 glimpse neobyazatel'no -
prosto bez nego ne budet rabotat' search. )
1. Sozdat' usera
moshkow UID=555 HOME=/home/moshkow
2. Vskryt' v ego HOME arhiv
cd ~moshkow
cat li20040123.cgz | gunzip | cpio -idmv
cat li??????.r?? li??????.r??? | gunzip | cpio -idmv
vse soderzhimoe biblioteki dolzhno upast' v katalog
~moshkow/public_html/
vse teksty lezhat v ~moshkow/public_html/book/
summarnyj "tonnazh" (na yanvar' 2004-go) primerno 4700 Mb
3. Vstat' v katalog, gde u servera lezhat cgi-bin skripty, i
zakinut' v nego CGI skript html-KOI
Skripty mogut nazyvat'sya tak:
koi, win, lat, alt, iso, mac
Libo
html-KOI, html-windows, html-volapuk, html-alt, html-mac,
html-iso_8859_5
Libo koi.cgi, win.cgi, lat.cgi, alt.cgi, iso.cgi, mac.cgi
po vkusu.
cd /usr/local/etc/httpd/cgi-bin # ili gde tam eshche ?
cd /home/httpd/cgi-bin # ili gde tam eshche ?
ln -s /home/moshkow/public_html/bin/html-KOI.pl koi
ln -s koi win
ln -s koi lat
ln -s koi alt
ln -s koi iso
ln -s koi mac
Primechanie: chtob v cgi-bin direktoriyu mozhno bylo klast'
simlinki, v acces.conf httpd servera na cgi-direktorii dolzhna
byt' propisana opciya FollowSymLinks primerno tak:
<Directory /home/httpd/cgi-bin>
Options FollowSymLinks
</Directory>
4. Ostaetsya yuzat':
http://your.host.name/cgi-bin/koi/ ....
libo
http://your.host.name/cgi-bin/html-KOI/ ....
A esli vstavit' v conf/srm.conf primerno takie stroki:
ScriptAlias /library/koi /home/moshkow/public_html/bin/koi
ScriptAlias /library/win /home/moshkow/public_html/bin/win
ScriptAlias /library/lat /home/moshkow/public_html/bin/lat
ScriptAlias /library/alt /home/moshkow/public_html/bin/alt
ScriptAlias /library/iso /home/moshkow/public_html/bin/iso
ScriptAlias /library/mac /home/moshkow/public_html/bin/mac
to k biblioteke mozhno budet obrashchat'sya tak:
http://your.host.name/library/koi/
5. Punkt 1 vypolnyat' neobyazatel'no, no v etom sluchae
neobhodimo v nachale fajla html-KOI otredaktirovat' konstanty
zadayushchie fakticheskoe raspolozhenie fajlov biblioteki.
6. Ubedites', chto u vas est' /usr/bin/perl
7. Vse fajly biblioteki - v kodirovke koi8, poetomu
ispol'zovanie ih bez cgi-skripta html-KOI na vindovs-mashinah
mozhet byt' ne ochen' udobno
O vnutrennem formate biblioteki
Format hraneniya biblioteki opisan tut:
http://lib.ru/WEBMASTER/libformat.txt
Glimpse - indeksator, neobhodimyj dlya poiska
Dlya raboty poiska po oglavleniyu biblioteki, na server nuzhno
ustanovit' _patchennyj_ indeksator glimpse. Skompilirujte samostoyatel'no,
i polozhite libo v /usr/local/bin/glimpse libo v
~moshkow/public_html/bin/glimpse
Dlya FreeBSD,Linux,Solaris binaries versii glimpse mogu prislat' po zaprosu.
Patch dlya glimpse lezhit zdes'
http://lib.ru/WEBMASTER/locale.c
http://kulichki.com/moshkow/WEBMASTER/locale.c
Ishodniki glimpse - zdes':
ftp://ftp.cs.arizona.edu/glimpse/glimpse-4.1.src.tar.gz
http://kulichki.com/moshkow/SOFTWARE/glimpse-4.1.src.tar.gz
Ob ekonomii diskov: szhatie fajlov biblioteki
Pri zhelanii sekonomit' diskovoe prostranstvo vebservera mozhno
za-gzip-it' nekotorye TXT-fajly biblioteki. Bibliotechnyj skript umeet
opredelyat' takie sluchai i avtomaticheski guzip-it' szhatye fajly. Na zerkale s
bol'shoj poseshchaemost'yu ya by etogo delat' vse zhe ne rekomendoval - CPU
zhretsya... I eshche v szhatyh fajlah ne rabotaet poisk i What-s-new... Nizhe
primer komandy, szhimayushchej vse tekstovye fajly razmerom bol'she 100Kb i
izmenivshiesya bolee dvuh mesyacev nazad.
su - moshkow
cd ~moshkow
find public_html/book -type f -size +200 \
-mtime +60 -name "*.txt" -exec gzip {} \; -print
Library installation(english version)
Feel free to point to my
mistakes in this paragraph.
Sorry for my bad english.
1. Create user
moshkow UID=555 HOME=/home/moshkow
2. Extract archieve in moshkow's HOME directory
cd ~moshkow
cat li??????.r?? li??????.r??? | gunzip | cpio -idmv
all content of library goes to
~moshkow/public_html/
all e-texts goes to ~moshkow/public_html/book/
full size of library at (Feb 1998) ~ 500 Mb
3. Link cgi-script html-KOI to your sgi-directory
You can name scripts as
koi, win, lat, alt, iso, mac
Or koi.cgi, win.cgi, lat.cgi, alt.cgi, iso.cgi, mac.cgi
on your taste
cd /usr/local/etc/httpd/cgi-bin # (or where it is?)
cd /home/httpd/cgi-bin # (or where it is?)
ln -s /home/moshkow/public_html/bin/html-KOI koi
ln -s koi win
ln -s koi lat
ln -s koi alt
ln -s koi iso
ln -s koi mac
Note: Your cgi-directory should have permition for using
symlink in it so add into acces.conf Option FollowSymLinks for
cgi-directory some like:
<Directory /home/httpd/cgi-bin>
Options FollowSymLinks
</Directory>
4. And now - you can use it:
http://your.host.name/cgi-bin/koi/ ....
Also you can insert into conf/srm.conf somethiing like:
ScriptAlias /library/koi /home/moshkow/public_html/bin/koi
ScriptAlias /library/win /home/moshkow/public_html/bin/win
ScriptAlias /library/lat /home/moshkow/public_html/bin/lat
ScriptAlias /library/alt /home/moshkow/public_html/bin/alt
ScriptAlias /library/iso /home/moshkow/public_html/bin/iso
ScriptAlias /library/mac /home/moshkow/public_html/bin/mac
And now - you can use it with url:
http://your.host.name/library/koi/
6. You should have perl (or symlink to real location of perl) at
/usr/bin/perl
Vsyu rusifikaciyu bibliotechnyj skript delaet sam. V russkom Apache
nuzhno otklyuchit' vse perekodiruyushchie funkcii kasayushchiesya
biblioteki.
Vozmozhno pomozhet v httpd.conf:
<Directory /home/moshkow/public_html>
CharsetTurnOff on
CharsetMatchLanguage on
</Directory>
Ili v httpd.conf
<IfModule mod_charset.c>
...
# CharsetDisable directive turns off all charset processing.
<Directory /home/moshkow/public_html/bin>
CharsetDisable on
</Directory>
...
</IfModule>
Ili sozdat' v /home/moshkow/public_html fajl .htaccess i v nego
vpisat' strochki (kakie? sm. rusapach-doc)
Bazovaya versiya arhiva sdelana v yanvare 2004. Izmeneniya i
dopolneniya v biblioteku vykladyvayutsya raz v nedelyu po
ponedel'nikam.
Updates lezhat na anonymous ftp v /pub/moshkow/.library/
v zaarhivirovannom vide v formate cpio.gzip
apYYYYMMDD.cpz
(gde YYYY - god, MM - mesyac, DD - den' sozdaniya update)
ili, porezanyj na kusochki po 1Mb
apYYYYMMDD.r??
Otnyne generaciya ftp-apdejtov perevedena na avtomat, arhivy
gotovyatsya ezhenedel'no po ponedel'nikam v 5 utra. Razmeshchayutsya so
standartnymi nazvaniyami fajlov.
apYYYYMMDD.cpz
gde YYYYMMDD - data sozdaniya apdejta
a tak zhe s fiksirovannym imenem
apLAST.cpz
Povtoryat' etu operaciyu raz v nedelyu po ponedel'nikam:
1. Sftpit' ocherednoj apTODAY.cpz ili apYYMMDD.cpz
2. Vskryt' arhiv v domashnem kataloge yuzera moshkow
su moshkow
cd ~moshkow
cat apTODAY.cpz | gunzip | cpio -idmv
vse eti dejstviya mozhno vypolnit' komandoj
/home/moshkow/public_html/bin/libraryadmin -getlast
# estestvenno, razumnee vsego fragment programmy, otvechayushchij
# za apdejty vytashchit' v otdel'nyj komandnyj fajl i zaryadit' ego
# v crontab na utro ponedel'nika.
Vsya podderzhka biblioteki osnovana na perlovom cgi-kripte, tak chto
principial'nyh prepyatstvij razvernut' zerkalo ne na Unix a na NT vrode by
net. Za proshedshee vremya 4 administratora predpolagali eto sdelat', no
soobshchenij ob uspehah ya ot nih ne poluchil.
Odno zerkalo: http://www.belpsb.minsk.by/moshkow/
rabotaet pod NT, zapustil ego vebmaster webmaster@belpsb.minsk.by
Ezhednevnyj apdejt po email
YA rassylayu ezhednevnye apdejty po pochte na vydelennye adresa po
dogovorennosti s administratorami.
Format: uudecode - gzip - cpio
Razmer - 1-7Mb v pis'me. Poetomu sendmejl dolzhen prinimat' pis'ma ne
menee chem v 8Mb za raz. V standartnom sendmail.cf obychno stoit max 1Mb
Na prieme nado propuskat' prihodyashchee pis'mo skvoz' skript, kotoryj
zapuskaetsya pod UID moshkow . (I ni v koem sluchae - ne pod root !!!)
Sam skript mozhno pricepit' na vydelennyj adres libo cherez .procmailrc
ili .forward ili na /etc/aliases
Inogda v sendmail.cf byvaet zapreshcheno ispolnenie skriptov
v forvarde i aliasah. Ubedites', chto s etimne budet problem.
Adres soobshchit' mne, ya vklyuchu ego v rassylku.
Primer vozmozhnogo /usr/locale/sbin/script:
#!/bin/sh
TMP=/tmp/$$
mkdir $TMP
cd $TMP || exit 1
sed -e '/^begin [0-9]/ s%/%_%g' |\
uudecode
cd ~moshkow
if zcat $TMP/* > /dev/null ; then
if zcat $TMP/* | cpio -it > /dev/null ; then
zcat $TMP/* | cpio -idmv "public_html/*"
rm -rf $TMP
else echo bad archieve
fi
else echo bad gzip archieve
fi | mail -s "mirror.firm.ru report" mirroradmin moshkow@ipsun.ras.ru
Last-modified: Sun, 10 Apr 2005 07:00:52 GMT