Ocenite etot tekst:


---------------------------------------------------------------
  Tezisy dlya seminara WebClub
  Date: 17 Nov 1999
---------------------------------------------------------------



Obychnyj intelevyj host Linux ili FreeBSD s apachem v sostoyanii
obsluzhit' 100-150 staticheskih rekvestov v sekundu.

|to 7Mln rekvestov v sutki, chto sootvetstvuet 200 tysyach posetitelej
v sutki. Trafik generitsya pri etom 1-2M v sekundu.

Vopros, nado li vam bol'shego?

300 MaxClients sootvetstvuet 3M zaprosov v sutki, 60 tysyach chelovek.
400M RAM

Primer Lenta.Ru. Den' vyborov.
 400Mb RAM, MaxClients 512, Timeout 120, CacheTime 900
 500,000 cgi-html-rekvestov + 5Mln img-fajlov
 v chas pik 35,000 zaprosov, 540 odnovremennyh httpd, load 8-15
 swapa ne bylo


Disk: tol'ko SCSI.

Vot i vse, chto mozhno trebovat' ot mashiny.

I  NIKAKOGO swap! (Imeetsya vvidu, chto u veb-servera swapoblast' byt' dolzhna,
no ona obyazana byt' pustoj)



VSEGDA STAVITX Last-Modified ATRIBUT V VYDACHU CGI-SKRIPTOV
  - dokument  bez  vremennogo shtampa ne sohranyaetsya v lokal'nom
    keshe, i postoyanno perezasasyvaetsya pri prosmotre

Pereimenovat'   svoyu  direktoriyu  CGI-skriptov  iz  cgi-bin  vo
chto-nibud' drugoe
  - Proksi-servery ne keshiruyut URL vida
    http://host.name/cgi-bin/file/name.txt i kazhdyj raz vynuzhdeny
    obrashchat'sya k vam na server.

Vsegda ustanavlivat' pole Last-modified u Russkogo-Apacha s
avtomaticheskim ugadyvaniem kodirovki
  + Da, esli ne vzvodit' eto pole, to na proxy-serverah ne zastryanut
    fajly v nekkorektnoj kodirovke.
  - No naskol'ko napryagutsya vse ostal'nye yuzery (a ih >95%), i sam
    veb-server...

CharsetDisableForcedExpires on
CacheNegotiatedDocs

Ne primenyat' avtoredirekt po charsetu v russkom Apache

CharsetNormalizeToUrl none
CharsetAutoRedirect   koi8-r none
CharsetAutoRedirect   windows-1251 none

Hranit' dokumenty na servere v kodirovke windows-1251

CharsetSourceEnc koi8-r
  + Poskol'ku 95% posetitelej zhivut v etoj kodirovke, dlya nih serveru
ne potrebuetsya perekodirovat' dokumenty.
  - rus-apach _vsegda_ perekodiruet dokument. Dazhe win  v win

Fajlam s SSI server sbrasyvaet Last-Modified, no eto lechitsya

SSI  -  porozhdaet  dopolnitel'nuyu  nagruzku.  Luchshe  vydelit'  ih  tol'ko na
otdel'noe rasshirenie .shtml, i ne trogat' chistostaticheskie .htm i .html.

V konfigure servera est' direktiva, vozvrashchayushchaya Last-Modified SSI-fajlam

XBitHack full


i vypolnit'

chmod 755 *.shtml

Frejmy ne ispol'zovat'

Uslozhnyayut programmirovanie i dobavlyayut lishnie rekvesty:
(dvuhfrejmovaya stranica - 3 fajla vmesto odnogo!)

Ne delat' superoblozhek, maksimum info v golovnuyu stranicu

Lishnij klik, poterya posetitelej, snizhenie glubiny prosmotra.

NIKAKIH ANIMATED-GIFOV
  - Iz-za oshibki v Netscape-navigatore on postoyanno perezaprashivaet
    animated-gif po seti, posylaya zapros na server kazhdye 10-15 sekund
    Predstav'te, chto na vashu stranicu s 10 animirovannymi gifami zashlo
    dvadcat' Netscape i prosto  smotryat na nee ni vo chto ne klikaya.
    Netscap'y sami nachnut slat' vashemu serveru IFMS-zaprosy v tempe
    20 zaprosov v sekundu.

Lishnie imadzhi = poteryannye den'gi
  + Mnogie hostery ne berut deneg za traffik i razmery grafiki mozhno ne
    schitat'.
  - No chasto vklyuchayut schetchik na _vhodyashchij_ zarubezhnyj traffik.
    Pomnite, chto sam HTTP-rekvest ot zarubezhnogo posetitelya - _vhodyashchij_
    Vsego-to v nem 200-300 bajt. No esli u vas na kazhdoj stranichke po
    20 gif-fajlov s oformleniem, to odin HTML-klik iz-za zagranicy obojdetsya
    v 4Kb vhodnogo trafika. Pomnozhim na 10 tysyach stranichek v den', da na
    30 dnej - 1.2Gb - vhodyashchej zarubezhki. 100-200 baksov - kak s kusta.

Lishnie imadzhi = zamedlennyj otklik i poteryannye posetiteli
  - Mnogo dopolnitel'nyh rekvestov za grafikoj zabivayut vhodnuyu ochered',
    perepolnyaya MaxClients, bolee prioritetnye zaprosy na obychnye html
    vynuzhdeny stoyat' v obshchej ocheredi, zaderzhivaya otklik do 10-30 sekund.
  + Otnesti vsyu grafiku na otdel'nyj port, i na nego povesit' "hudoj"
    otdel'nyj veb-server, kotoryj mozhet tol'ko obsluzhivat' staticheskie
    fajly i nichego krome. V nem - sokrashchennyj TimeOut, i men'she
    zhretsya virtual'noj pamyati.
  + khttpd dlya Linux - rabotaet kak modul' yadra - s minimal'nym overhedom.
    http://www.fenrus.demon.nl/index.html
  + thttpd - derzhit do 2000 rekvestov/sek bez ogranicheniya chisla konnektov
    pod FreeBSD na nem sdelan images.rambler.ru, pod Linux glyuchit
    http://www.acme.com -> freeware
    Mathopd (na nem sdelan top.list.ru)
  + Razmyshleniya/sovetami po povodu proizvoditel'nyh http-serverov:

.htaccess v yuzerskih direktoriyah otmenit'
    Delaem
    AllowOverride None
    inache server pri otkrytii lyubogo dokumenta budet posledovatel'no
    sherstit' vse vyshestoyashchie direktorii na predmet nalichiya v nih .htaccess



  - Soshedshij s uma robot sobiraet neveroyatnoe kolichestvo 404 oshibok,
zaciklivayas' v nih na veki

404 kod ne delat' cgi-skriptom

404 kod ne delat' "krasivym" - s gifchikami i ukazaniyami na prochie razdely



robots.txt

Obyazatel'no delat' fajl robots.txt, potomu chto on - naibolee zaprashivaemyj
na servere dokument, i inache porozhdaet massu 404 - sm. vyshe, osobenno
esli 404 - cgi-skript

Razumnye roboty slushayutsya zapretov v fajle robots.txt
# "Skazhem NET offline-kachalkam
User-Agent: DISCo Pump, Wget, WebZIP, Teleport Pro, WebSnake, Offline Explorer, Web-By-Mail
Disallow: /

Upravlenie dostupom cherez httpd.conf

Primer perekryvaet dostup k nashim .zip fajlam esli ih
linkuyut ne s nashih stranic a snaruzhi.

SetEnvIfNoCase Referer lib\.ru     internal_referer
SetEnvIfNoCase User-Agent Teleport internal_referer
SetEnvIfNoCase User-Agent Vampire  internal_referer
SetEnvIfNoCase User-Agent ReGet    internal_referer
SetEnvIfNoCase User-Agent GetRight internal_referer
SetEnvIfNoCase User-Agent Wget     internal_referer

<Files ~ "\.zip$">
ErrorDocument 403 http://lib.ru/books/index.htm
order deny,allow
deny from all
allow from env=internal_referer
</Files>

Razvivat'  ego  mozhno po raznym napravleniyam: po raznomu obrabatyvat' raznyh
Us­ er-Agent, proveryat' IP-klienta i mnogoe drugoe, i glavnoe, chto  vse  eto
delaetsya  ne  v  cgi-skripte,  a  na urovne bazovogo httpd - a znachit deshevo
obhoditsya serveru.

Esli robot uporstvuet, ego unichtozhayut

route add -host 123.456.789.1 gw localhost


Esli na na mod_rewrite, kak to tak - po usloviyam -
    RewriteCond %{HTTP_USER_AGENT} Teleport [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} MSIECrawler [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} DISCoFinder [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} WebCrawler [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} spider [NC,OR]
 vse  zaprosy ot izvestnyh robotov na dinamicheskie stranicy perenapravlyayutsya
na staticheskuyu zaglushku
    RewriteRule ^/news.html?              /static_index.html  [R]

NC = No Case
R = redirect
L = Last rule

Naprimer - pereadresovka vseh vneshnih referorov na arhivy - na mordu sajta

    RewriteEngine on
    RewriteCond %{HTTP_REFERER} !^http://(www\.lib\.ru/)|(lib\.ru/).*$ [NC]
    RewriteBase /home/lib-www/docs/
    RewriteRule ^arc/.*\.(zip)|(rar)$ http://www.lib.ru/ [R]
    RewriteCond %{HTTP_REFERER} !^http://(www\.lib\.ru/)|(lib\.ru/).*$ [NC]
    RewriteBase /home/lib-www/docs/
    RewriteRule ^index2\.html$        http://www.lib.ru/ [R]

        Ili tak:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://allowed-site1.com*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.allowed-site1.com*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://allowed-site2.com*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.allowed-site2.com.*$ [NC]
RewriteRule ^.*$ http://site.com/another_pic.gif [R,L]

        Dazhe tak:

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?domen.ru/.*$ [NC]
RewriteRule \.(gif|jpg)$ http://www.domen.ru/fuck_off.gif [R,L]

RewriteEngine on
RewriteCond %{REMOTE_ADDR}  !^81.19.69.21$
RewriteRule ^(/n/.*) https://lenta.ru$1 [R,L]

RewriteEngine on
RewriteCond %{REMOTE_ADDR}  !^81.19.69.28
RewriteCond %{REMOTE_ADDR}  !^81.19.68.[6-9]
RewriteCond %{REMOTE_ADDR}  !^81.19.68.1[012].
RewriteRule ^(/N/.*) https://lenta.ru$1 [R,L]

# Allow from 81.19.68.64/255.255.255.224



Ne stav'te bannery na samyj verh
 - Banner sverhu otnimaet 1-2 rekvesta iz 4 - i v itoge gruzitsya vpered
   tormozya vashi sajtovye kartinki
 + v ssylke na img src bannera vmesto hostname stav'te IP - sekonomite
   posetitelyu dns-resolving - a eto 2-30 sekund.
 Zaderzhka v zagruzke "vashego" soderzhimogo - derzhit u _vas_ lishnie httpd

 Ni v koem sluchae ne delat' uniq-url dlya bannera s pomoshch'yu SSI virtual cgi
include. Potomu chto ps -axf pokazhet vam:

12858  ?  S    0:00  \_ /usr/local/apache/sbin/httpd
12859  ?  S    0:00  \_ /usr/local/apache/sbin/httpd
12862  ?  S    0:00  \_ /usr/local/apache/sbin/httpd
13097  ?  Z    0:00  |   \_ (rand.cgi <zombie>)
13098  ?  Z    0:00  |   \_ (rb2 <zombie>)
13103  ?  Z    0:00  |   \_ (rb2 <zombie>)
13104  ?  Z    0:00  |   \_ (c4.pl <zombie>)
13105  ?  Z    0:00  |   \_ (random.cgi <zombie>)
12863  ?  S    0:00  \_ /usr/local/apache/sbin/httpd
12868  ?  S    0:00  \_ /usr/local/apache/sbin/httpd

Vmesto etogo ispol'zovat' var - datu

<!--#config timefmt="%H%w%e%M%S"-->
<a href=http://rb2.design.ru/cgi-bin/href/nit?<!--#echo var="date_local"-->
target="_top">

<!--#config timefmt="%M%H%S%I%e"-->
<a href=http://www1.reklama.ru/cgi-bin/href/nit?<!--#echo var="date_local"-->
target=_top>
<img src=http://www1.reklama.ru/cgi-bin/banner/nit?<!--#echo var="date_local"-->
width=468 height=60 border=0 vspace=10
alt="www.reklama.ru. The Banner Network." ismap></a>

Tx3 predlagaet vnutrennyuyu podkachku bannera: eto lishnij cgi-skript,
zatem iz skripta delaet obrashchenie k bannernomu dvizhku - eto zaderzhka
pri generacii html, a znachit - bol'she httpd visyashchih v pamyati.



200 tysyach v sekundu = 3 skripta v sekundu

30 static v sekundu =

suexec - zapusk cgi-skriptov pod yuzerskim id - da, povyshaet bezopasnost', no
udvaivaet  chislo  fork+exec  pri  zapuske  lyubogo   cgi-skripta.   Izbegajte
naskol'ko eto vozmozhno.

Sledit', chto vkompilirovano v httpd. Da, konechno kod v unix reenterabel'nyj,
no ved' u modperl i php3 ogromnye oblasti inicializiruemyh dannyh - vse  eto
zhret  virtual'nuyu  pamyat',  i vremya na obrabotku odnogo zaprosa, da i prosto
proverka hoock'ov, na kotoruyu podvesheny  moduli  otnimaet  vremya.  Stoit  li
obrabatyvat'   100  staticheskih  httpd-zaprosov,  dlya  obsluzhivaniya  kotoryh
dostatochno odnogo modulya default s pomoshch'yu 5M monstra s vkompilirovannymi  v
nego  modperl, php3, ssl httpd - kotoryh za eto zhe vremya potrebuetsya 2-5. Iz
100.


Konechno  luchshij yazyk dlya napisaniya cgi-skriptov - perl.  No on bezzhalosten k
serveru.

Perl-skripty  -  kompiliruyutsya pri kazhdom vyzove. Skorost' kompilyacii sil'no
zavisit, no vse ravno - eto primerno 0.1 sek na  20Kb  perl-koda.  Moral'  -
dazhe  bez  ucheta  na  vremya  raboty sobestvenno programmy 60Kb skript smozhet
vypolnit'sya ne chashche chem 2-3 raza za sekundu!

Kak vykruchivat'sya iz polozheniya?

Razbit'  bol'shoj  skript  na  mnogo  melkih sostavnyh chastej i podklyuchat' ih
tol'ko kogda ukazannyj kusok koda trebuetsya  pri  dannom  sluchae  ispolneniya
koda. Dlya etogo v perl ispol'zuetsya operator "require" (|to gramotnyj analog
include - gramotnost' zaklyuchena v tom, chto reyauire - ispolnimyj operator,  i
zatyagivaet  dopolnitel'nyj  kod  tol'ko kogda on zatrebovan, a pri povtornom
ispolnenii require on ego NE perekompiliruet povtorno)

Prekompilyaciya perl. Perl2C. modperl. FastCGI...


Keshirovanie.

Mozhno  sohranyat' rezul'tat raboty skripta v keshfajle i pri povtornyh zaprosa
vydavat' ego vmesto povtornoj generacii.

Po sub®ektivnym oshchushcheniyam kesh fajl luchshe vydavat' ne samim skriptom

open IN $file; while(){print;}

a vnutrennim redirektom

print "Location: http:$file\n\n";

Keshirovanie s pomoshch'yu squid v rezhime proxy-accelerator

Pozhaluj,  luchshee reshenie, esli nado uskoryat' cgi-skriptovyj server. Skorost'
i nagruzka na mashinu u squid-accelerator sovpadaet s rabotoj httpd otdayushchego
staticheskie  html  i  image fajly. A nagruzku na cgi-dvizhok on snizhaet v 2-3
raza.

Squid  smozhet podderzhivat' direktivy IfModifiedSince i REGET dlya soderzhimogo
skripta, chto, ponyatnoe delo samomu v skripte delat' ochen' neveselo.

    Reset v 4 chasa nochi

Mashiny stoyali mordami drug k drugu tak, chto vyezzhayushchaya podstavka dlya kofe odnogo nazhimala na knopku Reset vtorogo, i naoborot. Predydushchaya reinkarnaciya moej lib.ru zhila v odnom korpuse s drugoj mashinoj. Byla u nih vnutri na kolenke payanaya shema-samodelka, kotoraya pozvolyala pitanie peredernut' sosedu. A voobshche dlya podobnyh veshchej obychnyj smart-UPS luchshe vsego podhodit. A komport ot UPSa nado zavodit' libo na kisku, ibo oni ne dohnut, libo na modem i zvonit' na nego iz doma. From: Exler Poskol'ku ohrannik raza tri za noch' obhodil pomeshchenie na predmet vozgoraniya (zahodil v komnatu, vklyuchal svet, obozreval pomeshchenie, vyklyuchal svet i uhodil), k vyklyuchatelyu na noch' prisoedinyalas' knopka, kotoraya pri nazhatii na vyklyuchatel' avtomaticheski resetila mashinu.

    Apache Config

Konfiguracionnye parametry vliyayushchie na skorost'. Options FollowSymLinks - pozvolyaet ne proveryat' simlinki AllowOverride all - pozvolyaet ne iskat' .htaccess vo vseh poddirektoriyah Ochen' vazhno! Na servere s bol'shoj poseshchaemost'yu: 1. Kartinki snesti na vydelennyj server(port) (ili otdel'nyj process servera), i otklyuchit' KeepAlive Off Poskol'ku Alive ispol'zuetsya tol'ko dlya podkachki kartinok, a dlya htmlya brouzer vse ravno otkryvaet novyj konnekt. S KeepAlive kazhdyj server obsluzhiv pros eshche 15 sekund boltaetsya v pamyati ozhidaya, ne pridet li novyj zapros na kartinku - uvelichivaya kolichestvo processov raza v 4.

    Pereezd servera, smena ego IP-adresa

Staryj IP-adres sidit v keshah DNS dovol'no dolgo (oficial'no - do 8 chasov, real'no - do dvuh s lishnim sutok). Vse eto vremya mnogie klienty idut po staromu IP, na kotorom ih uzhe nikto ne zhdet - poteri posetitelej vo vremya "ustakanivaniya DNS dostigayut ot 20 do 60%. Vyhod: dvuhshagovaya smena IP s ispol'zovaniem redirektov. 1. SHag. Za dva dnya do real'noj smeny IP podnimaem na novom IP virtual'nyj vebserver-zaglushku, kotoryj bydet otklikat'sya na www.washserver.ru, a v ego konfigure stavim redirekt vseh zaprosov na http://washserver.ru httpd.conf na novom IP-adrese: <VirtualHost Novyj-IP:*> ServerName www.washserver.ru Redirect / http://washserver.ru/ </VirtualHost> DNS-zona domena washserver.ru: @ IN A staryj-IP www IN A novyj-IP Posle etogo propisyvaem v DNS dlya www.washserver.ru novyj IP, a washserver.ru ostavlyaem starym. Posetiteli, prishedshie na www.washserver.ru budut redirektit'sya na washserver.ru - t.e. my nikogo ne poteryaem, i zhdem 2 sutok, poka "razojdetsya" novyj IP dlya www.washserver.ru CHerez 2 sutok 2 shag. Real'naya smena IP u servera. Odnovremenno s etim: Na starom IP podnimaem virtual'nyj vebserver-zaglushku, kotoryj budet otklikat'sya na washserver.ru, i delat' redirekt vseh zaprosov na http://www.washserver.ru V DNS propisyvaem washserver.ru na novyj IP Posetiteli, prishedshie po staromu IP na washserver.ru budut redirektit'sya na www.washserver.ru s novym IP - t.e. my nikogo ne poteryaem. A cherez 2 sutok novyj IP dlya imeni washserver.ru razojdetsya po DNS i redirekt mozhno budet snyat'. httpd.conf na starom IP-adrese: <VirtualHost Staryj-IP:*> ServerName washserver.ru Redirect / http://www.washserver.ru/ </VirtualHost> DNS-zona domena washserver.ru: @ IN A novyj-IP www IN A novyj-IP

Last-modified: Tue, 12 Apr 2005 05:24:00 GMT
Ocenite etot tekst: