USENIX Association
Proceedings of the
2001 USENIX Annual
Technical Conference
Boston, Massachusetts, USA
June 25–30, 2001
THE ADVANCED COMPUTING SYSTEMS ASSOCIATION
© 2001 by The USENIX Association
All Rights Reserved
For more information about the USENIX Association:
Phone: 1 510 528 8649
FAX: 1 510 548 5738
Email: office@usenix.org
WWW:
http://www.usenix.org
Rights to individual papers remain with the author or the author's employer.
Permission is granted for noncommercial reproduction of the work for educational or research purposes.
This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
Servers
Thiemo Voigt
Swedish Institute of Computer Science
thiemo@sics.se
Renu Tewari
IBM T.J. Watson Research Center
tewarir@us.ibm.com
Douglas Freimuth
IBM T.J. Watson Research Center
dmfreim@us.ibm.com
Ashish Mehra
iScale Networks
ashish@iscale.net
Abstract
TheincreasingnumberofInternetusersandinnovative
new services such as e-commerce are placing new
de-mandsonWebservers. ItisbecomingessentialforWeb
serverstoprovideperformanceisolation,havefast
recov-ery times, and provide continuousservice during
over-load at leastto preferredcustomers. Inthis paper, we
present the designand implementationof three
kernel-basedmechanismsthatprotectWebserversagainst
over-load by providing admission control and service
dif-ferentiation based on connection and application level
information. Our basic admission control mechanism,
TCPSYNpolicing,limitstheacceptancerateofnew
re-quests basedonthe connection attributes. The second
mechanism, prioritized listen queue, supports dierent
service classes by reordering the listen queue based on
the priorities of the incoming connections. Third, we
presentHTTPheader-basedconnectioncontrolthatuses
application-levelinformation suchas URLsandcookies
tosetprioritiesandratecontrolpolicies.
We have implemented these mechanisms in AIX 5.0.
Through numerous experiments we demonstrate their
eectiveness in achieving the desired degree of service
dierentiationduring overload. We also showthat the
kernel mechanismsare moreeÆcient and scalablethan
applicationlevelcontrolsimplementedintheWebserver.
ThisworkwaspartiallyfundedbythenationalSwedish
Real-timeSystemsResearch Initiative(ARTES). Thiswork
wasdonewhentheauthorwasvisitingtheIBMT.J.Watson
1 Introduction
Applicationservice providersand Webhosting
ser-vices that co-host multiple customer sites on the
sameservercluster or largeSMP are becoming
in-creasingly common in the current Internet
infras-tructure. The increasinggrowth of e-commerceon
thewebmeans that any serverdowntime that
af-fects the clients beingserviced will result in a
cor-respondinglossofrevenue. Additionally,the
unpre-dictabilityof ashcrowdscanoverwhelmahosting
server and bring down multiple customer sites
si-multaneously, aecting the performance of a large
number of clients. It becomes essential, therefore,
forhostingservicestoprovideperformanceisolation
andcontinuousoperationunderoverloadconditions.
Eachoftheco-hostedcustomerssitesorapplications
may have dierent quality-of-service (QoS) goals
based on the price of the service and the
applica-tionrequirements. Furthermore,eachcustomer site
mayrequiredierentservicesduringoverloadbased
on the client's identity (preferred gold client) and
theapplicationorcontenttheyaccess(e.g.,aclient
with abuy ordervs. abrowsingrequest). Asimple
thresholdbasedrequestdiscardpolicy(e.g., aTCP
SYNdropmodeincommercialswitches/routers
dis-cards the incoming, oldest or any random
sirablethatrequestsofnon-preferredcustomersites
bediscardedrst. Such QoSspecicationsare
typ-icallynegotiatedin aservicelevelagreement(SLA)
between the hosting service provider and its
cus-tomers. Based on this governingSLA, the hosting
serviceprovidersneedto support service
dierenti-ationbasedonclientattributes(IPaddress,session
id, port etc.), server attributes (IP address, type),
andapplicationinformation(URLaccessed,CGI
re-quest,cookiesetc.).
Inthispaper,wepresentthedesignand
implementa-tionofkernelmechanismsinthenetworksubsystem
that provideadmissioncontrol andservice
dieren-tiation during overload basedonthecustomer site,
theclient,andtheapplicationlayerinformation.
One of theunderlying principles of our design was
that it should enable\earlydiscard",i.e., ifa
con-nectionistobediscardeditshouldbedoneasearly
aspossible,before it hasconsumed alot of system
resources[2]. Sinceawebserver'sworkloadis
gen-erated by incoming network connections we place
our control mechanisms in the network subsystem
of the serverOS at dierent stages of the protocol
stackprocessing. Tobalancetheneedforearly
dis-card with that of an informed discard, where the
decisionismade withfullknowledgeofthecontent
beingaccessed,weprovidemechanismsthat enable
content-basedadmissioncontrol.
Our second principle was to introduce minimal
changes to the core networking subsystem in
com-mercialoperatingsystemsthat typicallyimplement
aBSD-style stack. There havebeenprior research
eortsthatmodify thearchitectureof the
network-ing stack to enable stable overload behavior [3].
Otherresearchershavedevelopednewoperating
sys-tem architectures to protect against overload and
denial of serviceattacks [4]. Some \virtual server"
implementationstrytosandboxallresources(CPU,
memory, network bandwidth) according to
admin-istrative policies and enable complete performance
isolation [5]. Our aim in this design, however, was
nottobuildanewnetworkingarchitecturebutto
in-troducesimplecontrolsin the existing architecture
thatcouldbejustaseective.
The third principle was to implement mechanisms
that can be deployed both on the server as well
as outside the server in layer 4 or 7 switches that
perform load balancing and content based routing
havesomeform of overloadprotectionmechanisms
that typically consists of dropping a new
connec-tionpacket(orsomerandomnewconnectionpacket)
when a load threshold is exceeded. For
content-based routing the layer7 switch functionality
con-sists of terminating the incoming TCP connection
to determine the destination server based on the
content being accessed, creating a new connection
to the server in the cluster, and splicing the two
connections together [7]. Such a switch has access
to the application headers along with the IP and
TCPheaders. Themechanismswebuiltin the
net-worksubsystemcaneasilybemovedtothefront-end
switchtoprovideservicedierentiationbasedonthe
clientattributesorthecontentbeingaccessed.
There have been proposals to modify the process
schedulingpoliciesintheOStoenablepreferredweb
requests to executeas higherpriority processes[8].
These mechanisms, however, can only change the
relativeperformanceofhigherpriorityrequests;they
do notlimittherequestsaccepted. Sincethe
hard-ware device interrupt on a packet receive and the
softwareinterruptforpacketprotocolprocessingcan
preempt any of the other user processes [3] such
schedulingpoliciescannotpreventordelayoverload.
Secondly,theincomingrequestsalreadyhave
numer-oussystem resourcesconsumedbefore any
schedul-ing policy comes intoeect. Such priority
schedul-ing schemes can co-exist with our controls in the
networksubsystem.
An alternate approachis toenabletheapplications
toprovidetheirindividualadmissioncontrol
mecha-nisms. Althoughthisachievesapplicationlevel
con-trol it requires modications to existing legacy
ap-plications orspecializedwrappers. Application
con-trols are useful in dierentiating between dierent
clients of an application but are less useful in
pre-ventingor delaying overload acrosscustomer sites.
More importantly, various serverresourceshave
al-ready been allocated to arequest before the
appli-cation control comes into eect,violating theearly
discardpolicy. However,thekernelmechanismscan
easily workin conjunction withapplication specic
controls.
Since most web servers receive requests over
HTTP/TCP connections, our controls are located
in three dierent stages in the lifetime of a TCP
The rst control mechanism, TCP SYN
polic-ing,islocatedatthestartofprotocolstack
pro-cessingoftherstSYNpacketofanew
connec-tionandlimitsacceptanceof anewTCPSYN
packetbasedoncompliancewithatokenbucket
basedpolicer.
Thenextcontrol,prioritizedlistenqueue,is
lo-cated at the end of a TCP 3-way handshake,
i.e., when the connection is acceptedand
sup-ports dierent priority levels among accepted
connections.
Third, HTTP header-based connection control,
is located after the HTTP header is received
(which could be after multiple data packets)
andenablesadmissioncontrolandpriority
val-uestobebasedonapplication-layerinformation
containedintheheadere.g.,URLs,cookiesetc.
Wehaveimplementedthese controlsintheAIX5.0
kernelasaloadablemodule usingtheframeworkof
an existing QoS-architecture [9]. The existing QoS
architectureonAIXsupportspolicy-basedoutbound
bandwidthmanagement [10]. These techniques are
easilyportabletoanyOSrunningaBSD style
net-work stack
1 .
Wepresentexperimentalresultstodemonstratethat
these mechanisms eectively provide selective
con-nectiondiscardandservicedierentiationinan
over-loaded server. Wealso compareagainstapplication
layer controls that we added in the Apache 1.3.12
server and show that the kernel controls are much
moreeÆcientandscalable.
Theremainderofthispaperisorganizedasfollows:
InSection2wegiveabriefoverviewoninputpacket
processing. Our architecture and the kernel
mech-anisms are presentedin Section 3. InSection 4we
present and discuss experimental results. We
com-pare the performance of kernel based mechanisms
and application level controlsin Section 5. We
de-scriberelatedworkinSection6andnally,the
con-clusionsandfutureworkinSection 7.
2 Input Packet Processing:
Back-ground
Inthis sectionwebrie y describetheprotocol
pro-cessing steps executed when a new connection
re-1
SYN
partial
control
Rate Control
listen
queue
listen
queue
Accept
by server
prioritized
connection setup complete
URL-based
connection
Figure1: Proposedkernelmechanisms.
questisprocessedbyawebserver. Whenthedevice
interfacereceivesapacketittriggersahardware
in-terruptthatisservicedbythecorrespondingdevice
driver [11]. The device driver copies the received
packet into anmbufandde-multiplexes itto
deter-mine the queueto insertthe packet. For example,
anIPpacketisaddedtotheinputqueue,ipintrq.
The device driver then triggerstheIP software
in-terrupt. TheIP inputroutinedequeues thepacket
fromtheIPinputqueueanddoesthenextlayer
de-multiplexingtoinvokethetransportlayerinput
rou-tine. Forexample,foraTCPpacketthiswillresult
in acalltoatcpinputroutineforfurther
process-ing. The call to the transport layer input routine
happens within the realmof theIP inputcall, i.e.,
thereisnoqueuing betweenthe IPand TCPlayer.
The TCP input processing veries the packet and
locatesthe protocol controlblock(PCB). Ifthe
in-comingpacketisaSYNrequestforalistensocket,a
newdatasocketiscreatedandplacedinthepartial
listenqueueandanACKis sentbackto theclient.
When the ACK for the SYN-ACK is received the
TCP3-wayhandshake is complete, the connection
moves to an established state and the data socket
ismovedto thelisten queue. The sleepingprocess,
e.g., the web server, waiting on the acceptcall is
wokenup. Theconnectionisreadytoreceivedata.
3 Architecture Design
Thenetworksubsystemarchitectureaddsthree
con-trol mechanisms that are placed at the dierent
stages of a TCP connection's life time. Figure 1
shows the various phases in the connection setup
andthecorrespondingcontrolmechanisms: (i)when
aSYN packet isprocessedittriggerstheSYNrate
controlandselectivedrop(ii)whenthe3-way
hand-shakeiscompletedtheprioritizedlistenqueue
selec-tivelychanges theorderingofacceptedconnections
inthelistenqueue(iii)whentheHTTPheaderis
layer information. Each of these mechanisms can
be activated at varying degrees of overload where
theearliest andsimplest controlis triggered at the
highestloadlevel.
3.1 SYN Policer
TCP SYN policing controls the rate and burst at
whichnewconnectionsareaccepted. ArrivingTCP
SYN packets are policed using atokenbucket
pro-ledened bythepair<rate; burst>,whererate
istheaveragenumberofnewrequestsadmittedper
second and burst is the maximum number of
con-currentnewrequests. Incomingconnectionsare
ag-gregated using specied lter rules that are based
on the connection end points (source and
destina-tionaddresses and ports as shown in Table 2). On
arrival at the server, the SYN packet is classied
usingtheIP/TCPheaderinformationto determine
thematchingrule. Acompliancecheckisperformed
againstthetokenbucketproleoftherule. If
com-pliant,anewdatasocket iscreatedandinsertedin
thepartiallistenqueueotherwisetheSYNpacketis
silentlydiscarded.
When the SYN packet is silently dropped, the
re-questingclientwilltime-outwaitingforaSYNACK
and retry again with an exponentially increasing
time-outvalue
2
. An alternate option,which we do
notconsider,istosendaTCPRSTtoresetthe
con-nectionindicatinganabortfromtheserver.This
ap-proach,however,incursunnecessaryextraoverhead.
Secondly,someclientssendanewSYNimmediately
afteraTCPRSTisreceivedinsteadofabortingthe
connection. Notethatwedropnon-compliantSYNs
evenbeforeasocketiscreatedforthenewconnection
thereby investing only asmall amount of overhead
onrequeststhataredropped.
To provide service dierentiation, connection
re-questsare aggregatedbasedonlters andeach
ag-gregatehasaseparate tokenbucket prole.
Filter-ingbasedonclientIPaddressesisusefulsinceafew
domains account for asignicantportion of aweb
server'srequests[12]. Therateandburstvaluesare
enforcedonlywhenoverloadisdetectedandcanbe
dynamicallycontrolled byanadaptationagent,the
detailsofwhicharebeyondthescopeofthispaper.
2
Thetimeoutvaluesaretypicallysetto6,24,48,upto75
socket QoS attributes
socket{}
socket{}
socket{}
socket{}
socket{}
so_q
so_q
so_q
so_q
so_qlimit
so_q
so_qlen
...
so_qos
ptable
Prio: 1
Prio: 2
Prio: 2
Prio: 3
Figure 2: Implementation of the prioritized listen
queue
3.2 Prioritized Listen Queue
Theprioritizedlistenqueuereordersthelistenqueue
ofaserverprocessbasedon pre-denedconnection
prioritiessuchthatthehighestpriorityconnectionis
locatedattheheadofthequeue. Theprioritiesare
associatedwithlters(seeTable2)andconnections
areclassiedintodierentpriorityclasses. Whena
TCPconnectionisestablished,itismovedfromthe
partial listen queue to the listen queue. Weinsert
thesocket at the position corresponding to its
pri-ority in the listen queue. Since the server process
always removesthe head of the listen queue when
calling accept, this approach provides better
ser-vice,i.e. lowerdelayandhigherthroughput,to
con-nectionswithhigherpriority.
Figure 2showsthe implementation ofa prioritized
listenqueue. Aspecialdatastructureusedfor
main-tainingsocketQoSattributesstoresanarrayof
pri-ority pointers. Each priority pointer points to the
lastsocketof thecorrespondingpriorityclass. This
allows eÆcient socket insertion | a new socket is
always inserted behind the one pointed to by the
correspondingprioritypointer.
3.3 HTTP Header-based Controls
The SYNpolicer and prioritized listen queuehave
limited knowledge aboutthe type and nature of a
connectionrequest, sincethey arebased onthe
in-formationavailableintheTCPandIPheaders. For
web servers with the majority of the traÆc being
possi-yes
?
cache
in
no
HTTP request
serve in
kernel
action table
consult
drop
control
rate
?
compliant
no
yes
update priority
(if found)
listen queue
reordered
priority only
not in table
by server
Accept
drop
parse URL
Figure3: TheHTTPheader-basedconnection
con-trolmechanism.
Table1: URLactiontable
URL ACTION
*noaccess* <drop>
/shop.html <priority=1>
/index.html <rate=15conn./sec,burst=5conn.>,
<priority=1>
/cgi-bin/* <rate=10,burst=2>
a majority of the load is caused by afew CGI
re-questsandmostofthebytestransferredbelongtoa
smallset oflargeles. Thissuggeststhattargeting
specicURLs,typesofURLs,orcookieinformation
for service dierentiation can have a wide impact
duringoverload.
Our third mechanism, HTTP header-based
connec-tion control, enablescontent-basedconnection
con-trol by examining application layerinformation in
the HTTP header, such asthe URL name or type
(e.g., CGI requests) and other application-specic
informationavailablein cookies. Thecontrolis
ap-pliedintheformofratepolicingandprioritiesbased
onURLnamesandtypesandcookieattributes.
ThismechanisminvolvesparsingtheHTTPheader
inthekernelandwakingthesleepingwebserver
pro-cess only after a decisionto service the connection
is made. If aconnection is discarded, aTCPRST
is sent to the client and the socket receive buer
contentsare ushed.
For URL parsing, our implementation relies upon
Table2: ExampleNetwork-levelPolicies
(dstIP,dstport,srcIP,srcport) (r,b) priority
(*,80,*,*) (300,5) 3
(*,80,10.1.1.1, *) (100,5) 2
(12.1.1.1,80,*,*) (10,1) *
in-kernelwebcacheonAIX.ForLinux,anin-kernel
web engine called KHTTPD is available [14]. As
opposedtothenormaloperation,wherethesleeping
processiswokenupafteraconnectionisestablished,
AFPA responds to cached HTTP requests directly
withoutwakinguptheserverprocess. With AFPA,
a connection is not moved out of the partial listen
queue evenafter the3-wayhandshakeis over. The
normal data ow of TCP continues with the data
beingstoredin thesocket receivebuer. Whenthe
HTTP header is received (that is when the AFPA
parser ndstwoCR control characters in the data
stream), AFPA checks for the object in its cache.
On a cache miss, the socket is moved to the listen
queue and the web server process is woken up to
servicetherequest.
TheHTTPheader-basedconnectioncontrol
mecha-nismcomesintoplayatthisjuncture,asillustrated
in Figure 3, before the socket is moved out of the
partial listen queue. The URL action table (T
a-ble 1) species three types of actions/controls for
each URL or set of URLs. A drop action implies
thataTCP RSTis sent before discardingthe
con-nectionfromthepartiallistenqueueand ushingthe
socketreceivebuer. Ifapriorityvalueisset it
de-terminesthelocationofthecorrespondingsocketin
theordered listenqueue. Finally,ratecontrol
spec-iesa tokenbucket prole of a<rate, burst>pair
whichdropsout-of-proleconnectionssimilartothe
SYNpolicer.
3.4 Filter Specication
A lter rule species the network-level and/or
application-levelattributesthatdeneanaggregate
andtheparametersforthecontrolmechanismthat
isassociatedwithit. Anetwork-levellterisa
four-tuple consisting of local IP address,local port,
re-mote IPaddress,andremoteport; application-level
lters were shown in Table 1. Table 2 lists some
network-levellterexamples. Therst ruleapplies
Socket Layer
TCP
QOS
Engine
Lower Layers
Reply
URL
Ack for TCP SYN
Policing
adm. control
priority
TCP SYN
and Priority
URL-based
Prio 3
drop TCP SYN
(2)
(4)
(3)
RST
HTTP Get
Kernel
SYN
(1)
partial listen queue
TCP
HTTP
Prio 2
Prio 1
(6)
(8)
API
listen
prioritized
queue
Web Server
Agent
(9)
System
Statistics
Adaptation/Policy
connection-based
HTTP Request (5)
(7)
Figure4: Enhancedprotocolstackarchitecture.
nectionstotheserverarerate-controlledatarateof
300conns/sec,aburst of5,andapriorityof3(the
defaultlowestpriority). Thelterrulescan contain
rangeofIPaddresses,wildcards,etc.
3.5 Protocol Stack Architecture
We have developed architectural enhancements for
Unix-based servers to provide these mechanisms.
Figure 4 shows the basic components of the
en-hanced protocol stack architecture, with the new
capabilities utilized either by user-space agents or
applications themselves. This architecture permits
controloveranapplication'sinboundnetworktraÆc
viapolicy-basedtraÆcmanagement[10];an
adapta-tion/policyagentinstallspoliciesintothekernelvia
a special API. Thepolicy agent interacts with the
kernelvia anenhanced socket interface by sending
(receiving) messagesto(from) special control
sock-ets. ThepoliciesspecifylterstoselectthetraÆcto
becontrolled,andactionstoperformontheselected
traÆc. Thegureshowsthe owofanincoming
re-questthroughthevariouscontrolmechanisms.
3.6 Implementation Methodology and
Testbed
We have implemented the proposed kernel
mecha-described below. As shown in Figure 4, the QoS
module contains the TCP SYN policer, a priority
assignment function for new connections, and the
entity that performs URL-based admission control
andpriorityassignment.
All experimentswere conducted on atestbed
com-prisinganIBMHTTPServerrunningona375MHz
RS/6000machinewith512MBmemory,several550
MHz Pentium III clients running Linux, and one
166MHzPentiumProclientrunningFreeBSD.The
server and clients are connected via a 100 BaseT
Ethernet switch. Forclient loadgeneratorsweuse
Webstone2.5[15] andaslightlymodiedversionof
sclient[16]. Bothprogramsmeasureclient
through-put in connections per second. The experimental
workload consists of static and dynamic requests.
The dynamic les areminor modicationsof
stan-dardWebstoneCGIlesthatsimulatememory
con-sumptionofreal-worldCGIs.
The IBM HTTP Server is a modied Apache [17]
1.3.12 web server that utilizes an in-kernel HTTP
get enginecalledtheAdvancedFastPath
Architec-ture(AFPA).WeuseAFPAinourarchitectureonly
to performtheURL parsingandhavedisabled any
caching when measuring throughput results.
Un-lessstatedotherwise,weconguredApachetousea
maximumof150serverprocesses.
4 Experimental Results
4.1 EÆcacy of SYN Policing
InthissectionweshowhowTCPSYNpolicing
pro-tects apreferredclientagainst ashcrowdsorhigh
request ratesfrom other clients. In oursetup, one
client replays alarge e-tailer's trace le
represent-ing a preferred customer. For the competing load
we useve machines running Webstone, each with
50clients. All clientsrequestan8KB le,whichis
reasonablesinceatypicalHTTPtransferisbetween
5and13KB[12].
Without SYNpolicing,the e-tailer'sclientreceives
a low throughput ofabout6 KB/sec. Using
polic-ingtolowertheacceptancerateofWebstoneclients,
weexpectthethroughputforthee-tailer'sclientto
increase. Figure 5 shows that the throughput for
0
200
400
600
800
1000
1200
1400
0
50
100
150
200
250
300
350
throughput (KB/sec)
accepted rate of non-preferred clients
"preferred client protected"
"preferred client unprotected"
Figure 5: Throughput ofthe preferred e-tailer's client
with and without TCP SYN policing. Onthe X-axis
istheSYNpolicingrateofthenon-preferredWebstone
clients that are continuously generating requests. The
Y-axisshowsthecorrespondingthroughputreceivedby
thee-tailer'sclient whentherewasnoSYNcontroland
whenSYNcontrol wasenforced.
isloweredfrom300reqs/secto25reqs/sec. The
ex-perimentdemonstratesthatapreferredclientcanbe
successfullyprotectedbyrate-controllingconnection
requestsofothergreedyclients.
TCP SYN policing works well when client
identi-ties and request patterns are known. In general,
however, it is diÆcult to correctly identify a
mis-behaving group of clients. Moreover, as discussed
below,itishardtopredicttheratecontrol
parame-ters thatenableservice dierentiationfor preferred
clientswithoutunder-utilizingtheserver. For
eec-tive overload prevention the policing rate must be
dynamically adapted to the resource consumption
ofacceptedrequests.
4.2 Impact of Burst Size
In theprevious experimentwedid notanalyze the
eect of theburst size on theeectivethroughput.
Theburstsizeisthemaximumnumberof new
con-nectionsacceptedconcurrentlyforagivenaggregate.
With alargeburstsize,greedyclientscan overload
theserver, whereaswithasmall burst, clientsmay
berejected unnecessarily. Theburst size also
con-trols the responsiveness of rate control. There is a
tradeo, however, between responsiveness and the
achievedthroughput.
0
500
1000
1500
2000
0
10
20
30
40
50
throughput (KB/sec)
non-preferred client’s burst size at a given rate of 50
"protected client"
"bursty non-preferred client"
"total throughput"
Figure 6: Impact of burst size on preferred client
throughput. The burst size for policing non-preferred
clientisvariedfrom5to50whiletheconnection
accep-tancerate isxed at50 conn/sec. Theplotshows the
throughputachievedbythepreferredandnon-preferred
clientsalongwiththetotalthroughput.
throughputofapreferredclient. Inourexperiment,
the non-preferred client is a modied sclient
pro-gram that makes 50to 80back-to-backconnection
requests about twice a second, in addition to the
specied request rate. Both the length of the
in-comingrequestburstanditstimingarerandomized.
Figure6showsthethroughputofpreferredand
non-preferred client with the SYN policing rate of the
non-preferredclientsetto50conn/secandtheburst
sizevaryingfrom5to50. Thenon-preferredsclient
programrequestsa16KBdynamicallygeneratedcgi
le. ThepreferredclientisaWebstoneprogramwith
40clients,requestingastatic8KBle. Astheburst
sizeisincreasedfrom5to50,thesclient's
through-putincreasesfrom36.6conns/sec(585.6KB/sec)to
47.7conns/sec(752KB/sec), whilethethroughput
receivedbythepreferredclientdecreasesfromabout
140conns/sec(1117KB/sec)to79conns/sec.
Intuitively the overall throughput should have
in-creased, however, the observed decrease in total
throughput is due to thefact that weaccept more
CPUconsumingCGIrequestsfromsclient,thereby,
incurring ahigheroverheadperbytetransferred.
4.3 Prioritized Listen Queue: Simple
Priority
With TCPSYNpolicing,onemustlimitthegreedy
non-preferred clients to a meaningful rate during
ority. Wedemonstratenextthat theprioritized
lis-tenqueueprovidesservicedierentiation,especially
withalargelistenqueuelength.
Inourexperimentsweclassifyclientsintothree
pri-oritylevels. Clientsbelongingtoacommonpriority
levelareallcreatedbyaWebstonebenchmarkthat
requestsan8KBle. AseparateWebstoneinstance
is used for each priority level. We measure client
throughputforeachprioritylevelwhilevarying the
totalnumberof clients in each class. Eachpriority
classusesthesamenumberofclients.
Intherstexperiment,theApacheserveris
cong-ured to spawn a maximum of 50 server processes.
The results in Figure 7 show that when the total
numberofclientsissmall,allprioritylevelsachieve
similar throughput. With fewer clients, server
pro-cessesare alwaysfreeto handle incomingrequests.
Thus, the listen queue remains short and almost
no reorderingoccurs. As thenumber of clients
in-creases, the listen queue builds up since there are
fewer Apache processes than concurrent client
re-quests. Consequently,withre-orderingthe
through-put received by the high priority client increases,
whilethatofthetwolowerpriorityclientsdecreases.
Figure 7 shows that with more than 30 Webstone
clients per class only the high-priority clients are
servedwhilethelower-priorityclientsreceivealmost
noservice.
Figure8illustratestheeect onresponse times
ob-served by clients of the three priority classes. It
canbeseenthat asthe number ofclientsincreases
acrossall priority classesthe response time for the
lower priority classes increases exponentially. The
responsetimeofthehighpriorityclass,ontheother
hand,onlyincreasessub-linearly. Whenthenumber
ofhighpriorityrequestsincreases,thelowerpriority
ones are shifted back in the listen queue, thereby,
increasing theirresponse times. Also asmorehigh
priorityrequestsgetservicedbythedierentserver
processesrunninginparallel andcompeting forthe
CPUtheirresponsetimesincrease.
Wealsoobservedthatwhenthenumberofhigh
pri-ority requests was xed and the lower priority
re-questratewassteadilyincreased,theresponse time
ofthehighpriorityrequestsremainedunaected.
Thepriority-basedapproach enablesusto give low
delay and high throughput to preferred clients
in-dependent of the requests or request patterns of
0
50
100
150
200
250
300
350
400
450
0
5
10
15
20
25
30
35
40
45
throughput (conn/sec)
Number of Webstone clients
"high priority client"
"medium priority client"
"low priority client"
Figure7: Throughputwiththeprioritizedlistenqueue
and 3 priority classes with 50 Apache processes. The
numberofclientsineachclassremainsequal.
other clients. However, onemay need many
prior-ity classes for dierent levels of service. The main
drawbackofasimplepriorityorderingisthatit
pro-videsnoprotectionagainststarvationoflow-priority
requests.
4.4 Combining Policing and Priority
Topreventstarvation, lowpriorityrequestsneedto
have some minimum number reserved slots in the
listenqueuesothat theyarenotalwayspreempted
byahighpriorityrequest. However,reservingslots
inthelistenqueuearbitrarilycouldcauseahigh
pri-orityrequesttondafulllistenqueue,whichwould
inturn causeittobeabortedafter its3-way
hand-shakeis completed. To avoid starvation with xed
priorities,wecombinethelistenqueueprioritieswith
SYNpolicingtogivepreferredclientshigherpriority,
butlimitingtheirmaximumrateandburst,thereby,
implicitlyreservingsome slots in thequeue forthe
lowerpriorityrequests.
Table3showstheresultsforexperimentswiththree
setsofWebstoneclientswithdierentprioritiesand
rate control of the high priority class. The lower
priorityclasshas30Webstoneclientswhilethehigh
priorityclass has 150Webstone clients spreadover
threedierenthosts. With noSYN policing of the
clientsinthehighpriorityclass,thetwolow-priority
clients are completely starved. Table3 shows that
0
0.2
0.4
0.6
0.8
1
0
5
10
15
20
25
30
35
40
response time (seconds)
Number of Webstone clients
"high priority client"
"medium priority client"
"low priority client"
Figure 8: Response time with the prioritized listen
queueand 3 priority classes with50Apacheprocesses.
Thenumberofclientsineachclassremainsequal.
Table3: TCPSYNpolicingofahigh-priorityclient
toavoidstarvationofotherclients.
Throughput(conn/sec)ofeachpriorityclass
client (rate,burst)limitofhighpriority
priority none (300,300) (200,200)
high 381 306 196
medium 0 78.6 180
low 0 4.1 13
lowpriorityclientsachieveathroughputof78.6and
4.1conn/secrespectively.
4.5 HTTP Header-based Connection
Control
Inthissectionweillustratetheperformanceand
ef-fectivenessofadmissioncontrolandservice
dieren-tiation basedon information in the HTTPheaders
i.e.,URLnameandtype,cookieeldsetc.
Rate control using URLs: In our experimental
scenariothe preferredclient replayingthe e-tailer's
traceneedsto beprotectedfrom overloaddue to a
large number of high overhead CGI requests from
non-preferred clients. The client issuing CGI
re-quests is an sclient program requesting a dynamic
le of length 5 KB at a very high rate. Figure 9
showsthat withoutanyprotection,thepreferred
e-tailer'scustomerreceivesalowthroughputofunder
0
100
200
300
400
500
600
700
0
10
20
30
40
50
throughput (KB/sec)
Accepted rate of non-preferred dynamic clients
"preferred client throughput"
Figure 9: URL-based policing to protect preferred
e-tailer's customers. The graph shows the resulting
throughputofthepreferrede-tailer's clientasaspecic
highoverheadCGIrequestsislimitedtoagivennumber
ofconn/sec
from 40 reqs/sec to 2 reqs/sec the throughput of
the preferred e-tailer's customer improves from 1
KB/sec to 650 KB/sec. In contrast to TCP SYN
policing(Figure5), URLratecontroltargetsa
spe-cicURLcausingoverloadinsteadofaclientpool.
URLpriorities: Inthissectionwepresenttheresults
ofpriorityassignmentsinthelistenqueuebasedon
theURLnameortypebeingrequested. Theclients
are Webstone benchmarks requesting two dierent
URLs, both corresponding to les of size 8 KB.
There are two priority classes in the listen queue
basedonthetworequestedURLs. Figure10shows
thatthelowerpriorityclients(accessingthelow
pri-orityURL)receivelowerthroughputandarealmost
starved when the number of clients requesting the
high priority URL exceeds 40. These results
cor-respond to theresults shown earlier with priorities
based on the connection attributes (see Figure 7).
The average total throughput, however, is slightly
lower with URL-based priorities due to the
addi-tional overheadofURLparsing.
Combined URL-based rate control and priorities:
Toavoid starvation of requests forthe low-priority
URL,weratelimittherequestsforthehigh-priority
URL.Inthisexperiment,weassignahigherpriority
torequestsforadynamic CGIrequestofsize5KB
(requestedbyansclientprogram), andlower
prior-ity to requests for a static 8KB le (requested by
theWebstoneprogram). Table4showsthat
0
50
100
150
200
250
300
350
400
0
5
10
15
20
25
30
35
40
45
throughput (conn/sec)
Number of Webstone clients
"high priority client"
"low priority client"
Figure10: Throughputwith2URL-basedprioritiesand
50 Apache server processes. The numberof clients in
eachclassisequal
Table4:URL-basedpolicingofahigh-priorityclient
toavoidstarvationofotherclients.
Throughput(conn/sec)
client (rate,burst)limitofhighpriority
priority none (30,10) (10,10)
high 61.7 29.0 10.1
low 0 10.2 117
4.5.1 Overload Protection from High
Over-head Requests
SofarwehaveusedtheURL-basedcontrolsfor
pro-viding service dierentiation based on URL names
andtypes. Inthenextexperiment,weinvestigateif
URL-based connection control can beused to
pro-tectawebserverfromoverloadbyatargetedcontrol
of high overhead requests (e.g., CGI requests that
requirelargecomputationordatabaseaccess).
Weusethesclientloadgeneratortorequestagiven
high overhead URL and control the request rate,
steadilyincreasingitandmeasuringthethroughput.
Figure11showstheclient'sthroughputwithvarying
requestratesforadynamicCGIrequestthat
gener-atesa lesize of 29 KB. Thethroughput increases
linearlywiththerequestrateuptoacriticalpointof
about63 connections/sec. For anyfurther increase
intherequestratethethroughputfallsexponentially
andlaterplateaustoaround40connections/sec.To
understandthisbehaviorweusedvmstattocapture
thepagingstatistics. Sincethedynamicrequestsare
memory-intensive,theavailablefreememoryrapidly
0
10
20
30
40
50
60
70
80
0
20
40
60
80
100
throughput (conn/sec)
request rate
"no control"
"url-based control"
Figure11: Overloadprotectionfromhighoverhead
re-questsusingURL-basedconnection control. Thegraph
showsthethroughputofwebserverwithnocontrols
ser-vicingCPUintensiveCGIrequestsandthe
correspond-ingthroughputwhentheCGIrequestsarelimitedto60
reqs/sec.
andthenumberofactiveprocesses,theavailablefree
memory falls to zero. Eventually thesystemstarts
thrashingastheCPUspendsmostofthetime
wait-ing forpendinglocal diskI/O.Intheabove
experi-mentwith150serverprocessesandarequestrateof
63 reqs/sec the wait timestarts increasing as
indi-catedbythewaiteld oftheoutputfrom vmstat.
To preventoverload we use URL-based connection
control to limit the number of accepted dynamic
CGI requests to a rate of 60 reqs/sec and a burst
of10. ThedashedlineinFigure11showsthatwith
URL-based control the throughput stabilizes to 60
reqs/secandtheserverneverthrashes. Intheabove
experiment, the URL-based connection controlcan
handle request ratesof upto 150requests per
sec-ond. However,forrequestratesbeyondthat
thrash-ingstartsasthekerneloverheadof setting up
con-nections, parsing the URL and sending the RSTs,
becomessubstantial.
Tofurther delaytheonset ofthrashingweaugment
theURL-basedcontrol with theTCPSYNpolicer.
ForeveryTCPRSTthatissentwedropany
subse-quentSYNrequestfromthatsameclientfora
Table5:PerformanceofAFPAandmatchingaURL
toarulefora8KBlewithdierentURLlengths.
Throughput(conn/sec)
URL AFPA AFPAon, AFPAon,
o (nocache) (nocache)
length norule matchingrule
11char. 370.1 340.5 338.3
80char. 361.5 321.9 319.4
160char. 355.1 321.1 303.7
Table6: Overheadofkernelmechanisms
Operation Cost(sec)
TCPSYNpolicing 1lterrule 7.9
3lterrules 9.6
classicationandpriority 1rule 4.4
3rules 5.0
AFPAincludingURLparsing 19
1rule 5.0
URL-basedratecontrol 2rules 5.8
includingURLmatching 3rules 6.5
1rule 3.8
URL-basedpriority 2rules 4.1
includingURLmatching 3rules 4.3
4.5.2 Discussion
TheHTTPheader-basedratecontrolrelieson
send-ing TCP RST to terminate non-preferred
connec-tionsasandwhennecessary. Inamoreuser-friendly
implementationwecould directlyreturn anHTTP
error message(e.g., server busy) backto theclient
andclosetheconnection.
Our current implementation of URL-based control
handles only HTTP/1.0 connections. We are
cur-rentlyexploringdierentmechanismsforHTTP/1.1
withkeep-aliveconnectionstolimitthenumberand
typesof requests that canbeserviced onthe same
persistentconnection. Theexperimentsin the
pre-vious section have only presented results on URL
basedcontrols. Similarcontrolscanbesetbasedon
theinformationincookiesthatcancapture
transac-Wequantify theoverheadof matchingURLs inthe
kernelforvaryingURLlengths. Table5showsthat
theoverhead of matching aURLto a ruleis
mod-erate (under 6% for a 160 character URL). The
throughputnumbersarefor20Webstoneclients
re-questingan8KBle. Rules arematchedusingthe
standard string comparison (strcmp) with no
op-timizations; bettermatching techniquescanreduce
thisoverheadsignicantly. Onacachemiss,the
in-kernelAFPAcacheintroducesanoverheadofabout
10% for an 8 KB le. However, the AFPA cache
undernormalconditionsincreasesperformance
sig-nicantlyforcachehits. Inourexperimentswehave
thecache size set to 0so that AFPA cannot serve
anyobjectfromthecache. Whencachingisenabled
Webstonereceivedathroughputofover800
connec-tionspersecond onacachehit.
Table6 summarizesthe additional overheadof the
implemented kernel mechanisms. The overhead of
compliancecheckandltermatchingforTCPSYN
policingwith1lterruleis7.9secs. Simply
match-ingthelter,allocatingspacetostoreQoSstate,and
settingthepriorityaddsanoverheadof around4.4
secsfor1lterrule.Thepolicingcontrolsaremore
expensiveastheyincludeaccessingtheclockforthe
currenttime. Surprisingly, theURL matching and
rate control has a low overhead of 5.0 secs for a
URL of 11 chars. This happens to be lower than
SYNpolicingasthestrcmpmatchingischeaperfor
one short URL compared to matching multiple IP
addressesandport numbers. Theoverheadof URL
matching and setting priorities for a single rule is
around3.8 secs. Themostexpensiveoperationis
thecalltoAFPAtoparsetheURL.AFPAnotonly
parsestheURL,butalsodoeslogging,checksifthe
requestedobjectisin thenetworkbuercache,and
pre-computestheHTTPresponseheader.
5 Comparison of User Space and
Kernel Mechanisms
Inthis section we compare the eectiveness of our
kernelmechanismswithoverloadprotectionand
ser-vicedierentiationmechanismsimplementedinuser
space. One might argue that kernel-based
mecha-nisms are less exible and more diÆcult to
0
50
100
150
200
250
300
350
400
450
0
100
200
300
400
500
600
700
throughput webstone (conn/sec)
rate sclient (reqs/sec)
"TCP SYN policing"
"URL control"
"Apache control"
Figure12: Throughputofkernel-basedTCPSYN
polic-ing,kernelbasedURLratecontrol andApachemodule
basedconnectionratecontrol. Thethroughputachieved
by Webstone clients is measured against an increasing
requestload generated bysclient. The sclient requests
areratecontrolledto10.0req/secwithaburstof2.
bilities, haveeasy access to application layer
infor-mation. However,kernelmechanismsaremore
scal-ableandprovidemuchbetterperformance. In
gen-eral, placing mechanismsin thekernelis benecial
ifitleadstoconsiderableperformancegainsand
in-creasestherobustnessofthe serverwithoutrelying
ontheapplication layertopreventoverload.
Toenable a fair comparison wehave extended the
Apache 1.3.12 server with additional modules [18]
that police requests basedon theclient IPaddress
and requested URL. Theimplementedrate control
schemesuseexactlythesamealgorithmsasour
ker-nelbasedmechanisms. Ifarequestisnotcompliant
wesenda\servertemporarilyunavailable"(503
re-sponse code) back to the client and close the
con-nection.
TheexperimentalsetupconsistsofaWebstone
traf-cgeneratorwith100clientsrequestingaleofsize
8KBalongwithansclientprogramrequestingale
of size 16 KB. The sclient's requests are rate
con-trolled with arateof 10 requestspersecond and a
burst of 2; there are no controls set for the
Web-stone clients. Duringour experiments, westeadily
increasedthesclient'srequestrate.
Figure 12 illustrates that when the request loadof
the sclient program is low (20 reqs/sec), the
Web-stonethroughputis392conn/secand387.3conn/sec
for TCP SYN policing and Apache user level
con-trols respectively. These controls limit the sclient
0
0.2
0.4
0.6
0.8
1
1.2
0
100
200
300
400
500
600
700
response time webstone (seconds)
rate sclient (reqs/sec)
"TCP SYN policing"
"URL control"
"Apache control"
Figure 13: Response times using kernel-based TCP
SYNpolicing,kernelbasedURLratecontrolandApache
module based connection rate control. The response
timeachieved by Webstone clients ismeasured against
an increasing request load generated by sclient. The
sclient requests are rate controlled to 10 req/sec with
aburstof2.
URL-basedratecontrolthethroughputislower(354
conn/sec). This low throughput is caused by the
additional10%overheadadded by AFPA (withno
caching) as discussed in Section 4.6. As discussed
earlier,withthecachesizesettozero,weaddmore
overhead than necessary for URLparsing, without
thecorrespondinggains fromAFPAcaching.
Asthesclient'srequestloadincreasesfurther,TCP
SYNpolicingisabletoachieveasustained
through-putfortheWebstoneclients,whiletheApachebased
controlsshowsamarkeddeclineinthroughput. The
graphshows that for asclientload of 650reqs/sec
theWebstonethroughput forTCPSYNpolicing is
374 conn/sec; for in-kernel URL-based connection
control it is 260.7 conn/sec; for Apache user level
controlsthethroughputsinkstoabout75conn/sec.
The corresponding results for response times are
showninFigure13.
Theexperimentdemonstratesthatthekernel
mech-anismsaremoreeÆcientandscalablethantheuser
space mechanisms. There are two main reasons
forthehighereÆciency andscalability: First,
non-compliant connectionrequestsare discardedearlier
reducingthequeuingtimeofthecompliantrequests,
inparticularlessCPUisconsumedandthecontext
switch to user space is avoided. Second, when
im-plementingratecontrolat user space, the
Several research eorts have focused on admission
controlandservicedierentiationinwebservers[19],
[20], [21], [22], [8] and [23]. Almeida et al. [8] use
priority-basedschemestoprovidedierentiated
lev-elsofservicetoclientsdependingonthewebpages
accessed. While in their approach the application,
i.e., the web server, determines request priorities,
ourmechanismsresidein thekernelandcanbe
ap-plied without context-switching to user level.
We-bQoS[23]isamiddlewarelayerthatprovidesservice
dierentiationandadmissioncontrol.Sinceitis
de-ployedin userspace, itislesseÆcientcomparedto
kernel-basedmechanisms. WhileWebQoS also
pro-vides URL-based classication, the authors do not
present any experimentsor performance
considera-tions. Cherkasova et al. [20] present an enhanced
web server that provides session-based admission
controltoensurethatlongersessionsarecompleted.
Crovellaet al. [24] show that client response time
improveswhenwebserversserving staticlesserve
shorter connections before handlinglonger
connec-tions. Our mechanisms are general and caneasily
realizesuchapolicy.
Reumann etal.[25]havepresentedvirtualservices,
a new operating system abstraction that provides
resourcepartitioningandmanagement. Virtual
ser-vices can enhance our scheme by, forexample,
dy-namicallycontrollingthenumberofprocessesaweb
serverisallowedtofork. In[26]Reumannetal. have
describedanadaptivemechanismtosetuprate
con-trols for overload protection. The receiverlivelock
study [2] showed that network interrupt handling
could cause server livelocks and should be taken
into considerationwhen designingprocess
schedul-ingmechanisms. BangaandDruschel's[27]resource
containers enable the operating system to account
for and control the consumption of resources. To
shield preferred clients from malicious or greedy
clientsonecan assign them to dierentcontainers.
Inthe samepapertheyalso describeamulti listen
socketapproachforprioritiesin whichaltersplits
asinglelistenqueueintomultiplequeuesfromwhich
connections are acceptedseparately and accounted
todierentprincipals. Ourapproachissimilar,
how-ever,connectionsareacceptedfromthesamesingle
listenqueuebutinsertedinthequeuebasedon
prior-ity. Kanodiaetal.[21]presentasimulationstudyof
queuing-basedalgorithmsforadmissioncontroland
trollingtheadmissionrateperclass. Aronetal.[28]
describeascalablerequestdistributionarchitecture
for clusters and also present resourcemanagement
techniques forclusters.
Scout[29], Rialto [30] and Nemesis[31] are
operat-ingsystemsthattrackper-applicationresource
con-sumptionandrestricttheresourcesgrantedtoeach
application. These operatingsystemscanthus
pro-videisolationbetweenapplicationsaswellasservice
dierentiation betweenclients. However, there isa
signicant amount of work involved to port
appli-cationsto thesespecializedoperatingsystems. Our
focus, however, was not to build a new operating
systemornetworking architecture but tointroduce
simplecontrols in theexisting architecture of
com-mercialoperatingsystemsthat couldbejust as
ef-fective.
7 Conclusions and Future Work
In this paper, we have presented three in-kernel
mechanismsthat provideservicedierentiationand
admissioncontrol foroverloadedweb servers. TCP
SYN policing limits the number of incoming
con-nection requests using a token bucket policer and
prevents overload by enforcing a maximum
accep-tancerate of non-preferredclients. The prioritized
listen queue provides low delay and high
through-put to clients with high priority, but can starve
low priority clients. We show that starvation can
be avoided bycombiningpriorities with TCPSYN
policing. Finally, URL-based connection control
provides in-kernel admission control and priority
basedonapplication-levelinformationsuchasURLs
andcookies. This mechanismis very powerfuland
can, for example, prevent overload caused by
dy-namic requests. We compared the kernel
mecha-nismsto similar applicationlayercontrolsadded in
the Apache server and demonstrated that the
ker-nelmechanismsaremuchmoreeÆcientandscalable
thantheApacheuserlevelcontrols.
Thekernelmechanismsthatwepresentedrelyonthe
existenceof accuratepolicies that controlthe
oper-ating range of the server. In a production system
it is unrealistic to assume knowledge of the
opti-maloperatingregionoftheserver. Wearecurrently
implementing apolicy adaptation agent (Figure 4)
to select appropriatevaluesfor the various policies
andmonitorstheinteractionbetweenvariouscontrol
optionsontheoverallperformanceduringoverload.
Ourcurrentimplementationdoesnotaddress
secu-rityissuesoffakeIPaddressesandclientidentities.
Weplantointegrateavarietyofoverloadprevention
policieswithtraditionalrewallrulestoprovidean
integratedsolution.
References
[1] \CiscoTCPintercept,"http://www.cisco.com.
[2] J. C. Mogul and K. K.Ramakrishan,
\Eliminat-ing receive livelock in aninterrupt-driven kernel,"
inProc.of USENIXAnnual TechnicalConference,
Jan.1996.
[3] P.DruschelandG.Banga, \Lazyreceiver
process-ing (LRP): a network subsystem architecture for
serversystems," inProc. of OSDI,Oct.1996, pp.
91{105.
[4] O.SpatscheckandL.Peterson,\Defendingagainst
denialofserviceattacksinscout,"inProc.ofOSDI,
Feb.1999.
[5] \Ensim corporation: virtual servers,"
http://www.ensim.com.
[6] \Alteon web systems,"
http://www.alteonwebsystems.com.
[7] \Cisco arrowpoint web network services,"
http://www.arrowpoint.com.
[8] J. Almeida, M. Dabu, A. Manikutty, and P. Cao,
\Providing dierentiated levels of service in web
content hosting," inProc. of Internet Server
Per-formanceWorkshop,Mar.1999.
[9] T. Barzilai, D. Kandlur, A. Mehra, and D. Saha,
\Designandimplementationofanrsvpbased
qual-ityofservicearchitectureforanintegratedservices
internet,"IEEEJournalonSelectedAreasin
Com-munications,vol.16,no.3,pp.397{413,Apr.1998.
[10] A.Mehra,R.Tewari,andD.Kandlur,\Design
con-siderations for ratecontrol of aggregated tcp
con-nections," inProc.ofNOSSDAV,June1999.
[11] G.R. Wright and W.R. Stevens, TCP/IP
Illus-trated,Volume2,Addison-WesleyPublishing
Com-pany,1995.
[12] Martin F. Arlitt and Carey l. Williamson, \Web
serverworkloadcharacterization: Thesearchfor
in-variants," inProc.ofACMSigmetrics, Apr.1996.
[13] P.Joubert, R. King, R.Neves, M.Russinovich, and
J.Tracey, \High performance memory based web
caches: Kernel and user space performance," in
http://www.fenrus.demon.nl/.
[15] \webstone,"http://www.mindcraft.com.
[16] G.BangaandP.Druschel,\Measuringthecapacity
ofawebserver," inProc.ofUSITS,Dec.1997.
[17] \apache," http://www.apache.org.
[18] L.SteinandD.MacEachern,WritingApache
mod-uleswithPerlandC, O'Reilly,1999.
[19] T. Abdelzaher and N. Bhatti, \Web server qos
managementbyadaptivecontentdelivery," inInt.
WorkshoponQualityofService,June1999.
[20] L.CherkasovaandP.Phaal, \Sessionbased
admis-sion control: a mechanism for improvingthe
per-formanceofanoverloadedwebserver," Tech.Rep.,
HewlettPackard,1999.
[21] V. Kanodia andE.Knightly, \Multi-class
latency-boundedwebservers,"inIntl.WorkshoponQuality
of Service,June2000.
[22] K. Li and S. Jamin, \A measurement-based
ad-mission controlledwebserver," inProc.of
INFO-COMM,Mar.2000.
[23] NinaBhatti andRichFriedrich, \Webserver
sup-portfortieredservices,"IEEENetwork,Sept.1999.
[24] M. E. Crovella, R. Frangioso, and M.
Harchol-Balter, \Connectionschedulinginwebservers," in
Proc.ofUSITS,Oct.1999.
[25] J. Reumann,A.Mehra, K.Shin,andD. Kandlur,
\Virtualservices:Anewabstractionforserver
con-solidation," inProc.ofUSENIXAnnual Technical
Conference,June2000.
[26] H. JamjoomandJ.Reumann, \Qguard:protecting
internet servers from overload," Tech. Rep.,
Uni-versityofMichigan,CSE-TR-427-00,2000.
[27] G. Banga, P. Druschel, and J. Mogul, \Resource
containers: anewfacilityfor resourcemanagement
inserversystems," inProc.ofOSDI,Feb.1999.
[28] M. Aron, D. Sanders, P. Druschel, and
W. Zwaenepoel, \Scalable content-aware request
distribution incluster-based networkservers," in
Proc. of USENIX Annual Technical Conference,
June2000.
[29] D. Mosberger and L. L. Peterson, \Making paths
explicitinthescoutoperatingsystem," inProc.of
OSDI,Oct.1996,pp.153{167.
[30] M.B.Jones,J.S.BarreraIII,A.Forin,P.J.Leach,
D.Rosu,andM.Rosu, \AnoverviewoftheRialto
real-timearchitecture," inACMSIGOPSEuropean
Workshop, Sept.1996,pp.249{256.
[31] Thiemo Voigt and Bengt Ahlgren, \Scheduling
TCP in the Nemesis operating system," in IFIP
WG 6.1/WG 6.4 International Workshop on