Kernel mechanisms for service differentiation in overloaded web servers

(1)

USENIX Association

Proceedings of the

2001 USENIX Annual

Technical Conference

Boston, Massachusetts, USA

June 25–30, 2001

THE ADVANCED COMPUTING SYSTEMS ASSOCIATION

© 2001 by The USENIX Association

All Rights Reserved

For more information about the USENIX Association:

Phone: 1 510 528 8649

FAX: 1 510 548 5738

Email: office@usenix.org

WWW:

http://www.usenix.org

Rights to individual papers remain with the author or the author's employer.

Permission is granted for noncommercial reproduction of the work for educational or research purposes.

This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.

(2)

Servers

Thiemo Voigt

Swedish Institute of Computer Science

thiemo@sics.se

Renu Tewari

IBM T.J. Watson Research Center

tewarir@us.ibm.com

Douglas Freimuth

IBM T.J. Watson Research Center

dmfreim@us.ibm.com

Ashish Mehra

iScale Networks

ashish@iscale.net

Abstract

TheincreasingnumberofInternetusersandinnovative

new services such as e-commerce are placing new

de-mandsonWebservers. ItisbecomingessentialforWeb

serverstoprovideperformanceisolation,havefast

recov-ery times, and provide continuousservice during

over-load at leastto preferredcustomers. Inthis paper, we

present the designand implementationof three

kernel-basedmechanismsthatprotectWebserversagainst

over-load by providing admission control and service

dif-ferentiation based on connection and application level

information. Our basic admission control mechanism,

TCPSYNpolicing,limitstheacceptancerateofnew

re-quests basedonthe connection attributes. The second

mechanism, prioritized listen queue, supports dierent

service classes by reordering the listen queue based on

the priorities of the incoming connections. Third, we

presentHTTPheader-basedconnectioncontrolthatuses

application-levelinformation suchas URLsandcookies

tosetprioritiesandratecontrolpolicies.

We have implemented these mechanisms in AIX 5.0.

Through numerous experiments we demonstrate their

eectiveness in achieving the desired degree of service

dierentiationduring overload. We also showthat the

kernel mechanismsare moreeÆcient and scalablethan

applicationlevelcontrolsimplementedintheWebserver.

ThisworkwaspartiallyfundedbythenationalSwedish

Real-timeSystemsResearch Initiative(ARTES). Thiswork

wasdonewhentheauthorwasvisitingtheIBMT.J.Watson

1 Introduction

Applicationservice providersand Webhosting

ser-vices that co-host multiple customer sites on the

sameservercluster or largeSMP are becoming

in-creasingly common in the current Internet

infras-tructure. The increasinggrowth of e-commerceon

thewebmeans that any serverdowntime that

af-fects the clients beingserviced will result in a

cor-respondinglossofrevenue. Additionally,the

unpre-dictabilityof ashcrowdscanoverwhelmahosting

server and bring down multiple customer sites

si-multaneously, aecting the performance of a large

number of clients. It becomes essential, therefore,

forhostingservicestoprovideperformanceisolation

andcontinuousoperationunderoverloadconditions.

Eachoftheco-hostedcustomerssitesorapplications

may have dierent quality-of-service (QoS) goals

based on the price of the service and the

applica-tionrequirements. Furthermore,eachcustomer site

mayrequiredierentservicesduringoverloadbased

on the client's identity (preferred gold client) and

theapplicationorcontenttheyaccess(e.g.,aclient

with abuy ordervs. abrowsingrequest). Asimple

thresholdbasedrequestdiscardpolicy(e.g., aTCP

SYNdropmodeincommercialswitches/routers

dis-cards the incoming, oldest or any random

(3)

sirablethatrequestsofnon-preferredcustomersites

bediscardedrst. Such QoSspecicationsare

typ-icallynegotiatedin aservicelevelagreement(SLA)

between the hosting service provider and its

cus-tomers. Based on this governingSLA, the hosting

serviceprovidersneedto support service

dierenti-ationbasedonclientattributes(IPaddress,session

id, port etc.), server attributes (IP address, type),

andapplicationinformation(URLaccessed,CGI

re-quest,cookiesetc.).

Inthispaper,wepresentthedesignand

implementa-tionofkernelmechanismsinthenetworksubsystem

that provideadmissioncontrol andservice

dieren-tiation during overload basedonthecustomer site,

theclient,andtheapplicationlayerinformation.

One of theunderlying principles of our design was

that it should enable\earlydiscard",i.e., ifa

con-nectionistobediscardeditshouldbedoneasearly

aspossible,before it hasconsumed alot of system

resources[2]. Sinceawebserver'sworkloadis

gen-erated by incoming network connections we place

our control mechanisms in the network subsystem

of the serverOS at dierent stages of the protocol

stackprocessing. Tobalancetheneedforearly

dis-card with that of an informed discard, where the

decisionismade withfullknowledgeofthecontent

beingaccessed,weprovidemechanismsthat enable

content-basedadmissioncontrol.

Our second principle was to introduce minimal

changes to the core networking subsystem in

com-mercialoperatingsystemsthat typicallyimplement

aBSD-style stack. There havebeenprior research

eortsthatmodify thearchitectureof the

network-ing stack to enable stable overload behavior [3].

Otherresearchershavedevelopednewoperating

sys-tem architectures to protect against overload and

denial of serviceattacks [4]. Some \virtual server"

implementationstrytosandboxallresources(CPU,

memory, network bandwidth) according to

admin-istrative policies and enable complete performance

isolation [5]. Our aim in this design, however, was

nottobuildanewnetworkingarchitecturebutto

in-troducesimplecontrolsin the existing architecture

thatcouldbejustaseective.

The third principle was to implement mechanisms

that can be deployed both on the server as well

as outside the server in layer 4 or 7 switches that

perform load balancing and content based routing

havesomeform of overloadprotectionmechanisms

that typically consists of dropping a new

connec-tionpacket(orsomerandomnewconnectionpacket)

when a load threshold is exceeded. For

content-based routing the layer7 switch functionality

con-sists of terminating the incoming TCP connection

to determine the destination server based on the

content being accessed, creating a new connection

to the server in the cluster, and splicing the two

connections together [7]. Such a switch has access

to the application headers along with the IP and

TCPheaders. Themechanismswebuiltin the

net-worksubsystemcaneasilybemovedtothefront-end

switchtoprovideservicedierentiationbasedonthe

clientattributesorthecontentbeingaccessed.

There have been proposals to modify the process

schedulingpoliciesintheOStoenablepreferredweb

requests to executeas higherpriority processes[8].

These mechanisms, however, can only change the

relativeperformanceofhigherpriorityrequests;they

do notlimittherequestsaccepted. Sincethe

hard-ware device interrupt on a packet receive and the

softwareinterruptforpacketprotocolprocessingcan

preempt any of the other user processes [3] such

schedulingpoliciescannotpreventordelayoverload.

Secondly,theincomingrequestsalreadyhave

numer-oussystem resourcesconsumedbefore any

schedul-ing policy comes intoeect. Such priority

schedul-ing schemes can co-exist with our controls in the

networksubsystem.

An alternate approachis toenabletheapplications

toprovidetheirindividualadmissioncontrol

mecha-nisms. Althoughthisachievesapplicationlevel

con-trol it requires modications to existing legacy

ap-plications orspecializedwrappers. Application

con-trols are useful in dierentiating between dierent

clients of an application but are less useful in

pre-ventingor delaying overload acrosscustomer sites.

More importantly, various serverresourceshave

al-ready been allocated to arequest before the

appli-cation control comes into eect,violating theearly

discardpolicy. However,thekernelmechanismscan

easily workin conjunction withapplication specic

controls.

Since most web servers receive requests over

HTTP/TCP connections, our controls are located

in three dierent stages in the lifetime of a TCP

(4)

The rst control mechanism, TCP SYN

polic-ing,islocatedatthestartofprotocolstack

pro-cessingoftherstSYNpacketofanew

connec-tionandlimitsacceptanceof anewTCPSYN

packetbasedoncompliancewithatokenbucket

basedpolicer.

Thenextcontrol,prioritizedlistenqueue,is

lo-cated at the end of a TCP 3-way handshake,

i.e., when the connection is acceptedand

sup-ports dierent priority levels among accepted

connections.

Third, HTTP header-based connection control,

is located after the HTTP header is received

(which could be after multiple data packets)

andenablesadmissioncontrolandpriority

val-uestobebasedonapplication-layerinformation

containedintheheadere.g.,URLs,cookiesetc.

Wehaveimplementedthese controlsintheAIX5.0

kernelasaloadablemodule usingtheframeworkof

an existing QoS-architecture [9]. The existing QoS

architectureonAIXsupportspolicy-basedoutbound

bandwidthmanagement [10]. These techniques are

easilyportabletoanyOSrunningaBSD style

net-work stack

1 .

Wepresentexperimentalresultstodemonstratethat

these mechanisms eectively provide selective

con-nectiondiscardandservicedierentiationinan

over-loaded server. Wealso compareagainstapplication

layer controls that we added in the Apache 1.3.12

server and show that the kernel controls are much

moreeÆcientandscalable.

Theremainderofthispaperisorganizedasfollows:

InSection2wegiveabriefoverviewoninputpacket

processing. Our architecture and the kernel

mech-anisms are presentedin Section 3. InSection 4we

present and discuss experimental results. We

com-pare the performance of kernel based mechanisms

and application level controlsin Section 5. We

de-scriberelatedworkinSection6andnally,the

con-clusionsandfutureworkinSection 7.

2 Input Packet Processing:

Back-ground

Inthis sectionwebrie y describetheprotocol

pro-cessing steps executed when a new connection

re-1

SYN

partial

control

Rate Control

listen

queue

listen

queue

Accept

by server

prioritized

connection setup complete

URL-based

connection

Figure1: Proposedkernelmechanisms.

questisprocessedbyawebserver. Whenthedevice

interfacereceivesapacketittriggersahardware

in-terruptthatisservicedbythecorrespondingdevice

driver [11]. The device driver copies the received

packet into anmbufandde-multiplexes itto

deter-mine the queueto insertthe packet. For example,

anIPpacketisaddedtotheinputqueue,ipintrq.

The device driver then triggerstheIP software

in-terrupt. TheIP inputroutinedequeues thepacket

fromtheIPinputqueueanddoesthenextlayer

de-multiplexingtoinvokethetransportlayerinput

rou-tine. Forexample,foraTCPpacketthiswillresult

in acalltoatcpinputroutineforfurther

process-ing. The call to the transport layer input routine

happens within the realmof theIP inputcall, i.e.,

thereisnoqueuing betweenthe IPand TCPlayer.

The TCP input processing veries the packet and

locatesthe protocol controlblock(PCB). Ifthe

in-comingpacketisaSYNrequestforalistensocket,a

newdatasocketiscreatedandplacedinthepartial

listenqueueandanACKis sentbackto theclient.

When the ACK for the SYN-ACK is received the

TCP3-wayhandshake is complete, the connection

moves to an established state and the data socket

ismovedto thelisten queue. The sleepingprocess,

e.g., the web server, waiting on the acceptcall is

wokenup. Theconnectionisreadytoreceivedata.

3 Architecture Design

Thenetworksubsystemarchitectureaddsthree

con-trol mechanisms that are placed at the dierent

stages of a TCP connection's life time. Figure 1

shows the various phases in the connection setup

andthecorrespondingcontrolmechanisms: (i)when

aSYN packet isprocessedittriggerstheSYNrate

controlandselectivedrop(ii)whenthe3-way

hand-shakeiscompletedtheprioritizedlistenqueue

selec-tivelychanges theorderingofacceptedconnections

inthelistenqueue(iii)whentheHTTPheaderis

(5)

layer information. Each of these mechanisms can

be activated at varying degrees of overload where

theearliest andsimplest controlis triggered at the

highestloadlevel.

3.1 SYN Policer

TCP SYN policing controls the rate and burst at

whichnewconnectionsareaccepted. ArrivingTCP

SYN packets are policed using atokenbucket

pro-ledened bythepair<rate; burst>,whererate

istheaveragenumberofnewrequestsadmittedper

second and burst is the maximum number of

con-currentnewrequests. Incomingconnectionsare

ag-gregated using specied lter rules that are based

on the connection end points (source and

destina-tionaddresses and ports as shown in Table 2). On

arrival at the server, the SYN packet is classied

usingtheIP/TCPheaderinformationto determine

thematchingrule. Acompliancecheckisperformed

againstthetokenbucketproleoftherule. If

com-pliant,anewdatasocket iscreatedandinsertedin

thepartiallistenqueueotherwisetheSYNpacketis

silentlydiscarded.

When the SYN packet is silently dropped, the

re-questingclientwilltime-outwaitingforaSYNACK

and retry again with an exponentially increasing

time-outvalue

2

. An alternate option,which we do

notconsider,istosendaTCPRSTtoresetthe

con-nectionindicatinganabortfromtheserver.This

ap-proach,however,incursunnecessaryextraoverhead.

Secondly,someclientssendanewSYNimmediately

afteraTCPRSTisreceivedinsteadofabortingthe

connection. Notethatwedropnon-compliantSYNs

evenbeforeasocketiscreatedforthenewconnection

thereby investing only asmall amount of overhead

onrequeststhataredropped.

To provide service dierentiation, connection

re-questsare aggregatedbasedonlters andeach

ag-gregatehasaseparate tokenbucket prole.

Filter-ingbasedonclientIPaddressesisusefulsinceafew

domains account for asignicantportion of aweb

server'srequests[12]. Therateandburstvaluesare

enforcedonlywhenoverloadisdetectedandcanbe

dynamicallycontrolled byanadaptationagent,the

detailsofwhicharebeyondthescopeofthispaper.

2

Thetimeoutvaluesaretypicallysetto6,24,48,upto75

socket QoS attributes

socket{}

so_q

so_qlimit

so_q

so_qlen

...

so_qos

ptable

Prio: 1

Prio: 2

Prio: 3

Figure 2: Implementation of the prioritized listen

queue

3.2 Prioritized Listen Queue

Theprioritizedlistenqueuereordersthelistenqueue

ofaserverprocessbasedon pre-denedconnection

prioritiessuchthatthehighestpriorityconnectionis

locatedattheheadofthequeue. Theprioritiesare

associatedwithlters(seeTable2)andconnections

areclassiedintodierentpriorityclasses. Whena

TCPconnectionisestablished,itismovedfromthe

partial listen queue to the listen queue. Weinsert

thesocket at the position corresponding to its

pri-ority in the listen queue. Since the server process

always removesthe head of the listen queue when

calling accept, this approach provides better

ser-vice,i.e. lowerdelayandhigherthroughput,to

con-nectionswithhigherpriority.

Figure 2showsthe implementation ofa prioritized

listenqueue. Aspecialdatastructureusedfor

main-tainingsocketQoSattributesstoresanarrayof

pri-ority pointers. Each priority pointer points to the

lastsocketof thecorrespondingpriorityclass. This

allows eÆcient socket insertion | a new socket is

always inserted behind the one pointed to by the

correspondingprioritypointer.

3.3 HTTP Header-based Controls

The SYNpolicer and prioritized listen queuehave

limited knowledge aboutthe type and nature of a

connectionrequest, sincethey arebased onthe

in-formationavailableintheTCPandIPheaders. For

web servers with the majority of the traÆc being

(6)

possi-yes

?

cache

in

no

HTTP request

serve in

kernel

action table

consult

drop

control

rate

?

compliant

no

yes

update priority

(if found)

listen queue

reordered

priority only

not in table

by server

Accept

drop

parse URL

Figure3: TheHTTPheader-basedconnection

con-trolmechanism.

Table1: URLactiontable

URL ACTION

*noaccess* <drop>

/shop.html <priority=1>

/index.html <rate=15conn./sec,burst=5conn.>,

<priority=1>

/cgi-bin/* <rate=10,burst=2>

a majority of the load is caused by afew CGI

re-questsandmostofthebytestransferredbelongtoa

smallset oflargeles. Thissuggeststhattargeting

specicURLs,typesofURLs,orcookieinformation

for service dierentiation can have a wide impact

duringoverload.

Our third mechanism, HTTP header-based

connec-tion control, enablescontent-basedconnection

con-trol by examining application layerinformation in

the HTTP header, such asthe URL name or type

(e.g., CGI requests) and other application-specic

informationavailablein cookies. Thecontrolis

ap-pliedintheformofratepolicingandprioritiesbased

onURLnamesandtypesandcookieattributes.

ThismechanisminvolvesparsingtheHTTPheader

inthekernelandwakingthesleepingwebserver

pro-cess only after a decisionto service the connection

is made. If aconnection is discarded, aTCPRST

is sent to the client and the socket receive buer

contentsare ushed.

For URL parsing, our implementation relies upon

Table2: ExampleNetwork-levelPolicies

(dstIP,dstport,srcIP,srcport) (r,b) priority

(*,80,*,*) (300,5) 3

(*,80,10.1.1.1, *) (100,5) 2

(12.1.1.1,80,*,*) (10,1) *

in-kernelwebcacheonAIX.ForLinux,anin-kernel

web engine called KHTTPD is available [14]. As

opposedtothenormaloperation,wherethesleeping

processiswokenupafteraconnectionisestablished,

AFPA responds to cached HTTP requests directly

withoutwakinguptheserverprocess. With AFPA,

a connection is not moved out of the partial listen

queue evenafter the3-wayhandshakeis over. The

normal data ow of TCP continues with the data

beingstoredin thesocket receivebuer. Whenthe

HTTP header is received (that is when the AFPA

parser ndstwoCR control characters in the data

stream), AFPA checks for the object in its cache.

On a cache miss, the socket is moved to the listen

queue and the web server process is woken up to

servicetherequest.

TheHTTPheader-basedconnectioncontrol

mecha-nismcomesintoplayatthisjuncture,asillustrated

in Figure 3, before the socket is moved out of the

partial listen queue. The URL action table (T

a-ble 1) species three types of actions/controls for

each URL or set of URLs. A drop action implies

thataTCP RSTis sent before discardingthe

con-nectionfromthepartiallistenqueueand ushingthe

socketreceivebuer. Ifapriorityvalueisset it

de-terminesthelocationofthecorrespondingsocketin

theordered listenqueue. Finally,ratecontrol

spec-iesa tokenbucket prole of a<rate, burst>pair

whichdropsout-of-proleconnectionssimilartothe

SYNpolicer.

3.4 Filter Specication

A lter rule species the network-level and/or

application-levelattributesthatdeneanaggregate

andtheparametersforthecontrolmechanismthat

isassociatedwithit. Anetwork-levellterisa

four-tuple consisting of local IP address,local port,

re-mote IPaddress,andremoteport; application-level

lters were shown in Table 1. Table 2 lists some

network-levellterexamples. Therst ruleapplies

(7)

Socket Layer

TCP

QOS

Engine

Lower Layers

Reply

URL

Ack for TCP SYN

Policing

adm. control

priority

TCP SYN

and Priority

URL-based

Prio 3

drop TCP SYN

(2)

(4)

(3)

RST

HTTP Get

Kernel

SYN

(1)

partial listen queue

TCP

HTTP

Prio 2

Prio 1

(6)

(8)

API

listen

prioritized

queue

Web Server

Agent

(9)

System

Statistics

Adaptation/Policy

connection-based

HTTP Request (5)

(7)

Figure4: Enhancedprotocolstackarchitecture.

nectionstotheserverarerate-controlledatarateof

300conns/sec,aburst of5,andapriorityof3(the

defaultlowestpriority). Thelterrulescan contain

rangeofIPaddresses,wildcards,etc.

3.5 Protocol Stack Architecture

We have developed architectural enhancements for

Unix-based servers to provide these mechanisms.

Figure 4 shows the basic components of the

en-hanced protocol stack architecture, with the new

capabilities utilized either by user-space agents or

applications themselves. This architecture permits

controloveranapplication'sinboundnetworktraÆc

viapolicy-basedtraÆcmanagement[10];an

adapta-tion/policyagentinstallspoliciesintothekernelvia

a special API. Thepolicy agent interacts with the

kernelvia anenhanced socket interface by sending

(receiving) messagesto(from) special control

sock-ets. ThepoliciesspecifylterstoselectthetraÆcto

becontrolled,andactionstoperformontheselected

traÆc. Thegureshowsthe owofanincoming

re-questthroughthevariouscontrolmechanisms.

3.6 Implementation Methodology and

Testbed

We have implemented the proposed kernel

mecha-described below. As shown in Figure 4, the QoS

module contains the TCP SYN policer, a priority

assignment function for new connections, and the

entity that performs URL-based admission control

andpriorityassignment.

All experimentswere conducted on atestbed

com-prisinganIBMHTTPServerrunningona375MHz

RS/6000machinewith512MBmemory,several550

MHz Pentium III clients running Linux, and one

166MHzPentiumProclientrunningFreeBSD.The

server and clients are connected via a 100 BaseT

Ethernet switch. Forclient loadgeneratorsweuse

Webstone2.5[15] andaslightlymodiedversionof

sclient[16]. Bothprogramsmeasureclient

through-put in connections per second. The experimental

workload consists of static and dynamic requests.

The dynamic les areminor modicationsof

stan-dardWebstoneCGIlesthatsimulatememory

con-sumptionofreal-worldCGIs.

The IBM HTTP Server is a modied Apache [17]

1.3.12 web server that utilizes an in-kernel HTTP

get enginecalledtheAdvancedFastPath

Architec-ture(AFPA).WeuseAFPAinourarchitectureonly

to performtheURL parsingandhavedisabled any

caching when measuring throughput results.

Un-lessstatedotherwise,weconguredApachetousea

maximumof150serverprocesses.

4 Experimental Results

4.1 EÆcacy of SYN Policing

InthissectionweshowhowTCPSYNpolicing

pro-tects apreferredclientagainst ashcrowdsorhigh

request ratesfrom other clients. In oursetup, one

client replays alarge e-tailer's trace le

represent-ing a preferred customer. For the competing load

we useve machines running Webstone, each with

50clients. All clientsrequestan8KB le,whichis

reasonablesinceatypicalHTTPtransferisbetween

5and13KB[12].

Without SYNpolicing,the e-tailer'sclientreceives

a low throughput ofabout6 KB/sec. Using

polic-ingtolowertheacceptancerateofWebstoneclients,

weexpectthethroughputforthee-tailer'sclientto

increase. Figure 5 shows that the throughput for

(8)

0

200

400

600

800 1000

1200

1400

0

50

100

150

200

250

300

350 throughput (KB/sec)

accepted rate of non-preferred clients

"preferred client protected"

"preferred client unprotected"

Figure 5: Throughput ofthe preferred e-tailer's client

with and without TCP SYN policing. Onthe X-axis

istheSYNpolicingrateofthenon-preferredWebstone

clients that are continuously generating requests. The

Y-axisshowsthecorrespondingthroughputreceivedby

thee-tailer'sclient whentherewasnoSYNcontroland

whenSYNcontrol wasenforced.

isloweredfrom300reqs/secto25reqs/sec. The

ex-perimentdemonstratesthatapreferredclientcanbe

successfullyprotectedbyrate-controllingconnection

requestsofothergreedyclients.

TCP SYN policing works well when client

identi-ties and request patterns are known. In general,

however, it is diÆcult to correctly identify a

mis-behaving group of clients. Moreover, as discussed

below,itishardtopredicttheratecontrol

parame-ters thatenableservice dierentiationfor preferred

clientswithoutunder-utilizingtheserver. For

eec-tive overload prevention the policing rate must be

dynamically adapted to the resource consumption

ofacceptedrequests.

4.2 Impact of Burst Size

In theprevious experimentwedid notanalyze the

eect of theburst size on theeectivethroughput.

Theburstsizeisthemaximumnumberof new

con-nectionsacceptedconcurrentlyforagivenaggregate.

With alargeburstsize,greedyclientscan overload

theserver, whereaswithasmall burst, clientsmay

berejected unnecessarily. Theburst size also

con-trols the responsiveness of rate control. There is a

tradeo, however, between responsiveness and the

achievedthroughput.

0

500 1000

1500

2000

0

10

20

30

40

50 throughput (KB/sec)

non-preferred client’s burst size at a given rate of 50

"protected client"

"bursty non-preferred client"

"total throughput"

Figure 6: Impact of burst size on preferred client

throughput. The burst size for policing non-preferred

clientisvariedfrom5to50whiletheconnection

accep-tancerate isxed at50 conn/sec. Theplotshows the

throughputachievedbythepreferredandnon-preferred

clientsalongwiththetotalthroughput.

throughputofapreferredclient. Inourexperiment,

the non-preferred client is a modied sclient

pro-gram that makes 50to 80back-to-backconnection

requests about twice a second, in addition to the

specied request rate. Both the length of the

in-comingrequestburstanditstimingarerandomized.

Figure6showsthethroughputofpreferredand

non-preferred client with the SYN policing rate of the

non-preferredclientsetto50conn/secandtheburst

sizevaryingfrom5to50. Thenon-preferredsclient

programrequestsa16KBdynamicallygeneratedcgi

le. ThepreferredclientisaWebstoneprogramwith

40clients,requestingastatic8KBle. Astheburst

sizeisincreasedfrom5to50,thesclient's

through-putincreasesfrom36.6conns/sec(585.6KB/sec)to

47.7conns/sec(752KB/sec), whilethethroughput

receivedbythepreferredclientdecreasesfromabout

140conns/sec(1117KB/sec)to79conns/sec.

Intuitively the overall throughput should have

in-creased, however, the observed decrease in total

throughput is due to thefact that weaccept more

CPUconsumingCGIrequestsfromsclient,thereby,

incurring ahigheroverheadperbytetransferred.

4.3 Prioritized Listen Queue: Simple

Priority

With TCPSYNpolicing,onemustlimitthegreedy

non-preferred clients to a meaningful rate during

(9)

ority. Wedemonstratenextthat theprioritized

lis-tenqueueprovidesservicedierentiation,especially

withalargelistenqueuelength.

Inourexperimentsweclassifyclientsintothree

pri-oritylevels. Clientsbelongingtoacommonpriority

levelareallcreatedbyaWebstonebenchmarkthat

requestsan8KBle. AseparateWebstoneinstance

is used for each priority level. We measure client

throughputforeachprioritylevelwhilevarying the

totalnumberof clients in each class. Eachpriority

classusesthesamenumberofclients.

Intherstexperiment,theApacheserveris

cong-ured to spawn a maximum of 50 server processes.

The results in Figure 7 show that when the total

numberofclientsissmall,allprioritylevelsachieve

similar throughput. With fewer clients, server

pro-cessesare alwaysfreeto handle incomingrequests.

Thus, the listen queue remains short and almost

no reorderingoccurs. As thenumber of clients

in-creases, the listen queue builds up since there are

fewer Apache processes than concurrent client

re-quests. Consequently,withre-orderingthe

through-put received by the high priority client increases,

whilethatofthetwolowerpriorityclientsdecreases.

Figure 7 shows that with more than 30 Webstone

clients per class only the high-priority clients are

servedwhilethelower-priorityclientsreceivealmost

noservice.

Figure8illustratestheeect onresponse times

ob-served by clients of the three priority classes. It

canbeseenthat asthe number ofclientsincreases

acrossall priority classesthe response time for the

lower priority classes increases exponentially. The

responsetimeofthehighpriorityclass,ontheother

hand,onlyincreasessub-linearly. Whenthenumber

ofhighpriorityrequestsincreases,thelowerpriority

ones are shifted back in the listen queue, thereby,

increasing theirresponse times. Also asmorehigh

priorityrequestsgetservicedbythedierentserver

processesrunninginparallel andcompeting forthe

CPUtheirresponsetimesincrease.

Wealsoobservedthatwhenthenumberofhigh

pri-ority requests was xed and the lower priority

re-questratewassteadilyincreased,theresponse time

ofthehighpriorityrequestsremainedunaected.

Thepriority-basedapproach enablesusto give low

delay and high throughput to preferred clients

in-dependent of the requests or request patterns of

0

50

100

150

200

250

300

350

400

450

0

5

10

15

20

25

30

35

40

45 throughput (conn/sec)

Number of Webstone clients

"high priority client"

"medium priority client"

"low priority client"

Figure7: Throughputwiththeprioritizedlistenqueue

and 3 priority classes with 50 Apache processes. The

numberofclientsineachclassremainsequal.

other clients. However, onemay need many

prior-ity classes for dierent levels of service. The main

drawbackofasimplepriorityorderingisthatit

pro-videsnoprotectionagainststarvationoflow-priority

requests.

4.4 Combining Policing and Priority

Topreventstarvation, lowpriorityrequestsneedto

have some minimum number reserved slots in the

listenqueuesothat theyarenotalwayspreempted

byahighpriorityrequest. However,reservingslots

inthelistenqueuearbitrarilycouldcauseahigh

pri-orityrequesttondafulllistenqueue,whichwould

inturn causeittobeabortedafter its3-way

hand-shakeis completed. To avoid starvation with xed

priorities,wecombinethelistenqueueprioritieswith

SYNpolicingtogivepreferredclientshigherpriority,

butlimitingtheirmaximumrateandburst,thereby,

implicitlyreservingsome slots in thequeue forthe

lowerpriorityrequests.

Table3showstheresultsforexperimentswiththree

setsofWebstoneclientswithdierentprioritiesand

rate control of the high priority class. The lower

priorityclasshas30Webstoneclientswhilethehigh

priorityclass has 150Webstone clients spreadover

threedierenthosts. With noSYN policing of the

clientsinthehighpriorityclass,thetwolow-priority

clients are completely starved. Table3 shows that

(10)

0

0.2

0.4

0.6

0.8

1

0

5

10

15

20

25

30

35

40 response time (seconds)

Number of Webstone clients

"high priority client"

"medium priority client"

"low priority client"

Figure 8: Response time with the prioritized listen

queueand 3 priority classes with50Apacheprocesses.

Thenumberofclientsineachclassremainsequal.

Table3: TCPSYNpolicingofahigh-priorityclient

toavoidstarvationofotherclients.

Throughput(conn/sec)ofeachpriorityclass

client (rate,burst)limitofhighpriority

priority none (300,300) (200,200)

high 381 306 196

medium 0 78.6 180

low 0 4.1 13

lowpriorityclientsachieveathroughputof78.6and

4.1conn/secrespectively.

4.5 HTTP Header-based Connection

Control

Inthissectionweillustratetheperformanceand

ef-fectivenessofadmissioncontrolandservice

dieren-tiation basedon information in the HTTPheaders

i.e.,URLnameandtype,cookieeldsetc.

Rate control using URLs: In our experimental

scenariothe preferredclient replayingthe e-tailer's

traceneedsto beprotectedfrom overloaddue to a

large number of high overhead CGI requests from

non-preferred clients. The client issuing CGI

re-quests is an sclient program requesting a dynamic

le of length 5 KB at a very high rate. Figure 9

showsthat withoutanyprotection,thepreferred

e-tailer'scustomerreceivesalowthroughputofunder

0

100

200

300

400

500

600

700

0

10

20

30

40

50 throughput (KB/sec)

Accepted rate of non-preferred dynamic clients

"preferred client throughput"

Figure 9: URL-based policing to protect preferred

e-tailer's customers. The graph shows the resulting

throughputofthepreferrede-tailer's clientasaspecic

highoverheadCGIrequestsislimitedtoagivennumber

ofconn/sec

from 40 reqs/sec to 2 reqs/sec the throughput of

the preferred e-tailer's customer improves from 1

KB/sec to 650 KB/sec. In contrast to TCP SYN

policing(Figure5), URLratecontroltargetsa

spe-cicURLcausingoverloadinsteadofaclientpool.

URLpriorities: Inthissectionwepresenttheresults

ofpriorityassignmentsinthelistenqueuebasedon

theURLnameortypebeingrequested. Theclients

are Webstone benchmarks requesting two dierent

URLs, both corresponding to les of size 8 KB.

There are two priority classes in the listen queue

basedonthetworequestedURLs. Figure10shows

thatthelowerpriorityclients(accessingthelow

pri-orityURL)receivelowerthroughputandarealmost

starved when the number of clients requesting the

high priority URL exceeds 40. These results

cor-respond to theresults shown earlier with priorities

based on the connection attributes (see Figure 7).

The average total throughput, however, is slightly

lower with URL-based priorities due to the

addi-tional overheadofURLparsing.

Combined URL-based rate control and priorities:

Toavoid starvation of requests forthe low-priority

URL,weratelimittherequestsforthehigh-priority

URL.Inthisexperiment,weassignahigherpriority

torequestsforadynamic CGIrequestofsize5KB

(requestedbyansclientprogram), andlower

prior-ity to requests for a static 8KB le (requested by

theWebstoneprogram). Table4showsthat

(11)

0

50

100

150

200

250

300

350

400

0

5

10

15

20

25

30

35

40

45 throughput (conn/sec)

Number of Webstone clients

"high priority client"

"low priority client"

Figure10: Throughputwith2URL-basedprioritiesand

50 Apache server processes. The numberof clients in

eachclassisequal

Table4:URL-basedpolicingofahigh-priorityclient

toavoidstarvationofotherclients.

Throughput(conn/sec)

client (rate,burst)limitofhighpriority

priority none (30,10) (10,10)

high 61.7 29.0 10.1

low 0 10.2 117

4.5.1 Overload Protection from High

Over-head Requests

SofarwehaveusedtheURL-basedcontrolsfor

pro-viding service dierentiation based on URL names

andtypes. Inthenextexperiment,weinvestigateif

URL-based connection control can beused to

pro-tectawebserverfromoverloadbyatargetedcontrol

of high overhead requests (e.g., CGI requests that

requirelargecomputationordatabaseaccess).

Weusethesclientloadgeneratortorequestagiven

high overhead URL and control the request rate,

steadilyincreasingitandmeasuringthethroughput.

Figure11showstheclient'sthroughputwithvarying

requestratesforadynamicCGIrequestthat

gener-atesa lesize of 29 KB. Thethroughput increases

linearlywiththerequestrateuptoacriticalpointof

about63 connections/sec. For anyfurther increase

intherequestratethethroughputfallsexponentially

andlaterplateaustoaround40connections/sec.To

understandthisbehaviorweusedvmstattocapture

thepagingstatistics. Sincethedynamicrequestsare

memory-intensive,theavailablefreememoryrapidly

0

10

20

30

40

50

60

70

80

0

20

40

60

80

100 throughput (conn/sec)

request rate

"no control"

"url-based control"

Figure11: Overloadprotectionfromhighoverhead

re-questsusingURL-basedconnection control. Thegraph

showsthethroughputofwebserverwithnocontrols

ser-vicingCPUintensiveCGIrequestsandthe

correspond-ingthroughputwhentheCGIrequestsarelimitedto60

reqs/sec.

andthenumberofactiveprocesses,theavailablefree

memory falls to zero. Eventually thesystemstarts

thrashingastheCPUspendsmostofthetime

wait-ing forpendinglocal diskI/O.Intheabove

experi-mentwith150serverprocessesandarequestrateof

63 reqs/sec the wait timestarts increasing as

indi-catedbythewaiteld oftheoutputfrom vmstat.

To preventoverload we use URL-based connection

control to limit the number of accepted dynamic

CGI requests to a rate of 60 reqs/sec and a burst

of10. ThedashedlineinFigure11showsthatwith

URL-based control the throughput stabilizes to 60

reqs/secandtheserverneverthrashes. Intheabove

experiment, the URL-based connection controlcan

handle request ratesof upto 150requests per

sec-ond. However,forrequestratesbeyondthat

thrash-ingstartsasthekerneloverheadof setting up

con-nections, parsing the URL and sending the RSTs,

becomessubstantial.

Tofurther delaytheonset ofthrashingweaugment

theURL-basedcontrol with theTCPSYNpolicer.

ForeveryTCPRSTthatissentwedropany

subse-quentSYNrequestfromthatsameclientfora

(12)

Table5:PerformanceofAFPAandmatchingaURL

toarulefora8KBlewithdierentURLlengths.

Throughput(conn/sec)

URL AFPA AFPAon, AFPAon,

o (nocache) (nocache)

length norule matchingrule

11char. 370.1 340.5 338.3

80char. 361.5 321.9 319.4

160char. 355.1 321.1 303.7

Table6: Overheadofkernelmechanisms

Operation Cost(sec)

TCPSYNpolicing 1lterrule 7.9

3lterrules 9.6

classicationandpriority 1rule 4.4

3rules 5.0

AFPAincludingURLparsing 19

1rule 5.0

URL-basedratecontrol 2rules 5.8

includingURLmatching 3rules 6.5

1rule 3.8

URL-basedpriority 2rules 4.1

includingURLmatching 3rules 4.3

4.5.2 Discussion

TheHTTPheader-basedratecontrolrelieson

send-ing TCP RST to terminate non-preferred

connec-tionsasandwhennecessary. Inamoreuser-friendly

implementationwecould directlyreturn anHTTP

error message(e.g., server busy) backto theclient

andclosetheconnection.

Our current implementation of URL-based control

handles only HTTP/1.0 connections. We are

cur-rentlyexploringdierentmechanismsforHTTP/1.1

withkeep-aliveconnectionstolimitthenumberand

typesof requests that canbeserviced onthe same

persistentconnection. Theexperimentsin the

pre-vious section have only presented results on URL

basedcontrols. Similarcontrolscanbesetbasedon

theinformationincookiesthatcancapture

transac-Wequantify theoverheadof matchingURLs inthe

kernelforvaryingURLlengths. Table5showsthat

theoverhead of matching aURLto a ruleis

mod-erate (under 6% for a 160 character URL). The

throughputnumbersarefor20Webstoneclients

re-questingan8KBle. Rules arematchedusingthe

standard string comparison (strcmp) with no

op-timizations; bettermatching techniquescanreduce

thisoverheadsignicantly. Onacachemiss,the

in-kernelAFPAcacheintroducesanoverheadofabout

10% for an 8 KB le. However, the AFPA cache

undernormalconditionsincreasesperformance

sig-nicantlyforcachehits. Inourexperimentswehave

thecache size set to 0so that AFPA cannot serve

anyobjectfromthecache. Whencachingisenabled

Webstonereceivedathroughputofover800

connec-tionspersecond onacachehit.

Table6 summarizesthe additional overheadof the

implemented kernel mechanisms. The overhead of

compliancecheckandltermatchingforTCPSYN

policingwith1lterruleis7.9secs. Simply

match-ingthelter,allocatingspacetostoreQoSstate,and

settingthepriorityaddsanoverheadof around4.4

secsfor1lterrule.Thepolicingcontrolsaremore

expensiveastheyincludeaccessingtheclockforthe

currenttime. Surprisingly, theURL matching and

rate control has a low overhead of 5.0 secs for a

URL of 11 chars. This happens to be lower than

SYNpolicingasthestrcmpmatchingischeaperfor

one short URL compared to matching multiple IP

addressesandport numbers. Theoverheadof URL

matching and setting priorities for a single rule is

around3.8 secs. Themostexpensiveoperationis

thecalltoAFPAtoparsetheURL.AFPAnotonly

parsestheURL,butalsodoeslogging,checksifthe

requestedobjectisin thenetworkbuercache,and

pre-computestheHTTPresponseheader.

5 Comparison of User Space and

Kernel Mechanisms

Inthis section we compare the eectiveness of our

kernelmechanismswithoverloadprotectionand

ser-vicedierentiationmechanismsimplementedinuser

space. One might argue that kernel-based

mecha-nisms are less exible and more diÆcult to

(13)

0

50

100

150

200

250

300

350

400

450

0

100

200

300

400

500

600

700 throughput webstone (conn/sec)

rate sclient (reqs/sec)

"TCP SYN policing"

"URL control"

"Apache control"

Figure12: Throughputofkernel-basedTCPSYN

polic-ing,kernelbasedURLratecontrol andApachemodule

basedconnectionratecontrol. Thethroughputachieved

by Webstone clients is measured against an increasing

requestload generated bysclient. The sclient requests

areratecontrolledto10.0req/secwithaburstof2.

bilities, haveeasy access to application layer

infor-mation. However,kernelmechanismsaremore

scal-ableandprovidemuchbetterperformance. In

gen-eral, placing mechanismsin thekernelis benecial

ifitleadstoconsiderableperformancegainsand

in-creasestherobustnessofthe serverwithoutrelying

ontheapplication layertopreventoverload.

Toenable a fair comparison wehave extended the

Apache 1.3.12 server with additional modules [18]

that police requests basedon theclient IPaddress

and requested URL. Theimplementedrate control

schemesuseexactlythesamealgorithmsasour

ker-nelbasedmechanisms. Ifarequestisnotcompliant

wesenda\servertemporarilyunavailable"(503

re-sponse code) back to the client and close the

con-nection.

TheexperimentalsetupconsistsofaWebstone

traf-cgeneratorwith100clientsrequestingaleofsize

8KBalongwithansclientprogramrequestingale

of size 16 KB. The sclient's requests are rate

con-trolled with arateof 10 requestspersecond and a

burst of 2; there are no controls set for the

Web-stone clients. Duringour experiments, westeadily

increasedthesclient'srequestrate.

Figure 12 illustrates that when the request loadof

the sclient program is low (20 reqs/sec), the

Web-stonethroughputis392conn/secand387.3conn/sec

for TCP SYN policing and Apache user level

con-trols respectively. These controls limit the sclient

0

0.2

0.4

0.6

0.8

1

1.2

0

100

200

300

400

500

600

700 response time webstone (seconds)

rate sclient (reqs/sec)

"TCP SYN policing"

"URL control"

"Apache control"

Figure 13: Response times using kernel-based TCP

SYNpolicing,kernelbasedURLratecontrolandApache

module based connection rate control. The response

timeachieved by Webstone clients ismeasured against

an increasing request load generated by sclient. The

sclient requests are rate controlled to 10 req/sec with

aburstof2.

URL-basedratecontrolthethroughputislower(354

conn/sec). This low throughput is caused by the

additional10%overheadadded by AFPA (withno

caching) as discussed in Section 4.6. As discussed

earlier,withthecachesizesettozero,weaddmore

overhead than necessary for URLparsing, without

thecorrespondinggains fromAFPAcaching.

Asthesclient'srequestloadincreasesfurther,TCP

SYNpolicingisabletoachieveasustained

through-putfortheWebstoneclients,whiletheApachebased

controlsshowsamarkeddeclineinthroughput. The

graphshows that for asclientload of 650reqs/sec

theWebstonethroughput forTCPSYNpolicing is

374 conn/sec; for in-kernel URL-based connection

control it is 260.7 conn/sec; for Apache user level

controlsthethroughputsinkstoabout75conn/sec.

The corresponding results for response times are

showninFigure13.

Theexperimentdemonstratesthatthekernel

mech-anismsaremoreeÆcientandscalablethantheuser

space mechanisms. There are two main reasons

forthehighereÆciency andscalability: First,

non-compliant connectionrequestsare discardedearlier

reducingthequeuingtimeofthecompliantrequests,

inparticularlessCPUisconsumedandthecontext

switch to user space is avoided. Second, when

im-plementingratecontrolat user space, the

(14)

Several research eorts have focused on admission

controlandservicedierentiationinwebservers[19],

[20], [21], [22], [8] and [23]. Almeida et al. [8] use

priority-basedschemestoprovidedierentiated

lev-elsofservicetoclientsdependingonthewebpages

accessed. While in their approach the application,

i.e., the web server, determines request priorities,

ourmechanismsresidein thekernelandcanbe

ap-plied without context-switching to user level.

We-bQoS[23]isamiddlewarelayerthatprovidesservice

dierentiationandadmissioncontrol.Sinceitis

de-ployedin userspace, itislesseÆcientcomparedto

kernel-basedmechanisms. WhileWebQoS also

pro-vides URL-based classication, the authors do not

present any experimentsor performance

considera-tions. Cherkasova et al. [20] present an enhanced

web server that provides session-based admission

controltoensurethatlongersessionsarecompleted.

Crovellaet al. [24] show that client response time

improveswhenwebserversserving staticlesserve

shorter connections before handlinglonger

connec-tions. Our mechanisms are general and caneasily

realizesuchapolicy.

Reumann etal.[25]havepresentedvirtualservices,

a new operating system abstraction that provides

resourcepartitioningandmanagement. Virtual

ser-vices can enhance our scheme by, forexample,

dy-namicallycontrollingthenumberofprocessesaweb

serverisallowedtofork. In[26]Reumannetal. have

describedanadaptivemechanismtosetuprate

con-trols for overload protection. The receiverlivelock

study [2] showed that network interrupt handling

could cause server livelocks and should be taken

into considerationwhen designingprocess

schedul-ingmechanisms. BangaandDruschel's[27]resource

containers enable the operating system to account

for and control the consumption of resources. To

shield preferred clients from malicious or greedy

clientsonecan assign them to dierentcontainers.

Inthe samepapertheyalso describeamulti listen

socketapproachforprioritiesin whichaltersplits

asinglelistenqueueintomultiplequeuesfromwhich

connections are acceptedseparately and accounted

todierentprincipals. Ourapproachissimilar,

how-ever,connectionsareacceptedfromthesamesingle

listenqueuebutinsertedinthequeuebasedon

prior-ity. Kanodiaetal.[21]presentasimulationstudyof

queuing-basedalgorithmsforadmissioncontroland

trollingtheadmissionrateperclass. Aronetal.[28]

describeascalablerequestdistributionarchitecture

for clusters and also present resourcemanagement

techniques forclusters.

Scout[29], Rialto [30] and Nemesis[31] are

operat-ingsystemsthattrackper-applicationresource

con-sumptionandrestricttheresourcesgrantedtoeach

application. These operatingsystemscanthus

pro-videisolationbetweenapplicationsaswellasservice

dierentiation betweenclients. However, there isa

signicant amount of work involved to port

appli-cationsto thesespecializedoperatingsystems. Our

focus, however, was not to build a new operating

systemornetworking architecture but tointroduce

simplecontrols in theexisting architecture of

com-mercialoperatingsystemsthat couldbejust as

ef-fective.

7 Conclusions and Future Work

In this paper, we have presented three in-kernel

mechanismsthat provideservicedierentiationand

admissioncontrol foroverloadedweb servers. TCP

SYN policing limits the number of incoming

con-nection requests using a token bucket policer and

prevents overload by enforcing a maximum

accep-tancerate of non-preferredclients. The prioritized

listen queue provides low delay and high

through-put to clients with high priority, but can starve

low priority clients. We show that starvation can

be avoided bycombiningpriorities with TCPSYN

policing. Finally, URL-based connection control

provides in-kernel admission control and priority

basedonapplication-levelinformationsuchasURLs

andcookies. This mechanismis very powerfuland

can, for example, prevent overload caused by

dy-namic requests. We compared the kernel

mecha-nismsto similar applicationlayercontrolsadded in

the Apache server and demonstrated that the

ker-nelmechanismsaremuchmoreeÆcientandscalable

thantheApacheuserlevelcontrols.

Thekernelmechanismsthatwepresentedrelyonthe

existenceof accuratepolicies that controlthe

oper-ating range of the server. In a production system

it is unrealistic to assume knowledge of the

opti-maloperatingregionoftheserver. Wearecurrently

implementing apolicy adaptation agent (Figure 4)

(15)

to select appropriatevaluesfor the various policies

andmonitorstheinteractionbetweenvariouscontrol

optionsontheoverallperformanceduringoverload.

Ourcurrentimplementationdoesnotaddress

secu-rityissuesoffakeIPaddressesandclientidentities.

Weplantointegrateavarietyofoverloadprevention

policieswithtraditionalrewallrulestoprovidean

integratedsolution.

References

[1] \CiscoTCPintercept,"http://www.cisco.com.

[2] J. C. Mogul and K. K.Ramakrishan,

\Eliminat-ing receive livelock in aninterrupt-driven kernel,"

inProc.of USENIXAnnual TechnicalConference,

Jan.1996.

[3] P.DruschelandG.Banga, \Lazyreceiver

process-ing (LRP): a network subsystem architecture for

serversystems," inProc. of OSDI,Oct.1996, pp.

91{105.

[4] O.SpatscheckandL.Peterson,\Defendingagainst

denialofserviceattacksinscout,"inProc.ofOSDI,

Feb.1999.

[5] \Ensim corporation: virtual servers,"

http://www.ensim.com.

[6] \Alteon web systems,"

http://www.alteonwebsystems.com.

[7] \Cisco arrowpoint web network services,"

http://www.arrowpoint.com.

[8] J. Almeida, M. Dabu, A. Manikutty, and P. Cao,

\Providing dierentiated levels of service in web

content hosting," inProc. of Internet Server

Per-formanceWorkshop,Mar.1999.

[9] T. Barzilai, D. Kandlur, A. Mehra, and D. Saha,

\Designandimplementationofanrsvpbased

qual-ityofservicearchitectureforanintegratedservices

internet,"IEEEJournalonSelectedAreasin

Com-munications,vol.16,no.3,pp.397{413,Apr.1998.

[10] A.Mehra,R.Tewari,andD.Kandlur,\Design

con-siderations for ratecontrol of aggregated tcp

con-nections," inProc.ofNOSSDAV,June1999.

[11] G.R. Wright and W.R. Stevens, TCP/IP

Illus-trated,Volume2,Addison-WesleyPublishing

Com-pany,1995.

[12] Martin F. Arlitt and Carey l. Williamson, \Web

serverworkloadcharacterization: Thesearchfor

in-variants," inProc.ofACMSigmetrics, Apr.1996.

[13] P.Joubert, R. King, R.Neves, M.Russinovich, and

J.Tracey, \High performance memory based web

caches: Kernel and user space performance," in

http://www.fenrus.demon.nl/.

[15] \webstone,"http://www.mindcraft.com.

[16] G.BangaandP.Druschel,\Measuringthecapacity

ofawebserver," inProc.ofUSITS,Dec.1997.

[17] \apache," http://www.apache.org.

[18] L.SteinandD.MacEachern,WritingApache

mod-uleswithPerlandC, O'Reilly,1999.

[19] T. Abdelzaher and N. Bhatti, \Web server qos

managementbyadaptivecontentdelivery," inInt.

WorkshoponQualityofService,June1999.

[20] L.CherkasovaandP.Phaal, \Sessionbased

admis-sion control: a mechanism for improvingthe

per-formanceofanoverloadedwebserver," Tech.Rep.,

HewlettPackard,1999.

[21] V. Kanodia andE.Knightly, \Multi-class

latency-boundedwebservers,"inIntl.WorkshoponQuality

of Service,June2000.

[22] K. Li and S. Jamin, \A measurement-based

ad-mission controlledwebserver," inProc.of

INFO-COMM,Mar.2000.

[23] NinaBhatti andRichFriedrich, \Webserver

sup-portfortieredservices,"IEEENetwork,Sept.1999.

[24] M. E. Crovella, R. Frangioso, and M.

Harchol-Balter, \Connectionschedulinginwebservers," in

Proc.ofUSITS,Oct.1999.

[25] J. Reumann,A.Mehra, K.Shin,andD. Kandlur,

\Virtualservices:Anewabstractionforserver

con-solidation," inProc.ofUSENIXAnnual Technical

Conference,June2000.

[26] H. JamjoomandJ.Reumann, \Qguard:protecting

internet servers from overload," Tech. Rep.,

Uni-versityofMichigan,CSE-TR-427-00,2000.

[27] G. Banga, P. Druschel, and J. Mogul, \Resource

containers: anewfacilityfor resourcemanagement

inserversystems," inProc.ofOSDI,Feb.1999.

[28] M. Aron, D. Sanders, P. Druschel, and

W. Zwaenepoel, \Scalable content-aware request

distribution incluster-based networkservers," in

Proc. of USENIX Annual Technical Conference,

June2000.

[29] D. Mosberger and L. L. Peterson, \Making paths

explicitinthescoutoperatingsystem," inProc.of

OSDI,Oct.1996,pp.153{167.

[30] M.B.Jones,J.S.BarreraIII,A.Forin,P.J.Leach,

D.Rosu,andM.Rosu, \AnoverviewoftheRialto

real-timearchitecture," inACMSIGOPSEuropean

Workshop, Sept.1996,pp.249{256.

[31] Thiemo Voigt and Bengt Ahlgren, \Scheduling

TCP in the Nemesis operating system," in IFIP

WG 6.1/WG 6.4 International Workshop on