proxy recita-on

37
Carnegie Mellon 1 Proxy Recita-on Jeffrey Liu Recita4on 13: November 23, 2015

Upload: others

Post on 29-Mar-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Proxy Recita-on

CarnegieMellon

1

ProxyRecita-on

JeffreyLiuRecita4on13:November23,2015

Page 2: Proxy Recita-on

CarnegieMellon

2

Outline¢  Ge3ngcontentontheweb:Telnet/cURLDemo

§  Howthewebreallyworks¢  NetworkingBasics¢  Proxy

§  DueTuesday,December8th§  Gracedaysallowed

¢  StringManipula-oninC

Page 3: Proxy Recita-on

CarnegieMellon

3

TheWebinaTextbook¢  Clientrequestpage,serverprovides,transac-ondone.

¢  Asequen-alservercanhandlethis.Wejustneedtoserveonepageata-me.

¢  Thisworksgreatforsimpletextpageswithembeddedstyles.

Web server

Web client

(browser)

Page 4: Proxy Recita-on

CarnegieMellon

4

Telnet/CurlDemo¢  Telnet

§  Interac4veremoteshell–likesshwithoutsecurity§  MustbuildHTTPrequestmanually

§  Thiscanbeusefulifyouwanttotestresponsetomalformedheaders

[rjaganna@makoshark~]%telnetwww.cmu.edu80Trying128.2.42.52...ConnectedtoWWW-CMU-PROD-VIP.ANDREW.cmu.edu(128.2.42.52).Escapecharacteris'^]'.GEThttp://www.cmu.edu/HTTP/1.0HTTP/1.1301MovedPermanentlyDate:Sat,11Apr201506:54:39GMTServer:Apache/1.3.42(Unix)mod_gzip/1.3.26.1amod_pubcookie/3.3.4amod_ssl/2.8.31OpenSSL/0.9.8e-fips-rhel5Location:http://www.cmu.edu/index.shtmlConnection:closeContent-Type:text/html;charset=iso-8859-1<!DOCTYPEHTMLPUBLIC"-//IETF//DTDHTML2.0//EN"><HTML><HEAD><TITLE>301MovedPermanently</TITLE></HEAD><BODY><H1>MovedPermanently</H1>Thedocumenthasmoved<AHREF="http://www.cmu.edu/index.shtml">here</A>.<P><HR><ADDRESS>Apache/1.3.42Serverat<AHREF="mailto:[email protected]">www.cmu.edu</A>Port 80</ADDRESS></BODY></HTML>Connectionclosedbyforeignhost.

Page 5: Proxy Recita-on

CarnegieMellon

5

Telnet/cURLDemo¢  cURL

§  “URLtransferlibrary”withacommandlineprogram§  BuildsvalidHTTPrequestsforyou!

§  CanalsobeusedtogenerateHTTPproxyrequests:

[rjaganna@makoshark~]%curlhttp://www.cmu.edu/<!DOCTYPEHTMLPUBLIC"-//IETF//DTDHTML2.0//EN"><HTML><HEAD><TITLE>301MovedPermanently</TITLE></HEAD><BODY><H1>MovedPermanently</H1>Thedocumenthasmoved<AHREF="http://www.cmu.edu/index.shtml">here</A>.<P><HR><ADDRESS>Apache/1.3.42Serverat<AHREF="mailto:[email protected]">www.cmu.edu</A>Port 80</ADDRESS></BODY></HTML>

[rjaganna@makoshark~]%curl--proxylemonshark.ics.cs.cmu.edu:3092http://www.cmu.edu/<!DOCTYPEHTMLPUBLIC"-//IETF//DTDHTML2.0//EN"><HTML><HEAD><TITLE>301MovedPermanently</TITLE></HEAD><BODY><H1>MovedPermanently</H1>Thedocumenthasmoved<AHREF="http://www.cmu.edu/index.shtml">here</A>.<P><HR><ADDRESS>Apache/1.3.42Serverat<AHREF="mailto:[email protected]">www.cmu.edu</A>Port80</ADDRESS></BODY></HTML>

Page 6: Proxy Recita-on

CarnegieMellon

6

HowtheWebReallyWorks¢  Inreality,asingleHTMLpagetodaymaydependon10s

or100sofsupportfiles(images,stylesheets,scripts,etc.)¢  Buildsagoodargumentforconcurrentservers

§  Justtoloadasinglemodernwebpage,theclientwouldhavetowaitfor10sofback-to-backrequest

§  I/Oislikelyslowerthanprocessing,soback¢  Cachingissimplerifdoneinpiecesratherthanwhole

page§  Ifonlypartofthepagechanges,noneedtofetcholdpartsagain§  Eachobject(image,stylesheet,script)alreadyhasauniqueURL

thatcanbeusedasakey

Page 7: Proxy Recita-on

CarnegieMellon

7

HowtheWebReallyWorks¢  Excerptfromwww.cmu.edu/index.html:

<htmllang="en"xml:lang="en"xmlns="http://www.w3.org/1999/xhtml"><head>...<linkhref="homecss/cmu.css"rel="stylesheet"type="text/css"/><linkhref="homecss/cmu-new.css"rel="stylesheet"type="text/css"/><linkhref="homecss/cmu-new-print.css"media="print"rel="stylesheet"type="text/css"/><linkhref="http://www.cmu.edu/RSS/stories.rss"rel="alternate"title="CarnegieMellonHomepageStories"type="application/rss+xml"/>...<scriptlanguage="JavaScript"src="js/dojo.js"type="text/javascript"></script><scriptlanguage="JavaScript"src="js/scripts.js"type="text/javascript"></script><scriptlanguage="javascript"src="js/jquery.js"type="text/javascript"></script><scriptlanguage="javascript"src="js/homepage.js"type="text/javascript"></script><scriptlanguage="javascript"src="js/app_ad.js"type="text/javascript"></script>...<title>CarnegieMellonUniversity|CMU</title></head><body>...

Page 8: Proxy Recita-on

CarnegieMellon

8

Sequen-alProxy

Page 9: Proxy Recita-on

CarnegieMellon

9

Sequen-alProxy¢  Notetheslopedshapeofwhenrequestsfinish

§  Althoughmanyrequestsaremadeatonce,theproxydoesnotacceptanewjobun4litfinishesthecurrentone

§  Requestsaremadeinbatches.ThisresultsfromhowHTMLisstructuredasfilesthatreferenceotherfiles.

¢  Comparedtotheconcurrentexample(next),thispagetakesalong-metoloadwithjuststa-ccontent

Page 10: Proxy Recita-on

CarnegieMellon

10

ConcurrentProxy

Page 11: Proxy Recita-on

CarnegieMellon

11

ConcurrentProxy¢  Now,weseemuchlesspurple(wai-ng),andless-me

spentoverall.¢  No-cehowmul-plegreen(receiving)blocksoverlapin

-me§  Ourproxyhasmul4pleconnec4onsopentothebrowsertohandle

severaltasksatonce

Page 12: Proxy Recita-on

CarnegieMellon

12

HowtheWebReallyWorks¢  AnoteonAJAX(andXMLHZpRequests)

§  Normally,abrowserwillmaketheini4alpagerequestthenrequestanysuppor4ngfiles

§  AndXMLHapRequestissimplyarequestfromthepageonceithasbeenloaded&thescriptsarerunning

§  Thedis4nc4ondoesnotmaaerontheserverside–everythingisanHTTPRequest

Page 13: Proxy Recita-on

CarnegieMellon

13

Outline¢  Ge3ngcontentontheweb:Telnet/cURLDemo

§  Howthewebreallyworks¢  NetworkingBasics¢  Proxy

§  DueTuesday,December8th§  Gracedaysallowed

¢  StringManipula-oninC

Page 14: Proxy Recita-on

14

Carnegie Mellon

Sockets

¢  Whatisasocket?§  Toanapplica4on,asocketisafiledescriptorthatletstheapplica4onread/

writefrom/tothenetwork§  (allUnixI/Odevices,includingnetworks,aremodeledasfiles)

¢  Clientsandserverscommunicatewitheachotherbyreadingfromandwri-ngtosocketdescriptors

¢  ThemaindifferencebetweenregularfileI/OandsocketI/Oishowtheapplica-on“opens”thesocketdescriptors

Page 15: Proxy Recita-on

15

Carnegie Mellon

OverviewoftheSocketsInterface

Client/ServerSession

Client Server

socket socket

bind

listen

rio_readlineb

rio_writen rio_readlineb

rio_writen

Connec4onrequest

rio_readlineb

close

close EOF

Awaitconnec4onrequestfromnextclient

open_listenfd open_clientfd

accept connect

getaddrinfo getaddrinfo

Page 16: Proxy Recita-on

Carnegie Mellon

16

HostandServiceConversion:getaddrinfo ¢  getaddrinfoisthemodernwaytoconvertstringrepresenta4onsof

host,ports,andservicenamestosocketaddressstructures.§  Replacesobsoletegethostbyname-unsafebecauseitreturnsa

pointertoasta4cvariable

¢  Advantages:§  Reentrant(canbesafelyusedbythreadedprograms).§  Allowsustowriteportableprotocol-independentcode(IPv4andIPv6)§  Givenhostandservice,getaddrinfo returnsresultthat

pointstoalinkedlistofaddrinfostructs,eachpoin4ngtosocketaddressstruct,whichcontainsargumentsforsocketsAPIs.

¢  getnameinfoistheinverseofgetaddrinfo,conver-ngasocketaddresstothecorrespondinghostandservice.

Page 17: Proxy Recita-on

SocketsAPI¢  intsocket(intdomain,inttype,intprotocol);

§  Createafiledescriptorfornetworkcommunica4on§  usedbybothclientsandservers§  intsock_fd=socket(PF_INET,SOCK_STREAM,IPPROTO_TCP);§  Onesocketcanbeusedfortwo-waycommunica4on

¢  intbind(intsocket,conststructsockaddr*address,socklen_taddress_len);§  AssociateasocketwithanIPaddressandportnumber§  usedbyservers§  structsockaddr_insockaddr–family,address,port

17

Carnegie Mellon

Page 18: Proxy Recita-on

SocketsAPI¢  intlisten(intsocket,intbacklog);

§  socket:sockettolistenon§  usedbyservers§  backlog:maximumnumberofwai4ngconnec4ons§  err=listen(sock_fd,MAX_WAITING_CONNECTIONS);

¢  intaccept(intsocket,structsockaddr*address,socklen_t*address_len);§  usedbyservers§  socket:sockettolistenon§  address:pointertosockaddrstructtoholdclientinforma4onamer

acceptreturns§  return:filedescriptor

18

Carnegie Mellon

Page 19: Proxy Recita-on

SocketsAPI¢  intconnect(intsocket,structsockaddr*address,socklen_t

address_len);§  aaempttoconnecttothespecifiedIPaddressandportdescribedin

address§  usedbyclients

¢  intclose(intfd);§  usedbybothclientsandservers§  (alsousedforfileI/O)§  fd:socketfdtoclose

19

Carnegie Mellon

Page 20: Proxy Recita-on

SocketsAPI¢  ssize_tread(intfd,void*buf,size_tnbyte);

§  usedbybothclientsandservers§  (alsousedforfileI/O)§  fd:(socket)fdtoreadfrom§  buf:buffertoreadinto§  nbytes:buflength

¢  ssize_twrite(intfd,void*buf,size_tnbyte);§  usedbybothclientsandservers§  (alsousedforfileI/O)§  fd:(socket)fdtowriteto§  buf:buffertowrite§  nbytes:buflength

20

Carnegie Mellon

Page 21: Proxy Recita-on

CarnegieMellon

21

Outline¢  Ge3ngcontentontheweb:Telnet/cURLDemo

§  Howthewebreallyworks¢  NetworkingBasics¢  Proxy

§  DueTuesday,December8th§  Gracedaysallowed

¢  StringManipula-oninC

Page 22: Proxy Recita-on

CarnegieMellon

22

ByteOrderingReminder¢  So,howarethebyteswithinamul--bytewordorderedin

memory?¢  Conven-ons§  BigEndian:Sun,PPCMac,Internet

§  Leastsignificantbytehashighestaddress§  LialeEndian:x86,ARMprocessorsrunningAndroid,iOS,and

Windows§  Leastsignificantbytehaslowestaddress

Page 23: Proxy Recita-on

CarnegieMellon

23

ByteOrderingReminder¢  So,howarethebyteswithinamul--bytewordorderedin

memory?¢  Conven-ons

§  BigEndian:Sun,PPCMac,Internet§  Leastsignificantbytehashighestaddress

¢  Makesuretousecorrectendianness

Page 24: Proxy Recita-on

CarnegieMellon

24

Proxy-Func-onality¢  Shouldworkonvastmajorityofsites

§  Twitch,CNN,NYTimes,etc.§  SomefeaturesofsiteswhichrequirethePOSTopera4on(sending

datatothewebsite),willnotwork-  Loggingintowebsites,sendingFacebookmessage

§  HTTPSisnotexpectedtowork§  Google,YouTube(andsomeotherpopularwebsites)nowtry

topushuserstoHTTPsbydefault;watchoutforthat¢  Cachepreviousrequests

§  UseLRUevic4onpolicy§  Mustallowforconcurrentreadswhilemaintainingconsistency

§  Detailsinwriteup

Page 25: Proxy Recita-on

CarnegieMellon

25

Proxy-Func-onality¢  Whyamul--threadedcache?

n  Sequen4alcachewouldboaleneckparallelproxy

n  Mul4plethreadscanreadcachedcontentsafely

n  Searchcachefortherightdataandreturnitn  Twothreadscanreadfromthesamecacheblock

n  Butwhataboutwri4ngcontent?

n  Overwriteblockwhileanotherthreadreading?

n  Twothreadswri4ngtosamecacheblock?

Page 26: Proxy Recita-on

CarnegieMellon

26

Proxy-How¢  Proxiesareabitspecial-theyareaserverandaclientatthesame

4me.¢  Theytakearequestfromonecomputer(ac4ngastheserver),and

makeitontheirbehalf(astheclient).¢  Ul4mately,thecontrolflowofyourprogramwilllooklikeaserver,but

willhavetoactasaclienttocompletetherequest

¢  Startsmall§  Grabyourselfacopyoftheechoserver(pg.946)andclient(pg.

947)inthebook§  Alsoreviewthe4ny.cbasicwebservercodetoseehowtodeal

withHTTPheaders§  Notethat4ny.cignoresthese;youmaynot

Page 27: Proxy Recita-on

CarnegieMellon

27

Proxy-How¢  Whatyouendupwithwillresemble:

Server(port80)Client

Clientsocketaddress128.2.194.242:51213

Serversocketaddress208.216.181.15:80

Proxy

Proxyserversocketaddress128.2.194.34:15213

Proxyclientsocketaddress128.2.194.34:52943

Page 28: Proxy Recita-on

CarnegieMellon

28

Summary¢  Step1:Sequen-alProxy

§  Worksgreatforsimpletextpageswithembeddedstyles

¢  Step2:ConcurrentProxy§  mul4-threading

¢  Step3:CacheWebObjects§  Cacheindividualobjects,notthewholepage§  UseanLRUevic-onpolicy§  Yourcachingsystemmustallowforconcurrentreadswhile

maintainingconsistency.Concurrency?SharedResource?

Page 29: Proxy Recita-on

CarnegieMellon

29

Proxy–Tes-ng&Grading¢  New:Autograder

§  ./driver.shwillrunthesametestsasautolab:§  Abilitytopullbasicwebpagesfromaserver§  Handlea(concurrent)requestwhileanotherrequestiss4llpending

§  Fetchawebpageagainfromyourcacheamertheserverhasbeenstopped

§  Thisshouldhelpanswertheques4on“isthiswhatmyproxyissupposedtodo?”

§  Pleasedon’tusethisgradertodefini4velytestyourproxy;therearemanythingsnottestedhere

Page 30: Proxy Recita-on

CarnegieMellon

30

Proxy–Tes-ng&Grading¢  Testyourproxyliberally

§  Thewebisfullofspecialcasesthatwanttobreakyourproxy§  Generateaportforyourselfwith./port-for-user.pl[andrewid]§  Generatemoreportsforwebserversandsuchwith./free-port.sh§  Considerusingyourandrewwebspace(~/www)tohosttestfiles

§  Youhavetovisithaps://www.andrew.cmu.edu/server/publish.htmltopublishyourfoldertothepublicserver

¢  Createahandinfilewithmakehandin§  Willcreateatarfileforyouwiththecontentsofyourproxylab-

handinfolder

Page 31: Proxy Recita-on

CarnegieMellon

31

Outline¢  Ge3ngcontentontheweb:Telnet/cURLDemo

§  Howthewebreallyworks¢  NetworkingBasics¢  Proxy

§  DueTuesday,December8th§  Gracedaysallowed

¢  StringManipula-oninC

Page 32: Proxy Recita-on

CarnegieMellon

32

Stringmanipula-oninC¢  sscanf:Readinputinspecificformat

intsscanf(constchar*str,constchar*format,…);Example:buf=“213isawesome”//Readintegerandstringseparatedbywhitespacefrombuffer‘buf’//intopassedvariablesret=sscanf(buf,“%d%s%s”,&course,str1,str2);Thisresultsin:course=213,str1=is,str2=awesome,ret=3

Page 33: Proxy Recita-on

CarnegieMellon

33

Stringmanipula-on(cont)¢  sprine:Writeinputintobufferinspecificformat

intsprinI(char*str,constchar*format,…);Example:buf[100];str=“213isawesome”//Buildthestringindoublequotes(“”)usingthepassedarguments//andwritetobuffer‘buf’sprinI(buf,“String(%s)isoflength%d”,str,strlen(str));Thisresultsin:buf=String(213isawesome)isoflength14

Page 34: Proxy Recita-on

CarnegieMellon

34

Stringmanipula-on(cont)Otherusefulstringmanipula-onfunc-ons:¢  strcmp,strncmp,strncasecmp¢  strstr¢  strlen¢  strcpy,strncpy

Page 35: Proxy Recita-on

CarnegieMellon

35

Aside:Se3ngupFirefoxtouseaproxy

¢  Youmayuseanybrowser,butwe’llbegradingwithFirefox

¢  Preferences>Advanced>Network>Se3ngs…(underConnec-on)

¢  Check“Usethisproxyforallprotocols”oryourproxywillappeartoworkforHTTPStraffic.

Page 36: Proxy Recita-on

CarnegieMellon

36

Acknowledgements¢  Slidesderivedfromrecita-onslidesoflast2yearsby

§  Shiva§  HartajSinghDugal§  IanHartwig§  RohithJagannathan

Page 37: Proxy Recita-on

CarnegieMellon

37

Ques-ons?