代理伺服器 proxy. 大綱 簡介 proxy explanation of squid.conf. research issues. benchmark...

37
代代代代代 Proxy

Post on 21-Dec-2015

250 views

Category:

Documents


4 download

TRANSCRIPT

代理伺服器 Proxy

大綱• 簡介 Proxy

• Explanation of squid.conf.

• Research issues.

• Benchmark tools and reports

• Suggestions for Axtronics

• Related software

• Other notes

代理伺服器 ( Proxy )

• 原為防火牆的一部分– 為增加安全性而設計的一種「應用程式閘道」– 內部系統和外界系統都只能看到 Proxy

– 可在 Proxy 任意一層加入檢查安全性資料的過濾機制

• Proxy 只有一個 IP– 必須透過 Protocol Number 和 Port Number 來區分連線

Proxy 的原理乙太網路

Proxy Server

Client 1

Sibling Proxy ServerSibling Proxy Server

Parent Proxy Server

WWW目的 伺服器

Req

uest

(1)

Client 2

Request (2)

Pare

nt

Cach

ed

?

forwardRequest

cached? cached?

Proxy 的運作( 1/4 )• 輸入一個 URL 並且按下 Enter , Client

1 瀏覽器會發出一個要求( Request 1) 指向 Proxy Server

• Proxy Server 檢查自己的磁碟內有沒有 Client 1 所需要的資料

• 如果沒有,則向同儕代理伺服器( Sibling Proxy Server )發出 ICP_QUERY ,看看有沒有 Client 1 所要的資料

Proxy 的運作( 2/4 )• 如果 Sibling Proxy Server 沒有資料, Proxy

Server 會再送出 ICP_QUERY 給他的 Parent Proxy

• 如果還是沒有, Proxy Server 將這一個要求( Request 1 )傳給他的 Parent Proxy Server

• Parent Proxy Server 負責 Forward 這一個 Request ,向目地的 WWW 伺服器擷取資料

Proxy 的運作( 3/4 )• Parent Proxy Server 將擷取回來的資料傳給下一級的 Proxy Server ,並且將資料在自己的電腦裡面做個備份

• 下一級的 Proxy Server 也同樣的備份快取資料,並且將資料傳給使用者。

Proxy 的運作( 4/4 )

Local Proxy Server

Fresh or Stale ?

Sibling1

Sibling2

Sibling3

Parent1

Parent2

Parent3

Client

Requested

Object

(URL)

RTT will be checked

RTT=2With ‘query-icmp’ enabled

RTT=3

If no ‘ICP_MISS’ replies

1. : ICP_QUERY

2. : ICP_REPLY

3. : ICP_NOFETCH

4. : Retrieving object

Proxy 的命令格式• ICP & SQUID

– ICP header format– ICP query algorithm

ICP Header formatOPCODE VERSION PACKET LENGTH

REQUEST NUMBER

OPTIONS

SENDER HOST ADDRESS

PAYLOAD (Ex:Receiver’s address , piggyback..etc)

OPCODE: Message type,for ex: ICP_HIT , ICP_MISS , ICP_NOFETCH…etc

VERSION: Version of ICP protocol

REQUEST NUMBER: identifier to match queries and responses.

ICP Query flow chart

Neighbor selection1.Round Robin2.RTT

Hierarchy-stop

List (Ex: cgi-bin)

Extract & parse the URL

Access Control List

Object (URL)

Lookup

ICP_QUERY(ICP_DECHO)for non-icp proxies

Other Cache server

Multicast Group

ICP_DENIED

ICP_DENIED (authen)

Authen passed

Object Size

Network Failure

Or don’t want to handle this req

ICP_MISS

IPC_HIT_OBJ (Piggyback)

ICP_HIT

IPC_NOFETCHRedirector ?

Client

Yes

No

Blank page or other URL Remote peer

Explanation of squid.conf– Part 1 : General options– Part 2 : Ops which affect the neighbor selection algorithm– Part 3 : Options which affect the cache size– Part 4 : Logfile pathnames and cache directories– Part 5 : Options for external support programs– Part 6 : Options for tuning the cache– Part 7 : Timeout– Part 8 : Access Controls– Part 9 : Other important tags

Part 1 : General options( 1 )

Tag_name Use Default Remarks

http_port

(**)

The port num where squid listen to.

3128

(or 8080)

For httpd-accel mode use port 80.

icp_port

(*)

Where squid send & recv ICP requests.

3130 For comm with sibling neighbors.

mcast_groups(*) Spec a list of multicast groups for join to recv multicasted ICP requests.

Listen for no groups

Don’t use addrs which have been used by other groups. Ask the NLANR for your own multicast addrs space.

tcp_incoming_address(*)

Used for HTTP socket which accepts conns from clients and other caches.

No

Below 4 tags (including this one) are used to provide more control for “multihomed” hosts.

tcp_outgoing_address(*)

Used for conns made to remote servers and other caches. No

Tag_name Use Default Remarks

udp_incoming_address(*)

Used for the ICP socket receiving packets from other caches.

No

Udp_outgoing_address(*)

Used for ICP packets sent out to other caches. No

Part 1 : General options( 2)

Tag_name Use Default Remarks

cache_peer(***) Specify other caches in a hierarchy. Format:

Tag host type proxy icp ops

No

Ops: 1) proxy-only , 2) weight=n

3) ttl=n , 4) no-query , 5) default

6) round-robbin , 7)no-digest

8)login=user:password

cache_peer_domain(***)

Limit the domains for which a neighbor chche will be queried. Format:

c_h_d ca_host [!]domain

No

1)Any num of domain may be given

2)First matched domain is applied

3)Queried for all reqs if no dom restri

Ex: cache_peer proxy.nctu.edu.tw parent 3128 3130 no-digest

cache_peer_domain proxy.edu.tw .jp

cache_peer_domain proxy.nctu.edu.tw ! .nctu.edu.tw

Part 2 : Ops which affect the neighbor selection algorithm ( 1 )

Tag_name Use Default Remarks

neighbor_type_domain(***)

You can treat some domai-ns diffly than the neighbor type on “cache_host” line.

No

icp_query_timeout (*)

Normally squid will deter-mine it based on round-trip time.

2000

(msec)

mcast_icp_query_timeout(*)

How long squid should wait to count all the replies

2000

(msec)

Ex: cache_peer proxy.nctu.edu.tw parent 3128 3130 [options]

neighbor_type_domain proxy.nctu.edu.tw sibling .com .net

neighbor_type_domain proxy.nctu.edu.tw sibling .au .de

Part 2 : Ops which affect the neighbor selection algorithm ( 2 )

Tag_name Use Default Remarks

dead_peer_timeout

(*)

How long squid wais to d-eclare a peer cache as ‘dead’.

10(sec)

hierarchy_stoplist(**)

Use this to not query neig-bor for certain obj’s URL.

‘cgi-bin’

or ‘?’

The option may be list multiple times.

no_cache

(**)

Use this to force certain objs to never be cached.

‘cgi-bin’

or ‘?’

The option may be list multiple times.Obj is cached then removed immedly.

Ex: acl QUERY urlpath_regex cgi-bin \?

no_cache deny QUERY

Part 2 : Ops which affect the neighbor selection algorithm ( 3 )

Tag_name Use Default Remarks

cache_mem(***)

Max amount of memory used to store objs.

8MB Mem here is users for storing 1)In-transit 2)Hot 3)Negative cached objects.

cache_mem_low(*)

The low water mark for cache mem storage.

75(%)

cache_mem_high(*)

The high water mark for cache mem storage.

90(%) Though obj are out from mem when reaching this pt,they remain on disk.

cache_swap_low(*)

The low water mark for cache LRU replacement.

90(%)

cache_swap_high(*)

The high water mark for cache LRU replacement.

95(%)

The tag determining the cache disk space is in later table. (cache_dir)

Part 3 : Options which affect the cache size ( 1 )

Tag_name Use Default Remarks

maximum_object_size(*)

Objs larger than this size will not be saved on disk

4096(KB)

ipcache_size(**)

The size of the IP cache. 1024 1024 entries of IP in the cache.

ipcache_low(*) The low water mark for the IP cache.

90(%) 90% of the entries.

ipcache_high(*)

The high water mark for the IP cache.

95(%) 95% of the entries.

fqdncache_size(**)

The cache size of ‘Full Qualified Domain Name’

No mentioned

Part 3 : Options which affect the cache size ( 2 )

Tag_name Use Default Remarks

cache_dir (**) Dir structure for on-disk cache storage.

/usr/local/squid/cache

An example is showed in the buttom of this page.

cache_access_log (**)

Logs the client request activity.

/usr/local/squid/logs/access.log

Contains an entry for every HTTP and ICP request received.

cache_store_log(*)

Log the activitied of the storage manager.

/usr/local/squid/logs/store.log

Show which objs are ejected or saved (and saved for how long).

cache_swap_log(**)

Location for the cache “swap log”.

In first cache_dir

Used to rebuild the cache during startup.

emulate_httpd_log(*)

Cache can emu the log file format httpd prgs use.

Off

Cache_dir /usr/local/squid/cache 100 16 256

Part 4 : Logfile pathnames and cache directories ( 1 )

Tag_name Use Default Remarks

log_mine_hdrs(*)

The headers are encoded & at the end of access log.

Off

useragent_log(*)

Squid will write the usera-gent field.

None Should compile with “-DUSE_USERAGENT_LOG=1”.

pid_filename(**)

A pathname to write the process-id to.

In the re-mark field.

/usr/local/squid/logs/squid.pid

debug_options(**)

Logging options are set as “section,level”

‘ALL,1’

Recomded

Lower lever result in less output.

ident_lookup(*)

Make a RFC931/ident loo-kup of client usename.

Off Lookup each connnection.

log_fqdn (*)

Log fully qualified domain names in access.log .

Off

Part 4 : Logfile pathnames and cache directories ( 2 )

Tag_name Use Default Remarks

client_netmask(*)

For client addrs in logfiles and cachemgr output.

255.255.255.255

Used to protect the privacy of your cache clients.

Part 4 : Logfile pathnames and cache directories ( 3 )

Tag_name Use Default Remarks

ftp_user(***)

Used for handling some picky ftp server.

No

ftp_list_width(**)

Set the width of ftp listings

32

cache_dns_program(*)

Spec the location of the executable for dns lookup.

/usr/local/squid/bin

dns_children(**)

Num of pros spawn to serv DNS name lookups.

5

Max is 32

Set 10 in heavily loaded caches.

ftp_user [email protected]

Part 5 : Options for external support programs ( 1 )

Tag_name Use Default Remarks

redirect_program(**)

Spec the local of exec for the URL redirector.

Not use Must provide your own redirector prg currently.

redirect_children (*)

The num of redirector proc to spawn.

5

Part 5 : Options for external support programs ( 2 )

Tag_name Use Default Remarks

wais_relay Relay WAIS req to host(1st arg) at port(2nd arg).

request_size(*)

Max allowed req size in kilobytes.

100(KB)

refresh_pattern(/I) (**)

Used to determin freshness or stale of a object

reference_age(***)

Max LRU age in disk cache. 1month

refresh_pattern ^ftp: 1440 20% 10080

Part 6 : Options for tuning the cache ( 1 )

Tag_name Use Default Remarks

quick_abort_min(**)

Used to determine whether to continue download or not

Not

Mentioned

There are still “quick_abort_max” and “quick_abort_max”.

negative_ttl(*)

Time-to-Live for failed reqs.

5(mins) Certain types of errs are negatively-cached for a small amount of time.

positive_dns_ttl(*)

TTL for positive caching of successful DNS lookups

360(mins) Set ‘1’ to minimize the use of squid’s ipcache.

negative_dns_ttl(*)

TTL for negative caching of failed DNS lookups.

5(mins)

quick_abort_min 1KB quick_abort_max 16KB

quick_abort_pct 95

Part 6 : Options for tuning the cache ( 2 )

Tag_name Use Default Remarks

connect_timeout(*)

Enforce timeout on sever connections

120(sec)

read_timeout(*)

An active conn aborted c-ause no act on that conn.

15(min)

pconn_timeout(*)

Timeout for persistent connections

120(sec)

client_lifetime(**)

Max time a client is allow-ed to remain conned.

200(min) Default is designed with low-speed conns.

shutdown_lifetime (*)

Lifetime for all open sock-et during shdown mode.

30(sec) Active client will receive a ‘lifetime expire’ message after this period.

Part 7 : Timeout

Example:1. acl Cooking1 url_regex cooking

acl Recepie1 url_regex recepie

http_access deny Cooking1

http_access deny Recepie1

PS: case-sensitive for all regular expression

2. acl Cooking2 dstdomain gourmet-chef.com

http_access deny Cooking2

http_access allow all

Part 8 : Access Controls ( 1 )

Example:1. acl game dst 210.62.177.70 139.175.208.190

http_access deny game

2. acl ncturc src 140.113.0.0

http_access allow ncturc

http_access deny all

Part 8 : Access Controls ( 2 )

Part 9 : Other important tags

Tag_name Use Default Remarks

miss_access(**)

Used to force your neighb-ors to use you as sibling.

memory_pools(**)

Set to keep pools of mem-ory for future use.

On

memory_pools_limit (***)

Limit for memory useage Not set If not set,Squid will keep all memory it can.

acl localneighbors src 140.113.23.0miss_access allow localneighbors

Benchmarking tools and reports( 1 )

• Web Polygraph• SPA ( Squid Proxy Analysis )• Wisconsin Proxy Benchmark 1.0• Perfect Benchmark• NetCache Load Generator• CacheFlow Performance Testing Tool• Inktomi Large Scale Benchmark

Benchmarking tools and reports( 2 )

• On performance of Caching Proxies• Generating Representative Web Workloads

for Network and Server Performance Evaluation

• Squid Performance as a Factor of the Number of Disk Utilized

• Benchmark of Squid2.2 Stable3• SPA ( Squid Proxy Analysis )

Benchmarking tools and reports ( 3 )

• The First IRCache Web Cache Bake-off (The Official Report )

• A Survey of Proxy Cache Evaluation Techniques

未來研究項目• Prefetching mechanism

• Mechanisms for locating the best server to ask for documents

• Other possible proxy models

Related Software• Cachemgr.cgi

• echoping: A nifty Unix utility that pings your proxy with a test HTTP request. Can be used from cron to warn you if the cache is down.

• Squirm: squid cache redirector

Other notes:The difference between ipcache and fqdncache

IP Cache Contents:

Hostname Flags lstref TTL N [IP-Number]

gorn.cc.fh-lippe.de C 0 21581 1 193.16.112.73

lagrange.uni-paderborn.de C 6 21594 1 131.234.128.245

www.altavista.digital.com C 10 21299 4 204.123.2.75 ...

2/ftp.symantec.com DL 1583 -772855 0

Flags: C --> Cached

D --> Dispatched

N --> Negative Cached

L --> Locked

lstref: Time since last use

TTL: Time-To-Live until information expires

N: Count of addresses

FQDN Cache Contents:

IP-Number Flags TTL N Hostname

130.149.17.15 C -45570 1 andele.cs.tu-berlin.de

194.77.122.18 C -58133 1 komet.teuto.de

206.155.117.51 N -73747 0

Flags: C --> Cached

D --> Dispatched

N --> Negative Cached

L --> Locked

TTL: Time-To-Live until information expires

N: Count of names

未來研究項目• 內部 / 外部防火牆• 處理加密 / 解密資料• 主機與使用者的認證• One-Time 密碼認證系統• Scalability