interface between kernel and user space

29

Upload: susant-sahani

Post on 27-Jan-2015

790 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 2: Interface between kernel and user space

User Space

Kernel Space

netlink socketrtnetlink socket

include/linux/pkt_cls.hinclude/linux/pkt_sched.h

net/netlink

tc

struct sockaddr_nlstruct nlmsghdr

net/core/rtnetlink.clinux/include/rtnetlink.h

OverviewOverview

Page 3: Interface between kernel and user space

Boot TimeBoot Time

__initfunc

pktsched_init

net/core/dev.c

net/sched/sch_api.c

• declarations

• binding

Page 4: Interface between kernel and user space

pktsched_initpktsched_init

struct rtnetlink_link *link_p;

if (link_p) {link_p[RTM_NEWQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_DELQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].doit = tc_ctl_qdisc;link_p[RTM_GETQDISC-RTM_BASE].dumpit = tc_dump_qdisc;link_p[RTM_NEWTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_DELTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].doit = tc_ctl_tclass;link_p[RTM_GETTCLASS-RTM_BASE].dumpit = tc_dump_tclass;}

Page 5: Interface between kernel and user space

User level ApplicationUser level Application

Create netlink socketsendtonetlink_sendmsg

rtnetlink_rcv_msgcall function in rtnetlink_link

net/core/rtnetlink.c

net/netlink/af_netlink.c

Page 6: Interface between kernel and user space

nl_tablenl_table

nl_table : array of INET socket linked list

Page 7: Interface between kernel and user space

rtnetlink_linksrtnetlink_linksrtnetlink_links : array of

pointers to rtnetlink_linkrtnetlink_link : command

Page 8: Interface between kernel and user space

TC programTC program

do_qdisc

do_class

do_filter

tc_qdisc_modify

tc_qdisc_list

usage

Page 9: Interface between kernel and user space

tc_qdisc_modifytc_qdisc_modifyallocate “req”initialize it

Page 10: Interface between kernel and user space

tc_qdisc_modify (con’t)tc_qdisc_modify (con’t)

rtnl_open : create ‘rtnetlink’ socketfamily = AF_NETLINKtype = SOCK_RAWprotocol = NETLINK_ROUTE

setup and bindlocal address, sockaddr_nl local

call “rtnl_talk”

Page 11: Interface between kernel and user space

rtnl_talkrtnl_talkallocate “msghdr msg”

call “sendmsg” sys_sendmsg

Page 12: Interface between kernel and user space

sys_sendmsgsys_sendmsg

Kernel SpaceUser space

Copyreqmsg

reqmsg

• sock_sendmsgsock_sendmsg

scm_cookie scmcall ‘scm_send’call socket’s ‘sendmsg’ = netlink_ops

netlink_sendmsg

Page 13: Interface between kernel and user space

netlink_sendmsgnetlink_sendmsg

skbuffmemcpy_from_iovec

msg msg

• netlink_broadcastnetlink_broadcast• netlink_unicastnetlink_unicastdstgroups

Page 14: Interface between kernel and user space

netlink_unicastnetlink_unicastsocket’s protocol

find ‘linked list’ in nl_tablel

pid

add_wait_queue

socket’s receive queue

call ‘data_ready’ = rtnetlink_rcv

skbuff

Page 15: Interface between kernel and user space

rtnetlink_rcvrtnetlink_rcv

socket’s receive queue skbuff

invoke ‘rtnetlink_rcv_skb’

Page 16: Interface between kernel and user space

rtnetlink_rcv_skbrtnetlink_rcv_skb

nlhskbuff

invoke ‘rtnetlink_rcv_msg’

passing ‘nlh’

Page 17: Interface between kernel and user space

rtnetlink_rcv_msgrtnetlink_rcv_msg

invoke ‘doit’ in ‘rtnetlink_link’In this case, doit = tc_modify_qdisc

Page 18: Interface between kernel and user space

middle summarymiddle summary

User Space

Kernel Space

tc

netlink, rtnetlink

nlmsghdr, tcmsg

rtnetlink_rcv

tc_modify_qdisctc_ctl_tfilter

tc_get_qdisc

Page 19: Interface between kernel and user space

tc_modify_qdisctc_modify_qdisc

dev_get_by_index index = tcm->tcm_ifindex

if qdisc parent is set, call ‘qdisc_lookup’ : Find parent

Q call ‘qdisc_leaf’

Page 20: Interface between kernel and user space

tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)

if tcm->tcm_handle is not empty, call ‘qdisc_lookup’ for band Q

graftcreate_n_graft

fail

Page 21: Interface between kernel and user space

tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)

if tcm->tcm_handle is empty,if q is empty

elsecreate_n_graft

create graft

Page 22: Interface between kernel and user space

tc_modify_qdisc (con’t)tc_modify_qdisc (con’t)

if (tcm->tcm_parent is not specified),if (tcm->tcm->handle is not

empty)then call ‘qdisc_lookup’

call qdisc_change(q,tca) ‘qdisc_change’ call ‘prio_tune’

Page 23: Interface between kernel and user space

create_n_graftcreate_n_graft

qdisc_create

dev, tcm->tcm_handle, tca, &err

Page 24: Interface between kernel and user space

qdisc_createqdisc_create

find qdisc’s kindusing kind, get ‘Qdisc_ops’allocate space for Q displinecall ‘skb_queue_head_init’set up ‘enqueue’, ‘dequeue’call ‘ops->init’

= prio_initinsert new Q into qdisc_list

Page 25: Interface between kernel and user space

graftgraft

call ‘qdisc_graft’connect ‘new’ to parent’s class

or devif parent Q displine is empty,

call ‘dev_graft_qdisc(dev,new)’else call ‘get’ from classcall ‘qdisc_notify’

Page 26: Interface between kernel and user space

dev_graft_qdiscdev_graft_qdisc

dev_deactiveput old ‘qdisc_sleeping’ to ‘oqdisc’if new Q is empty,

set new Q to noop_qdiscthen, set dev’s qdisc_sleeping to new Q,

dev->qdisc to noop_qdiscReactive device

Page 27: Interface between kernel and user space

prio_getprio_get

get minor class ID

prio_graftprio_graft

using minor class ID as index which band

Page 28: Interface between kernel and user space

qdisc_chageqdisc_chage

directly call ‘sch->ops->change’ chage = prio_tune

Page 29: Interface between kernel and user space

prio_tuneprio_tune

argument opt contains ‘bands’outside band is set by ‘noop_qdisc’update child Q by ‘prio2band array’if Q == noop_qdisc

qdisc_create_dfltqdisc_creat_dflt set up child Q set up operator to ‘pfifo_qdisc_ops’