*ASCII* Last changed: 2001 Jul.18 MPLS for Linux How-To < for comments and suggestions related to this how-to send email > < to the following address: mistvan@entropy.tmok.com > [1]. Getting the sources ................................... 1 [2]. Setting up the compilation environment ................ 2 [3]. Compile and Install kernel(patch)/tools ............... 3 [4]. Basic examples that anyone can do ..................... 4 [5]. How to debug when something fails ..................... 5 [6]. Descriptions of what is happening behind the scenes ... 6 [7]. LDP: ldp_portable ..................................... 7 [8]. LDP: ldp_zebra ........................................ 8 [9]. Quick setup (reference card) .......................... 9 [10]. Further documentation resources ....................... 10 This story is about MPLS, the "beast" that lives between L2 and L3, a place where just few of the protocols dared to wander before. So lets begin ... MPLS (Multi-Protocol Label Switching) consists of 3 basic components: 1. MPLS forwarding component 2. MPLS signalling protocols 3. MPLS L3 mapping plane, i.e. mapping layer 3 traffic onto MPLS LSPs MPLS forwarding plane uses two pieces of information to take forwarding decision, namely: the information contained in an MPLS forwarding table maintained by the LSR the second being the label carried in the packet. Planes 2. and 3. can be considered to be part of the same MPLS component, called control component. The control component creates label bindings (label to FEC) and distributes these binding tables to other LSRs. For example an LSR creates a label for a set of destinations : a). traffic to 1.1.1.0/24 , or b). 'traffic going through router B' , or c). 'high priority traffic', then, it distributes the newly created label and the associated meaning to the other LSR or LSRs via the signalling protocols. Examples of MPLS signalling protocols are: RSVP-TE LDP and CR-LDP. The "ldp-portable" component of MPLS for Linux is an implementation of LDP and contains more information about LDP based MPLS signalling. Mapping of layer 3 traffic to MPLS LSPs can be accomplished in a couple of ways: a). Per FEC, where FEC is an entry in the routing table; b). Virtual interface that represents an LSP, then routes data flow out that interface (i.e. route add -net 1.1.1.0/24 gw mpls0 , where mpls0 is the virtual interface); Sections [5]. & [6]. have a more detailed description of each of this functional components, as well as 'label' switching. MPLS-Linux -- was written by James R. Leu (jleu@mindspring.com) and basically consists of a Linux kernel patch that adds MPLS forwarding capabilities for the Linux Kernel thus enabling the Linux community to build label switched networks. The package also includes an MPLS administration tool called "mplsadm", which is used to manage "static" MPLS networks ("static" MPLS means that all MPLS parameters are set manually by a human operator). In a network where dynamic configuration of MPLS parameters is necessary, an LDP (label distribution protocol) will have to be used. A lot of useful info might be gathered if you join the 'MPLS for Linux' mailing list or consult its archives. If possible please avoid posting 'MPLS for Linux' related questions to mailing lists that do not have this issue as main topic. More info about the list can be found at: http://lists.sourceforge.net/lists/listinfo/mpls-linux-general The latest version of this how-to as well as some basic scripts can be downloaded from the following locations: a). http://entropy.tmok.com/~mistvan b). http://stone.staff9.ul.ie/archives/mpls/ c). http://e.f.g.h [1]. Getting all the sources needed Before starting to mess around with the kernel it is always a good idea to save some info about your system and current kernel configuration, so that it can be used as references while building the new kernel. Save your current kernel configuration (stored in /usr/src/linux/.config): root# cd /usr/src root# mkdir configs root# cp /usr/src/linux/.config ./configs/.config.x.y.z Where x.y.z is your kernel version identifier. If you don't know it, type : root# uname -a to find out about this one. Then save your boot-up messages generated by the current kernel : root# dmesg /usr/src/configs/.dmesg.x.y.z Let's get the mpls-linux sources : +Download method #1 (HTTP) The latest stable mpls-linux version : From the project home-page : http://sourceforge.net/projects/mpls-linux/ or from the MPLS-Linux archive if you want older versions too: http://prdownloads.sourceforge.net/mpls-linux/ +Download method #2 (CVS) The latest version, may not be stable : Make sure you have installed CVS on your computer by running : rpm -qi cvs If cvs is installed do the following : root# cd /usr/src root# cvs -z3 -d:pserver:anonymous@cvs.mpls-linux.sourceforge.net:/cvsroot/mpls-linux/ checkout mpls-linux Ok, now go to http://www.kernel.org/pub/linux/kernel/v2.4/ or a mirror of this site and grab the latest kernel version (you may find out which one is the latest by looking at the directory listing, there will be usually an empty file called LATEST-IS-2.4.x). Or FTP to ftp.kernel.org/pub/linux/kernel/v2.4/ Before installing the new kernel sources please make sure to do the following steps: root#rpm -e kernel root#rm -rf /usr/src/linux root#rm /usr/include/linux root#rm /usr/include/asm Uncompress the kernel sources : +Case 1. root#mv linux-2.4.5.tar.bz2 /usr/src root#cd /usr/src root#tar -xIvf linux-2.4.5.tar.bz2 +Case 2. root#mv linux-2.4.5.tar.gz /usr/src root#cd /usr/src root#tar -xzvf linux-2.4.5.tar.gz root#ln -s /usr/src/linux/include/linux /usr/include/linux Then create the following symlink : root#ln -s /usr/src/linux/include/asm /usr/include/asm "/usr/src/linux/include/asm" is in fact another symlink : /usr/src/linux/include/asm -> asm-ARCH Where ARCH is the CPUs architecture, examples: /usr/src/linux/include/asm -> asm-i386 # if you have (i386) CPU /usr/src/linux/include/asm -> asm-arm # if you have (arm) CPU (some other architectures supported are DEC Alpha, SUN SPARC, M68000, MIPS and PowerPC) [2]. Setting up the compilation environment Check the version of your gcc compiler, you should have at least version 2.91.66 . root# gcc -v to check the rpm from which it was installed : root# rpm -qif /usr/bin/gcc According to the output of "rpm -qif" check if the compiler variable from your Makefile is correctly set, i.e. if you have kgcc and in the Makefile there is CC=gcc then change it. Or vice versa. [3]. Compile and install kernel / mpls-linux tools After uncompressing the archive containing the mpls-linux sources, it is a good idea to have mpls-linux as a symlink to the source tree of the latest mpls-linux-version. (so that you don't have to keep changing PATH every time you compile a new mpls-linux); ln -s /usr/src/mpls-linux-X.Y /usr/src/mpls-linux # where, X.Y are mpls-linux mpls-linux-X.Y drwxrwxr-x 5 root root mpls-linux-X.Y/ X.Y can be for example 0.993 or whatever the latest version is. Now it is ok to patch the kernel for mpls-linux: root@lsr1#cd /usr/src/linux root@lsr1#patch -p1 < ../mpls-linux/patches/linux-mpls.diff Proceed to kernel configuration: root@lsr1#make menuconfig a. Under "[Code maturity level options]" \ - turn ON "[Prompt for development and/or incomplete code/drivers]" b. Under "Networking Options" \ - turn ON "Kernel/User netlink socket" - turn ON " Routing messages" - turn ON "Multi-Protocol Label Switching" c. Do other kernel configuration As a suggestion I say it is a good idea to turn on mcast options and a few of the advanced router options (at least ip forwarding). d. Compile and Install kernel root# make dep ; make clean ; make bzImage ; make modules ; make modules_install root# cp arch/i386/boot/bzImage /boot/bzImage245 root# joe /etc/lilo.conf # or vi or emacs or whatever :) Add this to /etc/lilo.conf : -- snip -- image=/boot/bzImage245 root=/dev/your_root_device label=kmpls -- snip -- Be sure to set the new kernel as default, default=kmpls (at the top of the lilo.conf file) Then as root, run: root# lilo As a result of the execution of lilo , you have to see among other things the following lines: -- snip -- Added kmpls * -- snip -- Reboot : root# shutdown -r now After rebooting check your kernel debug messages for keyword "MPLS" : root# dmesg | grep MPLS MPLS version 0.993 06/05/2001 jleu@mindspring.com MPLS Tunnel interface root# e. Compile mplsadm Change directory to mpls-linux/utils/ then check the following environment variables: CC, CFLAGS , in your makefile (i.e. /usr/src/mpls-linux/utils/Makefile ). Usually you should do fine with the default values, but it's ok to check anyway. If you're not satisfied :) edit and modify as necessary. You can modify the Makefile in your mpls-linux/utils/ directory to look like this : --snip-- CC=gcc CFLAGS = -O2 -Wall #-g -Wall # all: mplsadm strip ./mplsadm # # mplsadm: netlink.o mplsadm.o gcc -O2 -static -o $@ $^ #old: gcc -g -static -o $@ $^ clean: rm -rf *.o mplsadm # --snip-- The -g and -Wall options are useful mostly to developers, and -g also significantly increases the size of the binary file, so generally it's a good idea to get rid of them at this stage. Also add the -O2 optimization (increases the performance of the generated code) and also strip in order to discard symbols. The result will be a smaller utility that will supposedly run faster ;) Now that you have modified the Makefile from mpls-linux/utils/ cd to the mpls-linux/utils/ directory and type make all. Don't forget to add the full path to mpls-linux/utils/ to your PATH env variable. Example: Add the following lines to your /etc/profile : --snip-- export MPLSHOME=/usr/src/mpls-linux export PATH=$PATH:$MPLSHOME/scripts:$MPLSHOME/utils --snip-- To actually compile mplsadm do: make mplsadm Now, untar mpls-linux-scripts.tgz to /usr/src/mpls-linux/scripts , the tarball contains some shell scripts that you can play with. (the scripts can be downloaded from http://entropy.tmok.com/~mistvan). At the moment there are only two of them: 1. sh_mpls_conf - displays the configuration parameters of your mpls LSR, somewhat friendlier than cat /proc/net/mpls_* 2. rm_mpls_conf - deletes all label mappings and other mpls configuration parameters, except the label space values, since this values will remain 0 in most of the cases. [4]. Basic examples that anyone can do [4.1]. Example 1. The Label Switched Path or, Close Encounters of the Third Kind with static LSPs The example consists in configuring host A and B such that an LSP will be established between them, all data going from A to B or from B to A will be encapsulated in MPLS header and go down the LSP. Obs: throughout this document the word "static" will be used in association with MPLS terms to denote that parameters are set manually by a human operator. In order to build a static LSP between two PCs you will have to follow a simple scenario like the following: +--------+ +----------+ | LSR1 |---------------------------------------------------| LSR2 | +--------+ +----------+ Configuration parameters for this example: +--------+-------+--------------+---------------------+-----------------+ |LSR No. |IF_NAME| IF_IPADDR | IF_IPMASK | IF_BCAST | +--------+-------+--------------+---------------------+-----------------+ | 1 | lo | 127.0.0.1 | 255.0.0.0 (/8) | 127.255.255.255 | +--------+-------+--------------+---------------------+-----------------+ | 1 | eth0 | 192.168.5.33 | 255.255.252.0 (/22) | 192.168.7.255 | +--------+-------+--------------+---------------------+-----------------+ | 2 | lo | 127.0.0.1 | 255.0.0.0 (/8) | 127.255.255.255 | +--------+-------+--------------+---------------------+-----------------+ | 2 | eth0 | 192.168.6.60 | 255.255.252.0 (/22) | 192.168.7.255 | +--------+-------+--------------+---------------------+-----------------+ +----------------+--------------+ | 192.168.5.33 | (0xC0A80521) | +----------------+--------------+ | 192.168.6.60 | (0xC0A8063C) | +----------------+--------------+ General info about the IP addresses used in this example: (Class C) Network: 192.168.4.0/22 11000000.10101000.000001 00.00000000 Broadcast: 192.168.7.255 11000000.10101000.000001 11.11111111 HostMin: 192.168.4.1 11000000.10101000.000001 00.00000001 HostMax: 192.168.7.254 11000000.10101000.000001 11.11111110 Goal of the example: By carrying out this example you'll set up a static Label Switched Path (LSP) between two LSRs, namely LSR1 and LSR2. All traffic from LSR1 to LSR2 and vice versa will be encapsulated in an MPLS header and will get a label corresponding to it's destination (i.e. FEC - forwarding equivalence class). This is some pseudo-code to illustrate in great lines the mpls label switching mechanism that will take place after you will setup the above mentioned LSP : [1. i = get_BOS(incomming_packet) ] IN [2. if (i != 0) then { ] [packet]-->---- LSR1 [3. tmp = pop(top_of_label_stack_packet ] [4. if(lookupFW_TABLE(tmp)) { ] [5. swapp_label(tmp)&decrease TTL ] OUT (case #1) [6. mpls_fwd(packet_with_new_label_stack)} -]-->---- next_hop LSR [7. else { rm_shim(packet) & check(DLV) ] [8. pass_to_kernel(packet) } -]-->---- kernel OUT (case #2) Configuring LSR1: ----------------- Configure/Reconfigure one of your Ethernet interfaces, in this example I used eth0. You will have to tell to the kernel what you plan to do, so as root carry out the following commands : root@lsr1# ifconfig eth0 192.168.5.33 netmask 255.255.252.0 broadcast 192.168.7.255 up root@lsr1# route add -host 192.168.6.60 gw 192.168.6.60 The IPv4 kernel routing table on LSR1 (A) should look like this: Destination Gateway Genmask Flags Iface 192.168.6.60 192.168.6.60 255.255.255.255 UGH eth0 192.168.4.0 0.0.0.0 255.255.252.0 U eth0 127.0.0.0 0.0.0.0 255.0.0.0 U lo Now step to the second PC: Configuring LSR2: ----------------- root@lsr1# ifconfig eth0 192.168.6.60 netmask 255.255.252.0 broadcast 192.168.7.255 up root@lsr1# route add -host 192.168.5.33 gw 192.168.5.33 IPv4 Routing Table on LSR2 (B): Destination Gateway Genmask Flags Iface 192.168.5.33 192.168.5.33 255.255.255.255 UGH eth0 192.168.4.0 0.0.0.0 255.255.252.0 U eth0 127.0.0.0 0.0.0.0 255.0.0.0 U lo Building the MPLS Forwarding Tables: Steps: ------ ** On LSR1 ** root@lsr1# mplsadm -A -B -O gen:2614:eth0:ipv4:192.168.6.60 -f 192.168.6.60/32 # i.e. establish a mapping between outgoing label 2614 and next_hop while specifying # label type and outgoing interface */ root@lsr1# mplsadm -A -I gen:1234:0 # add incoming label 1234 to local label space 0 root@lsr1# mplsadm -L eth0:0 #enable MPLS on interface eth0 by assigning it to label space 0 Result: MPLS FW_TABLE for LSR1 +-------------------------+-----------+---------+---------------+----------+ | (incomming label)_index | out_label | out_int | next_hop_type | next_hop | +-------------------------+-----------+---------+---------------+----------+ | 1234 | 2614 | eth0 | ipv4 | C0A8063C | +-------------------------+-----------+---------+---------------+----------+ ** On LSR2 ** root@lsr2# mplsadm -A -B -O gen:1234:eth0:ipv4:192.168.5.33 -f192.168.5.33/32 root@lsr2# mplsadm -A -I gen:2614:0 root@lsr2# mplsadm -L eth0:0 Result: MPLS FW_TABLE for LSR2 (B) +-------------------------+-----------+---------+---------------+----------+ | (incomming label)_index | out_label | out_int | next_hop_type | next_hop | +-------------------------+-----------+---------+---------------+----------+ | 2614 | 1234 | eth0 | ipv4 | C0A80521 | +-------------------------+-----------+---------+---------------+----------+ After carrying out the instructions from the steps corresponding to LSR1 and LSR2 you will have all traffic from A to B and B to A going through the LSP. You can easily check mpls traffic using ethereal to take a peek. ( * ethereal * is a network protocol analyzer, it can be downloaded from: http://www.ethereal.com ) The current MPLS setting can be checked with the "sh_mpls_conf" script or if you've done something wrong or just want to get rid of the mpls settings run "rm_mpls_conf". The installation of these scripts was explained in section [3]. To actually see that the connection between the two hosts is going through an MPLS LSP, enable debug with the mplsadm -d command, or check dmesg. If you have mplsadm -d enabled you can check the messages in /var/log/messages as well. [4.2]. Example 2. MPLS Tunnels, or The light at the end of the tunnel might be an oncoming packet MPLS tunnels are an abstraction used to represent an LSP to routing protocols or IP services in a way which they can easily understand, i.e. as a network interface. MPLS tunnels are also one way of doing LSP hierarchies. Imagine an MPLS tunnel as a bidirectional tube/pipe system used in a bank in order to allow offices situated on different floors to exchange among them money or documents at high speed and across multiple floors using plastic capsules and wrapping paper of different colors. Rules: - a capsule has origin, destination, intermediate origin and intermediate destination; - origin and destination might be any person working in an office or having access to an office - intermediate origin and destination are the offices; - when office A sends a capsule to B it must be wrapped in red paper, thus hiding the origin and destination info. Only the intermediate origin and destination info will be available for an observer at an intermediate check point. - when office B sends a capsule to A it must be wrapped in blue paper, blue paper has the same role as the red one, hides info that is irrelevant on the "road" between source and destination office; - although red & blue capsules pass through multiple floors and check points, nobody is allowed to check their content, i.e. origin or destination. Only the intermediate destination can be seen. (the intermediate origin can be identified from the wrapping color) Think of the colored wrappings as mpls encapsulation and of the intermediate origin and destination as the two ends of an MPLS tunnel, i.e. two MPLS tunnel interfaces. The origin and destination can be any network that is routed through the respective MPLS tunnel interfaces. Now lets step towards the practice: +---------------+ | Router 1 (R1) |-------------------------------- +---------------+ [5]. How to debug when it fails If you have problems always enable debugging, Enable debugging with: mplasdm -d Note that you can disable debugging by repeating the same command. After debugging is enabled save the debug messages, read them and try to find out if the mistake is yours or something is wrong with the mpls-linux distribution. [6]. Descriptions of what is happening behind the scenes 6.0.1 What is this MPLS ? MPLS is a label switching/swapping framework that wants to play an important role in the future of packet forwarding through high speed networks. The catalyst that pushed IETF towards the development of MPLS was the increased need for service requirements of the network users. MPLS is independent of Layer 2 and Layer 3. At the "entrance" of an MPLS domain (ingress router), i.e. basically the entrance of an LSP, the incomming IP packet is encapsulated in MPLS header and sent down the LSP, at the end of the LSP the egress router restores the IP packet by stripping off the MPLS header, the TTL is adjusted by default while the packet is flowing through the LSP. 6.0.2 MPLS Header Format: 00000000000000000000 000 0 00000000 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Label |Exp|S| TTL | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Label : 20 bits Exp : 3 bits BO(S) : 1 bit TTL : 8 bits ------------------------ MPLS_Header : 32 bits (Label): the label is used to match a packet against a corresponding LSP ; (Exp) : experimental bits, this field carries packet queueing priority (CoS) ; (BOS) : bottom of label stack bit ; (TTL) : copied from ip TTL field, has the same task, loop prevention ; / [2614][0][1][64] _B10 Example: -|- [0000 0000 1010 0011 0110] [000][1] [0100 0000] _B2 \ 0x00A36140 _B16 6.0.3 MPLS and the Linux Kernel, or "a packets life" in MPLSland As said at the beginning of this document MPLS is an extension to the "classic" TCP/IP layered set of protocols, thus as in the case of any data packet coming in or going out the NIC, the OS kernel needs a group of functions to process each TCP/IP layer in order to fully "understand" the informations carried in the packet's meta-data and accordingly connect it to a socket. At first, kernel routines process the packet's data link-layer header, then the packet passed over to either the MPLS plane or to the network layer routines. The MPLS stack registers with the Linux networking stack to receive all packets with type ether-type 0x8847 (just like the IP stack registers for packets with ether-type 0x0800). In: /usr/src/linux/include/linux/if_ether.h --snip-- #define ETH_P_IP 0x0800 /* Internet Protocol packet */ --snip-- #define ETH_P_MPLS_UC 0x8847 /* MPLS Unicast traffic */ #define ETH_P_MPLS_MC 0x8848 /* MPLS Multicast traffic */ --snip-- Ethertype values for MPLS are defined in RFC3032 - "MPLS Label Stack Encoding". (http://www.ietf.org/rfc/rfc3032.txt) This is done inside of mpls_init() in mpls_init.c. The result is that any PPP or Ethernet frames that contain a label stack will be sent to the function mpls_rcv() in mpls_input.c . MPLS for Linux has a notion of 'instructions'. These instructions modify the packet and direct it through the MPLS stack. In mpls_rcv() the first thing that is done is a 'peek'. This instruction looks at the label on the top of the label stack and looks it up in the ILM (Incoming Label Map). Looking at the code, you might find out that James R. Leu sometimes refers to ILM as 'incoming label info' or you'll see variables that contain 'mii' (MPLS incoming info). An ILM maps each of the incoming labels to a set of Next Hop Forwarding Entries. The peek will yield an ILM entry. Attached to the ILM entry is an array of instructions. This array is processed by mpls_input() in mpls_input.c. Ex: --- Consider the following MPLS settings: mpls_fec { 40134802 192.168.5.30/32 }; mpls_in { 4028d800 gen 2614 0 POP DLV }; mpls_labelspace { lo 0 eth0 0 eth1 0 }; mpls_out { 40134802 PUSH(gen 1234) SET(eth0) }; mpls_tunnel { }; "The way" of an incoming packet can be illustrated as: +- mpls_rcv: enter | | +- mpls_opcode_peek: enter | | | +- mpls_opcode_peek: exit | | +- mpls_input: enter | | mpls_input: labelspace=0,label=2614,exp=0,B.O.S=1,TTL=255 | | mpls_input: pop | | | | mpls_opcode_pop: enter | | mpls_opcode_pop: exit | | | | mpls_input: mii_proto 8 | | | | mpls_finish: enter | | mpls_finish: exit | | | | mpls_input: setting ttl 255 | | mpls_input: sending to IPv4 | | | | skb_dump: from eth0 with len 53 (264) headroom=32 tailroom=7 | | 3c343e4a756c2031322031373a34383a3330*0020af9a2a02000102da22db | | 8847{#|45000035ef444000ff0600f2c0a8051ec0a8051d127d00179747777 | | a84d333e580185fe4572900000101080a01faa75d03b7a4fd08b7a4fd0860} | | | | mpls_input: retval from ip_rcv 0 | | mpls_input: finished executing in label program | +- mpls_input: exit | +- mpls_rcv: exit [10]. Further documentation resources * BGP/MPLS VPNs (http://www.ietf.org/rfc/rfc2547.txt) * Requirements for Traffic Engineering Over MPLS (http://www.ietf.org/rfc/rfc2702.txt) * A Core MPLS IP VPN Architecture (http://www.ietf.org/rfc/rfc2917.txt) * Multiprotocol Label Switching Architecture (http://www.ietf.org/rfc/rfc3031.txt) * MPLS Label Stack Encoding (http://www.ietf.org/rfc/rfc3032.txt) * Use of Label Switching on Frame Relay Networks Specification (http://www.ietf.org/rfc/rfc3034.txt) * MPLS using LDP and ATM VC Switching (http://www.ietf.org/rfc/rfc3035.txt) * LDP Specification (http://www.ietf.org/rfc/rfc3036.txt) * LDP Applicability (http://www.ietf.org/rfc/rfc3037.txt) * VCID Notification over ATM link for LDP (http://www.ietf.org/rfc/rfc3038.txt) * MPLS Loop Prevention Mechanism (http://www.ietf.org/rfc/rfc3063.txt)