Is it a issue to achive proper QoS for IP tunneled traffic over GRE encrypted by IPsec tunnel – for Cisco router is not the case. With use of GRE/IPsec ToS byte is copied from the original IP header to the new GRE and IPsec IP headers. It’s done by default without any specific configuration, this process is called the “ToS Byte Preservation” feature.
For example if router gets IP packet with DSCP value equal to 46 then packet is encapsulated in to GRE header, ToS byte is copied to the GRE IP header ToS field. In case this GRE packet is subject to IPsec encryption the same process occurs and ToS byte from GRE headers is copied into IPsec IP header – all is done by automatically. But what if QoS policy would classify and do action based on original source IP and specific TCP or UDP port. Solution is the QoS pre-classify feature. It allows router to temporarily copy and store the layer 3 and 4 headers from the original IP packet and take action based on these values so classify, queue and schedule on the router’s egress interface accordingly.

To achive above QoS feature have to be enabled both under GRE tunnel and IPsec crypto map as follows:
Router(config)#int tun 0
Router(config-if)#qos pre-classify
Router(config-if)#exit
Router(config)#crypto map MAPA 100 ipsec-isakmp
Router(config-crypto-map)#qos pre-classify

Now you are ready to go with class-map and policy-map. Remember that in this case policy needs to be assigned to physical interface where crypto map is configured already.

, , , , , ,

Ping tool is very usful to discover or understand some network behaviours related to network protocols or services that are run on Cisco routers or MS Windows. Before we start with ping test, first of all we have to know how this tool has been deployed in both systems.

Datagram/Packet size (IP Total Length field) = IP Header + Payload
As name imply it’s datagram/packet size so IP Header + Payload.

With Cisco IOS the case is simple. In below example packet size is equal to 1000 bytes, it means that ICMP Echo ping message will be generated with standard 20 bytes for IP Header + 980 for ICMP Payload (ICMP type field: 1B, code: 1B, checksum: 2B, identifier: 2B, sequence number: 2B, data: 972B).

R1#ping 10.0.23.3 repeat 1 size 1000
Type escape sequence to abort.
Sending 1, 1000-byte ICMP Echos to 10.0.23.3, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 80/80/80 ms

In case of Windows OS it’s more complicated. We can define specific data size (don’t mix up it with payload) so in this example 1000B means exactly ICMP data in payload + fields in ICMP + IP Header. In this case IP packet size will have 20 bytes for IP Header + 1008 bytes for ICMP Payload (ICMP type field: 1B, code: 1B, checksum: 2B, identifier: 2B, sequence number: 2B, data: 1000B). So in case of Ethernet network frame will have 1042B (14B Ethernet II Header included). 

C:\>ping 10.0.10.1 -l 1000 -n 1

Here you can find sniff of ICMP echo request based on the Windows ping tool.

To recap:
Cisco ping size = total IP packet length
Windows ping size = just ICMP data without ICMP 8B fields like type, code, checksum, identifier and sequence.

Fragmentation and Maximum Transmission Unit (MTU)

For IP protocol MTU defines size of IP packet (so IP Total Length field) can can be send thru network device interface. As packets are encapsulated into frame in data link layer they have to be small enough to be transmited by the physical transmission technology.  In case packet is larger then maximum size of underlying network technology there is need to divide an IP packet in to smaller IP chunks. This process is called as IP fragmentation, puttin chunks all together back on the second end of the transmission is called as reassemble, the reassemble of IP packets is done on the IP destination end.

Fragmentation issues

Fragmentation couse more overhead for the receiver then to sender. Device responsible for fragmentation needs to create new header and devide orignal pacekts into fragments. From the other side receivermust allocate memeory to properly serve all fragments and consolidate them all togther. It is not a issue for final destinatio like a host but could couse a problem for routers.

Several issues related to IP fragmentation that couse that is should be avoided:

  • Requires the support of slow path processing on the routers and additionaly use CPU of the receiver part  to reassemble the fragments (forwarding functions is done by software so process switched). 
  • In case router gets fragmented needs to do IP reassembling it reserve the largest buffer available with which to work because it has no idea what size of the original IP packet will be get.
  • If one fragment of an IP packet is dropped, then the entire IP packet must be resend and fragmented again.

 Fragmentation with GRE and IPsec tunnel

The biggest issue with IP fragmentation is related to GRE tunnel. Let’s take a look on the following example. Router A gets 1476B packet then adds GRE header into and sends to tunnel destination. Assume there is a router in the MPLS cloud with link MTU 1400B. This router will fragment GRE packet (GRE IP header, inner origanl IP and TCP headers will be only in the first fragment), in this case GRE tunnel destination router must reassemble the fragmented packets.

To avoid IP fragmentation the best way is to increase the MTU on whole packet way (so on each router on the path), in case of MPLS provider is not the issue because more and more vendors already increased this value up to 1600B to support IPsec or GRE without any problems. In case provider does not support higher MTU or transmit network is Internet we have three options: decrease IP MTU on tunnel interface, adjust MSS value or use PMTUD.  Let’s move from fragmentation theory and take a look how to take advantage of ping to discover the fragmentation issue related to MTU on the packet way and how it can be avoided.

Example – pure network

Suppose that we have pure network enviroment, two CE routers one in branch A and second one in remote branch B, we would confirm that MTU on the path especially minimum MTU in the provider MPLS cloud is equal to 1500B.

To test the MTU on the path the best option is to send the packets each time incrementing overall size. Extended ping feature with sweep option is perfect in this case. In the below example we do send ICMP echo request eith overall size equal to 1490B, we set sweep max size to 1510B where weep interval so we are going to increment each packet by 1B. Important option in this example is setting of Don’t Fragment bit.
R1#p
Protocol [ip]:
Target IP address: 10.0.23.3
Repeat count [5]:
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: y
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]: y
Sweep min size [36]: 1490
Sweep max size [18024]: 1510
Sweep interval [1]:
Type escape sequence to abort.
Sending 105, [1490..1510]-byte ICMP Echos to 10.0.23.3, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!..........!!!!!!!!!!!..........!!!!!!!!!!!..........!!!!!!!
!!!!..........!!!!!!.
Success rate is 54 percent (50/91), round-trip min/avg/max = 40/99/224 ms

We have sent out 10 ICMP echo request starting from 1490B and we have got 10 responses, but next 10 pacets has not been answered at all, it’s means that packets was not be able to reach a final destination due to MTU, in fact pacekts longer then 1500B even has not been routed and leave router. Tested router has MTU equal 1500B configured on the interfaces so in the debug you can show that packets longer then 1500B has been routed just by FIB not by RIB like others.
*Mar 1 00:16:42.179: IP: tableid=0, s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), routed via FIB
*Mar 1 00:16:42.183: IP: s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), len 1499, sending
*Mar 1 00:16:42.183: ICMP type=8, code=0
*Mar 1 00:16:42.219: IP: tableid=0, s=10.0.23.3 (FastEthernet0/0), d=10.0.12.1 (FastEthernet0/0), routed via RIB
*Mar 1 00:16:42.223: IP: s=10.0.23.3 (FastEthernet0/0), d=10.0.12.1 (FastEthernet0/0), len 1499, rcvd 3
*Mar 1 00:16:42.227: ICMP type=0, code=0
*Mar 1 00:16:42.231: IP: tableid=0, s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), routed via FIB
*Mar 1 00:16:42.231: IP: s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), len 1500, sending
*Mar 1 00:16:42.231: ICMP type=8, code=0
*Mar 1 00:16:42.283: IP: tableid=0, s=10.0.23.3 (FastEthernet0/0), d=10.0.12.1 (FastEthernet0/0), routed via RIB
*Mar 1 00:16:42.287: IP: s=10.0.23.3 (FastEthernet0/0), d=10.0.12.1 (FastEthernet0/0), len 1500, rcvd 3
*Mar 1 00:16:42.291: ICMP type=0, code=0
*Mar 1 00:16:42.299: IP: tableid=0, s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), routed via FIB
*Mar 1 00:16:42.299: IP: s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), len 1501, sending
*Mar 1 00:16:42.303: ICMP type=8, code=0
*Mar 1 00:16:44.295: IP: tableid=0, s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), routed via FIB
*Mar 1 00:16.:44.299: IP: s=10.0.12.1 (local), d=10.0.23.3 (FastEthernet0/0), len 1502, sending

Router sends 20 packets (starting from 1490 up to 1510, later on starting process again this is why we see exclamation mark and dots and exclamation again). We have sent 10 packets to starting from 1490 + 10 it means that last one was 1500B long and the router has 1500B MTU configured and droped the packets because was not able to fragment it due to DF bit set. The propose of this example was just to show way of test.

Example 2 – GRE over IPsec

Fragmentation can be a issue in case of use GRE over IPsec where end point that terminates the IPsec tunnel has to add GRE Header and next encrypts it all together. GRs header adds 24B to the packet, IPsec with ESP 56B so we have 80B overhead. As we have mentioned the best option to use in this case is PMTUD but what if routers on the path blocks the ICMP messages then the best option is to use optimum MTU size under tunnel GRE interface. GRE source router then will responsible for IP fragmentation befrore adding GRE and IPsec headers. To calulate the the MTU for tunnel the best option is to use sweep ping as above.

Let’s assume that we have GRE/IPsec between R1 and R3. We have experience issues related to the slow response time of application and high CPU on the R3 router. We expect that IP fragmentation is a culprit. We would to figure out what minimal MTU is on the path to adjust tunnel MTU to avoid IP fragmentation.
We are not able to ping with DF bit set to simulate traffic with GRE/IPsec because router will not copy DF bit in to IPsec header ( GRE tunnel cleares DF bit unless we use tunnel path-mtu discovery under tunnel GRE), istead of this we have to ping pure IP so for example remote head end of IPsec tunnel and be sure that this traffic will not be encrypted.

Test – test of minimal MTU on the path

R1#ping
Protocol [ip]:
Target IP address: 10.0.23.3
Repeat count [5]:
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: y
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: y
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]: y
Sweep min size [36]: 1300
Sweep max size [18024]: 1500
Sweep interval [1]:
Type escape sequence to abort.
Sending 1005, [1300..1500]-byte ICMP Echos to 10.0.23.3, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M.M
.M.M.M.
Success rate is 68 percent (101/147), round-trip min/avg/max = 4/19/60 ms

After 101 successful request so we get response for ICMP with 1300B lenght until 1400B so the minimal MTU on the path is 1400B. M means could not fragment. Then recomended is to lower IP MTU under GRE to minimal MTU on the path – 80B (24B for GRE and 56B for ESP tunnel IPsec).

To read more about IP Fragmentation, MTU, MSS and PMTUD issues with GRE and IPsec I recomend to read this very good and interesting Cisco Technology White Paper.

, , , , ,