Internet-Draft draft-deng-spring-sr-loop-free May 2024
Deng, et al. Expires 25 November 2024 [Page]
Workgroup:
Spring Working Group
Internet-Draft:
draft-deng-spring-sr-loop-free-02
Published:
Intended Status:
Informational
Expires:
Authors:
L. Deng
China Telecom
Y. Zhu
China Telecom
X. Geng
Huawei Technologies
Z. Hu
Huawei Technologies

SR based Loop-free implementation

Abstract

Microloops are transient packet loops that occur in the network following a topology change (link- down, link up, node fault, or metric change events). Microloops are caused by the non-simultaneous convergence of different nodes in the network. If a converged node sends traffic to a neighbor node that has not converged yet (or vice versa), traffic may be looped between these nodes, resulting in packet loss, jitter, and packet disorder. This document presents some optional implementation methods aimed at loop avoidance in different scenarios of IGP network convergence.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 25 November 2024.

Table of Contents

1. Introduction

An IP network computes paths based on the distributed IGP protocols. If a node or link fails, a loop may occur on the network because LSDBs are not synchronized. Take the IS-IS/OSPF link state protocol as an example: Each time the network topology changes, some routers need to update the FIB table based on the new topology. Due to the different convergence time and convergence order, different routers may be asynchronous for a short time. Depending on the capability, configuration parameters, and service volume of the device, the database may not be synchronized in milliseconds to seconds. During this period, each device on the packet forwarding path may be in the pre-convergence state or in the post-convergence state. If the status is not synchronized, forwarding routes may be inconsistent and a forwarding loop may occur. However, such a loop disappears after all devices on the forwarding path complete convergence. Such a transient loop is called “microloop”. Microloops may cause packet loss, delay variation, and packet disorder on the network.

The Segment Routing defined in [RFC8042] . can be used to cope with the microloop issue on the network. When a loop may occur due to a network topology change, a network node creates a loop-free segment list to direct traffic to the destination address. After all network nodes converge, the network node returns to the normal forwarding state. This effectively eliminates loops on the network.

[I-D.bashandy-rtgwg-segment-routing-uloop] describes the basic principles of how to use Segment Routing to cope with microloop. This document describes some optional implementation methods of SR for microloop avoidance in different scenarios.

2. Conventions used in this document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] .

3. Anti-Microloop Scheme for Switching Scenarios

Switching microloops refer to the microloop occurring after the node/link fails. Along the traffic forwarding path, it may cause a loop if a node closer to the point of failure converges before a node far from the point of failure. Figure 1 is used as an example to describe the switching microloop caused process: when the link between R3 and R5 fails, it is assumed that R3 completes convergence first and R2 does not complete convergence. R1 and R2 forward the packet along the previous path to R3. Since R3 has converged, it forwards the traffic to R2 according to the route after convergence. Thus, the switching microloops happen between R2 and R3.

 +----------------------------------------------------------------+
 |                                             X  link failure    |
 |                                                                |
 |   +-------+      +-------+       +-------+                     |
 |   |   R1  |------|   R2  |-------|   R3  |                     |
 |   +-------+  10  +-------+   10  +-------+                     |
 |                       |               |                        |
 |                       | 10            X  10                    |
 |                       |               |                        |
 |                  +-------+       +-------+        +-------+    |
 |                  |   R4  |-------|   R5  |--------|   R6  |    |
 |                  +-------+ 1000  +-------+   10   +-------+    |
 |                      End.X SID 4::1                            |
 |                                                                |
 +----------------------------------------------------------------+
Figure 1: Switching illustrative scenario, failure of link R3-R5

TI-LFA (draft-ietf-rtgwg-segment-routing-ti-lfa-12 describes the fundamentals of TI-LFA.) is deployed in all nodes of the network, and when the link between R3 and R5 fails, the convergence process after deploying switching anti-microloop is as follows:

Time T1 must be longer than time T2. This scheme is limited to single points of failure, the TI-LFA backup path may be affected in case of multi-point failure.

4. Anti-Microloop Scheme for Back-switching Scenarios

Microloops may occur not only when the node/link fails, but also after the failure node/link recovers. Figure 2 is used as an example to introduce the process of the back-switching microloop. After the failure node/link recovers, it may cause a loop if a node further from the point of failure converges before a node closer to the point of failure.

R1 forwards the traffic to the destination node R6 following the path R1->R2->R3->R5->R6. When the link between R2 and R3 fails, R1 forwards the traffic to the destination R6 following the re-converged path R1->R2->R4->R5->R6. After the failure link between R2 and R3 is recovered, assuming that R4 is the first to convergence, R1 forwards the traffic to R2. Since R2 has not completed convergence, the packet is still forwarded to R4 in accordance with the path before the failure link recovering. R4 has already completed convergence, so R4 forwards it to R2 in accordance with the path after the failure link recovering, and the mircoloop occurs between R2 and R4.

 +---------------------------------------------------------------+
 |                                            & Link Recovery    |
 |                         End.X SID 2::3                        |
 |   +-------+      +-------+   &   +-------+                    |
 |   |   R1  |------|   R2  |-------|   R3  |                    |
 |   +-------+  10  +-------+  10   +-------+                    |
 |                       |               |                       |
 |                       | 10            | 10                    |
 |                       |               |                       |
 |                  +-------+       +-------+        +-------+   |
 |                  |   R4  |-------|   R5  |--------|   R6  |   |
 |                  +-------+ 1000  +-------+   10   +-------+   |
 |                                                               |
 |                                                               |
 +---------------------------------------------------------------+
Figure 2: Back-switching illustrative scenario, recovery of link R2-R3

Since the network does not enter the TI-LFA forwarding process after the node/link failure recovers, the delay convergence cannot be used in the back-switching scenario to prevent the generation of microloops as in the switching scenario. In the back-switching scenario, we only need to specify the Adj-SID of the back-switching link to achieve loop-free.

From the above process of back-switching microloop generation, it can be seen that microloops happen because R4 is unable to pre-install a loop-free path computed for the link-up. Therefore, in order to eliminate the potential loop after the faulty link recovers, R4 needs to be able to converge to a loop-free path.

When the faulty node/link recovers, the path can be anti-microloop by simply specifying Adj-SIDs of the neighbor node. As shown in Figure 2, R4 senses that the faulty link R2-R3 recovers and re-converges to the destination R6 with the path R4->R2->R3->R5->R6. The recovery of the faulty link R2-R3 does not affect the SR path from R4 to R2 and the SR path from R3 to R6, so both of them are loop-free. Since the only thing affected is the path from R2 to R3, the loop-free path from R4 to R6 can be determined by just specifying the path from R2 to R3. So it is only necessary to insert an End.X SID from R2 to R3 in the converged path of R4 End.X SID that instructs the routers to forward the message from R2 to R3, for example, R4 inserts anti-microloop segment list <2::3> in the message before forwarding it to R2, the path from R4 to R6 is guaranteed to be loop-free.

5. Anti-Microloop Scheme for Multi-source Scenarios

When an IPv4 or IPv6 prefix is advertised by multiple nodes in an IS-IS domain, the prefix has multiple route sources, which is called a multi-source route. This section is for the multi-source microloop avoidance scenario, which may occur when multiple nodes advertise the same route with inconsistent convergence speeds.

The prevention of multi-source microloop is conducted by adding SRv6 END.X and END SID to the segment list in the SRv6 scenario while adding prefix SID and Adj SID to the label stack in the SR-MPLS scenario.

The following example describes how the microloop happens when multiple nodes advertise the same route.

1. R3 and R6 both advertise the route 2001:db8:3::. The link between R2 and R3 fails. Assuming that R2 completes the convergence first, and R1 has not completed yet.

2. R1 forwards the packet with address prefix 2001:db8:3:: to R2 along the path before the failure.

3. Because R2 has completed convergence, R2 forwards packets to R1 according to the next hop of the route. In this way, a loop is formed between R1 and R2.

 +---------------------------------------------------+
 |                                 X  link failure   |
 | 2001:db8:1::    2001:db8:2::      2001:db8:3::    |
 |   +-------+       +-------+        +-------+      |
 |   |   R1  |-------|   R2  |----X---|   R3  |      |
 |   +-------+  10   +-------+   10   +-------+      |
 |        |                                          |
 |        | 10                                       |
 |        |                                          |
 |   +-------+       +-------+        +-------+      |
 |   |   R4  |-------|   R5  |--------|   R6  |      |
 |   +-------+  10   +-------+   10   +-------+      |
 | 2001:db8:4::     2001:db8:5::     2001:db8:3::    |
 |                                                   |
 +---------------------------------------------------+
Figure 3: Multi-source illustrative scenario, failure of link R2-R3

A possible solution is that: the preferred destination node of the packets destined for 2001:db8:3:: changes from R3 to R6, but the convergence path from R2 to R5 does not change. In this case, timer T1 on R2 can be started. Before T1 expires, for a packet that accesses R6, an End.X SID between R5 and R6 or an End SID of R6 is added to the encapsulation in order to ensure that the packet is forwarded to R6. The basic principle in the case of SR-MPLS is similar to that in the case of SRv6.

6. Anti-Microloop Scheme for Multi-point Scenarios

TBD

7. Conclusion

There are various scenarios and different implementation methods for loop prevention. The implementation methods proposed by this document based on SR microloop avoidance mechanism can be used for subsequent research and development.

8. Security Considerations

The behavior described in this document is internal functionality to a router that result in the ability to explicitly steer traffic over the post convergence path after a remote topology change in a manner that guarantees loop freeness. Because the behavior serves to minimize the disruption associated with a topology changes, it can be seen as a modest security enhancement.

9. IANA Considerations

No requirements for IANA.

10. Acknowledgement

The authors would like to thank everyone who contributed to the draft.

11. Normative References

[I-D.bashandy-rtgwg-segment-routing-uloop]
Bashandy, A., Filsfils, C., Litkowski, S., Decraene, B., Francois, P., and P. Psenak, "Loop avoidance using Segment Routing", Work in Progress, Internet-Draft, draft-bashandy-rtgwg-segment-routing-uloop-16, , <https://datatracker.ietf.org/doc/html/draft-bashandy-rtgwg-segment-routing-uloop-16>.
[I-D.ietf-rtgwg-segment-routing-ti-lfa]
Bashandy, A., Litkowski, S., Filsfils, C., Francois, P., Decraene, B., and D. Voyer, "Topology Independent Fast Reroute using Segment Routing", Work in Progress, Internet-Draft, draft-ietf-rtgwg-segment-routing-ti-lfa-14, , <https://datatracker.ietf.org/doc/html/draft-ietf-rtgwg-segment-routing-ti-lfa-14>.
[I-D.ietf-spring-segment-protection-sr-te-paths]
Hegde, S., Bowers, C., Litkowski, S., Xu, X., and F. Xu, "Segment Protection for SR-TE Paths", Work in Progress, Internet-Draft, draft-ietf-spring-segment-protection-sr-te-paths-06, , <https://datatracker.ietf.org/doc/html/draft-ietf-spring-segment-protection-sr-te-paths-06>.
[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8042]
Zhang, Z., Wang, L., and A. Lindem, "OSPF Two-Part Metric", RFC 8042, DOI 10.17487/RFC8042, , <https://www.rfc-editor.org/info/rfc8042>.

Authors' Addresses

Lijie Deng
China Telecom
109, West Zhongshan Road, Tianhe District
Guangzhou
Guangdong, 510000
China
Yongqing Zhu
China Telecom
109, West Zhongshan Road, Tianhe District
Guangzhou
Guangdong, 510000
China
Xuesong Geng
Huawei Technologies
Huawei Building, No.156 Beiqing Rd
Beijing
Beijing, 100095
China
Zhibo Hu
Huawei Technologies
Huawei Building, No.156 Beiqing Rd
Beijing
Beijing, 100095
China