anima                                                       J. Zhao, Ed.
Internet-Draft                                             S. Zhang, Ed.
Intended status: Standards Track                            China Unicom
Expires: 4 September 2025                                   3 March 2025


                  Automatic Network Congestion Relief
            draft-zhao-anima-automatic-congestion-relief-00

Abstract

   This document introduces an automatic congestion relief mechanism
   based on intelligent traffic analysis and dynamic regulation.  In the
   event of congestion caused by fiber optic failures, it can respond
   intelligently and self-heal in real time, ensuring the stable
   operation of the network.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 4 September 2025.

Copyright Notice

   Copyright (c) 2025 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.




Zhao & Zhang            Expires 4 September 2025                [Page 1]

Internet-Draft     Automatic Network Congestion Relief        March 2025


Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Automatic Network Congestion Relief . . . . . . . . . . . . .   2
     2.1.  Step1: Traffic Modeling . . . . . . . . . . . . . . . . .   3
     2.2.  Step2: Traffic Monitoring . . . . . . . . . . . . . . . .   3
       2.2.1.  BGP-LS Utilized Bandwidth TLV . . . . . . . . . . . .   4
     2.3.  Step3: Intelligent Policy Generation  . . . . . . . . . .   4
     2.4.  Step4: Policy Propagation . . . . . . . . . . . . . . . .   4
     2.5.  Step5: Policy Reversion . . . . . . . . . . . . . . . . .   5
   3.  usecase . . . . . . . . . . . . . . . . . . . . . . . . . . .   5
     3.1.  Pre-deployment Conditions . . . . . . . . . . . . . . . .   5
     3.2.  Implementation of Automated Mechanism . . . . . . . . . .   5
   4.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   6.  Normative References  . . . . . . . . . . . . . . . . . . . .   7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   7

1.  Introduction

   Nowadays, fiber optic failures occur frequently, leading to network
   congestion and becoming a common pain point for operators.  These
   issues necessitate dedicated staff to perform daily traffic
   inspections and manually adjust configurations on an hourly basis,
   which significantly increases the difficulty of network maintenance.

   This draft introduces an automatic congestion relief mechanism based
   on intelligent traffic analysis and auto-regulation.  In the event of
   congestion caused by fiber optic failures, it can intelligently
   respond to congestion and initiate real-time self-healing processes,
   solving the network congestion and maintenance challenges faced by
   operators due to fiber optic failures, and ensuring the stable
   operation of the network.

2.  Automatic Network Congestion Relief

   This second-level congestion relief mechanism is automated through
   the intelligent module within the device.  Leveraging intelligent
   traffic analysis, it precisely calculates the volume of traffic
   requiring redistribution.  Subsequently, it redirects this traffic to
   paired devices via inter-device protocol announcements and the
   automatic adjustment of routing priorities.









Zhao & Zhang            Expires 4 September 2025                [Page 2]

Internet-Draft     Automatic Network Congestion Relief        March 2025


   +-+-+-+-+-+-+-+-+-+-+       +-+-+-+-+-+-+-+-+-+-+-+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Traffic Modeling  |------>| Traffic Monitoring  |---->| Intelligent policy generation |
   +-+-+-+-+-+-+-+-+-+-+       +-+-+-+-+-+-+-+-+-+-+-+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                                                                         |
                                                                         |
                               +-+-+-+-+-+-+-+-+-+-+-+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                               |   Policy Reversion  |<----|      Policy Regulation      |
                               +-+-+-+-+-+-+-+-+-+-+-+     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


              Figure 1: Mechanism Framework Description

2.1.  Step1: Traffic Modeling

   The forwarding chip of the device performs real-time traffic sorting
   using full-flow data and identifies the top N traffic flows.

   The intelligent component of the AI chip subscribes to the BGP RIB-
   out (Routing Information Base-outbound) and employs intelligent flow
   recognition algorithms to perform AI-based traffic modeling.  This
   modeling approach, considering factors such as historical traffic
   patterns and flow behavior, provides a solid basis for subsequent
   traffic detection and regulation.

   The intelligent flow feature statistics cover multiple dimensions,
   arranged in a logical order from macroscopic to microscopic traffic
   characteristics, including flow rate, packet length, the proportion
   of TCP/UDP traffic, the proportion of fragmented packets, and the
   proportion of SYN packets.

2.2.  Step2: Traffic Monitoring

   Through the extension of the BGP-LS protocol, the inter-domain link
   bandwidth and load changes are obtained by the device in real-time.

   When the link bandwidth exceeds the set congestion threshold, the
   situation where the link bandwidth exceeds the threshold is reported
   quickly.

   The interface statistics are collected at a second-level time
   interval.










Zhao & Zhang            Expires 4 September 2025                [Page 3]

Internet-Draft     Automatic Network Congestion Relief        March 2025


2.2.1.  BGP-LS Utilized Bandwidth TLV

   The device uses the Utilized Bandwidth to aggregate the inter-domain
   BGP EPE link bandwidth and bandwidth utilization rate.  The BGP-LS
   Utilized Bandwidth TLV reuses the Maximum Link Bandwidth TLV (Type
   1089) [RFC5305] This TLV is used to describe the bandwidth and
   bandwidth utilization of inter-domain BGP Egress Peer Engineering
   (EPE) links.  The format of the BGP-LS Utilized Bandwidth TLV is as
   follows.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               Type            |             Length            |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Utilized Bandwidth                        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

                Figure 2: BGP-LS Utilized Bandwidth KEY TLV

2.3.  Step3: Intelligent Policy Generation

   When its utilization exceeds the predefined congestion threshold, the
   device launches the intelligent module.  The device intelligently
   identifies the traffic that needs to be adjusted and generates
   policies based on traffic analysis.

   One policy is that, considering the traffic characteristics such as
   flow rate and type, the device can make a more accurate calculation.
   Since the routing prefixes sent to the one network domain (rib-out)
   are the same, it is only necessary to change the attributes after
   finding the TOP N routing prefixes to be adjusted, and lower the
   priority of the faulty plane.

2.4.  Step4: Policy Propagation

   The device announces the automatic adjustment mechanism of routing
   priorities through the BGP RPD protocol and then automatically
   allocates the calculated traffic to the lightly-loaded plane.  The
   end-to-end process is completed within seconds, effectively
   alleviating the congestion on the original plane.  It can precisely
   control neighbors and paths without affecting the existing routing
   policies in the network.








Zhao & Zhang            Expires 4 September 2025                [Page 4]

Internet-Draft     Automatic Network Congestion Relief        March 2025


2.5.  Step5: Policy Reversion

   After the interrupted link recovers, the optimization policies will
   be gradually withdrawn.  With the revocation of the policies, the
   network traffic will progressively return to the load-sharing state
   before the failure.

3.  usecase

    ----------------------------------------
   |  --------                   --------   |
   | |   C2   |    Network2     |    C1  |  |
   |  --------                   --------   |
    ------|--------------------------|------
          |                          |
          |                          x
          | 100GB x 8                x 100GB x 8
          |                          |
   -------|--------------------------|------
   | +-------                   --------   |
   | |  CR2   |    Network1     |  CR1   | |
   |  --------                   --------  |
    ----------------------------------------

               Figure 3: Intelligent Decision-Making usecase

3.1.  Pre-deployment Conditions

   Establish topology mapping in Network 1 between CR1 and CR2, and
   between CR1/CR2 and C1/C2 in Network 2, clarifying the connection
   relationships.  Set up BGP-LS peering between CR1 and CR2 to exchange
   topology and bandwidth information.  Enable BGP EPE functionality via
   EBGP peering between CR1 and C1, and between CR2 and C2, to obtain
   link state and bandwidth-related information and generate BGP-LS LINK
   routes.  Activate the BGP RPD neighbor function on CR1 and CR2 to
   receive optimized routing policies.

3.2.  Implementation of Automated Mechanism

   *  Step1:AI Traffic Modeling

      CR1 and CR2 perform real-time and automatic TOP N traffic modeling
      on the link using the built-in automated algorithms.  Without
      human intervention, they are capable of accurately grasping the
      traffic conditions of the link.  The system automatically monitors
      that the total bandwidth of the traffic channels between CR1-C1
      and CR2-C2 is 100 x 8GB, and the current traffic on both paths is
      600GB.



Zhao & Zhang            Expires 4 September 2025                [Page 5]

Internet-Draft     Automatic Network Congestion Relief        March 2025


   *  Step2:Traffic Monitoring

      If the CR1 device detects that a total of five links between
      CR1-C1 have failed, leaving only three links, the system
      automatically determines that congestion will occur on the CR1-C1
      link.

   *  Step3:Intelligent Policy Generation

      As the primary adjustment device, the intelligent module of CR1
      automatically generates optimization policies based on the
      established TOPN traffic model and the real-time collected prefix
      information.  The entire process does not require human
      intervention, which greatly shortens the time from failure
      discovery to policy formulation, and enables timely response to
      network emergencies to ensure the stable operation of the network.

   *  Step4: Policy Propagation

      After CR1 generates the optimization policies, it automatically
      propagates them to C1.  Upon receiving the policies, C1
      automatically guides the remote routers to adjust their routing
      paths.

      The system intelligently identifies and then automatically diverts
      the two services with the highest priority from the CR1-C1 link to
      the CR2-C2 link (through C2) to alleviate the congestion on the
      CR1-C1 link.  After the policy adjustment, the system
      automatically monitors that the traffic on the CR1-C1 link is
      reduced to 300G, and the traffic on the CR2-C2 link is increased
      to 800G.  The automated policy propagation and traffic diversion
      process is efficient and accurate, effectively improving the
      utilization efficiency of network resources and quickly
      alleviating the link congestion problem.

   *  Step5: Policy Reversion

      When the failed links between CR1-C1 are restored, the system
      automatically detects the link status change and gradually
      withdraws the relevant optimization policies.  Upon the automatic
      revocation of the policies, the network traffic automatically and
      gradually returns to the load-sharing state before the fault.

      This automated mechanism ensures that the network can quickly
      return to normal operation after the fault is eliminated, reducing
      the cost of human intervention and improving the self-healing
      ability of the network.




Zhao & Zhang            Expires 4 September 2025                [Page 6]

Internet-Draft     Automatic Network Congestion Relief        March 2025


4.  Security Considerations

   TBD.

5.  IANA Considerations

   TBD.

6.  Normative References

   [RFC5305]  Li, T. and H. Smit, "IS-IS Extensions for Traffic
              Engineering", RFC 5305, DOI 10.17487/RFC5305, October
              2008, <https://www.rfc-editor.org/info/rfc5305>.

Authors' Addresses

   Jing Zhao (editor)
   China Unicom
   Beijing
   China
   Email: zhaoj501@chinaunicom.cn


   Shuai Zhang (editor)
   China Unicom
   Beijing
   China
   Email: zhangs633@chinaunicom.cn























Zhao & Zhang            Expires 4 September 2025                [Page 7]