Internet-Draft | Scalable Name-Based Forwarding | January 2025 |
Song, et al. | Expires 31 July 2025 | [Page] |
This document specifies a scalable approach to name-based packet forwarding, a fundamental component of content semantic network architectures. Unlike IP-based forwarding, name-based forwarding must address the challenges of handling variable-length, unbounded keys and significantly larger namespaces. The proposed approach leverages two key insights to enable scalable forwarding for billions of prefixes. First, by representing names as binary strings, forwarding tables with millions of entries can be effectively compressed using Patricia tries to fit within contemporary line card memory. Second, the data structure is designed and optimized to ensure its size is dependent only on the number of rules, not the length of the rules themselves.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 31 July 2025.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Name-based packet forwarding requires handling variable-length names, performing longest-prefix match (LPM) over a global namespace, and meeting high-throughput requirements within strict memory constraints.¶
Traditional LPM methods for IP forwarding have proven remarkably successful. Core innovations from the late 1990s, refined with techniques such as prefix expansion and bitmap representations, have enabled scalable, cost-effective solutions for IP LPM. With fixed-length IP addresses and relatively small rulesets (fewer than 500,000 entries), these implementations achieve high performance while requiring only a few megabytes of fast memory, such as SRAM or TCAM. For IP forwarding, LPM is widely considered a solved problem.¶
However, these methods are not directly applicable to content semantic name-based forwarding. Name-based forwarding faces unique challenges:¶
Complex Name Structures: content semantic names are more flexible and complex than IP addresses. There are hierarchical, URL-like names, and also flat, self-certifying names are supposed to be used for content semantic packet forwarding. A scalable forwarding solution must accommodate diverse naming schemes without being tied to a specific format.¶
Large Forwarding Tables: The forwarding information base (FIB) for name-based forwarding is orders of magnitude larger than that of IP. For example, the DNS namespace suggests FIBs at the scale of hundreds of millions of entries, with potential growth into billions as adoption of content semantic packet forwarding solution increases.¶
High Throughput: With the deployment of 100 Gbps Ethernet, name-based forwarding must sustain line-rate performance. Semantic packet forwarding requires efficient handling of large-scale rulesets at high speeds.¶
Existing solutions often assume hierarchical naming structures and rely on hash tables for FIB storage. These approaches are limited by their dependency on specific name formats and the inefficiencies of redundant data storage. Furthermore, their performance degrades with longer names, posing scalability challenges for large namespaces.¶
This document specific a scalable approach to name-based forwarding that addresses these challenges using a binary Patricia trie, which is first proposed in [Song2015]. The design is guided by two key insights:¶
This approach is intended to work across diverse namespaces, whether flat or hierarchical, without making assumptions about their specific characteristics.¶
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Name: A variable length string that uniquely identifies a network packet and is directly used for network packet forwarding.¶
Proper Prefix: A proper prefix p is said to be a proper (strict) prefix of name n, denoted by n = p + s, in which s is a non-null bit string.¶
Ki/k: Ki is kibi, multiplied by a power of 1024, while k is kilo with a power of 1000.¶
Mi/M: Ki is kibi, multiplied by a power of 1024, while M is mega with a power of 1000.¶
Gi/G: Ki is gibi, multiplied by a power of 1024, while G is giga with a power of 1000.¶
Scalable packet forwarding has two fundamental requirements: (1) fast FIB lookup and (2) small memory footprint, and these two requirements are highly related.¶
The main contributor to FIB lookup time is memory access time, which depends on the type of memory used to store the FIB. TCAM and SRAM are commonly used in a line card for fast FIB lookup. However, name-based FIB costs hundreds of MiB to store even a few million name prefixes, which can not fit in TCAM or SRAM with a limited size of a few MiB.¶
The design goal of this draft is to have a compact FIB data structure:¶
FIBs with a few million entries can still fit in SRAM for fast lookup.¶
The memory footprint of FIB is scalable in that it only depends on the number of entries rather than the length of names.¶
In this draft, binary Patricia trie data structure is used to minimize the redundant information stored, compressing FIBs of a few million entries to fit in tens of MiB of SRAM. Togerther with a corresponding forwarding behavior that we termed "speculative forwarding", only the differentiating bits between different rules are stored to scale to billions of FIB entries.¶
Patricia trie [PatriciaTrie], also known as radix tree, is a space optimized general trie structure. It is generated from a character-based prefix trie by merging every single-child node with its child. The internal nodes in Patricia trie are used to branch by discrimination characters while leaf nodes contain the match information, i.e., the prefix.¶
Binary Patricia is a commonly used variant, which treats prefixes as binary strings in building the trie. A binary Patricia is a full binary tree in which every non-leaf node has exact two children.¶
Figure 1 shows a character-based Patricia and a binary Patricia built from the same FIB. In a character-based Patricia, each internal node may have at most 256 children, branched by octet characters. From the root to a leaf, all tokens stored in the nodes and characters along the arrows together form a complete prefix. The delimiter ('/') is treated as a character as well.¶
The ASCII code is used to represent prefixes in building a binary Patricia, and other coding schemes can be used too. Each node has at most two children, branched by a bit of 0 and 1. All bits in the nodes and along the arrows from the root to a leaf together form a complete prefix. A prefix is divided to a sequence of tokens. In this document, this binary Patricia is also called as "tokenized binary Patricia". Especially, the branched 0 and 1 in an internal node are part of the sequence.¶
(str)_2: a binary representation of str.¶
A binary Patricia trie is constructed by sequentially inserting FIB entries into an initially empty tree. The key insertion algorithm is detailed in Figure 6. The algorithm works as follows:¶
A common prefix is determined by comparing the token stored at the root node with the key (k).¶
The process terminates in one of the following three cases:¶
If the common prefix equals the token but not (k), an additional iteration is required. In this case, the remaining sequence of (k) is used to continue the search at the lower-level nodes of the trie Figure 5. For clarity, certain implementation details, such as labeling bit positions and creating new nodes, are omitted here but can be found in [PatriciaTrie] [ComputerProg].¶
The structure of a binary Patricia trie is independent of the order in which keys are inserted. That is, for any given set of keys, the resulting trie is unique.¶
Updating a key involves:¶
In speculative forwarding, the binary Patricia trie removes tokens, leaving only the differentiating bits between prefixes. This reduces memory usage, making it independent of name length and more scalable. However, this approach shifts the lookup behavior from longest-prefix match (LPM) to longest-prefix classification (LPC), requiring an additional verification step to confirm the correct match. From the perspective of matching, LPM generally consists of two steps: longest-prefix classification and verification. LPC is a step to filter some candidate rules, and an additional verification is required to affirm the right one.¶
These aspects are discussed in the following sections.¶
Forwarding behaviors in name-based systems involve matching packet names to prefixes. Conventional forwarding uses longest-prefix match (LPM), where classification identifies the longest matching prefix, followed by a verification step to ensure correctness. Speculative forwarding relays packets by Longest-prefix classification (LPC) instead of LPM. LPC is a step to filter some candidate rules, and an additional verification is required to affirm the right one. LPC lookup guarantees that if there is a match, the packet will be forwarded to the same nexthop that LPM would use, but if there is no match, the packet is still forwarded.¶
Specifically , in the speculative forwarding, the process of verification is removed. While speculative forwarding ensures accurate forwarding for packets with known prefixes, it may incorrectly forward packets with unknown prefixes that would typically be dropped under traditional LPM systems.¶
The speculative data plane uses speculative forwarding in the default-free zone (DFZ) and conventional forwarding in networks with default routes. DFZ routers handle all global name prefixes and require high-speed forwarding, benefiting from speculative forwarding's smaller routing tables and faster lookups. Edge routers, which manage fewer prefixes due to default routes, use conventional forwarding as they can store tokens and terminate packets misforwarded by speculative forwarding. This hybrid approach combines speculative forwarding in core networks and conventional forwarding in edge networks to create an efficient global data plane.¶
Dual Binary Patricia (DuBP) supports LPC in a speculative data plane by dividing the FIB (S) into two subsets:¶
Flat Subset (F): Contains flat name prefixes where no prefix is a proper prefix of another. Uses speculative Patricia for efficient lookups.¶
Covered Subset (P): Contains prefixes covered by those in the flat subset. Uses tokenized Patricia for verification. This structure combines speculative forwarding for scalability with conventional forwarding for accuracy, optimizing memory usage and lookup speed.¶
Thus, S=F+P. As its name implies, dual Patricia is a data structure that consists of two different styles of Patricia: a speculative binary Patricia (sBP) for subset F and a tokenized binary Patricia (tBP) for subset P.¶
Speculative Binary Patricia (sBP) is a binary Patricia variant designed for speculative forwarding. It eliminates token storage in both internal and leaf nodes. Instead, each internal node stores a discrimination bit position to determine the branching direction based on the bit value at that position in the lookup name. For example, if the root node is labeled p:15, it checks the 15th bit of the lookup name and branches left if the bit is 0 or right if it is 1. The structure only uses discrimination bits for branching and does not store or verify other prefix information.¶
Dual Binary Patricia (DuBP) consists of two parts: a tokenized Patricia for the proper prefix subset (P) to perform longest-prefix match (LPM) and a speculative Patricia for the flat name subset (F) to perform longest-prefix classification (LPC). For name lookup, the packet name is first processed by the tokenized Patricia. If no match is found, it is sent to the speculative Patricia, which always outputs a port. This design supports speculative forwarding while maintaining classification for flat names. If multiple ports are matched, the forwarding strategy determines the correct port.¶
No IANA action is required.¶
As the network size grows, the scalability of name-based packet forwarding systems becomes increasingly critical. The FIB, which stores all the routing information, may grow to billions of entries, making it difficult to manage and look up packet names efficiently. Without proper solutions, the forwarding process can slow down, and the system may not be able to scale effectively to handle the growing number of packets and names.¶
To address scalability concerns, advanced data structures like Patricia tries or hash tables can be used to compress the FIB, improving both memory usage and lookup efficiency. Additionally, distributed systems can share the load of forwarding decisions across multiple nodes, allowing for parallel processing of large numbers of forwarding entries. Incremental updates to the FIB can also help, enabling the system to adjust as network conditions change without requiring complete recalculations or reconfigurations.¶
Despite these solutions, there are significant trade-offs. Patricia tries can still cause memory overhead when the FIB grows very large, as even compressed data structures require substantial resources to store billions of entries. Distributed systems introduce complexities like synchronization across nodes, which can increase latency and decrease system reliability. Furthermore, incremental updates can complicate real-time routing decisions, as they might cause delays in adapting to rapid network changes or lead to inconsistencies in the forwarding table.¶
DoS attacks represent a significant security threat to name-based packet forwarding systems. Attackers can overwhelm the system by sending a high volume of requests or injecting malformed packets, which may exhaust system resources and disrupt normal operations.¶
To mitigate DoS attacks, packet filtering and rate limiting can be implemented. These measures can prevent malicious packets from consuming excessive resources by limiting the frequency of requests and filtering out illegitimate packets. Additionally, distributed load balancing can help distribute the traffic load across multiple nodes, preventing a single node from becoming overwhelmed.¶
While packet filtering and rate limiting are effective, they may introduce additional latency, particularly under high traffic conditions. Distributed load balancing requires a more complex infrastructure and management, which can increase the cost and introduce potential points of failure.¶