Hi! I have been selected to do a routing directorate “early” review of this draft. https://datatracker.ietf.org/doc/draft-ietf-bess-bgp-sdwan-usage/ The routing directorate will, on request from the working group chair, perform an “early” review of a draft before it is submitted for publication to the IESG. The early review can be performed at any time during the draft’s lifetime as a working group document. The purpose of the early review depends on the stage that the document has reached. As this document has recently been through a working group last call, my focus for the review was to determine whether it is ready for publication. Please consider my comments along with the other working group last call comments. For more information about the Routing Directorate, please see https://wiki.ietf.org/en/group/rtg/RtgDir. Document: draft-ietf-bess-bgp-sdwan-usage-28 Reviewer: Alvaro Retana Review Date: December 4, 2025 Intended Status: Informational Summary: I have some concerns about this document that should be resolved before it is submitted to the IESG. Comments: The document presents an overview of how BGP can be used in a large-scale SD-WAN network. The draft is structured so that the scenarios are initially described (Section 3), then general provisioning is covered (Section 4), how BGP-controlled SD-WAN works for the scenarios in Section 3 (Section 5), and finally, the forwarding models for those scenarios are discussed (Section 6). This structure results in some repetition and disjointness -- the readability would improve if all aspects of each scenario were explained in a single section. Given the draft's focus on the control plane, I believe Section 6 (Forwarding Model) is out of place. The information in it can further inform the expected behavior, but it could also be moved to an appendix or eliminated. Note also that the Manageability and Security Considerations sections focus only on the control plane. This document already went through WGLC, so I defer to the Chairs/Shepherd on what shousld be included and how to structure the text. I have included in-line comments below. Among the ones I tagged as "major", I want to highlight the following: (1) Expectation about the use and placement of the RR/Controller and the BGP sessions. In general, the document assumes that the RR and the SD-WAN Controller are the same. However, this assumption doesn't account for multiple RRs or different planes. The assumption should be clearly and explicitly articulated. §3.1.5 also seems to mention the possibility of having BGP sessions between the SD-WAN Edges. (2) IPSec Tunnel Encapsulation The draft repeatedly references rfc9012, but no IPSec Tunnel Encapsulation is specified there. The correct reference should be draft-ietf-idr-sdwan-edge-discovery. (3) Focus of the Manageability and Security Considerations sections Both sections cover only a small part of what the draft covers. Explicitly, none of the building technologies are mentioned (even by reference). The Security Considerations section should not just cover expectations, but also the risks if those expectations are not followed. For example, BGP doesn't have a mandatory "secure communication channel"; what are the risks if the expectations are not met in a deployment? See more details below. This review ends with "[EoR -28]". Thank you! Alvaro. [Line numbers from idnits.] ... 98 1. Introduction ... 127 This document outlines SD-WAN use cases and the complexities of 128 managing large-scale SD-WAN overlay networks, as described in 129 [Net2Cloud-Problem]. It demonstrates how a BGP-based control plane 130 can efficiently manage these networks with minimal manual 131 intervention; additional operational drivers for standardized 132 protocol behavior are summarized in Section 6 of [MPLIFY-119]. [minor] "as described in [Net2Cloud-Problem]" It is not clear to me what this sentence says about what is described in [Net2Cloud-Problem]. If [Net2Cloud-Problem] describes use-cases and/or "the complexities...", what is this draft for? [major] There's no reference entry for [MPLIFY-119]. 134 It's important to distinguish the BGP instance as the control 135 plane for SD-WAN overlay from the BGP instances governing the 136 underlay networks. The document assumes a secure communication 137 channel between the SD-WAN controller and SD-WAN edges for 138 exchanging control plane information. [minor] "The document assumes a secure communication channel between the SD-WAN controller and SD-WAN edges for exchanging control plane information." What is a "secure communication channel"? §3.1.5 answers this question: An SD-WAN edge must use a secure channel, such as TLS (RFC5246) [RFC8446] or IPsec, to its designated RR for exchanging BGP UPDATE messages. But it calls it "secure channel". §5.1 uses "secure management channel", and §8 goes back to "secure communication channel". Please be consistent. 140 The need for an RFC documenting SD-WAN use cases lies in ensuring 141 standardization and interoperability. While BGP and IPsec are 142 well-established technologies, their application to SD-WAN 143 introduces challenges such as scalability, traffic segmentation, 144 and multi-homing. This document consolidates best practices and 145 defines guidelines to enable consistent implementations across 146 diverse networks, optimizing existing protocols for SD-WAN 147 scenarios rather than proposing new ones. [nit] s/The need for an RFC documenting/The need for documenting [major] "standardization and interoperability...consolidates best practices and defines guidelines to enable consistent implementations..." This document is tagged as Informational (which seems the right intended status to me), but the statements in this paragraph point at it being much more. IMO, this justification paragraph is not necessary. If justification is needed for publication, the Shepherd write-up is a better place to put it. 149 2. Conventions used in this document ... 154 Controller: Used interchangeably with SD-WAN controller to manage 155 SD-WAN overlay networks in this document. In the 156 context of BGP-controlled SD-WAN, the SD-WAN 157 controller functions as or is integrated with the BGP 158 Route Reflector (RR). [nit] s/as or is/are [major] In many places the text assumes that the RR is the controller. Is that always the case? Is it a requirement? What happens in cases where multiple RRs exist, are all of them controllers? Note that, for example, §5.1 opens the possibility of the RR and the controller not being the same: "When the BGP RR is integrated with the SD-WAN controller..."; which implies that the functionality may not be integrated. Please clarify the expectation. ... 163 Client route: A BGP-advertised route originated by an SDWAN edge 164 that represents the reachability of a client-facing 165 service (e.g., IP prefix or VLAN) and includes 166 associated path attributes used by the SDWAN- 167 Controller for policy enforcement and forwarding 168 decisions. [nit] s/SDWAN/SD-WAN/g [minor] s/client-facing service/client service/g Be consistent. ... 180 MP-NLRI: In this document, the term "MP-NLRI" serves as a 181 concise reference for "MP_REACH_NLRI". [minor] Even if defined here, please don't make up new terminology. In this case, "MP-NLRI" shows up only one time in the text. ... 198 SD-WAN IPsec SA: IPsec Security Association between two WAN ports 199 of the SD-WAN edges or between two SD-WAN edges. [minor] Add a reference. ... 229 3.1. SD-WAN Functional Overview and Requirements [] More than requirements, this section is a description of the operation. 231 3.1.1. Supporting SD-WAN Segmentation ... 239 This document assumes that SD-WAN VPN configuration on PE devices 240 will, as with MPLS VPN [RFC4364] [RFC4659], make use of VRFs 241 [RFC4364] [RFC4659]. Notably, a single SD-WAN VPN can be mapped to 242 one or multiple virtual topologies governed by the SD-WAN 243 controller's policies. [nit] s/MPLS VPN [RFC4364] [RFC4659], make use of VRFs [RFC4364] [RFC4659]./MPLS VPN, make use of VRFs [RFC4364] [RFC4659]. ... 250 As SD-WAN is an overlay network arching over multiple types of 251 networks, MPLS L2VPN[RFC4761] [RFC4762]/L3VPN[RFC4364] [RFC4659] 252 or pure L2 underlay can continue using the VPN ID (Virtual Private 253 Network Identifier), VN-ID (Virtual Network Identifier), or VLAN 254 (Virtual LAN) in the data plane to differentiate packets belonging 255 to different SD-WAN VPNs. For packets transported through an IPsec 256 tunnel, additional encapsulation, such as GRE [RFC2784] or VxLAN [nit] s/L2VPN[RFC4761] [RFC4762]/L3VPN[RFC4364]/L2VPN [RFC4761] [RFC4762]/L3VPN [RFC4364] ... 261 3.1.2. Client Service Requirement ... 270 In [MEF 70.1], the "SD-WAN client interface" is called SD-WAN UNI 271 (User Network Interface). Section 11 of [MEF 70.1] defines a 272 comprehensive set of attributes for the SD-WAN UNI, detailing the 273 expected behavior and requirements to enable seamless connectivity 274 to the client network. [major] MEF 70.2 is used elsewhere, are the definitions in MEF 70.1 different? IOW, do you need both references? ... 279 3.1.3. SD-WAN Traffic Segmentation [] What's the difference between the segmentation in this section and §3.1.1? Both sections talk about the same thing, only the level of the examples is different. Consider merging them. ... 293 In the figure below, traffic from the PoS system follows a tree 294 topology (denoted as "----" in the figure below), whereas other 295 traffic can follow a multipoint-to-multipoint topology (denoted as 296 "==="). [] Assuming that the topology below is conceptual, it looks like the "link" between the "payment gateway" and the "multi-point connection" is not needed. 298 +--------+ 299 Payment traffic |Payment | 300 +------+----+-+gateway +------+----+-----+ 301 / / | +----+---+ | \ \ 302 / / | | | \ \ 303 +-+--+ +-+--+ +-+--+ | +-+--+ +-+--+ +-+--+ 304 |Site| |Site| |Site| | |Site| |Site| |Site| 305 | 1 | | 2 | | 3 | | |4 | | 5 | | 6 | 306 +--+-+ +--+-+ +--|-+ | +--|-+ +--|-+ +--|-+ 307 | | | | | | | 308 ==+=======+=======+====+======+=======+=======+=== 309 Figure 1 multi-point connection for non-payment traffic [minor] Find a more descriptive name of this Figure. ... 318 3.1.4. Zero Touch Provisioning ... 329 - The SD-WAN edge's customer information and unique device 330 identifier (e.g., serial number, MAC address, or factory- 331 assigned ID) are registered with the SD-WAN Central Controller. [minor] Is the "SD-WAN Central Controller" different than a "SD-WAN Controller"? I ask because this is the only place this new term is used. 333 - Upon power-up, the SD-WAN edge can establish the transport 334 layer secure connection [BCP195] to its controller, whose URL 335 (or IP address) and credential for connection request can be 336 preconfigured on the edge device by the manufacture, external 337 USB drive or secure Email given to the installer. The external 338 USB method involves providing the installer with a pre- 339 configured USB flash drive containing the necessary 340 configuration files and settings for the SD-WAN device. The 341 secure Email approach entails sending a secure email containing 342 the configuration details for the SD-WAN device. [minor] I'm confused about the reference to BCP195. By "transport layer secure connection", do you mean a TLS connection? BCP195 points at general TLS-related best practices and doesn't define the protocol itself. If you meant TLS, I wonder why not use RFC8446. [nit] s/by the manufacture/by the manufacturer 344 - The SD-WAN Controller authenticates the ZTP request from the 345 remote SD-WAN edge with its configurations. Once the 346 authentication is successful, it can designate a local network 347 controller near the SD-WAN edge to pass down the initial 348 configurations via the secure channel. The local network 349 controller manages and monitors the communication policies for 350 traffic to/from the edge node. [minor] "local network controller" Here's another type of controller...which is not the central one mentioned above. What is the relationship with the SD-WAN Controller? 352 3.1.5. Constrained Propagation of SD-WAN Edge Properties 354 For an SD-WAN edge to establish an IPsec tunnel to another edge 355 and exchange the attached client routes, both edges need to know 356 each other's network properties, such as the IP addresses of the 357 WAN ports, the edges' loopback addresses, the attached client 358 routes, the supported encryption methods, etc. 360 In many cases, an SD-WAN edge is authorized to communicate with 361 only a subset of other edge nodes. To maintain security and 362 privacy, the property of an SD-WAN edge must not be propagated to 363 unauthorized peers. However, when a remote SD-WAN edge powers up, 364 it may lack the policies to determine which peers are authorized 365 to communicate. Therefore, SD-WAN deployment needs to have a 366 central point to distribute the properties of an SD-WAN edge to 367 its authorized peers. 369 BGP is well suited for this purpose. A Route-Reflector (RR) 370 [RFC4456], integrated into the SD-WAN controller, enforces 371 policies governing the communication among SD-WAN edges. The RR 372 ensures that BGP UPDATE messages from an SD-WAN edge are 373 propagated only to other edges within the same SD-WAN VPN. [major] The first paragraph talks about "an SD-WAN edge to establish an IPsec tunnel to another edge and exchange the attached client routes", which sounds to me like establishing a BGP session. But this last paragraph says that the RR "ensures that BGP UPDATE messages from an SD-WAN edge are propagated only to other edges within the same SD-WAN VPN". Are direct BGP peerings between SD-WAN Edges established, or is the communication only through the RR/controller? 375 An SD-WAN edge must use a secure channel, such as TLS (RFC5246) 376 [RFC8446] or IPsec, to its designated RR for exchanging BGP UPDATE 377 messages. [major] RFC5246 was obsoleted by RFC8446. Do you need both references? [major] Add a reference for IPSec. ... 394 3.2. Scenario #1: Homogeneous Encrypted SD-WAN ... 402 - A small branch office connecting to its headquarters via the 403 Internet. All traffic to and from this small branch office must be 404 encrypted, usually achieved by IPsec Tunnels [RFC6071]. [major] RFC6071 is an IPSec document roadmap, not the appropriate reference to be used here. [minor] This paragraph used "IPsec Tunnels", but the next couple use "IPSec SAs"...and elsewhere only "IPSec". Please be consistent. ... 518 3.4. Scenario #3: Private VPN PE based SD-WAN ... 540 +======>|PE2| 541 // +---+ 542 // ^ 543 // || VPN 544 // VPN v 545 |PE1| <====> |RR| <=> |PE3| 546 +-+-+ +--+ +-+-+ 547 | | 548 +--- Public Internet -- + 549 Offload 550 Figure 5: Additional Internet paths added to the VPN [nit] The "top" of the routers in the figure are missing. The same happens in other figures. ... 567 4.1. Client Service Provisioning Model 569 Provisioning of client-facing services in an SD-WAN network can 570 leverage approaches similar to those used for VRFs (Virtual 571 Routing and Forwarding) in MPLS based VPNs [RFC4364][RFC4659]. A 572 client VPN can define communication policies by specifying BGP 573 Route Targets for import and export. Alternatively, policy-based 574 filtering using ACLs (Access Control List) can be employed to 575 control which routes are allowed or denied for a given client VPN. [nit] s/MPLS based VPNs [RFC4364][RFC4659]/MPLS based VPNs [RFC4364] [RFC4659] ... 597 4.3. IPsec Related Parameters Provisioning ... 606 In a BGP-controlled SD-WAN, BGP UPDATE messages can be extended to 607 propagate IPsec-related attributes for each SD-WAN edge. This 608 approach allows peers to receive and apply compatible 609 cryptographic parameters distributed over a secure channel between 610 the SDWAN edge and its BGP RR, thereby simplifying IPsec tunnel 611 establishment and reducing reliance on traditional IKEv2 612 negotiation [RFC7296]. [minor] "BGP UPDATE messages can be extended" Include an Informative reference to draft-ietf-idr-sdwan-edge-discovery. ... 620 5.1. Rational for Using BGP as Control Plane for SD-WAN [minor] s/Rational/Rationale [] The rest of the draft discusses how BGP is used...I don't think a justification is needed anymore. ... 634 - Simplified peer authentication process: 636 With a secure management channel established between each edge 637 node and its RR, the RR can perform peer authentication on 638 behalf of the edge node. The RR has policies on peer 639 communication and the built-in capability to constrain the 640 propagation of the BGP UPDATE messages to the authorized edge 641 nodes only. [major] "With a secure management channel established between each edge node and its RR, the RR can perform peer authentication on behalf of the edge node." This question is related to the peering model question above (§3.1.5). I read the sentence as saying that (somehow) the RR is able to authenticate an edge node on behalf of another edge node. What does that mean? The use of "peer authentication" leads me to believe that the edge nodes will peer with each other (??). Is the "peer communication" at the control plane level or in the dataplane? 643 - Scalable IPsec tunnel management 645 In networks with multiple IPsec tunnels between SD-WAN edges, 646 BGP simplifies tunnel management by using the Tunnel 647 Encapsulation Attribute specified in [RFC9012] to carry 648 information that associates advertised client routes with 649 specific tunnels. [major] RFC9012 doesn't specify an IPSec tunnel encapsulation. 651 Unlike traditional IPsec VPN where IPsec tunnels between two 652 edge nodes are treated as independent parallel links requiring 653 duplicated control plane messages for load sharing. [] This sentence seems orphaned...unlike what? 655 - Simplified traffic selection configurations 657 BGP can simplify the configuration of IPsec tunnel associations 658 and related forwarding policies. By leveraging Route Targets to 659 identify SD-WAN VPN membership, administrators can apply 660 import/export policies that control the distribution of client 661 routes. These route attributes, in turn, inform the local 662 configuration of IPsec traffic selectors at each SDWAN edge. [] This point sounds like tunnel management to me. Maybe merge with the last point... ... 678 5.2. BGP Scenario for Homogeneous Encrypted SD-WAN ... 686 For example, in the figure below, the BGP UPDATE message from C- 687 PE2 to RR can have the client routes encoded in the MP-NLRI Path 688 Attribute and the IPsec Tunnel associated parameters encoded in 689 the Tunnel Encapsulation Attribute [RFC9012]. [major] RFC9012 doesn't specify an IPSec tunnel encapsulation. ... 717 5.3. BGP Scenario for Differential Encrypted SD-WAN ... 726 - Update 1: Client Route Advertisement for advertising the 727 prefixes of client services attached to the client facing 728 interfaces. The Color (Section 8 of [RFC9012]) is used to 729 associate each client service with the corresponding WAN ports 730 for the desired underlay paths. [] "The Color..." what? ... 820 6.1.1. Network and Service Startup Procedures ... 827 For example, in the full mesh scenario in Figure 2 of Section 3.2, 828 where client CN2 is attached to C-PE1, C-PE3, and C-PE4, six uni- 829 directional IPsec SAs must be established: C-PE1 <-> C-PE3; C-PE1 830 <-> C-PE4; C-PE3 <-> C-PE4. [minor] s/Figure 2/Figure 3 ... 887 6.2. Forwarding Model for Hybrid Underlay SD-WAN 889 In this scenario, as shown in Figure 3 of Section 3.3, traffic 890 forwarded over the trusted VPN paths can be native (i.e., 891 unencrypted). The traffic forwarded over untrusted networks need 892 to be protected by IPsec SA. [minor] s/Figure 3/Figure 4 894 6.2.1. Network and Service Startup Procedures 896 Infrastructure setup: The proper MPLS infrastructure must be 897 configured among the edge nodes, i.e., the C-PE1/C-PE2/C-PE3/C-PE4 898 of Figure 3. The IPsec SA between wAN ports or nodes must be set 899 up as well. IPsec SA related attributes on edge nodes can be 900 distributed by BGP UPDATE messages as described in Section 5. [nit] s/wAN/WAN ... 906 6.2.2. Packet Walk-Through ... 921 For a c-PE with multiple WAN ports provided by different NSPs, 922 separate IPsec SAs can be established for the WAN ports. In this 923 case, the C-PE have multiple IPsec tunnels in addition to the 924 MPLS path to choose from to forward the packets from the client 925 facing interfaces. [nit] s/c-PE/C-PE ... 957 For multicast traffic, MPLS multicast [RFC6513, RFC6514, or 958 RFC7988] can be utilized to forward multicast traffic across the 959 network. [minor] s/[RFC6513, RFC6514, or RFC7988]/[RFC6513], [RFC6514], or [RFC7988] ... 1022 7. Manageability Considerations 1024 A BGP-controlled SD-WAN uses RR to propagate client routes and 1025 underlay tunnel properties among authorized SD-WAN edges. Since 1026 the RR is configured with policies that identify authorized peers, 1027 the peer-wise IPsec IKE (Internet Key Exchange) authentication 1028 process is significantly simplified. [major] This section only considers small part of what was covered in the rest of the document. 1030 8. Security Considerations [major] Should the security considerations for all the technology mentioned in the draft be inherited? At least BGP and IPSec... 1032 In a BGP-controlled SD-WAN network, secure operation replies in 1033 part on the correct configuration and behavior of the RR, which 1034 acts as the central distribution point for BGP routing 1035 information. RR applies preconfigured routing policies to control 1036 the propagation of BGP UPDATE messages to authorized SD-WAN edges, 1037 help minimizing the risk of unintended route exposure or 1038 unauthorized communication. [nit] s/replies in part/relies in part 1040 The security model for the SD-WAN described in this document is 1041 based on the following principles: 1043 1) Centralized Control: The RR governs all routing and policy 1044 decisions. This centralized architecture simplifies security 1045 management compared to distributed models, as it limits the 1046 potential attack surface to a smaller, more controlled set of 1047 components. [major] True. What are the risks associated with misconfiguration? 1048 2) Secure Communication Channels: All communication between SD-WAN 1049 edges and the RR must occur over a secure channel, such as TLS 1050 or IPsec, to ensure the confidentiality and integrity of BGP 1051 UPDATE messages. [major] What happens if the secure communication channel is not used? The propagation of BGP UPDATEs is not gated by the transport mechanism. A peering session could be configured without the required secure communication channel. What are the associated risks? Is the expectation that the RR, or the edge nodes (or both), will not proceed with the BGP session unless a secure communication channel is used? 1052 3) Policy Enforcement: The RR is responsible for enforcing policies 1053 that restrict the propagation of edge node properties and 1054 routing updates to only authorized peers. This prevents 1055 sensitive information from being exposed to unauthorized nodes. [major] What are the risks associated with misconfiguration? 1057 4) Mitigation of Internet-Facing Risks: In scenarios where SD-WAN 1058 edges include Internet-facing WAN ports, additional measures 1059 must be taken to mitigate security risks: 1060 - Anti-DDoS mechanisms must be enabled to protect against 1061 potential attacks on Internet-facing ports. [major] Is there an example of an "Anti-DDoS mechanism" you can point to? What are the risks of not using one? 1062 - The control plane must avoid learning routes from Internet- 1063 facing WAN ports to prevent unauthorized traffic from being 1064 injected into the SD-WAN. [major] What are the risks associated with misconfiguration? ... 1084 10. References [major] IMO, only the reference to MEF70.2 (where the concept of SD-WAN is introduced) should be Normative, the rest can be Informative. [EoR -28]