1
0
mirror of https://github.com/openbsd/src.git synced 2025-01-10 06:47:55 -08:00
Commit Graph

306 Commits

Author SHA1 Message Date
claudio
6c19f566ed When generating UPDATE handle the message size limit better.
First of all warn that a prefix was dropped. In the generate an update
code handle possible overflows of attributes and NLRI and withdraw the
affected prefix. This way the peer will not have stale data.
OK tb@
2024-09-25 14:46:51 +00:00
claudio
c4328fc634 Introduce peer_is_up() and use it instead of peer->state == PEER_UP checks
also enqueue update and rrfresh imsgs only if the peer is up and flush them
once this is no longer the case.
OK tb@
2024-08-28 13:21:39 +00:00
claudio
89ee02f7f3 Introduce 'rde rib Loc-RIB include filtered' a feature that includes
filtered prefixes in the Loc-RIB

This includes filtered prefixes into the Loc-RIB but they are marked
ineligible so nothing will select them but it is possible to show them
in bgpctl. So 'bppctl show rib filtered' will return all prefixes filtered
out by the input filters.

OK tb@
2024-08-14 19:09:51 +00:00
claudio
1ebe6b99eb Remove nexthop_compare() prototype.
OK henning@ sthen@
2024-05-29 10:36:32 +00:00
claudio
bcd6516ba4 Convert bgpid, remote_bgpid and clusterid to host byte order.
Before the RDE used host byte order for remote_bgpid but all the other
code used network byte order. The reason for that was that bgpid was
initially an IPv4 address but since RFC 6286 in 2011 this is much more
relaxed and so it makes more sense to just treat them as numbers and
so host byte order.

OK tb@
2024-05-22 08:41:14 +00:00
jsg
85ba9220c9 remove prototypes with no matching function 2024-05-19 03:31:05 +00:00
jsg
088a2cd995 remove prototypes with no matching function; ok claudio@ 2024-05-18 11:17:30 +00:00
claudio
b3b1d93975 Convert the community parsers to the new ibuf api.
This converts community_add(), community_large_add() and community_ext_add()
and as a result removes some hacks from rde_attr_add() and rde_attr_parse().
OK tb@
2024-01-24 14:51:11 +00:00
claudio
5c4d2233d5 Start converting the message parser to use the new ibuf api.
Rewrite rde_update_dispatch() to use ibufs. Because of this
rde_update_err(), rde_get_mp_nexthop(), nlri_get_prefix() and
friends are switched to use ibufs. For rde_attr_parse() a minimal
change was done for now.

OK tb@
2024-01-23 16:13:35 +00:00
claudio
cf5008fd39 Improve IPv6 link-local address handling
When a session is established determine the possible interface scope of that
session. The scope is only set when the remote address is directly connected.
This interface scope is passed to the RDE that uses this information when
link-local nexthops are received. Again checking that a link-local nexthop
is actually acceptable.

OK tb@
2023-10-16 10:25:45 +00:00
claudio
c0c9c1699a Remove per-AFI ASPA handling in bgpd internals
With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@
2023-08-16 08:26:35 +00:00
claudio
66b1afa0a0 Update OpenBGPD to use new ibuf API.
This replaces the old way of using a static buffer and a len to build
UPDATEs with a pure ibuf solution. The result is much cleaner and a lot
of almost duplicate code can be removed because often a version for ibufs
and one for this static buffer was implemented (e.g. for mrt or bgpctl).
With and OK tb@
2023-07-12 14:45:42 +00:00
claudio
65117b4ced Use attr_writebuf() instead of hand rolling a more complicated version
for IMSG_CTL_SHOW_RIB_ATTR. Also drop the attr_optlen() usage in
imsg_create() since it is not stricly needed. With this attr_optlen
follows the path of the dodo.
OK tb@
2023-06-12 12:48:07 +00:00
claudio
dd2a9ed2eb Implement a way to announce flowspec rules without hitting Adj-RIB-In
and Loc-RIB. Flowspec objects are collected in a single flowrib RIB
and then directly distributed into the various Adj-RIB-Outs.
For this to work add a bypass in the filter logic (flowspec AFI/SAFI
are currently accepted without any rule). The filter language lacks
a way to allow prefixes based on AFI/SAFI which is the minimum needed.
OK tb@
2023-04-19 13:23:33 +00:00
claudio
689ec28392 Extend the pt_entry api to handle flowspec.
Introduce pt_get_flow() and pt_add_flow() to lookup and insert flowspec
objects. Add pt_getflowspec() which works somewhat similar to pt_getaddr()
to extract the flowspec NLRI from a pt_entry.
Make pt_getaddr() to return the destination prefix of the flowspec rule and
handle flowspec in pt_write().
OK tb@
2023-04-19 07:09:47 +00:00
claudio
5bb860535d Pass a pt_entry pointer to rib_get() and rib_add().
Add rib_get_addr() to behave like rib_get() did before.
OK tb@
2023-04-07 13:49:03 +00:00
claudio
064aed4897 Put the size of the pt_entry object into the struct itself.
Increase the refcnt to a 32bit int and while there reorder the vpn
specific structs a bit so the IPv4 and IPv6 types are more equal.

OK tb@
2023-03-30 12:11:18 +00:00
claudio
1e6dccedb9 Switch prefix_adjout_get and new prefix_adjout_first to use a pt_entry
as argument instead of the bgpd_addr + prefixlen.

Do the same with prefix_adjout_update but leave prefix_adjout_lookup
and prefix_adjout_match since those are used by bgpctl code that does
not use pt_entry structs.

With this most of the update code no longer needs struct bgpd_addr and
pt_getaddr().
OK tb@
2023-03-29 10:46:11 +00:00
claudio
de422abdef Instead of exracting the prefix into a bgpd_addr and passing that to
prefix_write() rename prefix_write() to pt_write() and pass a pt_entry to
the function. Removes an extra conversion step.
OK tb@
2023-03-28 15:17:34 +00:00
claudio
448d73c9c4 More pt_entry cleanup, move structure definitions to rde_prefix.c and
by that make them private. Remove no longer used AID_PTSIZE define.
OK tb@
2023-03-28 13:30:31 +00:00
claudio
f337fe2fe7 Add F_CTL_LEAKED and F_CTL_INELIGIBLE flags for bgpctl to show leaked
and ineligible paths.
While there rename F_PREF_OTC_LOOP to F_PREF_OTC_LEAK since this indicates
that a route leak was detected.
OK tb@
2023-03-13 16:52:41 +00:00
claudio
b900620c33 Compile the output filter rules into per peer filter rules.
especially on route-servers the output filters are in the hot path so
reducing the number of rules to check has a big impact. I have seen a
25% to 30% speedup in my big IXP testbench.
The output ruleset is applied and copied for each peer during config reload
and when a peer is initially added.
OK tb@
2023-03-10 07:57:15 +00:00
claudio
372bb3aab5 Major rework of RFC9234 support. My initial interpretation of the RFC was
too conservative. Fixes and changes include:

- add role output to bgpctl, also adjust the capability output.
  Note, this changes the JSON output of neighbors a bit.
- adjust the config parser to enable the RFC9234 role capability when
  there is a role set. iBGP and sessions with no role will not announce
  the role capability.
- adjust the role capability announcement to be only on sessions that
  use either AFI IPv4 or IPv6 and SAFI 1 (AID_INET, AID_INET6).
- if there is an OPEN notification indicating that the role capability
  is bad only disable the capability if it is not enforced.
- Adjust capability negotiation, store remote_role on the peer since
  the neighbors role is no longer needed by the RDE.
- inject the OTC attribute on ingress only for AID_INET and AID_INET6.
  For other AIDs clear the F_ATTR_OTC_LOOP flag.
- Adjust the role logic in the RDE and use the peer->role (local role of
  the system) for all checks. Also remove the check if the role capability
  was negotiated between peers.
- In prefix_eligible() check also if the F_ATTR_OTC_LOOP flag is set.
  The RFC requires that prefixes must be considered ineligible (and not
  treat as withdraw as done before)
- When generating an UPDATE include the OTC attribute unless the AID is
  neither AID_INET or AID_INET6.

Fixes https://github.com/openbgpd-portable/openbgpd-portable/issues/51
Reported by Pier Carlo Chiodi
OK tb@
2023-03-09 13:12:19 +00:00
claudio
988ba0ba4c Pass struct rib_entry to rde_generate_updates() instead of struct rib.
With this the newbest and oldbest arguments can go since the infromation
is part of the rib_entry. Especially the prefix in the rib_entry is
always valid so simplify some code in various functions below to use
this information.
OK tb@
2023-02-13 18:07:53 +00:00
claudio
82625ff8f2 Instead of relaying struct peer from the SE to the RDE to fill out 10
stat numbers, just send the peerid and have the RDE response with the
stats. The control code will then merge these counters into the real
peer struct and send that to bgpctl. This reduces the number of bytes
sent around a fair bit.
OK tb@
2023-02-09 13:43:23 +00:00
claudio
f8fade753e Implement ASPA validation and reload logic on ASPA set changes.
For this use the validation state (vstate) in struct prefix and
struct filterstate to store both the ASPA and ROA validity.
Introduce helper functions to set and get the various states for
struct prefix and make sure struct filterstate is also setup properly.
Change the ASPA state in rde_aspath to be AFI/AID and role independent
by storing all 4 possible outcomes. Also add a ASPA generation count
which is used to update the rde_aspath ASPA state cache on reloads.
Rework the rde_aspa.c code to be AFI/AID and role independent. Doing
this for roles is trivial but AFI switch goes deep and is so unnecessary.
The reload is combined with the ROA reload logic and renamed to RPKI
softreload.

OK tb@
2023-01-24 11:28:41 +00:00
claudio
c85bce7bf6 Use the vstate of the filterstate struct instead of passing an extra copy
to the various prefix update functions.
While there fix a filterstate leak in up_generate_updates().
With and OK tb@
2023-01-18 17:40:17 +00:00
claudio
d7e935310d Add the needed logic to load the ASPA table from the rtr process into the
RDE. The actual reload logic is missing to keep the diff small.
OK tb@
2023-01-17 16:09:01 +00:00
claudio
977f29ed75 Split rde_filterstate_prep() into three functions.
- rde_filterstate_init(): initialize a filterstate to default values
- rde_filterstate_copy(): copy from a filterstate into a new state object
- rde_filterstate_prep(): set filtersate based on prefix passed as argument.

This makes the code a bit easier to read.
OK tb@
2023-01-12 17:35:51 +00:00
claudio
245e5d076e Add the validation state to the filterstate struct.
Removes vstate argument from rde_filter().
Rename prefix_vstate() to prefix_roa_vstate().
OK tb@
2023-01-11 17:10:25 +00:00
claudio
28d6604741 Add ASPA validation functions to the RDE.
This implements ASPA validation based on the current draft. Implementing
this showed various weaknesses in the current ASPA draft which I hope to
fix in the near future.

Unlike the algorithm specified in the draft our version validates the
AS_PATH attribute in a single path doing one or two lookups depending on
the sessions BGP role.

The code is not yet hooked up into the RDE (see the NOTYET blocks).
Missing are reload logic, bgpctl integration and the loading of the
merged ASPA set from the rtr process.

OK tb@
2023-01-11 13:53:17 +00:00
jmc
3a50f0a93a spelling fixes; from paul tagliamonte
any parts of his diff not taken are noted on tech
2022-12-28 21:30:15 +00:00
claudio
3fad667eb0 Move some basic accessors of aspath to rde.h and make them static inline.
OK tb@
2022-12-14 12:37:15 +00:00
claudio
910ddab463 Implement a special update generator for add-path send all.
The generic add-path code up_generate_addpath() reevaluates everything
since this is the simplest way to select the announced paths. For add-path
all this is overkill since there is no dependency between prefixes and so
individual prefixes can be handled more efficently.

Extend rde_generate_updates() to pass the current newbest and oldbest
prefixes (for the selected best path) but now also include newpath and
oldpath (which is the prefix that is added/removed/modified).
If newpath or oldpath is set then a single prefix was altered and
up_generate_addpath_all() can just remove or add this prefix.
If newpath and oldpath are NULL than the full list based on newbest
needs to be inserted and any old path/prefix removed in the process.

This improves update generation performance on big route collectors using
add-path all substantially.

OK tb@
2022-09-23 15:49:20 +00:00
claudio
6c499f2559 Adjust pathid_assign() to be much faster in the common case.
Use a per peer path_id_tx to assign to paths received from none add-path
enabled peers. This skips two extra walks of the RIB prefix list and is
a big speed-up when there are many regular sessions. If the session uses
add-path recv then the old way of assigning random path_ids needs to be
used.

With input and OK tb@
2022-09-21 10:39:17 +00:00
claudio
76c3be877c Introduce tree walkers that only walk a subtree of the RIB.
In some cases only a "small" part of the RIB needs to be looked at. Like
bgpctl show rib 10/8 or-longer that only needs to travers nodes under
10/8 all other RIB entries do not matter. By setting the start node to
the RB_NFIND(10/8) the all nodes below this point can be skipped.
Using prefix_compare() while walking the tree with RB_NEXT() the walker
know when it steps outside of the 10/8 subtree and stops.
With this the or-longer commands become a lot faster.

Looks good to tb@
2022-09-12 10:03:17 +00:00
claudio
dcbb452232 Switch the rde_peer hashtable and peer list to a single RB tree.
Only the RDE used a hashtable for lookups while the session engine
switched from a list to RB tree some time ago.
Use peer_foreach() in the mrt code instead of passing the peer list
as an argument.
OK benno@ tb@
2022-09-01 13:23:24 +00:00
claudio
4ac780ca57 This code no longer needs siphash.h and also cleanup some leftover
prototypes and members that were not removed in the previous RB tree
conversions.
OK benno@ tb@
2022-09-01 13:19:11 +00:00
claudio
1f826a12ad Switch the generic attribute cache to an RB tree.
OK benno@ tb@
2022-08-31 14:29:36 +00:00
claudio
c03dc58fa1 Switch nexthop hash to a RB tree.
OK benno@
2022-08-30 18:50:21 +00:00
claudio
3487a0407f Instead of a global aspath cache copy the aspath attribute per rde_aspath
struct. It uses a bit more memory but improves performance a lot on really
big systems because aspath_get() becomes a very hot function.
OK tb@
2022-08-29 18:18:55 +00:00
claudio
23f5897daa Switch the DB of communities collections to a RB tree instead of an
undersized hash table.
OK tb@
2022-08-29 16:44:47 +00:00
claudio
b1dd0a100a Switch rde_aspath to a RB tree instead of a hash table.
OK tb@
2022-08-29 16:43:07 +00:00
claudio
ddbc7ef40e Handle IMSG_SESSION_* messages immediatly when received and do not put
them on the per peer imsg queue. This is mainly for IMSG_SESSION_DOWN.
Delaying the session down can race against IMSG_SESSION_ADD which is
handled immediatly and as a result an establised connection may be
removed in the RDE because of it.
The various graceful restart imsgs need similar treatment for similar
reasons. In the end when a session is reset/closed the RDE needs to
stop all work and flush the per peer imsg queue.
With this only update and route refresh messages are handled via the
imsg queue.
OK tb@
2022-08-26 14:10:52 +00:00
claudio
9391284c7c Add comment that NEXTHOP_FLAPPED is only set on oldstate of a nexthop. 2022-08-03 08:56:23 +00:00
deraadt
57baab2a53 whitespace found during a read-thru; ok claudio 2022-07-28 13:11:48 +00:00
claudio
13d31ce9d6 Properly handle nexthop state changes in the decision process
In rev 1.90 of rde_decide.c the re->active cache of the best prefix was
replaced with a call to prefix_best(). This introduced a bug because the
nexthop state at that time may have changed already. As a result when
a nexthop became unreachable prefix_evaluate() had oldbest = NULL and
newbest = NULL and did not withdraw the prefix from FIB and Adj-RIB-Out.

To fix this store the nexthop state per prefix and introduce
prefix_evaluate_nexthop() which removes the prefix from the decision list,
updates the nexthop state of the prefix and reinserts the prefix. Doing
this ensures that prefix_best() always reports the same result once the
decison process is done. prefix_best() and prefix_eligible() only depend
on data stored on the prefix itself.

OK tb@
2022-07-25 16:37:55 +00:00
claudio
5014683f69 Implement send side of RFC7911 ADD-PATH
This allows to send out more then one path per perfix to a neighbor that
supports add-path receive. OpenBGPD supports a few different modes to
select which paths to send:
  - all:	send all valid paths (the ones with a * in bgpctl output)
  - best:	send out only the single best path
  - ecmp:	send out paths that evaluate the same up and including
                the nexthop metric
  - as-wide-best: send out paths that evaluete the same up but not including
		  the nexthop metric
Currently ecmp and as-wide-best are the same. On top of this best, ecmp
and as-wide-best allow to include extra paths (e.g. best plus 2) and
for the multipath modes there is also a maximum (e.g. ecmp plus 2 max 4)

OK tb@
2022-07-11 17:08:21 +00:00
claudio
4efb02ee6a Pass path_id_tx to the Adj-RIB-Out
Adjust prefix_adjout_update() to properly handle path_id_tx.
Move the lookup of the prefix out of prefix_adjout_update() and to
up_generate_updates(). While that code uses prefix_adjout_lookup() to
find the current prefix in the Adj-RIB-Out and add-path aware function
will use prefix_adjout_get().

In up_generate_default() just use 0 for path_id_tx since for this peer
that is the only prefix installed into the Adj-RIB-Out.

OK tb@
2022-07-08 10:01:52 +00:00
claudio
ca69ff0dd9 Assign a local path_id to all prefixes
For add-path a unique path_id needs to be assigne to all prefixes.
Use a random number since the RFC explicitly mentions that there is no
meaning what the value means. The local path_id is inherited to all
the RIBs. Adj-RIB-Out handling is not yet down.
OK tb@
2022-07-08 08:11:25 +00:00