Age | Commit message (Collapse) | Author |
|
This queue information will be available through the kernel socket layer
once we move over to Netlink socket as transports, so we might as well get
rid of the redundancy.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
Userspace used to use the n_flows information here to decide how much
memory needed to be allocated to list flows, but that isn't necessary any
longer now that listing flows uses an iterator abstraction. The
cur_capacity and max_capacity members are just curiosities and don't
provide much information; if the implementation ever changes away from
the current hash table implementation then they could become meaningless
anyhow.
But more than anything, these aren't really the kind of statistics that
networking people usually care about.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software.
The customary way to do this in the Linux networking stack is to use
Netlink and in particular Netlink attributes. This commit adopts that
model for the vport layer. It does not yet actually start using the
Netlink socket layer, which will come later.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
I plan to make the vport type part of the standard header stuck on each
Netlink message related to a vport. As such, it is more convenient to use
an integer than a string. In addition, by being fundamentally different
from strings, using an integer may reduce the confusion we've had in the
past over the differences in userspace and kernel names for network device
and vport types.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
Until now it has only been possible to query a vport if you know what
datapath it is on. This doesn't really make sense, so this commit removes
that restriction. It is a little bigger than one might naturally expect
because locking changes are required.
This also allows us to get rid of the ETHTOOL_GDRVINFO kluge that has
bothered me for a long time. The next commit does that.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to add new
features to the kernel vport layer without changing userspace software. In
turn, that means that the odp_port structure must become variable-length.
This does not, however, fit in well with the ODP_PORT_LIST ioctl in its
current form, because that would require userspace to know how much space
to allocate for each port in advance, or to allocate as much space as
could possibly be needed. Neither choice is very attractive.
This commit prepares for a different solution, by replacing ODP_PORT_LIST
by a new ioctl ODP_VPORT_DUMP that retrieves information about a single
vport from the datapath on each call. It is much cleaner to allocate the
maximum amount of space for a single vport than to do so for possibly a
large number of vports.
It would be faster to retrieve a number of vports in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.
The Netlink version won't need to take the starting port number from
userspace, since Netlink sockets can keep track of that state as part
of their "dump" feature.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
This commit takes one step in that direction by making the kernel report
its idea of the flow that a packet belongs to whenever it passes a packet
up to userspace. This means that userspace can intelligently figure out
what to do:
- If userspace's notion of the flow for the packet matches the kernel's,
then nothing special is necessary.
- If the kernel has a more specific notion for the flow than userspace,
for example if the kernel decoded IPv6 headers but userspace stopped
at the Ethernet type (because it does not understand IPv6), then again
nothing special is necessary: userspace can still set up the flow in
the usual way.
- If userspace has a more specific notion for the flow than the kernel,
for example if userspace decoded an IPv6 header but the kernel
stopped at the Ethernet type, then userspace can forward the packet
manually, without setting up a flow in the kernel. (This case is
bad from a performance point of view, but at least it is correct.)
This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, although userspace does now
have enough information to do that intelligently. This will have to wait
for later commits.
This commit is bigger than it would otherwise be because it is rolled
together with changing "struct odp_msg" to a sequence of Netlink
attributes. The alternative, to do each of those changes in a separate
patch, seemed like overkill because it meant that either we would have to
introduce and then kill off Netlink attributes for in_port and tun_id, if
Netlink conversion went first, or shove yet another variable-length header
into the stuff already after odp_msg, if adding the flow key to odp_msg
went first.
This commit will slow down performance of checksumming packets sent up to
userspace. I'm not entirely pleased with how I did it. I considered a
couple of alternatives, but none of them seemed that much better.
Suggestions welcome. Not changing anything wasn't an option,
unfortunately. At any rate some slowdown will become unavoidable when OVS
actually starts using Netlink instead of just Netlink framing.
(Actually, I thought of one option where we could avoid that: make
userspace do the checksum instead, by passing csum_start and csum_offset as
part of what goes to userspace. But that's not perfect either.)
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length. This
commit makes that change using Netlink attribute sequences.
This commit does not actually make userspace flexible enough to handle
changes in the kernel flow key structure, because userspace doesn't yet
have enough information to do that intelligently. Upcoming commits will
fix that.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
One of the goals for Open vSwitch is to decouple kernel and userspace
software, so that either one can be upgraded or rolled back independent of
the other. To do this in full generality, it must be possible to change
the kernel's idea of the flow key separately from the userspace version.
In turn, that means that flow keys must become variable-length. This does
not, however, fit in well with the ODP_FLOW_LIST ioctl in its current form,
because that would require userspace to know how much space to allocate
for each flow's key in advance, or to allocate as much space as could
possibly be needed. Neither choice is very attractive.
This commit prepares for a different solution, by replacing ODP_FLOW_LIST
by a new ioctl ODP_FLOW_DUMP that retrieves a single flow from the datapath
on each call. It is much cleaner to allocate the maximum amount of space
for a single flow key than to do so for possibly a very large number of
flow keys.
As a side effect, this patch also fixes a race condition that sometimes
made "ovs-dpctl dump-flows" print an error: previously, flows were listed
and then their actions were retrieved, which left a window in which
ovs-vswitchd could delete the flow. Now dumping a flow and its actions is
a single step, closing that window.
Dumping all of the flows in a datapath is no longer an atomic step, so now
it is possible to miss some flows or see a single flow twice during
iteration, if the flow table is modified by another process. It doesn't
look like this should be a problem for ovs-vswitchd.
It would be faster to retrieve a number of flows in batch instead of just
one at a time, but that will naturally happen later when the kernel
datapath interface is changed to use Netlink, so this patch does not bother
with it.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
NXM_NX_TUN_ID and NXM_OF_VLAN_TCI were already allowed on NXAST_REG_MOVE,
but not on NXAST_REG_LOAD. This makes them valid on both.
Requested-by: Pankaj Thakkar <thakkar@nicira.com>
|
|
|
|
Reported-by: Justin Pettit <jpettit@nicira.com>
|
|
A few generated files have snuck in that should be ignored by git.
|
|
This macro hasn't ever been used.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
Previously, a GRE-over-IPsec tunnel was created as an interface with a
"type" of "gre" and the "other_config" column with "ipsec_cert" or
"ipsec_psk" set. This could lead to a potential security problem if a user
intended to create a GRE-over-IPsec tunnel, but misconfigured the
"ipsec_*" config and created an unencrypted GRE tunnel.
This commit defines an "ipsec_gre" tunnel type, which should prevent
users from inadvertently establishing insecure tunnels.
|
|
ODP_FLOW_GET takes an odp_flowvec, not an odp_flow.
(This would merely introduce a gratuitous ABI incompatibility for the sake
of pedantic correctness, except that we're breaking the ABI regularly
anyhow.)
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
|
|
|
|
The final example for this field was wrong. This corrects it and adds
two more examples.
Reported-by: Natasha Gude <natasha@nicira.com>
|
|
On older kernels that don't have if_link.h, we use our own, limited
version. This version doesn't include the netlink header, causing
problems where we were relying on it to define the types in
datapath-protocol.h. Therefore, directly include it, since it is
better to be explicit about it anyways.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
|
|
We have a need to identify tunnels with keys longer than 32 bits. This
commit adds basic datapath and OpenFlow support for such keys. It doesn't
actually add any tunnel protocols that support 64-bit keys, so this is not
very useful yet.
The 'arg' member of struct odp_msg had to be expanded to 64-bits also,
because it sometimes contains a tunnel ID. This member also contains the
argument passed to ODPAT_CONTROLLER, so I expanded that action's argument
to 64 bits also so that it can use the full width of the expanded 'arg'.
Userspace doesn't take advantage of the new space though (it was only
using 16 bits anyhow).
This commit has been tested only to the extent that it doesn't disrupt
basic Open vSwitch operation. I have not tested it with tunnel traffic.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Feature #3976.
|
|
In the medium term, we plan to migrate the datapath to use Netlink as its
communication channel. In the short term, we need to be able to have
actions with 64-bit arguments but "struct odp_action" only has room for
48 bits. So this patch shifts to variable-length arguments using Netlink
attributes, which starts in on the Netlink transition and makes 64-bit
arguments possible at the same time.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
I don't know what this is good for.
|
|
|
|
The NXM_IS_NX_REG macro didn't check the "hasmask" bit, which meant that it
looked like it was supposed to match both exact and wildcarded NXM headers,
e.g. both NXM_NX_REG0 and NXM_NX_REG0_W. But exact and wildcarded NXM
headers differ not just in the "hasmask" bit but in the "length" value
also (the wildcarded version's length is twice the exact version's length),
so this was not what it actually did.
The only current users of NXM_IS_NX_REG actually only want to match exact
versions, so this commit makes it only match those. It also adds a new
NXM_IS_NX_REG_W macro that matches only wildcarded versions. This new
macro has no users yet, but its existence should help to make it clear that
NXM_IS_NX_REG only matches exact NXM headers.
Reported-by: Natasha Gude <natasha@nicira.com>
|
|
|
|
For some time now, Open vSwitch datapaths have internally made a
distinction between adding a vport and attaching it to a datapath. Adding
a vport just means to create it, as an entity detached from any datapath.
Attaching it gives it a port number and a datapath. Similarly, a vport
could be detached and deleted separately.
After some study, I think I understand why this distinction exists. It is
because ovs-vswitchd tries to open all the datapath ports before it tries
to create them. However, changing it to create them before it tries to
open them is not difficult, so this commit does this.
The bulk of this commit, however, changes the datapath interface to one
that always creates a vport and attaches it to a datapath in a single step,
and similarly detaches a vport and deletes it in a single step.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
|
|
__aligned_u64 is a 64-bit integer type that is guaranteed to be aligned on
a 64-bit boundary. It is used in ABI structures to allow them to be shared
between 32- and 64-bit userspace without the need for kernel compat code.
The first use in OVS is coming up in this series of patches.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
This header hasn't been included into the kernel code in ages.
|
|
The uint8_t type here has bothered me from the very beginning.
|
|
Requested-by: Jeremy Stribling <strib@nicira.com>
CC: Jeremy Stribling <strib@nicira.com>
|
|
Originally, wildcards were just the OpenFlow OFPFW_* bits. Then, when
OpenFlow added CIDR masks for IP addresses, struct flow_wildcards was born
with additional members for those masks, derived from the wildcard bits.
Then, when OVS added support for tunnels, we added another bit
NXFW_TUN_ID that coexisted with the OFPFW_*. Later we added even more bits
that do not appear in the OpenFlow 1.0 match structure at all. This had
become really confusing, and the difficulties were especially visible in
the long list of invariants in comments on struct flow_wildcards.
This commit cleanly separates the OpenFlow 1.0 wildcard bits from the
bits used inside Open vSwitch, by defining a new set of bits that are
used only internally to Open vSwitch and converting to and from those
wildcard bits at the point where data comes off or goes onto the wire.
It also moves those functions into ofp-util.[ch] since they are only for
dealing with OpenFlow wire protocol now.
|
|
Our controller group at Nicira has requested a way to annotate flows with
extra information beyond the flow cookie. The new NXAST_NOTE action
provides such a way.
This new action is somewhat controversial. Some have suggested that it
should be added another way (either as part of the Nicira Extended Match
or as a new component of the flow_mod and related messages). Others think
that it has no place in the OpenFlow protocol at all and that an equivalent
should be implemented using the already available features of OVSDB. So
it is possible that this extension will be deleted and the feature will
be reimplemented some other way (or not at all).
CC: Teemu Koponen <koponen@nicira.com>
CC: Jeremy Stribling <strib@nicira.com>
|
|
|
|
|
|
|
|
Build-tested (only) on 2.6.18 from XenServer 5.5.0, 2.6.26, 2.6.29, 2.6.34,
and 2.6.36.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Why can't I ever get this right the first time?
|
|
Linux 2.6.35 added struct rtnl_link_stats64, which as a set of 64-bit
network device counters is what the OVS datapath needs. We might as well
use it instead of our own.
This commit moves the if_link.h compat header from datapath/ into the
top-level include/ directory so that it is visible both to kernel and
userspace code.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
It seemed to me that the descriptions of what actions do should be just
above the action structures, where the reader can see the arguments,
instead of just above the enumeration name, so I rearranged the code
this way.
A few actions didn't have their own structures, so to do this I had to give
them some.
|
|
Upcoming commits will add more flow formats, so this needs to be
an enumerated type instead of a bool.
|
|
This adds the error numbers that the "wdp" branch added, without adding
any uses of them (because they are not needed on "master" yet).
|
|
|
|
Cross-ported from "wdp" branch.
|
|
|
|
Using these types for data in network byte order makes code clearer, and
allows the "sparse" checker to give warnings for misuse.
|
|
There's no need to have a mask in this action, because both parts of the
TCI are part of the flow structure.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
Breaking this out as a separate commit should make it easier to see what
needs to change later, if we need to reintroduce padding at some point.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
This allows eliminating padding from odp_flow_key, although actually doing
that is postponed until the next commit.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|