Age | Commit message (Collapse) | Author |
|
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
We have a call chain like this:
iface_configure_qos() calls
netdev_dump_queues(), which calls
netdev_linux_dump_queues(), which calls back through 'cb' to
qos_unixctl_show_cb(), which calls
netdev_delete_queue(), which calls
netdev_linux_delete_queue().
Both netdev_dump_queues() and netdev_linux_delete_queue() take the same
mutex in the same netdev, which deadlocks.
This commit fixes the problem by getting rid of the callback.
netdev_linux_dump_queue_stats() would benefit from the same treatment but
it's less urgent because I don't see any callbacks from that function that
call back into a netdev function.
Bug #19319.
Reported-by: Scott Hendricks <shendricks@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
We've seen a number of deadlocks in the tree since thread safety was
introduced. So far, all of these are self-deadlocks, that is, a single
thread acquiring a lock and then attempting to re-acquire the same lock
recursively. When this has happened, the process simply hung, and it was
somewhat difficult to find the cause.
POSIX "error-checking" mutexes check for this specific problem (and
others). This commit switches from other types of mutexes to
error-checking mutexes everywhere that we can, that is, everywhere that
we're not using recursive mutexes. This ought to help find problems more
quickly in the future.
There might be performance advantages to other kinds of mutexes in some
cases. However, the existing mutex type choices were just guesses, so I'd
rather go for easy detection of errors until we know that other mutex
types actually perform better in specific cases. Also, I did a quick
microbenchmark of glibc mutex types on my host and found that the
error checking mutexes weren't any slower than the other types, at least
when the mutex is uncontended.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
When threading comes into the picture there arises the possibility of a
race between netdev_vport_patch_peer()'s caller using the returned string
and another caller changing the peer. It is safer to return a copy.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This is the same lifecycle used in the ofproto provider interface.
Compared to the previous netdev provider interface, it has the
advantage that the netdev top layer can control when any given
netdev becomes visible to the outside world.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
I'd forgotten even to use the xpthread variants here.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
This removes ofproto-dpif-xlate's dependency on ofport_get_peer()
which, while cleaner in-and-of itself, will become more important
as ofproto-dpif_xlate modularizes.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Returning a static data buffer makes code more brittle and definitely
not thread-safe, so this commit switches to using a caller-provided
buffer instead.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
I don't see any reason for this to be static.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
There's no particular reason that netdev_dummy_register() has to care about
the particular OS, except that the tests like to use the special Linux-only
tunnel vport types. But that can be done better, I think, by just always
registering them from netdev_dummy_register() and making that function
idempotent, so that calling it twice under Linux has no additional effect.
This commit implements that solution.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ed Maste <emaste@freebsd.org>
|
|
This commit adds a function to lib/netdev.c to check that the interface name
is not the same as any of the registered vport providers' dpif_port name
(e.g. gre_system) or the datapath's internal port name (e.g. ovs-system).
Bug #15077.
Signed-off-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
The distinction between struct netdev_dev and struct netdev has always
been confusing. Now that previous commits have eliminated all interesting
state from struct netdev, this commit deletes it and renames struct
netdev_dev to take its place. Now the situation makes much more sense and
I won't have to continue making embarrassed explanations in the future.
Good riddance.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
Separating packet capture from "struct netdev" means that there is no
remaining per-"struct netdev" state, which will allow us to get rid of
"struct netdev_dev" (by renaming it "struct netdev").
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
This gets rid of the only per-instance data in "struct netdev", which
will make it possible to merge "struct netdev_dev" into "struct netdev" in
a later commit.
Ed Maste wrote the netdev-bsd changes in this commit.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Co-authored-by: Ed Maste <emaste@freebsd.org>
Signed-off-by: Ed Maste <emaste@freebsd.org>
Tested-by: Ed Maste <emaste@freebsd.org>
|
|
Adds tun_src and tun_dst match and set capabilities via new NXM fields
NXM_NX_TUN_IPV4_SRC and NXM_NX_TUN_IPV4_DST. This allows management of
large number of tunnels via the flow tables, without requiring the tunnels
to be pre-configured.
Flow-based tunnels can be configured with options remote_ip=flow and
local_ip=flow. local_ip=flow requires remote_ip=flow. When set, the
tunnel remote IP address and/or local IP address is set from the flow,
instead of the tunnel configuration.
Example:
$ ovs-vsctl add-port br0 gre -- set Interface gre ofport_request=1 type=gre options:remote_ip=flow options:key=flow
$ ovs-ofctl add-flow br0 "in_port=LOCAL actions=set_tunnel:1,set_field:192.168.0.1->tun_dst,output:1"
$ ovs-ofctl add-flow br0 "in_port=1 tun_src=192.168.0.1 tun_id=1 actions=LOCAL"
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
VXLAN was recently assigned UDP port 4789 by IANA. This
comit updates the OVS VXLAN implementation to reflect the new UDP port
number.
Cc: Kenneth Duda <kduda@aristanetworks.com>
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
Since userspace flow based tunneling code is checked in, the kernel
port based tunneling code can be removed.
Patch removes following components:
- tunnel ports hash table and moved tunnel ports list to individual
vports.
- Cleaned per tnl-port config.
- OVS_KEY_ATTR_TUN_ID action is removed.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Bug #15078
|
|
In get_tunnel_config(), distinguish between VXLAN and LISP when deciding
whether or not to print UDP destination port. Only add the UDP
destination port for either protocol if it is not the default UDP port.
Update the LISP unit test to match the new behavior as well.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
[jesse: merge common test for VXLAN and LISP]
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
LISP is an experimental layer 3 tunneling protocol, described in RFC
6830. This patch adds support for LISP tunneling. Since LISP
encapsulated packets do not carry an Ethernet header, it is removed
before encapsulation, and added with hardcoded source and destination
MAC addresses after decapsulation. The harcoded MAC chosen for this
purpose is the locally administered address 02:00:00:00:00:00. Flow
actions can be used to rewrite this MAC for correct reception. As such,
this patch is intended to be used for static network configurations, or
with a LISP capable controller.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
The CAPWAP implementation is just the encapsulation format and
therefore really not the full protocol. While there were some
uses of it (primarily hardware support and UDP transport). But
these are most likely better provided by VXLAN.
Following patch removes CAPWAP tunneling support.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
Modify netdev_vport_get_dpif_port() to return a name for
VXLAN ports which includes the destination UDP port number as a part of the
name.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This patch removes the final bit of linux specific code which
prevents building netdev-vport everywhere. With this, other
platforms automatically get access to patch ports, and (if their
datapath supports it), flow based tunneling.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
With this patch, ovs-vswitchd uses flow based tunneling
exclusively. I.E. each kind of tunnel shares a single tunnel
backer in the datapath. Tunnel headers are set by userspace using
the ipv4_tunnel datapath action. And, the configuration of
individual tunnels is now a userspace responsibility, so
netdev-vport no longer marshals and unmarshals Netlink attributes
for tunnel configuration, instead only storing the configuration
internally. There are still some significant pieces of work to do,
but the basic building blocks are there to begin testing.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
Co-authored-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
In future patches, a netdev's datapath port name may not
necessarily be the same as its device name. This patch prepares for
this by making the distinction in the netdev and dpif layers.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Now that userspace implements patch ports completely internally,
it's possible to remove the kernel implementation of them.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
|
|
We want to move the GRE vport ID into the upstream range but in
order to ease the transition kept the old ID around for one release.
This removes the old value.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
|
|
Inheritance of the Don't Fragment bit in tunnels will not be
supported with flow based tunneling and has already been removed
from userspace. This removes the corresponding kernel support.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Kyle Mestery <kmestery@cisco.com>
|
|
This commit moves responsibility for implementing patch ports from
the datapath to ofproto-dpif. There are two main reasons to do
this.
The first is a matter of design: ofproto-dpif both has more
information than the datapath, and is better suited to handle the
complexity required to implement patch ports.
The second is performance. My setup is a virtual machine with two
basic learning bridges connected by patch ports. I used
ovs-benchmark to ping the virtual router IP residing outside the
VM. Over a 60 second run, "ovs-benchmark rate" improves from
14618.1 to 19311.9 transactions per second, or a 32% improvement.
Similarly, "ovs-benchmark latency" improves from 6ms to 4ms.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This is a straight search-and-replace, except that I also removed #include
<assert.h> from each file where there were no assert calls left.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
Future patches will need to know the details of a netdev's tunnel
configuration from outside the netdev library.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This will be required to support flow based tunneling.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
There are a lot of tunnels, and they all use the exact same
functions, so it makes sense to collapse their initialization into
a macro.
Suggested-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Letting netdev-vport manage ethernet addresses itself instead of
relying on the datapath has several advantages. It simplifies the
code, is significantly more efficient, and will work when there is
no longer a one to one mapping from netdev-vports to datapath
vports.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This patch removes path MTU discovery from userspace. The feature
still exists in the kernel where it will need to be removed in the
future.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
get_status() is a much more intuitive name since "status" is what
the database column is called.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Theoretically, its possible for netdev_get_status() to be called
on a netdev-vport which hasn't had its configuration set yet. In
this case, netdev-vport would dereference a null pointer.
Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Jarno Rajahalme <jarno.rajahalme@nsn.com>
[jesse: correct return type of get_u32_or_zero()]
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
The only user of netdev_set_stats() is bonding (for updating the
fake interface). This interface is never a vport, so it seems
quite a bit cleaner to keep the relevant code in the netdev-linux
library where it's needed, instead of in netdev-vport, where it
adds needless complexity.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
The only user of netdev_send() is dpif-netdev which doesn't support
vports.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
An ovs_be32 is a more obvious way to represent an IP address than a
pointer to one. It is also more type-safe, especially since "sparse" is
able to check that the argument is in network byte order.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Ethan Jackson <ethan@nicira.com>
|
|
Cc: Kyle Mestery <kmestery@cisco.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
Add support for VXLAN tunnels to Open vSwitch. Add support
for setting the destination UDP port on a per-port basis.
This is done by adding a "dst_port" parameter to the port
configuration. This is only applicable currently to VXLAN
tunnels.
Please note this currently does not implement any sort of multicast
learning. With this patch, VXLAN tunnels must be configured similar
to GRE tunnels (e.g. point to point). A subsequent patch will implement
a VXLAN control plane in userspace to handle multicast learning.
This patch set is based on one posted by Ben Pfaff on Oct. 12, 2011
to the ovs-dev mailing list:
http://openvswitch.org/pipermail/dev/2011-October/012051.html
The patch has been maintained, updated, and freshened by me and a
version of it is available at the following github repository:
https://github.com/mestery/ovs-vxlan/tree/vxlan
I've tested this patch with multiple VXLAN tunnels between hosts
using different UDP port numbers. Performance is on par (though
slightly faster) than comparable GRE tunnels.
See the following IETF draft for additional information about VXLAN:
http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
[jesse: simplify error path in vxlan_tunnel_setup, don't print default VXLAN port,
and remove dead code]
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
This patch fixes the following warning caused by a switch case
which was not handled.
lib/netdev-vport.c:144:5: error: enumeration value
‘OVS_VPORT_TYPE_FT_GRE’ not handled in switch
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|