Age | Commit message (Collapse) | Author |
|
Commit e72e793 (Add ability to restrict flow mods and flow stats
requests to cookies.) modified cookie handling. Some of its behavior
was unintuitive and there was at least one bug (described below).
Commit f66b87d (DESIGN: Document uses for flow cookies.) attempted to
document a clean design for cookie handling. This commit updates the
DESIGN document and brings the implementation in line with it.
In commit e72e793, the code that handled processing OpenFlow flow
modification requests set the cookie mask to exact-match. This seems
reasonable for adding flows, but is not correct for matching, since
OpenFlow 1.0 doesn't support matching based on the cookie. This commit
changes to cookie mask to fully wildcarded, which is the correct
behavior for modifications and deletions. It doesn't cause any problems
for flow additions, since the mask is ignored for that operation.
Bug #9742
Reported-by: Luca Giraudo <lgiraudo@nicira.com>
Reported-by: Paul Ingram <paul@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
|
|
Before we submitted the kernel module upstream, we updated the flow format
by adding two fields to the description of packets with VLAN headers, but
we forgot to update ODPUTIL_FLOW_KEY_BYTES to reflect these changes. The
result was that a maximum-length flow did not fit in the given space.
This fixes a crash processing IPv6 neighbor discovery packets with VLAN
headers received in a tunnel configured with key=flow or in_key=flow.
This updates some comments to better describe the implications of
ODPUTIL_FLOW_KEY_BYTES (suggested by Justin).
This also updates test-odp.c so that it would have caught this problem, and
updates odp.at to demonstrate that a full 156 bytes are necessary. (To see
that, revert the change to ODPUTIL_FLOW_KEY_BYTES and run the test.)
Reported-by: Dan Wendlandt <dan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Until now, bridges with datapath_type=netdev did not destroy the datapath
when deleted. In particular, the tap device implementing the internal
interface was not close()d, and therefore the tap persists until
ovs-vswitchd exit()s.
This behaviour was caused by the missing callback for 'enumerate' in the
dpif-netdev class. Without this callback 'bridge_reconfigure' failed to
realize that there are datapaths with no bridge, and thus cannot destroy
them. Providing an 'enumerate' callback fixes this.
Signed-off-by: Giuseppe Lettieri <g.lettieri@iet.unipi.it>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Some OpenFlow 1.0 controllers incorrectly use OPFP_CONTROLLER as the
in_port in packet-out messages, when OFPP_NONE is their intent. Until now,
Open vSwitch has rejected such requests with an error message. This commit
makes Open vSwitch instead treat OFPP_CONTROLLER the same as OFPP_NONE for
compatibility with those controllers.
(Also, as of this writing, OpenFlow 1.0.1 appears to be changing the port
to use from OFPP_NONE to OFPP_CONTROLLER.)
Suggested-by: Rob Sherwood <rob.sherwood@bigswitch.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Found by valgrind.
Reported-by: Ethan Jackson <ethan@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This could cause a slow but steady memory leak in ovs-vswitchd.
Found by valgrind.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This is useful in a situation where one knows that an hmap_node is in some
hmap, but it's not certain which one, and one needs to know whether it is
in a particular one. This is not a very common case; I don't see any
potential users in the current tree, although an upcoming commit will add
one.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Otherwise the "learn" action may not correctly set the cookie in flows that
it creates.
Found by valgrind.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
iface_configure_qos() passes a callback to netdev_dump_queues() that can
delete queues. The netdev-linux implementation of this function was
unprepared for the callback to delete queues, so this could cause a
use-after-free. This fixes the problem in netdev_linux_dump_queues() and
documents that netdev_dump_queues() implementations must support deletions
in the callback.
Found by valgrind:
==1593== Invalid read of size 8
==1593== at 0x4A8C43: netdev_linux_dump_queues (hmap.h:326)
==1593== by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593== by 0x431384: bridge_run (bridge.c:1892)
==1593== by 0x432749: main (ovs-vswitchd.c:96)
==1593== Address 0x632e078 is 8 bytes inside a block of size 32 free'd
==1593== at 0x4C240FD: free (vg_replace_malloc.c:366)
==1593== by 0x4A4D74: hfsc_class_delete (netdev-linux.c:3250)
==1593== by 0x42AA59: iface_delete_queues (bridge.c:3055)
==1593== by 0x4A8C8C: netdev_linux_dump_queues (netdev-linux.c:1881)
==1593== by 0x4305F7: bridge_reconfigure (bridge.c:3084)
==1593== by 0x431384: bridge_run (bridge.c:1892)
Bug #10164.
Reported-by: Ram Jothikumar <ram@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Found by lintian.
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Thomas Goirand <zigo@debian.org>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Open vSwitch userspace can set up flows at a high rate, but it is somewhat
"bursty" in opportunities to set up flows, by which I mean that OVS sets up
a batch of flows, then goes off and does some other work for a while, then
sets up another batch of flows, and so on. The result is that, if a large
number of packets that need flow setups come in all at once, then some of
them can overflow the relatively small kernel-to-user buffers.
This commit increases the kernel-to-user buffers from the default of
approximately 120 kB each to 1 MB each. In one somewhat synthetic test
case that I ran based on an "hping3" that generated a load of about 20,000
new flows per second (including both requests and replies), this reduced
the packets dropped at the kernel-to-user interface from about 30% to none.
I expect that it will similarly improve packet loss in workloads where
flow arrival is not easily predictable.
(This has little effect on workloads generated by "ovs-benchmark rate"
because that benchmark is effectively "self-clocking", that is, a new flow
is triggered only by a reply to a request made earlier, which means that
the number of buffered packets at any given has a known, constant upper
limit.)
Bug #10210.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
"recv" only works for sockets, but tap devices aren't sockets.
Makes the userspace switch work again.
Reported-by: Ravi Kerur <Ravi.Kerur@telekom.com>
Reported-by: 胡靖飞 <hujingfei914@msn.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
The bond/enable-slave and bond/disable-slave ovs-appctl commands
incorrectly reported the 501 error code upon success.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
The error handling path here failed to clean up bound sockets, by removing
them. This fixes the problem.
It was easy to observe this bug by running "ovs-vsctl" without
"ovsdb-server" running.
Bug #9811.
Bug #9769.
Reported-by: Michael <mhu@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Found by inspection.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Although we try to avoid it, some unit tests are necessarily
timing-sensitive. The new "time/stop" command that this commit adds should
help with that, by preventing time from advancing from the viewpoint of
the OVS "timeval" functions except when "time/warp" explicitly advances
the current time. This should allow the unit tests that need it to become
reproducible regardless of the speed at which the tests run.
This commit adds one use of "time/stop" to the unit test suite, in the one
timing-sensitive test of which I am currently aware.
Bug #9782.
Reported-by: Tim Chen <tchen@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
The kernel will report a vport with the given name in any datapath, but
userspace only wants a vport with the given name in a specific datapath.
Receiving information on a vport in an unexpected datapath yields bizarre
and hard-to-debug problems.
Bug #9889.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This can be useful when testing.
Suggested-by: Reid Price <reid@nicira.com>
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
NICS-11.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
When Cisco (and other?) routers are configured in high-availability modes,
they use two different MAC addresses. The router uses MAC 1 only for ARP
replies. The router uses MAC 2 for forwarding IP packets to end hosts.
When a MAC learning switch is attached to the router, therefore, it will
only learn the location of MAC 1 from ARP replies. If the end host's ARP
cache refresh timer is longer than the switch's MAC learning timeout, then
packets to the router will be flooded from the MAC learning timeout until
the next ARP reply.
This commit fixes the problem by increasing the MAC learning timeout from
60 seconds to 300 seconds. According to research by Sanjay Sane, this is
always sufficient, even with operating systems that use ARP timeouts
longer than 300 seconds (such as FreeBSD and Mac OS, which have 1200
seconds ARP timeouts) because the routers that cause this problem send
unsolicited ARP replies every 180 seconds.
This issue arises in any situation where traffic between two hosts flows
only in one direction. The explanation above describes only one special
case.
NICS-11.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
These functions use sprintf() into a 1000-byte buffer. It appears to me
that the strings they format are either short, fixed-length strings or the
output of strerror(), neither of which should ordinarily overflow.
However, using snprintf() cannot hurt.
Launchpad bug #914160.
Reported-by: Matthias Klose <doko@ubuntu.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Found by valgrind.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
The unit tests feed a lot of flows through the ofproto-dpif "trace"
command, which means that they need to know the port numbers of the ports
that they create. Until now, they've had to actually query those port
numbers from the database, which is a bit of unnecessary overhead for unit
tests.
This commit makes dummy dpif port numbers predictable: if the name of a
port contains a number, then the dummy dpif uses that number, if it is
valid and available, as the port number.
This commit also simplifies the unit tests that previously queried port
numbers to depend on the new behavior.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
This makes it possible to add entries for decoding OpenFlow messages with
newer versions, e.g. OpenFlow 1.1 or 1.2. However, no actual messages for
newer versions are actually implemented yet; that will come later.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Most structures in this file have an "nx_" prefix, so this makes naming
more consistent.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Rather than silently skipping ipv6 action generation, following patch
generates OVS_ACTION_ATTR_SET action for ipv6. Datapath which do not
support ipv6 action can reject this action.
Bug #8758
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
|
|
The vconn and ovsdb passive connection man page fragments used the
PN (program name) macro to describe their functionality. This was
usually correct, but in the case of ovs-vsctl, they may be used to
describe configuring ovs-vswitchd. This commit rewords the fragments to
make them correct regardless of whether they're describing a local or
remote passive connection.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
|
|
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
The new PACKET_IN format implemented in this patch includes flow
metadata such as the cookie, table_id, and registers.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
In future patches, PACKET_IN messages will include meta-data which
is only available in userspace during action translation. Either,
this data needs to be stored until it's required by a userspace
datapath action, or the PACKET_IN messages must be sent at the time
the data is available. This patch implements the latter.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Future patches will need the ability to skip over unsupported NXM
headers.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This will ease the implementation of future patches.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This commit pulls code used to modify L3 and L4 header fields
from dp_netdev into the packet library. An additional user will
be added in a future commit.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This will make the memory ownership clearer in future patches which
make more extensive use of ofputil_packet_in.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This patch removes an optimization which significantly complicates
the code in ways which would get worse in future patches if not
removed. Furthermore, future patches will have fewer cases which
can take advantage of the optimization further mitigating its
justification.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
This will improve the unit tests of future patches.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Instead this patch uses flow_format() which gives very similar
output. This patch will improve the reliability of unit tests in
future patches which rely on the results of ofp_packet_to_string().
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
We should never push a VLAN tag with the CFI bit set. This patch
defensively enforces this invariant.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
ofp_print_packet() and ofp_packet_to_string() don't use the
'total_len' argument which they require callers to supply.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
I find this significantly easier to read.
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Signed-off-by: Ethan Jackson <ethan@nicira.com>
|
|
Without this change, if a unixctl command name is 23 character long or
longer, no space appeared between the command name and its usage. This
commit ensures that at least one space always appears.
No command yet has a name this long. I discovered this issue when I added
one that does.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
With this commit, it is possible to limit flow deletions and
modifications to specific cookies. It also provides the ability to
dump flows based on their cookies.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
|
|
In C, the || operator yields 0 or 1, not (as in some other languages) the
value of its first nonzero operand.
Found by inspection.
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
Current userspace considers an ICMP header to be 4 bytes consisting
of the type, code, and checksum. The kernel considers it to be 8
bytes because it also counts the two data fields that contain
type-specific information (and are always present). Since flow
extract will zero out headers that are not completely present this
means that an ICMP packet that has a header of 5-7 bytes will be
interpreted differently by userspace and kernel. This fixes the
problem by adopting the kernel's version of the ICMP header in
userspace.
Signed-off-by: Jesse Gross <jesse@nicira.com>
|