Age | Commit message (Collapse) | Author |
|
Much of the code in the GRE implementation is not specific to the
GRE protocol but is actually common to all types of tunnels. In
order to support future types of tunnels, move this code into a
common library.
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
The current meaning of the GRE checksum option is to include
checksums on transmit and require packets to have them on receive.
In addition, incoming packets with checksums are always validated
regardless of this option. Requiring checksums on receive creates
surprising behavior and interoperability issues. This disables the
requirement on receive. The new behavior is that the sender decides
whether to checksum packets and the receiver will validate packets
with checksums (similar to UDP).
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
CC: Teemu Koponen <koponen@nicira.com>
|
|
Originally, the datapath didn't care about IP TOS at all. Then, to support
NetFlow, we made it keep track of the last-seen IP TOS value on a per-flow
basis. Then, to support OpenFlow 1.0, we added a nw_tos field to
odp_flow_key. We don't need both methods, so this commit drops the
NetFlow-specific tracking.
This introduces a small kernel ABI break: upgrading the kernel module
without upgrading the OVS userspace will mean that NetFlow records will
all show an IP TOS value of 0. I don't consider that to be a serious
problem.
|
|
ovs-vswitchd doesn't declare its QoS capabilities in the database yet,
so the controller has to know what they are. We can add that later.
The linux-htb QoS class has been tested to the extent that I can see that
it sets up the queues I expect when I run "tc qdisc show" and "tc class
show". I haven't tested that the effects on flows are what we expect them
to be. I am sure that there will be problems in that area that we will
have to fix.
|
|
|
|
In certain cases we require the ability to provide stats that are
added to the values collected by the kernel (currently only used
by bond fake devices). Internal devices previously implemented
this directly but now that their stats are now handled by the vport
layer the functionality has been moved there. This removes the
userspace code to set the stats and replaces it with a mechanism
to access the equivalent functionality in the vport layer.
|
|
Adds a method to set a group of stats to be added to the values
gathered normally. This is needed for the fake bond device to
show the stats of its underlying slaves. Also enables devices
that use the generic stats layer to define a get_stats() function
to provide additional error counts.
|
|
Most of the timekeeping needs of OVS are simply to measure intervals,
which means that it is sensitive to changes in the clock. This commit
replaces the existing clocks with monotonic timers. An additional set
of wall clock timers are added and used in locations that need absolute
time.
Bug #1858
|
|
datapath-protocol.h is not a very clean interface. I originally intended
it to be solely a Linux-kernel specific interface. Over time it became
a general-purpose interface to dpifs. This is not a good situation,
because clearly the header is still Linux-specific.
In the long run, the correct solution is to separate the generic and
Linux-specific bits. This is not that patch. Instead, this patch modifies
datapath-protocol.h enough that it can be used on non-Linux hosts. In
particular I tested that it works OK with FreeBSD 8.0.
|
|
gre.h is Linux-specific, and it uses Linux-specific types, so it has to
#include <linux/types.h>. We probably got away with it until now because
it was always included after some other header that had already included
that one.
|
|
When a 32-bit userspace program runs on a 64-bit kernel, data structures
that contain members whose sizes or alignments change from 32- to 64-bit
must be translated when they are passed to ioctls. This commit adds such
support for openvswitch_mod.
We should really reconsider some parts of the Open vSwitch ioctl interface
to avoid needing as much translation as we do.
Lightly tested with 32-bit userspace on sparc64.
|
|
'n_ports' should never be negative so it's better to use an unsigned type.
Suggested-by: Jesse Gross <jesse@nicira.com>
|
|
do_flowvec_ioctl() was checking for too-big 'n_flows' but not negative
'n_flows'. We could add that check too, but 'n_flows' should never be
negative so it's better to just use an unsigned type.
|
|
Now that Open vSwitch has support for multiple simultaneous controllers,
there is some need for a degree of coordination among them. For now, the
plan is for the controllers themselves to take the lead on this. This
commit adds a small bit of OVS infrastructure: the ability for a controller
to designate itself as a "master" or a "slave". There may be at most one
master at a time; when a controller designates itself as the master, then
any existing master is demoted to slave status. Slave controllers are not
allowed to modify the flow table or global configuration; any attempt to
do so is rejected with a "bad request" error.
Feature #2495.
|
|
|
|
Needed by XAPI to accurately report bond statistics.
Ugh.
Bug NIC-63.
|
|
The new GRE implementation provides a complete drop in replacement
for the old Linux based implementation. Therefore, remove the
old implementation and rename "grenew" to "gre".
|
|
Add a netdev that supports the new datapath GRE implementation.
It currently coexists with the old implementation so it is named
"grenew".
|
|
Add a new vport type that implements GRE support inside of the
datapath instead of relying on Linux devices. This provides
greater scalability, performance, and control.
The new GRE implementation supports nearly all features of the
Linux implementation. It does not currently support multicast,
NBMA tunnels, or non-Ethernet devices.
This implementation of GRE has several important benefits over the
existing Linux implementation. The first is simply that is not a
Linux device. Linux devices are fairly heavy weight both in terms
of memory consumption and interactions with the rest of the system
(notifications, processes polling, etc.). There are many pieces of
code that make assumptions about the maximum reasonable number of
ports. Simply maintaining the state of several thousand devices is
enough to full occupy the CPU.
A tighter coupling between the GRE implementation and datapath
also allows more flexibility. The key can be set and retrieved
from the flow table, which allows even greater scalability.
There will probably be additional use cases in the future.
|
|
Currently the datapath directly accesses devices through their
Linux functions. Obviously this doesn't work for virtual devices
that are not backed by an actual Linux device. This creates a
new virtual port layer which handles all interaction with devices.
The existing support for Linux devices was then implemented on top
of this layer as two device types. It splits out and renames dp_dev
to internal_dev. There were several places where datapath devices
had to handled in a special manner and this cleans that up by putting
all the special casing in a single location.
|
|
Add a tun_id field which contains the ID of the encapsulating tunnel
on which a packet was received (0 if not received on a tunnel). Also
add an action which allows the tunnel ID to be set for outgoing
packets. At this point there aren't any tunnel implementations so
these fields don't have any effect.
The matching is exposed to OpenFlow by overloading the high 32 bits
of the cookie as the tunnel ID. ovs-ofctl is capable of turning
on this special behavior using a new "tun-cookie" command but this
command is intentially undocumented to avoid it being used without
a full understanding of the consequences.
|
|
If NXAST_RESUBMIT adopts the replacement in_port for executing actions,
then OFPP_NORMAL will believe that traffic originated from whatever port
that is. This seems unlikely to ever be useful and in fact breaks
applications that use NXAST_RESUBMIT for two-stage ACLs.
Bug #2644.
|
|
Until now, the NXAST_RESUBMIT action has always looked up the original
flow except for the updated in_port. This commit changes the semantics to
instead look up the flow as modified by any preceding actions that affect
it, e.g. if OFPAT_SET_VLAN_VID precedes NXAST_RESUBMIT, then NXAST_RESUBMIT
now looks up the flow with the modified VLAN, not the original (as well as
the modified in_port).
Also, document how NXAST_RESUBMIT is supposed to work.
Suggested-by: Paul Ingram <paul@nicira.com>
|
|
|
|
|
|
Finalize OpenFlow 1.0 wire-compatibility:
- Set protocol version to 0x01
- Remove references to retired OFPC_MULTI_PHY_TX
- Clean extraneous spaces in header file
NOTE: This is the final commit in the OpenFlow 1.0 set. Starting with
this commit, OVS is OpenFlow 1.0 wire-compatible. Slicing is not yet
implemented.
|
|
OpenFlow 1.0 adds support for a subset of QoS that's referred to as slicing.
Open vSwitch does not support this yet, so send errors if it's used.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
|
|
OpenFlow 1.0 adds "port_no" field to the Port Stat request messages to
allow stats for individual ports to be queried. Port stats for all ports
can still be requested by specifying OFPP_NONE as the port number.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
|
|
OpenFlow 1.0 adds support for matching on IP ToS/DSCP bits.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
|
|
OpenFlow 1.0 increases the resolution of flow stats and flow removed messages
from seconds to (potentially) nanoseconds. The spec stats that only
millisecond granularity is required, so that's all we provide at this
time. Increasing to nanoseconds would require more significant code
change and would not provide an appreciable improvement in real world
use.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
|
|
The OpenFlow 1.0 specification supports matching the IP address and
opcode in ARP messages. The datapath already supports this, so this
commit merely exposes that through the OpenFlow module.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0
until the final commit in this OpenFlow 1.0 set.
|
|
In OpenFlow 1.0, flows have been extended to include an opaque
identifier, referred to as a cookie. The cookie is specified by the
controller when the flow is installed; the cookie will be returned as
part of each flow stats and flow removed message.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this Openflow 1.0 set.
|
|
In OpenFlow 1.0, a "dp_desc" character array was added to the ofp_desc_stats
structure that allows a human readable description of the datapath to be
provided.
NOTE: OVS at this point is not wire-compatible with OpenFlow 1.0 until
the final commit in this OpenFlow 1.0 set.
|
|
The length of a datapath was changed from 48 bits to 64 bits in OpenFlow
0.9. For parity, we increased the management id size to match.
NOTE: This is the final commit in the OpenFlow 0.9 set. Starting with
this commit, OVS is OpenFlow 0.9-compliant.
|
|
OpenFlow 0.9 introduces the concept of the barrier command. When the
controller sends a Barrier Request, the switch is not allowed to respond
with a Barrier Reply until it has finished processing any other commands
that preceded it. This commit provides that support.
NOTE: OVS at this point is not wire-compatible with OpenFlow 0.9 until the
final commit in this OpenFlow 0.9 set.
|
|
This commit cleans up a few comments in openflow.h. The only one of
significance is that OpenFlow port numbers now begin enumeration at 1.
OVS already behaved in this manner, so this is just a documentation
issue for us.
NOTE: OVS at this point is not wire-compatible with OpenFlow 0.9 until the
final commit in this OpenFlow 0.9 set.
|
|
In OpenFlow 0.9, flow "expiration" messages are sent when flows are
explicitly removed by a delete action. As such, the message is renamed
from Flow Expired to Flow Removed. This commit adds that support as well
as supporting the ability to choose sending these messages on a per flow
basis.
NOTE: OVS at this point is not wire-compatible with OpenFlow 0.9 until the
final commit in this OpenFlow 0.9 set.
|
|
This commit adds (some) support for a couple new OpenFlow 0.9 features:
- The OFPFF_CHECK_OVERLAP flag in Flow Mod messages allows the
controller to prevent flows that would conflict at the same
priority.
- An emergency flow cache that contains a small flow table that is
used if the switch loses connectivity with the controller. I
believe the design has fundamental flaws and looks likely to be
retired. If a controller attempts to add a flow to the emergency
flow cache, OVS always responds that the tables are full.
The OpenFlow 0.9 error codes are also sync'd in the commit.
NOTE: OVS at this point is not wire-compatible with OpenFlow 0.9 until the
final commit in this OpenFlow 0.9 set.
|
|
Starting in OpenFlow 0.9, it is possible to match on the VLAN PCP
(priority) field and rewrite the IP ToS/DSCP bits. This check-in
provides that support and bumps the wire protocol number to 0x98.
NOTE: The wire changes come together over the set of OpenFlow 0.9 commits,
so OVS will not be OpenFlow-compatible with any official release between
this commit and the one that completes the set.
|
|
|
|
Some (out-of-tree) datapaths want to pass OFPP_NORMAL up to the datapath.
For now add ODPP_NORMAL. In the long run we may want to use OFPP_ port
numbers in the datapath interface.
Reported-by: Jean Tourrilhes <jt@hpl.hp.com>
|
|
Conflicts:
COPYING
datapath/datapath.h
lib/automake.mk
lib/dpif-provider.h
lib/dpif.c
lib/hmap.h
lib/netdev-provider.h
lib/netdev.c
lib/stream-ssl.h
ofproto/executer.c
ofproto/ofproto.c
ofproto/ofproto.h
tests/automake.mk
utilities/ovs-ofctl.c
utilities/ovs-vsctl.in
vswitchd/ovs-vswitchd.conf.5.in
xenserver/etc_init.d_vswitch
xenserver/etc_xensource_scripts_vif
xenserver/opt_xensource_libexec_interface-reconfigure
|
|
|
|
These Nicira-specific requests have not been implemented for some time.
In case we need them later we can always reimplement them.
|
|
Older versions of Open vSwitch supported a management protocol based on
OpenFlow message framing. The current Open vSwitch instead uses the
OVSDB protocol for the same purposes. We don't plan to support this older
protocol any longer, so this commit deletes support.
This commit also deletes the management_id column from the vswitch's
database schema. The management_id was used by the older management
protocol to match up OpenFlow switch connections to management connections,
but the current implementation instead matches up connections based on
the datapath IDs exported by the configuration database. In fact, the
OpenFlow connections had no way to actually export the management ID, so
this just deletes code that was essentially without function anyhow.
|
|
|
|
This causes the build to fail with an error message if openflow.h contains
a structure whose members are not aligned in a portable way.
|
|
According to Neil McKee, in an email archived at
http://openvswitch.org/pipermail/dev_openvswitch.org/2010-January/000934.html:
The containment rule is that a given sflow-datasource (sampler or
poller) should be scoped within only one sflow-agent (or
sub-agent). So the issue arrises when you have two
switches/datapaths defined on the same host being managed with
the same IP address: each switch is a separate sub-agent, so they
can run independently (e.g. with their own sequence numbers) but
they can't both claim to speak for the same sflow-datasource.
Specifically, they can't both represent the <ifindex>:0
data-source. This containment rule is necessary so that the
sFlow collector can scale and combine the results accurately.
One option would be to stick with the <ifindex>:0 data-source but
elevate it to be global across all bridges, with a global
sample_pool and a global sflow_agent. Not tempting. Better to
go the other way and allow each interface to have it's own
sampler, just as it already has it's own poller. The ifIndex
numbers are globally unique across all switches/datapaths on the
host, so the containment is now clean. Datasource <ifindex>:5
might be on one switch, whille <ifindex>:7 can be on another.
Other benefits are that 1) you can support the option of
overriding the default sampling-rate on an interface-by-interface
basis, and 2) this is how most sFlow implementations are coded,
so there will be no surprises or interoperability issues with any
sFlow collectors out there.
This commit implements the approach suggested by Neil.
This commit uses an atomic_t to represent the sampling pool. This is
because we do want access to it to be atomic, but we expect that it will
"mostly" be accessed from a single CPU at a time. Perhaps this is a bad
assumption; we can always switch to another form of synchronization later.
CC: Neil McKee <neil.mckee@inmon.com>
|
|
|