aboutsummaryrefslogtreecommitdiff
path: root/datapath/flow_netlink.c
AgeCommit message (Collapse)Author
2014-11-25datapath: Don't validate IPv6 label masks.Joe Stringer
When userspace doesn't provide a mask, OVS datapath generates a fully unwildcarded mask for the flow by copying the flow and setting all bits in all fields. For IPv6 label, this creates a mask that matches on the upper 12 bits, causing the following error: openvswitch: netlink: Invalid IPv6 flow label value (value=ffffffff, max=fffff) This patch ignores the label validation check for masks, avoiding this error. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-09datapath: fix coding style.Pravin B Shelar
Kernel datapath code has diverged from upstream code. This makes porting patches between these two code bases harder than it needs to be. Following patch fixes this by fixing coding style issues on this branch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-11-09datapath: Fix few mpls issues.Pravin B Shelar
Found during MPLS upstreaming. Also sync-up MPLS header files with upstream code. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-10-23datapath: Fix comment style.Pravin B Shelar
Use netdev comment style. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-10-03datapath: Add support for OVS_FLOW_ATTR_PROBE.Jarno Rajahalme
This new flag is useful for suppressing error logging while probing for datapath features using flow commands. For backwards compatibility reasons the commands are executed normally, but error logging is suppressed. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-23datapath: Constify various function argumentsThomas Graf
Help produce better optimized code. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-09-08datapath: Remove unused dp parameter.Pravin B Shelar
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-08-18Extend OVS IPFIX exporter to export tunnel headersWenyu Zhang
Extend IPFIX exporter to export tunnel headers when both input and output of the port. Add three other_config options in IPFIX table: enable-input-sampling, enable-output-sampling and enable-tunnel-sampling, to control whether sampling tunnel info, on which direction (input or output). Insert sampling action before output action and the output tunnel port is sent to datapath in the sampling action. Make datapath collect output tunnel info and send it back to userpace in upcall message with a new additional optional attribute. Add a tunnel ports map to make the tunnel port lookup faster in sampling upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template sets with enterprise elements for the tunnel info, save the tunnel info in IPFIX cache entries, and send IPFIX DATA with tunnel info. Add flowDirection element in IPFIX templates. Signed-off-by: Wenyu Zhang <wenyuz@vmware.com> Acked-by: Romain Lenglet <rlenglet@vmware.com> Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-08-15datapath: Move key_attr_size() to flow_netlink.h.Joe Stringer
Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-08-11datapath/flow_netlink: Validate IPv6 flow key and mask values.Jarno Rajahalme
Reject flow label key and mask values with invalid bits set. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2014-08-08datapath: Avoid NULL mask check while building maskPravin B Shelar
OVS does mask validation even if it does not need to convert netlink mask attributes to mask structure. ovs_nla_get_match() caller can pass NULL mask structure pointer if the caller does not need mask. Therefore NULL check is required in SW_FLOW_KEY* macros. Following patch does not convert mask netlink attributes if mask pointer is NULL, so we do not need these checks in SW_FLOW_KEY* macro. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-08-07datapath: Refactor action alloc and copy api.Pravin B Shelar
There are two separate API to allocate and copy actions list. Anytime OVS needs to copy action list, it needs to call both functions. Following patch moves action allocation to copy function to avoid code duplication. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
2014-08-06datapath: Avoid using wrong metadata for recic action.Pravin B Shelar
Recirc action needs to extract flow key from packet, it uses tun_info from OVS_CB for setting tunnel meta data in flow key. But tun_info can be overwritten by tunnel send action. This would result in wrong flow key for the recirculation. Following patch copies flow-key meta data from OVS_CB packet key itself thus avoids this bug. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-08-06datapath: refactor ovs flow extract API.Pravin B Shelar
OVS flow extract is called on packet receive or packet execute code path. Following patch defines separate API for extracting flow-key in packet execute code path. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
2014-08-01datapath: do not use vport type to determine presence of Geneve attributesAnsis Atteka
This patch fixes following kernel crash that could happen, if geneve vport was not added yet, but revalidator thread attempted to dump flows. To reproduce: 1. switch tunnel type between geneve and gre in a loop; and 2. run ping. BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 IP: [<ffffffffa0385470>] ovs_nla_put_flow+0x3d0/0x7c0 [openvswitch] PGD 3b32b067 PUD 3b2ef067 PMD 0 Oops: 0000 [#2] SMP ... CPU: 0 PID: 6450 Comm: revalidator2 Tainted: GF D O 3.13.0-24-generic #46-Ubuntu Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012 task: ffff88003b4aafe0 ti: ffff88003d314000 task.ti: ffff88003d314000 RIP: 0010:[<ffffffffa0385470>] [<ffffffffa0385470>] ovs_nla_put_flow+0x3d0/0x7c0 [openvswitch] RSP: 0018:ffff88003d315a10 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88003a9a9960 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffffffffffc8 RDI: ffff88003babcb80 RBP: ffff88003d315a68 R08: 0000000000000000 R09: 0000000000000004 R10: ffff880039c23034 R11: 0000000000000008 R12: ffff88003a861600 R13: ffff88003a9a9960 R14: ffff88003babcb80 R15: qffff88003a861600 FS: 00007ff0f5d94700(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000048 CR3: 000000003b55b000 CR4: 00000000000007f0 Stack: ffffffff81385093 0000000000000000 0000000000000000 0000000000000000 ffff880000000000 ffff88003d315a58 ffff880039c23014 ffff88003a9a97a0 ffff88003babcb80 ffff880039c23018 ffff88003a861600 ffff88003d315ad0 Call Trace: [<ffffffff81385093>] ? __nla_reserve+0x43/0x50 [<ffffffffa037e683>] ovs_flow_cmd_fill_info+0x93/0x2b0 [openvswitch] [<ffffffffa0387159>] ? ovs_flow_tbl_dump_next+0x49/0xc0 [openvswitch] [<ffffffffa037e920>] ovs_flow_cmd_dump+0x80/0xd0 [openvswitch] [<ffffffff81645004>] netlink_dump+0x84/0x240 [<ffffffff816458eb>] __netlink_dump_start+0x1ab/0x220 [<ffffffff816498d7>] genl_family_rcv_msg+0x337/0x370 [<ffffffffa037e8a0>] ? ovs_flow_cmd_fill_info+0x2b0/0x2b0 [openvswitch] [<ffffffff811a2778>] ? __kmalloc_node_track_caller+0x58/0x1e0 [<ffffffff81649910>] ? genl_family_rcv_msg+0x370/0x370 [<ffffffff816499a1>] genl_rcv_msg+0x91/0xd0 [<ffffffff81647a29>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff81647f28>] genl_rcv+0x28/0x40 [<ffffffff81647055>] netlink_unicast+0xd5/0x1b0 [<ffffffff8164742f>] netlink_sendmsg+0x2ff/0x740 [<ffffffff816024eb>] sock_sendmsg+0x8b/0xc0 [<ffffffff811bbaa1>] ? __sb_end_write+0x31/0x60 [<ffffffff811d42bf>] ? touch_atime+0x10f/0x140 [<ffffffff811c2471>] ? pipe_read+0x371/0x400 [<ffffffff81602691>] SYSC_sendto+0x121/0x1c0 [<ffffffff8109dd84>] ? vtime_account_user+0x54/0x60 [<ffffffff81020d35>] ? syscall_trace_enter+0x145/0x250 [<ffffffff8160319e>] SyS_sendto+0xe/0x10 [<ffffffff8172663f>] tracesys+0xe1/0xe6 Signed-Off-By: Ansis Atteka <aatteka@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2014-07-23datapath/flow_netlink: Avoid wildcarding tunnel key with disabled megaflowsDaniele Di Proietto
If the userspace wants to match on a flow with some tunnel attributesset to 0, it simply omits them in the netlink attributes stream. Since our wildcarding logic (when megaflows are disabled) is based on the attributes in the netlink stream, we set our mask incorrectly. This commit adds a check to detect if the userspace wants to match on a tunnel, in which case we simply unwildcard the whole tun_key Reported-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
2014-07-23datapath: flow_netlink: Fix a bug.Alex Wang
Commit 62974663fe (datapath/flow_netlink: Create right mask with disabled megaflows) introduced the bug which caused ovs_nla_get_match() returns immediately after parsing the flow mask for OVS_KEY_ATTR_ENCAP. Consequently, when vlan encapsulated packets are present, the corresponding datapath flows will have incorrect mask like below. And the incorrect flows could affect other non-vlan packets. ~/ovs# ovs-dpctl dump-flows in_port(3/0xffff0000),eth_type(0x8100),encap(), packets:0, bytes:0, used:never, actions:2 This commit fixes the bug by checking and handling the return value of the parsing function correctly. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-07-11datapath/flow_netlink: Create right mask with disabled megaflowsDaniele Di Proietto
If megaflows are disabled, the userspace does not send the netlink attribute OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask. sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the bytes that represent padding for struct sw_flow, or the bytes that represent fields that may not be set during ovs_flow_extract(). This is a problem, because when we extract a flow from a packet, we do not memset() anymore the struct sw_flow to 0 (since commit 9cef26ac6a71). This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(), which operates on the netlink attributes rather than on the mask key. Using this approach we are sure that only the bytes that the user provided in the flow are matched. Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we now return with an error. Reported-by: Alex Wang <alexw@nicira.com> Suggested-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-07-10datapath/flow_netlink: Fix NDP flow mask validationDaniele Di Proietto
match_validate() enforce that a mask matching on NDP attributes has also an exact match on ICMPv6 type. The ICMPv6 type, which is 8-bit wide, is stored in the 'tp.src' field of 'struct sw_flow_key', which is 16-bit wide. Therefore, an exact match on ICMPv6 type should only check the first 8 bits. This commit fixes a bug that prevented flows with an exact match on NDP field from being installed Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-07-02datapath: Additional logging for -EINVAL on flow setups.Jesse Gross
There are many possible ways that a flow can be invalid so we've added logging for most of them. This adds logs for the remaining possible cases so there isn't any ambiguity while debugging. CC: Federico Iezzi <fiezzi@enter.it> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
2014-07-01datapath: Allow pop and push MPLS actions after pop VLANSimon Horman
This patch loosens the restrictions surrounding push and pop MPLS actions such that they will be allowed after a pop VLAN action if the inner ethernet type is acceptable for pop and push MPLS actions. This implies that there is only one VLAN tag present. Some analysis of logic of this change is as follows: The purpose of tracking vlan_tci is to allow prohibition of push and pop MPLS actions in the presence of a VLAN. In this scenario the VLAN_TAG_PRESENT bit of vlan_tci is set and eth_type is that of the packet with the outermost VLAN tag removed. A pop VLAN action may clear vlan_tci as it removes the outermost VLAN tag and the push and pop MPLS logic may rely on eth_type for their prohibition logic. This will not allow push and pop MPLS on packets with multiple VLAN tags, regardless of if they are all remove using POP VLAN, as there is no mechanism to expose the inner ethernet type beyond that of the outermost VLAN tag. Suggested-by: Jesse Gross <jgross@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-06-30datapath: Fix error handling for Geneve options in ipv4_tun_to_nlattr().Ben Pfaff
Found by inspection. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2014-06-27datapath: Remove redundant tcp_flags code.Joe Stringer
These two cases used to be treated differently for IPv4/IPv6, but they are now identical. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-24datapath: Add basic MPLS support to kernelSimon Horman
Allow datapath to recognize and extract MPLS labels into flow keys and execute actions which push, pop, and set labels on packets. Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer. Cc: Ravi K <rkerur@gmail.com> Cc: Leo Alterman <lalterman@nicira.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Joe Stringer <joe@wand.net.nz> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-06-20datapath: Add support for Geneve tunneling.Jesse Gross
This adds support for Geneve - Generic Network Virtualization Encapsulation. The protocol is documented at http://tools.ietf.org/html/draft-gross-geneve-00 The kernel implementation is completely agnostic to the options that are in use and can handle newly defined options without further work. It does this by simply matching on a byte array of options and allowing userspace to setup flows on this array. Userspace currently implements only support for basic version of Geneve. It can work with the base header (including the VNI) and is capable of parsing options but does not currently support any particular option definitions. Over time, the intention is to allow options to be matched through OpenFlow without requiring explicit support in OVS userspace. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-19tunnel: Add support for matching on OAM packets.Jesse Gross
Some tunnel formats have mechanisms for indicating that packets are OAM frames that should be handled specially (either as high priority or not forwarded beyond an endpoint). This provides support for allowing those types of packets to be matched. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-06-19datapath: Wrap struct ovs_key_ipv4_tunnel in a new structure.Jesse Gross
Currently, the flow information that is matched for tunnels and the tunnel data passed around with packets is the same. However, as additional information is added this is not necessarily desirable, as in the case of pointers. This adds a new structure for tunnel metadata which currently contains only the existing struct. This change is purely internal to the kernel since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version of OVS_KEY_ATTR_TUNNEL that is translated at flow setup. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
2014-04-21datapath: add recirc actionAndy Zhou
Recirculation implementation for Linux kernel data path. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2014-04-21datapath: add hash actionAndy Zhou
Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2014-03-24datapath: Compact sw_flow_key.Jarno Rajahalme
Minimize padding in sw_flow_key and move 'tp' top the main struct. These changes simplify code when accessing the transport port numbers and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit systems (128->120 bytes). These changes also make the keys for IPv4 packets to fit in one cache line. There is a valid concern for safety of packing the struct ovs_key_ipv4_tunnel, as it would be possible to take the address of the tun_id member as a __be64 * which could result in unaligned access in some systems. However: - sw_flow_key itself is 64-bit aligned, so the tun_id within is always 64-bit aligned. - We never make arrays of ovs_key_ipv4_tunnel (which would force every second tun_key to be misaligned). - We never take the address of the tun_id in to a __be64 *. - Whereever we use struct ovs_key_ipv4_tunnel outside the sw_flow_key, it is in stack (on tunnel input functions), where compiler has full control of the alignment. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-24datapath: Fix output of SCTP mask.Jarno Rajahalme
The 'output' argument of the ovs_nla_put_flow() is the one from which the bits are written to the netlink attributes. For SCTP we accidentally used the bits from the 'swkey' instead. This caused the mask attributes to include the bits from the actual flow key instead of the mask. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-03-07datapath: Add support for Linux 3.12Pravin Shelar
Bump kernel support for datapath module to include 3.12. Make use of native ip-tunnel API for Kernel >= 3.12. Based on patch from James Page. Signed-off-by: James Page <james.page@ubuntu.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Kyle Mestery <mestery@noironetworks.com>
2014-02-16datapath: Use ether_addr_copyJoe Perches
It's slightly smaller/faster for some architectures. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
2014-02-18datapath: Remove 5-tuple optimization.Jarno Rajahalme
The 5-tuple optimization becomes unnecessary with a later per-NUMA node stats patch. Remove it first to make the changes easier to grasp. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-02-03datapath: flow_netlink: Use pr_fmt to OVS_NLERRJoe Perches
Add "openvswitch: " prefix to OVS_NLERR output to match the other OVS_NLERR output of datapath.c Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-02-03datapath: Added (unsigned long long) cast in printfDaniele Di Proietto
This is necessary, since u64 is not unsigned long long in all architectures: u64 could be also uint64_t. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2014-01-23datapath: use const in some local vars and castsDaniele Di Proietto
In few functions, const formal parameters are assigned or cast to non-const. These changes suppress warnings if compiled with -Wcast-qual. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-12-03datapath: Use percpu allocator for flow-stats.Pravin B Shelar
Use percpu allocator for stats due to objection to stats array. But percpu allocator is not designed for high churn allocation/ deallcation. so we need to avoid allocating percpu flow for short lived flows. One cheaper way to detect flow is by checking if 5-tuple used in RSS are masked or not. if any one of them is masked, flow is likely shared across CPU where percpu stat should be more scalable. And that flow should be relatively long lived flow. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2013-10-29TCP flags matching support.Jarno Rajahalme
tcp_flags=flags/mask Bitwise match on TCP flags. The flags and mask are 16-bit num‐ bers written in decimal or in hexadecimal prefixed by 0x. Each 1-bit in mask requires that the corresponding bit in port must match. Each 0-bit in mask causes the corresponding bit to be ignored. TCP protocol currently defines 9 flag bits, and additional 3 bits are reserved (must be transmitted as zero), see RFCs 793, 3168, and 3540. The flag bits are, numbering from the least significant bit: 0: FIN No more data from sender. 1: SYN Synchronize sequence numbers. 2: RST Reset the connection. 3: PSH Push function. 4: ACK Acknowledgement field significant. 5: URG Urgent pointer field significant. 6: ECE ECN Echo. 7: CWR Congestion Windows Reduced. 8: NS Nonce Sum. 9-11: Reserved. 12-15: Not matchable, must be zero. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
2013-10-01datapath: Restructure datapath.c and flow.cPravin B Shelar
Over the time datapath.c and flow.c has became pretty large files. Following patch restructures functionality of component into three different components: flow.c: contains flow extract. flow_netlink.c: netlink flow api. flow_table.c: flow table api. Diffstat is showing wrong count. This patch mostly restructures code without changing logic. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>