Age | Commit message (Collapse) | Author |
|
Cross compiling the binaries in scripts/* is not possible
because various makefiles assume that $(obj)/whatever is
executable on the build host.
This patch introduces a new variable called KBUILD_SCRIPTROOT
that points to script/binaries to use while cross compiling.
Usage:
Build scripts for the build host:
make O=path/to/buildhost/buildscripts \
silentoldconfig prepare scripts
Then cross build script for target:
make O=path/to/target/buildscripts \
HOSTCC=$CROSS_COMPILE \
KBUILD_SCRIPTROOT=path/to/buildhost/buildscripts
silentoldconfig prepare scripts
This patch does not use KBUILD_SCRIPTROOT for all script invocations
it only redefines the following if KBUILD_SCRIPTROOT is defined.
scripts/Makefile.build
scripts/basic/fixdep --> $(KBUILD_SCRIPTROOT)/scripts/basic/fixdep
scripts/kconfig/Makefile
$(obj)/conf --> $(KBUILD_SCRIPTROOT)/scripts/kconfig/conf
scripts/mod/Makefile
$(obj)mk_elfconfig --> $(KBUILD_SCRIPTROOT)/scripts/mod/mk_elfconfig
Signed-off-by: John Rigby <john.rigby@linaro.org>
|
|
When checking permissions on an overlayfs inode we do not take into
account either device cgroup restrictions nor security permissions.
This allows a user to mount an overlayfs layer over a restricted device
directory and by pass those permissions to open otherwise restricted
files.
Switch over to the newly introduced inode_only_permissions.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
permissions checks
We need to be able to check inode permissions (but not filesystem implied
permissions) for stackable filesystems. Now that permissions involve
checking with the security LSM, cgroups and basic inode permissions it is
easy to miss a key permission check and introduce a security vunerability.
Expose a new interface for these checks.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Imaging using ext4 as upperdir which has a file "hello" and lowdir is
totally empty.
1. mount -t overlayfs overlayfs -o lowerdir=/lower,upperdir=/upper /overlay
2. cd /overlay
3. ln hello bye
then the overlayfs code will call vfs_link to create a real ext4
dentry for "bye" and create
a new overlayfs dentry point to overlayfs inode (which standed for
"hello"). That means:
two overlayfs dentries and only one overlayfs inode.
and then
4. umount /overlay
5. mount -t overlayfs overlayfs -o lowerdir=/lower,upperdir=/upper
/overlay (again)
6. cd /overlay
7. ls hello bye
the overlayfs will create two inodes(one for the "hello", another
for the "bye") and two dentries (each point a inode).That means:
two dentries and two inodes.
As above, with different order of "create link" and "mount", the
result is not the same.
In order to make the behavior coherent, we need to create inode in ovl_link.
Signed-off-by: Robin Dong <sanbai@taobao.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
After allocating a new inode, if the mode of inode is incorrect, we should
release it by iput().
Signed-off-by: Robin Dong <sanbai@taobao.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Add a simple read-only counter to super_block that indicates deep this
is in the stack of filesystems. Previously ecryptfs was the only
stackable filesystem and it explicitly disallowed multiple layers of
itself.
Overlayfs, however, can be stacked recursively and also may be stacked
on top of ecryptfs or vice versa.
To limit the kernel stack usage we must limit the depth of the
filesystem stack. Initially the limit is set to 2.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Document the overlay filesystem.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
This is useful because of the stacking nature of overlayfs. Users like to
find out (via /proc/mounts) which lower/upper directory were used at mount
time.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Add support for statfs to the overlayfs filesystem. As the upper layer
is the target of all write operations assume that the space in that
filesystem is the space in the overlayfs. There will be some inaccuracy as
overwriting a file will copy it up and consume space we were not expecting,
but it is better than nothing.
Use the upper layer dentry and mount from the overlayfs root inode,
passing the statfs call to that filesystem.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Overlayfs allows one, usually read-write, directory tree to be
overlaid onto another, read-only directory tree. All modifications
go to the upper, writable layer.
This type of mechanism is most often used for live CDs but there's a
wide variety of other uses.
The implementation differs from other "union filesystem"
implementations in that after a file is opened all operations go
directly to the underlying, lower or upper, filesystems. This
simplifies the implementation and allows native performance in these
cases.
The dentry tree is duplicated from the underlying filesystems, this
enables fast cached lookups without adding special support into the
VFS. This uses slightly more memory than union mounts, but dentries
are relatively small.
Currently inodes are duplicated as well, but it is a possible
optimization to share inodes for non-directories.
Opening non directories results in the open forwarded to the
underlying filesystem. This makes the behavior very similar to union
mounts (with the same limitations vs. fchmod/fchown on O_RDONLY file
descriptors).
Usage:
mount -t overlay -olowerdir=/lower,upperdir=/upper overlay /mnt
Supported:
- all operations
Missing:
- Currently a crash in the middle of copy-up, rename, unlink, rmdir or create
over a whiteout may result in filesystem corruption on the overlay level.
IOW these operations need to become atomic or at least the corruption needs
to be detected.
The following cotributions have been folded into this patch:
Neil Brown <neilb@suse.de>:
- minimal remount support
- use correct seek function for directories
- initialise is_real before use
- rename ovl_fill_cache to ovl_dir_read
Felix Fietkau <nbd@openwrt.org>:
- fix a deadlock in ovl_dir_read_merged
- fix a deadlock in ovl_remove_whiteouts
Erez Zadok <ezk@fsl.cs.sunysb.edu>
- fix cleanup after WARN_ON
Sedat Dilek <sedat.dilek@googlemail.com>
- fix up permission to confirm to new API
Also thanks to the following people for testing and reporting bugs:
Jordi Pujol <jordipujolp@gmail.com>
Andy Whitcroft <apw@canonical.com>
Michal Suchanek <hramrach@centrum.cz>
Felix Fietkau <nbd@openwrt.org>
Erez Zadok <ezk@fsl.cs.sunysb.edu>
Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Overlayfs needs a private clone of the mount, so create a function for
this and export to modules.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Export do_splice_direct() to modules. Needed by overlay filesystem.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Add a new inode operation i_op->open(). This is for stacked
filesystems that want to return a struct file from a different
filesystem.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
Make __dentry_open() take a struct path instead of separate vfsmount and dentry
arguments.
Change semantics as well, so that __dentry_open() acquires a reference to path
instead of transferring it to the open file.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
|
|
When we enable the zconfdump() debugging we see assertion failures
attempting to print the config. Convert this into a noop.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/984288
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Herton Krzesinski <herton.krzesinski@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/977246
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/977246
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Temporarily reverts commit 81e0e2103035c9fc806757ddfa859e66c1b23c32.
Repeated Oops/Panic on boot. Needs re-work after v3.4-rc2 rebase.
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/972355
We have been seeing increasing reports of scarey ioctl messages in
dmesg, such as the below often in bulk:
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 800c0910 to a partition!
Looking at the upstream discussions these are all benign and can be safely
suppressed. This patch is based on some discussions at the link below,
on some work SUSE did in this area. This is not suitable for upstreaming
as we need some refactoring to fix the 32bit compat ioctl mess.
Link: http://www.spinics.net/lists/raid/msg37770.html
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
If the host returns error for pass through commands, deal with
appropriately. I would like to thank James for patiently helping
me with this patch.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Receieved directly from the upstream maintainer. This is the current
state of the art for this patch, though discussion continues.
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/638939
cherry picked from drm-intel-next-queue
This reverts commmit d4b74bf07873da2e94219a7b67a334fc1c3ce649 which
reverted the origin fix fb8b5a39b6310379d7b54c0c7113703a8eaf4a57.
We have at least 3 different bug reports that this fixes things and no
indication what is exactly wrong with this. So try again.
To make matters slightly more fun, the commit itself was cc: stable
whereas the revert has not been.
According to Peter Clifton he discussed this with Zhao Yakui and this
seems to be in contradiction of the GM45 PRM, but rumours have it that
this is how the BIOS does it ... let's see.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Tested-by: Peter Clifton <Peter.Clifton@clifton-electronics.com>
Cc: Zhao Yakui <yakui.zhao@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16236
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=25913
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=14792
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Violations of seccomp filters should always be reported, regardless
of audit context. This the minimal change version of what has been
proposed upstream: https://lkml.org/lkml/2012/3/23/332
Signed-off-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/969309
OK. Then, I think we also want to fix these warnings probably introduced by
commit a6021559 "UBUNTU: SAUCE: (no-up) Modularize vesafb".
WARNING: drivers/video/vesafb.o(.exit.text+0x42): Section mismatch in reference from the function vesafb_remove() to the (unknown reference) .init.data:(unknown)
The function __exit vesafb_remove() references
a (unknown reference) __initdata (unknown).
This is often seen when error handling in the exit function
uses functionality in the init path.
The fix is often to remove the __initdata annotation of
(unknown) so it may be used outside an init section.
WARNING: drivers/video/vesafb.o(.exit.text+0x4a): Section mismatch in reference from the function vesafb_remove() to the variable .init.data:vesafb_fix
The function __exit vesafb_remove() references
a variable __initdata vesafb_fix.
This is often seen when error handling in the exit function
uses functionality in the init path.
The fix is often to remove the __initdata annotation of
vesafb_fix so it may be used outside an init section.
Reported-by: Tetsuo Honda <from-ubuntu@I-love.SAKURA.ne.jp>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
it is unsupported
Submitted upstream.
BugLink: http://bugs.launchpad.net/bugs/962038
Right now using pcie_aspm=force will not enable ASPM if the FADT indicates
ASPM is unsupported. However, the semantics of force should probably allow
for this, especially as they did before the ASPM disable rework with commit
3c076351c4027a56d5005a39a0b518a4ba393ce2
This patch just skips the clearing of any ASPM setup that the firmware has
carried out on this bus if pcie_aspm=force is being used.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
changing the brightness on AC/battery status changes.
BugLink: http://bugs.launchpad.net/bugs/949311
We currently carry a SAUCE patch which lets the OS handle the brightness
levels automatically when connecting/disconnecting AC. There are some
laptops (MSI Wind) for which this doesn't work. Provide a driver param
which allows this behaviour to be overriden.
Signed-off-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
|
Fix build failure in aufs introduced by
commit 9cd98c046b57cd1bdbd53c3669f6cdd75edffd61
which has been backported from 3.4 as part of the AppArmor 3.4 backport
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Base support for network mediation.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Add the dynamic profiles file to the interace, to allow load policy
introspection.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Add the ability for apparmor to do mediation of mount operations. Mount
rules require an updated apparmor_parser (2.8 series) for policy compilation.
The basic form of the rules are.
[audit] [deny] mount [conds]* [device] [ -> [conds] path],
[audit] [deny] remount [conds]* [path],
[audit] [deny] umount [conds]* [path],
[audit] [deny] pivotroot [oldroot=<value>] <path>
remount is just a short cut for mount options=remount
where [conds] can be
fstype=<expr>
options=<expr>
Example mount commands
mount, # allow all mounts, but not umount or pivotroot
mount fstype=procfs, # allow mounting procfs anywhere
mount options=(bind, ro) /foo -> /bar, # readonly bind mount
mount /dev/sda -> /mnt,
mount /dev/sd** -> /mnt/**,
mount fstype=overlayfs options=(rw,upperdir=/tmp/upper/,lowerdir=/) -> /mnt/
umount,
umount /m*,
See the apparmor userspace for full documentation
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Documents how system call filtering using Berkeley Packet
Filter programs works and how it may be used.
Includes an example for x86 and a semi-generic
example using a macro-based code generator.
v14: - rebase/nochanges
v13: - rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: - comment on the ptrace_event use
- update arch support comment
- note the behavior of SECCOMP_RET_DATA when there are multiple filters
(keescook@chromium.org)
- lots of samples/ clean up incl 64-bit bpf-direct support
(markus@chromium.org)
- rebase to linux-next
v11: - overhaul return value language, updates (keescook@chromium.org)
- comment on do_exit(SIGSYS)
v10: - update for SIGSYS
- update for new seccomp_data layout
- update for ptrace option use
v9: - updated bpf-direct.c for SIGILL
v8: - add PR_SET_NO_NEW_PRIVS to the samples.
v7: - updated for all the new stuff in v7: TRAP, TRACE
- only talk about PR_SET_SECCOMP now
- fixed bad JLE32 check (coreyb@linux.vnet.ibm.com)
- adds dropper.c: a simple system call disabler
v6: - tweak the language to note the requirement of
PR_SET_NO_NEW_PRIVS being called prior to use. (luto@mit.edu)
v5: - update sample to use system call arguments
- adds a "fancy" example using a macro-based generator
- cleaned up bpf in the sample
- update docs to mention arguments
- fix prctl value (eparis@redhat.com)
- language cleanup (rdunlap@xenotime.net)
v4: - update for no_new_privs use
- minor tweaks
v3: - call out BPF <-> Berkeley Packet Filter (rdunlap@xenotime.net)
- document use of tentative always-unprivileged
- guard sample compilation for i386 and x86_64
v2: - move code to samples (corbet@lwn.net)
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Enable support for seccomp filter on x86:
- asm/tracehook.h exists
- syscall_get_arguments() works
- syscall_rollback() works
- ptrace_report_syscall() works
- secure_computing() return value is honored (see below)
This also adds support for honoring the return
value from secure_computing().
SECCOMP_RET_TRACE and SECCOMP_RET_TRAP may result in seccomp needing to
skip a system call without killing the process. This is done by
returning a non-zero (-1) value from secure_computing. This change
makes x86 respect that return value.
To ensure that minimal kernel code is exposed, a non-zero return value
results in an immediate return to user space (with an invalid syscall
number).
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
This change adds support for a new ptrace option, PTRACE_O_TRACESECCOMP,
and a new return value for seccomp BPF programs, SECCOMP_RET_TRACE.
When a tracer specifies the PTRACE_O_TRACESECCOMP ptrace option, the
tracer will be notified, via PTRACE_EVENT_SECCOMP, for any syscall that
results in a BPF program returning SECCOMP_RET_TRACE. The 16-bit
SECCOMP_RET_DATA mask of the BPF program return value will be passed as
the ptrace_message and may be retrieved using PTRACE_GETEVENTMSG.
If the subordinate process is not using seccomp filter, then no
system call notifications will occur even if the option is specified.
If there is no tracer with PTRACE_O_TRACESECCOMP when SECCOMP_RET_TRACE
is returned, the system call will not be executed and an -ENOSYS errno
will be returned to userspace.
This change adds a dependency on the system call slow path. Any future
efforts to use the system call fast path for seccomp filter will need to
address this restriction.
v16: - update PT_TRACE_MASK to 0xbf4 so that STOP isn't clear on SETOPTIONS call (indan@nul.nu)
[note PT_TRACE_MASK disappears in linux-next]
v15: - add audit support for non-zero return codes
- clean up style (indan@nul.nu)
v14: - rebase/nochanges
v13: - rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
(Brings back a change to ptrace.c and the masks.)
v12: - rebase to linux-next
- use ptrace_event and update arch/Kconfig to mention slow-path dependency
- drop all tracehook changes and inclusion (oleg@redhat.com)
v11: - invert the logic to just make it a PTRACE_SYSCALL accelerator
(indan@nul.nu)
v10: - moved to PTRACE_O_SECCOMP / PT_TRACE_SECCOMP
v9: - n/a
v8: - guarded PTRACE_SECCOMP use with an ifdef
v7: - introduced
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Adds a new return value to seccomp filters that triggers a SIGSYS to be
delivered with the new SYS_SECCOMP si_code.
This allows in-process system call emulation, including just specifying
an errno or cleanly dumping core, rather than just dying.
v15: - use audit_seccomp/skip
- pad out error spacing; clean up switch (indan@nul.nu)
v14: - n/a
v13: - rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: - rebase on to linux-next
v11: - clarify the comment (indan@nul.nu)
- s/sigtrap/sigsys
v10: - use SIGSYS, syscall_get_arch, updates arch/Kconfig
note suggested-by (though original suggestion had other behaviors)
v9: - changes to SIGILL
v8: - clean up based on changes to dependent patches
v7: - introduction
Suggested-by: Markus Gutschke <markus@chromium.org>
Suggested-by: Julien Tinnes <jln@chromium.org>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
This change enables SIGSYS, defines _sigfields._sigsys, and adds
x86 (compat) arch support. _sigsys defines fields which allow
a signal handler to receive the triggering system call number,
the relevant AUDIT_ARCH_* value for that number, and the address
of the callsite.
SIGSYS is added to the SYNCHRONOUS_MASK because it is desirable for it
to have setup_frame() called for it. The goal is to ensure that
ucontext_t reflects the machine state from the time-of-syscall and not
from another signal handler.
The first consumer of SIGSYS would be seccomp filter. In particular,
a filter program could specify a new return value, SECCOMP_RET_TRAP,
which would result in the system call being denied and the calling
thread signaled. This also means that implementing arch-specific
support can be dependent upon HAVE_ARCH_SECCOMP_FILTER.
v14: - rebase/nochanges
v13: - rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: - reworded changelog (oleg@redhat.com)
v11: - fix dropped words in the change description
- added fallback copy_siginfo support.
- added __ARCH_SIGSYS define to allow stepped arch support.
v10: - first version based on suggestion
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
This change adds the SECCOMP_RET_ERRNO as a valid return value from a
seccomp filter. Additionally, it makes the first use of the lower
16-bits for storing a filter-supplied errno. 16-bits is more than
enough for the errno-base.h calls.
Returning errors instead of immediately terminating processes that
violate seccomp policy allow for broader use of this functionality
for kernel attack surface reduction. For example, a linux container
could maintain a whitelist of pre-existing system calls but drop
all new ones with errnos. This would keep a logically static attack
surface while providing errnos that may allow for graceful failure
without the downside of do_exit() on a bad call.
v15: - use audit_seccomp and add a skip label. (eparis@redhat.com)
- clean up and pad out return codes (indan@nul.nu)
v14: - no change/rebase
v13: - rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: - move to WARN_ON if filter is NULL
(oleg@redhat.com, luto@mit.edu, keescook@chromium.org)
- return immediately for filter==NULL (keescook@chromium.org)
- change evaluation to only compare the ACTION so that layered
errnos don't result in the lowest one being returned.
(keeschook@chromium.org)
v11: - check for NULL filter (keescook@chromium.org)
v10: - change loaders to fn
v9: - n/a
v8: - update Kconfig to note new need for syscall_set_return_value.
- reordered such that TRAP behavior follows on later.
- made the for loop a little less indent-y
v7: - introduced
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
This consolidates the seccomp filter error logging path and adds more
details to the audit log.
v15: added a return code to the audit_seccomp path by wad@chromium.org
(suggested by eparis@redhat.com)
v*: original by keescook@chromium.org
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
[This patch depends on luto@mit.edu's no_new_privs patch:
https://lkml.org/lkml/2012/1/30/264
The whole series including Andrew's patches can be found here:
https://github.com/redpig/linux/tree/seccomp
Complete diff here:
https://github.com/redpig/linux/compare/1dc65fed...seccomp
A GPG signed tag 'seccomp/v14/posted' will be pushed shortly.
]
This patch adds support for seccomp mode 2. Mode 2 introduces the
ability for unprivileged processes to install system call filtering
policy expressed in terms of a Berkeley Packet Filter (BPF) program.
This program will be evaluated in the kernel for each system call
the task makes and computes a result based on data in the format
of struct seccomp_data.
A filter program may be installed by calling:
struct sock_fprog fprog = { ... };
...
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, &fprog);
The return value of the filter program determines if the system call is
allowed to proceed or denied. If the first filter program installed
allows prctl(2) calls, then the above call may be made repeatedly
by a task to further reduce its access to the kernel. All attached
programs must be evaluated before a system call will be allowed to
proceed.
Filter programs will be inherited across fork/clone and execve.
However, if the task attaching the filter is unprivileged
(!CAP_SYS_ADMIN) the no_new_privs bit will be set on the task. This
ensures that unprivileged tasks cannot attach filters that affect
privileged tasks (e.g., setuid binary).
There are a number of benefits to this approach. A few of which are
as follows:
- BPF has been exposed to userland for a long time
- BPF optimization (and JIT'ing) are well understood
- Userland already knows its ABI: system call numbers and desired
arguments
- No time-of-check-time-of-use vulnerable data accesses are possible.
- system call arguments are loaded on access only to minimize copying
required for system call policy decisions.
Mode 2 support is restricted to architectures that enable
HAVE_ARCH_SECCOMP_FILTER. In this patch, the primary dependency is on
syscall_get_arguments(). The full desired scope of this feature will
add a few minor additional requirements expressed later in this series.
Based on discussion, SECCOMP_RET_ERRNO and SECCOMP_RET_TRACE seem to be
the desired additional functionality.
No architectures are enabled in this patch.
v15: - add a 4 instr penalty when counting a path to account for seccomp_filter
size (indan@nul.nu)
- drop the max insns to 256KB (indan@nul.nu)
- return ENOMEM if the max insns limit has been hit (indan@nul.nu)
- move IP checks after args (indan@nul.nu)
- drop !user_filter check (indan@nul.nu)
- only allow explicit bpf codes (indan@nul.nu)
- exit_code -> exit_sig
v14: - put/get_seccomp_filter takes struct task_struct
(indan@nul.nu,keescook@chromium.org)
- adds seccomp_chk_filter and drops general bpf_run/chk_filter user
- add seccomp_bpf_load for use by net/core/filter.c
- lower max per-process/per-hierarchy: 1MB
- moved nnp/capability check prior to allocation
(all of the above: indan@nul.nu)
v13: - rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: - added a maximum instruction count per path (indan@nul.nu,oleg@redhat.com)
- removed copy_seccomp (keescook@chromium.org,indan@nul.nu)
- reworded the prctl_set_seccomp comment (indan@nul.nu)
v11: - reorder struct seccomp_data to allow future args expansion (hpa@zytor.com)
- style clean up, @compat dropped, compat_sock_fprog32 (indan@nul.nu)
- do_exit(SIGSYS) (keescook@chromium.org, luto@mit.edu)
- pare down Kconfig doc reference.
- extra comment clean up
v10: - seccomp_data has changed again to be more aesthetically pleasing
(hpa@zytor.com)
- calling convention is noted in a new u32 field using syscall_get_arch.
This allows for cross-calling convention tasks to use seccomp filters.
(hpa@zytor.com)
- lots of clean up (thanks, Indan!)
v9: - n/a
v8: - use bpf_chk_filter, bpf_run_filter. update load_fns
- Lots of fixes courtesy of indan@nul.nu:
-- fix up load behavior, compat fixups, and merge alloc code,
-- renamed pc and dropped __packed, use bool compat.
-- Added a hidden CONFIG_SECCOMP_FILTER to synthesize non-arch
dependencies
v7: (massive overhaul thanks to Indan, others)
- added CONFIG_HAVE_ARCH_SECCOMP_FILTER
- merged into seccomp.c
- minimal seccomp_filter.h
- no config option (part of seccomp)
- no new prctl
- doesn't break seccomp on systems without asm/syscall.h
(works but arg access always fails)
- dropped seccomp_init_task, extra free functions, ...
- dropped the no-asm/syscall.h code paths
- merges with network sk_run_filter and sk_chk_filter
v6: - fix memory leak on attach compat check failure
- require no_new_privs || CAP_SYS_ADMIN prior to filter
installation. (luto@mit.edu)
- s/seccomp_struct_/seccomp_/ for macros/functions (amwang@redhat.com)
- cleaned up Kconfig (amwang@redhat.com)
- on block, note if the call was compat (so the # means something)
v5: - uses syscall_get_arguments
(indan@nul.nu,oleg@redhat.com, mcgrathr@chromium.org)
- uses union-based arg storage with hi/lo struct to
handle endianness. Compromises between the two alternate
proposals to minimize extra arg shuffling and account for
endianness assuming userspace uses offsetof().
(mcgrathr@chromium.org, indan@nul.nu)
- update Kconfig description
- add include/seccomp_filter.h and add its installation
- (naive) on-demand syscall argument loading
- drop seccomp_t (eparis@redhat.com)
v4: - adjusted prctl to make room for PR_[SG]ET_NO_NEW_PRIVS
- now uses current->no_new_privs
(luto@mit.edu,torvalds@linux-foundation.com)
- assign names to seccomp modes (rdunlap@xenotime.net)
- fix style issues (rdunlap@xenotime.net)
- reworded Kconfig entry (rdunlap@xenotime.net)
v3: - macros to inline (oleg@redhat.com)
- init_task behavior fixed (oleg@redhat.com)
- drop creator entry and extra NULL check (oleg@redhat.com)
- alloc returns -EINVAL on bad sizing (serge.hallyn@canonical.com)
- adds tentative use of "always_unprivileged" as per
torvalds@linux-foundation.org and luto@mit.edu
v2: - (patch 2 only)
Reviewed-by: Indan Zupancic <indan@nul.nu>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Adds a stub for a function that will return the AUDIT_ARCH_*
value appropriate to the supplied task based on the system
call convention.
For audit's use, the value can generally be hard-coded at the
audit-site. However, for other functionality not inlined into
syscall entry/exit, this makes that information available.
seccomp_filter is the first planned consumer and, as such,
the comment indicates a tie to HAVE_ARCH_SECCOMP_FILTER. That
is probably an unneeded detail.
v14: rebase/nochanges
v13: rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: rebase on to linux-next
v11: fixed improper return type
v10: introduced
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Suggested-by: Roland McGrath <mcgrathr@chromium.org>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Add syscall_get_arch() to export the current AUDIT_ARCH_* based on system call
entry path.
v14: rebase/nochanges
v13: rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Replaces the seccomp_t typedef with struct seccomp to match modern
kernel style.
v14: rebase/nochanges
v13: rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: rebase on to linux-next
v8-v11: no changes
v7: struct seccomp_struct -> struct seccomp
v6: original inclusion in this series.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Any other users of bpf_*_filter that take a struct sock_fprog from
userspace will need to be able to also accept a compat_sock_fprog
if the arch supports compat calls. This change let's the existing
compat_sock_fprog be shared.
v14: rebase/nochanges
v13: rebase on to 88ebdda6159ffc15699f204c33feb3e431bf9bdc
v12: rebase on to linux-next
v11: introduction
Signed-off-by: Will Drewry <wad@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Introduces a new BPF ancillary instruction that all LD calls will be
mapped through when skb_run_filter() is being used for seccomp BPF. The
rewriting will be done using a secondary chk_filter function that is run
after skb_chk_filter.
The code change is guarded by CONFIG_SECCOMP_FILTER which is added,
along with the seccomp_bpf_load() function later in this series.
This is based on http://lkml.org/lkml/2012/3/2/141
v15: include seccomp.h explicitly for when seccomp_bpf_load exists.
v14: First cut using a single additional instruction
... v13: made bpf functions generic.
Suggested-by: Indan Zupancic <indan@nul.nu>
Signed-off-by: Will Drewry <wad@chromium.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
granting privs
With this set, a lot of dangerous operations (chroot, unshare, etc)
become a lot less dangerous because there is no possibility of
subverting privileged binaries.
This patch completely breaks apparmor. Someone who understands (and
uses) apparmor should fix it or at least give me a hint.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Kees Cook <kees@ubuntu.com>
|
|
Build failure:
ubuntu/aufs/i_op.c:701:8: error: too many arguments to function 'security_path_chmod'
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
|
|
BugLink: http://bugs.launchpad.net/bugs/943119
https://lists.ubuntu.com/archives/ubuntu-devel/2012-March/034869.html
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
This is necessary for clickpad detection of Synaptics trackpads in Dell
Mini 10 series of laptops.
Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
|
|
Andy Whitcroft (1):
UBUNTU: ubuntu: AUFS -- suppress benign plink warning messages
J. R. Okajima (10):
aufs: headers 1/2, bugfix, where the pr_fmt macro definition
aufs: headers 2/2, simply refined
aufs: tiny, update the year
aufs: update the donator
aufs stdalone: include path in Makefile
aufs: tiny, update the year
aufs: tiny, remove a duplicated header by accident
aufs: tiny, restore the removed header files for 2.6.38
make aufs-version 3.2
aufs3.2 20120109
Signed-off-by: Andy Whitcroft <apw@canonical.com>
|