For Parts 2 through 5, please follow the following submission criteria:

  1. Submit a ZIP file which contains all your code.
  2. The ZIP file should contain one directory for each exercise, named: part2, part3, part4, and part5.
  3. Each directory should contain the kernel module source and the Makefile required to build it. Please name the kernel modules themselves p2mod.c, p3mod.c, p4mod.c, and p5mod.c.
  4. Your kernel modules must be able to compile against the Linux 3.13 kernel (you may use a more recent kernel, but please do not use one which is older). You can check your kernel version by running uname -r in a terminal.
  5. Include a text file in the root directory of your ZIP file which includes your answers to the theoretical/written questions.

You may use the Ubuntu 14.04 LTS virtual machine image which has been provided for the course here. Ubuntu 14.04 LTS (Trusty Tahr) ships with Linux 3.13, but can optionally be upgraded to Linux 4.2 (which ships with Ubuntu 15.10, Wily Werewolf) by running the following commands:

sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install linux-generic-lts-wily linux-image-generic-lts-wily linux-headers-generic-lts-wily

Alternatively, you may install your own operating system in a virtual machine. There are many tutorials online which can guide you through the process. The standard Ubuntu distribution may run slowly on systems with limited resources, but Lubuntu should work comfortably with 512MB of RAM allocated to the virtual machine (assuming that you will only use it for compiling and testing your kernel modules). If you prefer an even lighter-weight option and are comfortable working entirely in the shell, you may install an Ubuntu Server image, which comes without a GUI.

As you have learned from Chapter 1, the kernel is the central component of an operating system. The kernel runs in privileged mode, and is responsible for mediating access to the computer's resources such as the CPU, RAM, and I/O devices. When a user-mode program wishes to request a service from the kernel, it performs a system call. A system call is initiated by a software interrupt, which transfers execution from the user-mode program to the kernel. However, switching between user mode and kernel mode incurs an overhead cost. This is why performance-critical drivers (e.g., for your file system) tend to be written in kernel mode, as opposed to user mode - loading your code directly into the kernel eliminates the need to repeatedly switch between user and kernel modes each time a service is required from the kernel. However, this approach does come with a cost in stability: A buggy user-mode driver can cause the driver to crash (but not the system), whereas a buggy kernel-mode driver can quite easily cause the kernel to crash and thus necessitate rebooting the system.

That being said, for this portion of the assignment we will be writing some kernel-mode code. Seemingly minor mistakes may result in dramatic consequences, including but not limited to crashing your computer and forcing you to reboot, which may potentially result in data loss. Therefore, to avoid such scenarios, it is strongly recommended that you do these exercises in a virtual machine.

Bugs in your kernel modules may cause your system to crash, and consequences may include data loss or data corruption. It is strongly recommended that you complete these exercises in a virtual machine.

In the early days of computing, any kernel-mode code had to be directly compiled into the kernel image that is shipped with the operating system. However, modern kernels (including the Linux and Windows kernels) are modular kernels, which means that they allow for kernel modules (for, e.g., device drivers, file system drivers) to be added to the kernel at runtime. Some advantages of this approach are that (1) we do not have to recompile the entire kernel to add functionality, (2) the base kernel image can be kept small, and (3) a system will only load the kernel modules that it needs, thereby saving memory.

In the spirit of programming tradition, we will begin by writing a simple "Hello, world!" kernel module. Assuming that you are using a Debian-based Linux distribution (such as Ubuntu), you must first install some dependencies as follows:

sudo apt-get install build-essential linux-headers-$(uname -r)

This will install the tools necessary to build the module, including the headers for the Linux kernel which is currently installed on your system. If you update your kernel down the road, you will want to also install the newer headers. A kernel module is always compiled against a particular kernel version, which is why we need the header files.

Create a new directory for this exercise, and copy the following code into a new file named hello.c:

/*
 * Hello world module
 */

#include <linux/module.h> // Must be included in all kernel modules
#include <linux/init.h> // Needed for specifying the initialization and cleanup functions
#include <linux/kernel.h>

MODULE_LICENSE("GPL"); // If your module isn't GPL-licensed, it will "taint" the kernel
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("A Hello World module.");

static int __init hello_init(void)
{
        printk(KERN_INFO "Hello world!\n");
        return 0;
}

static void __exit hello_exit(void)
{
        printk(KERN_INFO "Exiting hello world module.\n");
}

module_init(hello_init);
module_exit(hello_exit);

Every kernel module needs an initialization function which runs when the module is loaded, and a cleanup function which runs before it is unloaded. The module_init() and module_exit() functions allow you to specify which function will serve as your initialization and cleanup functions, respectively.

In the same directory, create a Makefile with the following contents (if you copy/paste the script below, make sure that the tab characters are preserved, otherwise it will not work):

obj-m := hello.o
KDIR := /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)
all:
	make -C $(KDIR) M=$(PWD) modules
clean:
	make -C $(KDIR) M=$(PWD) clean

When you are done, run make. This will create a hello.ko file, which is your kernel module. You may load the module into your kernel by running sudo insmod hello.ko, and unload it by running sudo rmmod hello.ko.

The printk() statements can be viewed in your kernel log, which can be accessed via dmesg. Running dmesg | less will allow you to scroll through the entire contents of your kernel log, and dmesg | tail will just display the last few lines.

We used the KERN_INFO log level, which is used for informational purposes. There are higher log levels like KERN_ALERT which are used to report serious malfunctions, or lower levels such as KERN_DEBUG which are used for debugging purposes.

More information on log levels:

An overview of the different log levels used by the Linux kernel can be found here. Notice that printk(KERN_INFO, "Hello world!\n"); is equivalent to pr_info("Hello world!\n");. The table lists similar short-hand alias functions for the remaining log levels as well.

A kernel panic is an internal error from which a kernel is unable to recover. This typically results in the OS dumping the kernel image for later debugging, and presenting the user with an error message (such as the infamous Blue Screen of Death). However, not all errors in the kernel result in a kernel panic. A Linux "kernel oops" occurs when the kernel enters an unexpected state. The kernel will then attempt to recover, but if it is unable to do so it will panic. However, even if the kernel does recover from the oops, you should never trust it - it is best to simply reboot your operating system.

For this exercise, modify the sample kernel module from Part 1 to generate a kernel oops.

Hint:

What's the easiest way to crash a standard user-mode program written in C or C++? Try that in your kernel module and see what happens.

For full marks, provide:

  1. Your code (3 marks).
  2. "Proof" that the kernel oops happened (based on what we've discussed so far, you should be able to figure out how to obtain convincing proof, and that is part of the exercise - make sure that you specify how you obtained said proof) (3 marks).

The Linux kernel maintains a process descriptor for each process currently running on the system. Each descriptor is stored as a task_struct, as defined here in linux/sched.h.

There are almost 500 lines in the structure's definition, but you will only need some of them.

For this exercise, you will extend your kernel module from Part 1 to print the following information to the kernel log:

  1. The name and PID of the current process (2 marks).
  2. The name and PID of the parent process (4 marks).

Hint #1:

You will need to start by adding the following two lines:

#include <linux/sched.h> // As explained above, this is where the task_struct structure is defined
#include <asm/current.h>
		

Including asm/current.h will give you to access to a struct_task pointer called current, which points to the process descriptor of the current process. The current process in this context is the process which requested the kernel to perform some operation (e.g., via a system call).

Hint #2:

You may notice that I am providing links to the Linux Cross-Reference - you may find this tool very helpful, as it is a convenient way to browse the Linux kernel source code.

For this exercise, extend your kernel module from Part 3 to print the UID and GID of the current process (4 marks).

The task_struct structure contains a cred structure that stores information about the privileges which the current process holds. You could try to access that information directly from the current task_struct, or use a couple of function calls provided by linux/cred.h.

Hint:

The linux/cred.h source file here is relatively short, so you should be able to read it and find the functions that you need. You will need to include this header file in your program whether you decide to use the functions (since this is where the functions are defined) or to access the cred structure directly (since this is also where the structure is defined).

Warning:

The uid and gid are stored as kuid_t and kgid_t structures, which each contain an unsigned int. You will get a compiler warning if you try to print the structure instead of the unsigned integer stored within it. The relevant structures are defined here, and the code should give you a good idea about how to access the value inside the structure. Failing to do this properly will result in a mark deduction (even though the code will compile and run despite the warning).

When you load your kernel module, you will notice that the owner UID and GID are 0 (i.e., root). Explain why this is the case, despite the fact that you are not logged in as the root user (2 marks), and also explain if and why you would expect this kernel module to output any UID or GID other than root under any circumstances (2 marks).

For this exercise, extend your kernel module from Part 3 (or 4) to walk the process tree, starting from the current process, until you reach the process with pid 0 (8 marks).

Your kernel module's log output should look something like this:

Current process name: x, PID: y
Parent (level 1) process name: x1, PID: y1
Parent (level 2) process name: x2, PID: y2
...
Parent (level x) process name: xx, PID, yx

What is the name of the process with pid 0? What about pid 1? In Operating Systems terminology, explain the purpose of these processes (hint: see your course slides) (2 marks).

Hint #1:

You will have to write a loop to get the current process's parent, and then the parent's parent, etc.

Hint #2:

In the C90 standard used by the Linux kernel, it is forbidden to mix declarations with code, e.g., you are not allowed to write int i = 0; in the middle of your function. All declarations (e.g., int i;) must be at the beginning of the function.

In Part 4 you ask if we would expect to see any UID or GID other than root under any circustances. Are you asking us purely from a standpoint of working with modules, or in general with processes?

The question has been updated to clarify that it is specifically asking about the kernel module that you wrote in Part 4. The key to answering this question lies in Hint #1 from Part 3: "The current process in this context is the process which requested the kernel to perform some operation (e.g., via a system call)." Think about which process is running as root, and why.