We are back for another day of live-blog ! See yesterday afternoon’s liveblog.

You can find the live stream here.

The kernel self-protection project and how you can help – Gustavo A. R. Silva

The kernel self-protection project’s (KSPP) focus is hardening the Linux kernel. It is an upstream project, not a downstream fork, with a central repository of patches to try. Its goal is not to fix bugs, but to eliminate entire bug classes and methods of exploitation. Gustavo felt important to mention that the project does not care about CVEs or individual issues.

Coordination happens on the linux-hardening mailing list, which was created in 2020 as a goal to have a list where the discussion on small development tasks and details, not necessarily new ideas. There’s also a patchwork instance to keep track of patches and tags. While the project is not a subsystem, it has maintainers.

The project also has an issue tracker, used to keep track of work, document a few things, and having some good first issues for new contributors.

During his first pull request to Linus, Gustavo sent a pull-request with a huge merge commit message, detailing all the changes; but it was too expansive so it had to be redone.

Gustavo A. R. Silva

The project uses Coverity to find defects in kernel code, as well a Coccinelle. The latter is used to make transforms from some buggy idioms to safer alternative, like the migration to the new struct_size() macro, for variable-sized structs that end with a 0-size array for appending data.

For some changes, the Kernel Test Robot from Intel’s Build-Tested-by tag is included: it allows testing wide variety of architectures, configs and toolchains.

It takes sweat and blood to harden the kernel, Gustavo jokes about. It’s a project with big goals, and is usually not glamorous. A strategy to fix that is to make the compiler an ally.

Enabling compiler options is the way forward, in order to detect as many problems as possible at build-time. But it takes time when sending patches to convince everyone of the usefulness of the changes, which might turn political in some cases. This is though and thankless work as well, Gustavo says, but it needs to be done.

Changes to fix compiler warnings and cleanups might feel like mechanical changes; maintainers don’t like to see mechanical changes outside of their tree, so it means each changes should be sent to the proper maintainer.

Most of the changes related to compiler warnings have a lot false positives, and can feel very noisy. But they also find real bugs. And they help improve compilers as well because the project actually finds corner-cases in the compilers themselves.

The work is hard and long, but is important: it helps improve the quality of the code in the long term.

Ongoing and finished changes

struct_group() is changed aimed at expliciting changes that go across struct boundary to adjacent objects, usually through a single memset or memcpy. This helps enabling compiler warnings like FORTIFY_SOURCE which strengthen the use of these operations.

FORTIFY_SOURCE has also seen improvements in Linux by using __builtin_object_size() compiler builtin properly for bounds checking.

Flexible array transformations were used to fix how variable-length arrays were used at the end of structs. The zero-length and 1-length arrays were replaced by with flexible-length arrays and the struct_size() length computation macro.

This lead to an issue being found: compilers treated any trailing arrays in struct as flexible arrays, which breaks FORTIFY_SOURCE. This is isn’t fixed yet in compilers, and some legacy code actually uses this now.

-Warray-bounds warning being enabled also helped find a lot of bugs. About 40000 fallthrough warnings were fixed and the warning enabled by default as well.

The project is also looking for help, and has many things that need to be fixed. A good place to start is the public issue tracker.

Trust and the Linux development model – Greg KH

“With enough eyes, all bugs are shallow”. That’s the old open source motto, and it isn’t that much discussed anymore as core advantage of open source. Code can be audited (and fixed!) by anyone, at any time, and it’s possible to go back in time and audit whatever happened: both in mailing lists and code history.

This talk is about the “University of Minnesota incident” (UMN), or how to not do research. From 2018 to 2020, a team from this university tried to submit “Hypocrite Commits”, or intentionally wrong patches. They submitted a total of 5 of those, and only one was accepted. It turned out later that it was, in fact, correct, and not hypocrite. In 2020, a paper was published on this.

It continued again in 2021, and “researchers” continued sending poor quality patches. Greg did not like it at all, and publicly said that they were doing things wrong. All past patches from @umn.edu were reviewed, and new ones rejected. The Linux Foundation stepped in and sent a letter to the university, asking to stop the “research”, document what they did, and retract the paper.

The paper was retracted in the end, hours before IEEE was about to revoke it. This Linux Foundation Technical Advisory Board (TAB) published a full report on the incident.

In the TAB report, it was documented that all the patches that were wrong were rejected. The accepted “wrong” patch 1 was in fact correct, but was reverted in the end because it was submitted under a false name, which is against the Linux kernel guidelines.

Almost all of those patches were for obscure drivers in error handling “cleanup paths”. These code paths were never hit in the real world. About 20% of patches submitted to fix things were in fact wrong. It falls under Hanlon’s razon, Greg says: “Never attribute to malice that which is adequately explained by stupidity.”

The biggest issue of the “Hypocrite Commits” is that they signed the legal statement, the Developer Certificate of Origin (DCO, with Signed-off-by) while using a fake name. And it’s really not ethical, Greg says, which led to lawyers being involved.

As said before, hypocrite patch 1 was correct despite the incorrect intention. patch 2 was detected as incorrect by Greg because it tried to address a syzbot issue that was already fixed. patch 3 and 4 were caught by maintainers whom suggested a proper way to fix the issues. Those suggestions were ignored, wasting the maintainers’ time. Bonus patch 5 was in fact a correct one, but reject because they sent it from a machine set-up for hypocrite commits, and it had a fake name: James Bond.

Greg KH

All the fake patches were caught. Does that mean the development model works ? Greg thinks that in fact the project got lucky.

In 2021, the UMN responded to the TAB report, saying that Linux was the only project they sent hypocrite patches to, and that it was over. Later in the year, in November, they asked Kees Cook (and Greg) if they could send patches again. They refused. Despite this, patches appeared on the mailing list in December. In January, the developers noticed that patches were wrong. UMN was notified, but they feigned ignorance.

There is now an official researchers guidelines documentation in linux. It asks for more information from researcher on patches, and why and how they fix things.

In April 2022, UMN brought in a kernel developer to help fix their program.

Should you trust the Linux kernel community ? As a basis, no, Greg says: the license says it comes with no warranty. Some people want every contributor to be vetted. But it’s done at such a scale, worldwide, that it’s impossible. Can you review everything that’s going in ? There were 79662 total commits in 2021, and no single group can review all of that. Most bugs are fixed before they reach a final release anyway.

The top developers of the kernel are also the top fixers. And they are also the top bug introducers. Which is fully expected. Most prolific developers write a ton of security bugs, and Greg includes himself in the bug introducers. So what needs to be done, and is being done is to make it easier and faster to find bugs.

From the outside, you should Trust but, verify; or in developer terms: Trust, but test. But from the inside, the main model of Linux kernel development, is not built around trust that code will be correct. It’s trust that people will stick around fix things when bugs inevitably happen.

Developing Tilck, a tiny Linux-compatible kernel – Vladislav Valtchev

The tilck project includes a monolothic kernel written in C, a bootloader, test suites, and buildroot-like scripts for building 3rd party software.

It’s partially compatible with Linux, and only runs on 32-bit x86. It’s an educational project at the moment, distributed under the BSD-license. It’s not an attempt to replace Linux, Vlad says.

Binary compatibility with Linux was chosen to have a measure of robustness by running 3rd party software, and not deal with designing a new syscall interface, or designing new toolchains. It can use binary linux libc builds for example.

It’s a very small and simple OS, leading to less memory footprint, low latency. It aims for robustness and deterministic behavior. Developer experience is also a focus, spanning from the project simplicity.

Vladislav Valtchev

Vlad showed a demo of the build toolchain to show how simple it is to build tilck with a single command, and then run it in qemu. It boots in a hundred milliseconds, and then runs a busybox-based system. It has a sysfs implementation, so that tools that need it on Linux can work. Its tty implementation can run vi, and even vim colored syntax highlighting. And of course, it runs doom.

The OS supports a native syscall tracer, with filtering, which was used to debug the missing tty escape sequences in order to be able to run vim. It has multiple built-in testsuites: for userspace, system, and kernel. But it wasn’t enough to test all codepaths, so to improve coverage, Vlad wrote tests that analyzed the serial and framebuffer outputs.

Vlad shared a few bug and optimization stories around Tilck: from an overcommit bug, to a framebuffer font rendering optimization war story. While starting quite slow, character rendering in Tilck has been improved on a lot, reaching faster speeds than Linux in the end.

That’s it for this morning ! Continued reading this afternoon’s live blog.