openSUSE Tumbleweed – Review of the weeks 2020/32 & 33
Dear Tumbleweed users and hackers,
The last two weeks have seen a steady stream of new snapshots with small gaps for some days. In total, we have published 11 snapshots since the last review: 0731, 0801..0807, 0810, 0812, and 0813.
The most relevant changes in those snapshots were:
- Linux kernel 5.7.11 & 5.8.0
- git 2.28.0
- A new UEFI signing key and grub2 fixes to address for the boothole security issue CVE-2020-10713
- GCC 10.2.1
- Mozilla Thunderbird 68.11.0
- /tmp is now tmpfs, no longer disk backed
- KDE Frameworks 5.73.0
- LibreOffice 7.0.0 stable release
And in the next days/weeks, you can expect these changes to happen:
- GNOME 3.36.5
- KDE Applications 20.08.0
- glibc 2.32
- binutils 2.35
- gettext 0.21
- bison 3.7.1
- RPM changes: %{_libexecdir} is being changed to /usr/libexec. This exposes quite a lot of packages that abuse %{_libexecdir} and fail to build. Additionally, the payload compression is being changed to zstd
- openSSL 3.0
Participate in Hacktoberfest, Help Develop Contributions
The month-long, virtual-festival event that celebrates open source contributions, Hacktoberfest, is coming soon and members of the openSUSE community can make a difference.
The event that is in its seventh year and run by Digital Ocean and DEV encourages people to make their first contributions to open source projects.
The event is for developers, designers who contribute artwork, people who can contribute to documentation,and more.
As the event brings more awareness to open-source projects and encourages contributions that benefit communities, having developers and community members available to help people who want to contribute can be beneficial to the project.
Community members can help by guiding new contributors, creating educational content for the project, providing a list of the resources available and creating meetups.
Natnael Getahun plans on coordinating some of the efforts for openSUSE’s presence during Hacktoberfest and has asked for help from community members who are willing to help contributors and expand the events efforts around openSUSE related projects.
A list of ideas for projects during Hacktoberfest are being developed on the openSUSE Etherpad.
Hacktoberfest is open to everyone and there are rules that apply to receive Hacktoberfest Swag and Hacktoberfest Quality Standards that need to be met.
Noodlings 19 | BIOS Games Serving the NDI™ Plugin
"Rust does not have a stable ABI"
I've seen GNOME people (often, people who have been working for a long time on C libraries) express concerns along the following lines:
- Compiled Rust code doesn't have a stable ABI (application binary interface).
- So, we can't have shared libraries in the traditional fashion of Linux distributions.
- Also Rust bundles its entire standard library with every binary it compiles, which makes Rust-built libraries huge.
These are extremely valid concerns to be addressed by people like myself who propose that chunks of infrastructural libraries should be done in Rust.
So, let's begin.
The first part of this article is a super-quick introduction to shared libraries and how Linux distributions use them. If you already know those things, feel free to skip to the "Rust does not have a stable ABI" section.
How do distributions use shared libraries?
If several programs run at the same time and use the same shared library
(say, libgtk-3.so), the operating system can load a single copy of
the library in memory and share the read-only parts of the code/data
through the magic of virtual memory.
In theory, if a library gets a bugfix but does not change its
interface, one can just recompile the library, stick the new .so in
/usr/lib or whatever, and be done with it. Programs that depend on
the library do not need to be recompiled.
If libraries limit their public interface to a plain C ABI (application binary interface), they are relatively easy to consume from other programming languages. Those languages don't have to deal with name mangling of C++ symbols, exception handlers, constructors, and all that complexity. Pretty much every language has some form of C FFI (foreign function interface), which roughly means "call C functions without too much trouble".
For the purposes of a library, what's an ABI? Wikipedia says, "An ABI defines how data structures or computational routines are accessed in machine code [...] A common aspect of an ABI is the calling convention", which means that to call a function in machine code you need to frob the call and stack pointers, pass some function arguments in registers or push some others to the stack, etc. Really low-level stuff. Each machine architecture or operating system usually defines a C standard ABI.
For libraries, we commonly understand an ABI to mean the machine-code
implications of their programming interface. Which functions are
available as public symbols in the .so file? To which numeric
values do C enum values correspond, so that they can be passed to
those functions? What is the exact order and type of arguments that
the functions take? What are the struct sizes, and the order and
types and padding of the fields that those functions take? Does one
pass arguments in CPU registers or on the stack? Does the caller or
the callee clean up the stack after a function call?
Bug fixes and security fixes
Linux distributions generally try really hard to have a single
version of each shared library installed in the system: a single
libjpeg.so, a single libpng.so, a single libc.so, etc.
This is helpful when there needs to be an update to fix a bug,
security-related or not: users can just download the updated package
for the library, which when installed will just stick in a new .so
in the right place, and the calling software won't need to be updated.
This is possible only if the bug really only changes the internal code without changing behavior or interface. If a bug fix requires part of the public API or ABI to change, then you are screwed; all calling software needs to be recompiled. "Irresponsible" library authors either learn really fast when distros complain loudly about this sort of change, or they don't learn and get forever marked by distros as "that irresponsible library" which always requires special handling in order not to break other software.
Sidenote: sometimes it's more complicated. Poppler (the PDF rendering library) ships at least two stable APIs, one Glib-based in C, and one Qt-based in C++. However, some software like texlive uses Poppler's internals library directly, which of course does not have a stable API, and thus texlive breaks frequently as Poppler evolves. Someone should extend the public, stable API so that texlive doesn't have to use the library's internals!
Bundled libraries
Sometimes it is not irresponsible authors of libraries, but rather that people who use the libraries find out that over time the behavior of the library changes subtly, maybe without breaking the API or ABI, and they are better off bundling a specific version of the library with their software. That version is what they test their software against, and they try to learn its quirks.
Distros inevitably complain about this, and either patch the calling
software by hand to force it to use the system's shared library, or
succeed in getting patches accepted by the software so that they have
a --use-system-libjpeg option or similar.
This doesn't work very well if the bundled version of the library has extra patches that are not in a distro's usual patches. Or vice-versa; it may actually work better to use the distro's version of the library, if it has extra fixes that the bundled library doesn't. Who knows! It's a case-by-case situation.
Rust does not have a stable ABI
By default indeed it doesn't, because the compiler team wants to have the freedom to change the data layout and Rust-to-Rust calling conventions, often for performance reasons, at any time. For example, it is not guaranteed that struct fields will be laid out in memory in the same order as they are written in the code:
struct Foo {
bar: bool,
baz: f64,
beep: bool,
qux: i32,
}
The compiler is free to rearrange the struct fields in memory as it
sees fit. Maybe it decides to put the two bool fields next to each
other to save on inter-field padding due to alignment requirements;
maybe it does static analysis or profile-guided optimizations and
picks an optmal ordering.
But we can override this! Let's look at data layout first, and then calling conventions.
Data layout for C versus Rust
The following is the same struct as above, but with an extra #[repr(C)] attribute:
#[repr(C)]
struct Foo {
bar: bool,
baz: f64,
beep: bool,
qux: i32,
}
With that attribute, the struct will be laid out just as this C struct:
#include <stdbool.h>
#include <stdint.h>
struct Foo {
bool bar;
double baz;
bool beep;
int32_t qux;
}
(Aside: it is unfortunate that gboolean is not bool,
but that's because gboolean predates C99, and clearly standards from
20 years ago are too new to use. (Aside aside: since I wrote that
other post, Rust's repr(C) for bool is actually defined as C99's bool;
it's no longer undefined.))
Even Rust's data-carrying enums can be laid out in a manner friendly to C and C++:
#[repr(C, u8)]
enum MyEnum {
A(u32),
B(f32, bool),
}
This means, use C layout, and a u8 for the enum's discriminant. It
will be laid out like this:
#include <stdbool.h>
#include <stdint.h>
enum MyEnumTag {
A,
B
};
typedef uint32_t MyEnumPayloadA;
typedef struct {
float x;
bool y;
} MyEnumPayloadB;
typedef union {
MyEnumPayloadA a;
MyEnumPayloadB b;
} MyEnumPayload;
typedef struct {
uint8_t tag;
MyEnumPayload payload;
} MyEnum;
The gory details of data layout are in the Alternative Representations section of the Rustonomicon and the Unsafe Code Guidelines.
Calling conventions
An ABI's calling conventions detail things like how to call functions in machine code, and how to lay out function arguments in registers or the stack. The wikipedia page on X86 calling conventions has a good cheat-sheet, useful when you are looking at assembly code and registers in a low-level debugger.
I've already written about how it is possible to write Rust code to
export functions callable from C; one uses the extern "C" in the
function definition and a #[no_mangle] attribute to keep the symbol
name pristine. This is how librsvg is able to have the following:
#[no_mangle]
pub unsafe extern "C" fn rsvg_handle_new_from_file(
filename: *const libc::c_char,
error: *mut *mut glib_sys::GError,
) -> *const RsvgHandle {
// ...
}
Which compiles to what a C compiler would produce for this:
RsvgHandle *rsvg_handle_new_from_file (const gchar *filename, GError **error);
(Aside: librsvg still uses an intermediate C library full of stubs that just call the Rust-exported functions, but there is now tooling to produce a .so directly from Rust which I just haven't had time to investigate. Help is appreciated!)
Summary of ABI so far
It is one's decision to export a stable C ABI from a Rust library. There is some awkwardness in how types are laid out in C, because the Rust type system is richer, but things can be made to work well with a little thought. Certainly no more thought than the burden of designing and maintaining a stable API/ABI in plain C.
I'll fold the second concern into here — "we can't have shared libraries in traditional distro fashion". Yes, we can, API/ABI-wise, but read on.
Rust bundles its entire standard library with Rust-built .so's
I.e. it statically links all the Rust dependencies. This produces a large .so:
- librsvg-2.so (version 2.40.21, C only) - 1408840 bytes
- librsvg-2.so (version 2.49.3, Rust only) - 9899120 bytes
Holy crap! What's all that?
(And I'm cheating: this is both with link-time optimization turned on,
and by running strip(1) on the .so. If you just autogen.sh && make
it will be bigger.)
This has Rust's standard library statically linked (or at least the bits of that librsvg actually uses), plus all the Rust dependencies (cssparser, selectors, nalgebra, glib-rs, cairo-rs, locale_config, rayon, xml5ever, and an assload of crates). I could explain why each one is needed:
- cssparser - librsvg needs to parse CSS.
- selectors - librsvg needs to run the CSS selector matching algorithm.
- nalgebra - the code for SVG filter effects uses vectors and matrices.
- glib-rs, cairo-rs - draw to Cairo and export GObject types.
- locale_config - so that localized SVG files can work.
- rayon - so filters can use all your CPU cores instead of processing one pixel at a time.
- Etcetera. SVG is big and requires a lot of helper code!
Is this a problem?
Or more exactly, why does this happen, and why do people perceive it as a problem?
Stable APIs/ABIs and distros
Many Linux distributions have worked really hard to ensure that
there is a single copy of "system libraries" in an installation.
There is Just One Copy of /usr/lib/libc.so, /usr/lib/libjpeg.so,
etc., and packages are compiled with special options to tell them to
really use the sytem libraries instead of their bundled versions, or
patched to do so if they don't provide build-time options for that.
In a way, this works well for distros:
-
A bug in a library can be fixed in a single place, and all applications that use it get the fix automatically.
-
A security bug can be patched in a single place, and in theory applications don't need to be audited further.
If you maintain a library that is shipped in Linux distros, and you break the ABI, you'll get complaints from distros very quickly.
This is good because it creates responsible maintainers for libraries that can be depended on. It's how Inkscape/GIMP can have a stable toolkit to be written in.
This is bad because it encourages stagnation in the long term. It's
how we get a horrible, unsafe, error-prone API in libjpeg that can
never ever be improved because it would requires changes in tons of
software; it's why gboolean is still a 32-bit int after
twenty-something years, even though everything else close to C has
decided that booleans are 1 byte. It's how Inkscape/GIMP take many
years to move from GTK2 to GTK3 (okay, that's lack of paid developers
to do the grunt work, but it is enabled by having forever-stable APIs).
However, a long-term stable API/ABI has a lot of value. It is why the Windows API is the crown jewels; it is why people can rely on glib and glibc to not break their code for many years and take them for granted.
But we only have a single stable ABI anyway
And that is the C ABI. Even C++ libraries have trouble with this, and people sometimes write the internals of a library in C++ for convenience, but export a stable C API/ABI from it.
High level languages like Python have real trouble calling C++ code precisely because of ABI issues.
Actually, in GNOME we have gone further than that
In GNOME we have constructed a sweet little universe where GObject Introspection is basically a C ABI with a ton of machine-generated annotations to make it friendly to language bindings.
Still, we rely on a C ABI underneath. See this exploratory twitter thread on advancing the C ABI from Rust for lots of food for thought.
Single copies of libraries with a C ABI
Okay, let's go back to this. What price do we pay for single copies of libraries that, by necessity, must export a C ABI?
-
Code that can be conveniently called from C, maybe from C++, and moderately to very inconvently from ANYTHING ELSE. With most new application code being written definitely not in C, maybe we should reconsider our priorities here.
-
No language facilities like generics or field visibility, which are not even "modern language" features. Even C++ templates get compiled and statically linked into the calling code, because there's no way to pass information like the size of
TinArray<T>across a C ABI. You wanted to make some struct fields public and some private? You are out of luck. -
No knowledge of data ownership except by careful reading of the C function's documentation. Does the function free its arguments? How - with
free()org_free()ormy_thing_free()? Or does the caller just lend it a reference? Can the data be copied bit-by-bit or must a special function be called to make a copy? GObject-Introspection carries this information in its annotations, while the C ABI has no idea and just ships raw pointers around.
More food for thought note: this twitter thread says this about the C++ ABI: "Also, the ABI matters for whether the actual level of practicality of complying with LGPL matches the level of practicality intended years ago when some project picked LGPL as its license. Of course, the standard does not talk about LGPL, either. LGPL has rather different implications for Rust and Go than it does for C and Java. It was obviously written with C in mind."
Monomorphization and template bloat
While C++ had the problem of "lots of template code in header files", Rust has the problem that monomorphization of generics creates a lot of compiled code. There are tricks to avoid this and they are all the decision of the library/crate author. Both share the root cause that templated or generic code must be recompiled for every specific use, and thus cannot live in a shared library.
Also, see this wonderful article on how different languages implement generics, and think that a plain C ABI means we have NOTHING of the sort.
Also, see How Swift Achieved Dynamic Linking Where Rust Couldn't for more food for thought. This is extremely roughly equivalent to GObject's boxed types; callers keep values on the heap but know the type layout via annotation magic, while the library's actual implementation is free to have the values on the stack or wherever for its own use.
Should all libraries export APIs with generics and exotic types?
No!
You probably want something like a low-level array of values,
Vec<T>, to be inlined everywhere and with code that knows the
type of the vector's elements. Element accesses can be inlined to a
single machine instruction in the best case.
But not everything requires this absolute raw performance with everything inlined everywhere. It is fine to pass references or pointers to things and do dynamic dispatch from a vtable if you are not in a super-tight loop, as we love to do in the GObject world.
Library sizes
I don't have a good answer to librsvg's compiled size. If gnome-shell merges my branch to rustify the CSS code, it will also grow its binary size by quite a bit.
It is my intention to have a Rust crate that both librsvg and gnome-shell share for their CSS styling needs, but right now I have no idea if this would be a shared library or just a normal Rust crate. Maybe it's possible to have a very general CSS library, and the application registers which properties it can parse and how? Is it possible to do this as a shared library without essentially reinventing libcroco? I don't know yet. We'll see.
A metaphor which I haven't fully explored
If every application or end-user package is kind of like a living organism, with its own cycles and behaviors and organs (dependent libraries) that make it possible...
Why do distros expect all the living organisms on your machine to share The World's Single Lungs Service, and The World's Single Stomach Service, and The World's Single Liver Service?
You know, instead of letting every organism have its own slightly different version of those organs, customized for it? We humans know how to do vaccination campaigns and everything; maybe we need better tools to apply bug fixes where they are needed?
I know this metaphor is extremely imperfect and not how things work in software, but it makes me wonder.
Tumbleweed Snapshots bring Kernel 5.8, Hypervisor FS Support with Xen Update
This week openSUSE Tumbleweed delivered four snapshots that brought in a new mainline kernel for the distribution as well as a package for Xen that removes previous requirements of parsing log data or writing custom hypercalls to transport the data, and custom code to read it.
The latest snapshot, 20200810, brought the 5.8.0 Linux Kernel that had a fix for missing check in vgacon scrollback handling and an additional commit from the previous version improves load balancing for SO_REUSEPORT, which can be used for both TCP and UDP sockets. The GNU Compiler Collection 10 update includes some Straight Line Speculation mitigation changes. GNOME had a few package updates in the snapshot with updates to accerciser 3.36.3, web browser epiphany 3.36.4 and GNOME games gnome-mines 3.36.1 and quadrapassel 3.36.04. The snapshot is trending at a rating of 84, according to the Tumbleweed snapshot reviewer.
The urlscan 0.9.5 was the lone software package updated in snapshot 20200807. The version removed a workaround for fixing a python web browser bug, added an -R option to reverse URL/context output and provides clipboard support for Wayland.
Several Python related packages were updated in the stable 20200806 snapshot; these include updates to python-matplotlib 3.3.0, python-pycryptodome 3.9.8, python-python-xlib 0.27, python-pyzmq 19.0.2, python-redis 3.5.3 and python-urllib3 1.25.10. Point-to-Point Protocol communications daemon ppp 2.4.8 added new pppd options that included an IPv6 default route with a nodefaultroute6 to prevent adding an IPv6 default route. The 2.28.4 webkit2gtk3 version fixed several crashes and rendering issues as well as a half dozen Common Vulnerabilities and Exposure fixes. Hypervisor package xen had an update to the 4.14.0 version. The package corrected its license name and had some contributions from Intel, Citrix and QubesOS, which uses openQA for testing, contributed to the Linux stubdomains. The updated version offers Hypervisor FS support.
The snapshot that started off this week’s Tumbleweed review arrived in snapshot 20200805. This snapshot brought an updated version of Mozilla Firefox version 68.11.0; the updated version addressed seven CVEs, which included one fix of a leak with WebRTC’s data channel. The system daemon package sssd 2.3.1 provided new configuration options and fixed many regressions in Group Policy Object (GPO) that was introduced in the previous version. Xfce Windows manager xfwm4 4.14.4 fixed some complication warnings and the 2.23 transactional-update package added a “run” command to be able to execute a single command in a new snapshot as well as a “–drop-if-no-change” option to discard snapshots if no changes were performed. The snapshot recorded a stable rating of 99, according to the Tumbleweed snapshot reviewer.
OBS NDI™ Plugin on openSUSE
AntiMicro | Map Keyboard and Mouse Controls to Gamepad on openSUSE
New Prototype Builds Bringing Leap, SLE Closer Will be Available Soon
The release manager for openSUSE Leap, Lubos Kocman, has updated openSUSE’s develop community on efforts to bring the codes of Leap and SUSE Linux Enterprise closer together.
In an email to the openSUSE-Factory mailing list, Kocman explained that the prototype project openSUSE Jump should become available for early testing soon and that contributions to the project could become available in the next five weeks.
“First, I’d like to announce that we’ll start publishing images and FTP trees for openSUSE Jump, so people can get their hands on [it],” Kocman wrote. “Please be aware that Jump is still in Alpha quality. I expect data to be available later this week as there is still pending work on pontifex by Heroes.
The Alpha quality state of Jump is gradually progressing.
Jump is an interim name given to the experimental distribution in the Open Build Service as developers try to synchronize SLE binaries for openSUSE Leap.
Kocman explained how feature requests will work, the process for how contributions will be handled and and he also explained how the submissions will lead to greater transparency.
“Backports and openSUSE Leap code submissions will have pretty much the same experience as now,” he wrote. “With the exception that you’ll be able to use just one submit request target for more convenience and reusability as there will be one build used for both Package Hub and Leap. Submissions to SUSE Linux Enterprise will get more transparent. I’d really like to ask you to look at it as a good starting point. I see it as a big improvement from what we have now since you’ll be now able to see behind the scenes. This all is constantly happening, you were just not able to see it before.”
Kocman went on to name several steps on how to update SLE package.
BIOS Update Dell Latitude E6440 on Linux