ownCloud and CryFS
It is a great idea to encrypt files on client side before uploading them to an ownCloud server if that one is not running in controlled environment, or if one just wants to act defensive and minimize risk.
Some people think it is a great idea to include the functionality in the sync client.
I don’t agree because it combines two very complex topics into one code base and makes the code difficult to maintain. The risk is high to end up with a kind of code base which nobody is able to maintain properly any more. So let’s better avoid that for ownCloud and look for alternatives.
A good way is to use a so called encrypted overlay filesystem and let ownCloud sync the encrypted files. The downside is that you can not use the encrypted files in the web interface because it can not decrypt the files easily. To me, that is not overly important because I want to sync files between different clients, which probably is the most common usecase.
Encrypted overlay filesystems put the encrypted data in one directory called the cipher directory. A decrypted representation of the data is mounted to a different directory, in which the user works.
That is easy to setup and use, and also in principle good to use with file sync software like ownCloud because it does not store the files in one huge container file that needs to be synced if one bit changes as other solutions do.
To use it, the cypher directory must be configured as local sync dir of the client. If a file is changed in the mounted dir, the overlay file system changes the crypto files in the cypher dir. These are synced by the ownCloud client.
One of the solutions I tried is CryFS. It works nicely in general, but is unfortunately very slow together with ownCloud sync.
The reason for that is that CryFS is chunking all files in the cypher dir into 16 kB blocks, which are spread over a set of directories. It is very beneficial because file names and sizes are not reconstructable in the cypher dir, but it hits on one of the weak sides of the ownCloud sync. ownCloud is traditionally a bit slow with many small files spread over many directories. That shows dramatically in a test with CryFS: Adding eleven new files with a overall size of around 45 MB to a CryFS filesystem directory makes the ownCloud client upload for 6:30 minutes.
Adding another four files with a total size of a bit over 1MB results in an upload of 130 files and directories, with an overall size of 1.1 MB.
A typical change use case like changing an existing office text document locally is not that bad. CryFS splits a 8,2 kB big LibreOffice text doc into three 16 kB files in three directories here. When one word gets inserted, CryFS needs to create three new dirs in the cypher dir and uploads four new 16 kB blocks.
My personal conclusion: CryFS is an interesting project. It has a nice integration in the KDE desktop with Plasma Vault. Splitting files into equal sized blocks is good because it does not allow to guess data based on names and sizes. However, for syncing with ownCloud, it is not the best partner.
If there is a way how to improve the situation, I would be eager to learn. Maybe the size of the blocks can be expanded, or the number of directories limited? Also the upcoming ownCloud sync client version 2.6.0 again has optimizations in the discovery and propagation of changes, I am sure that improves the situation.
Let’s see what other alternatives can be found.
Kernel Adventures: Enabling VPD Pages for USB Storage Devices in sysfs
Kata Containers now available in Tumbleweed

Kata Containers is an open source container runtime that is crafted to seamlessly plug into the containers ecosystem.
We are now excited to announce that the Kata Containers packages are finally available in the official openSUSE Tumbleweed repository.
It is worthwhile to spend few words explaining why this is a great news, considering the role of Kata Containers (a.k.a. Kata) in fulfilling the need for security in the containers ecosystem, and given its importance for openSUSE and Kubic.
What is Kata
As already mentioned, Kata is a container runtime focusing on security and on ease of integration with the existing containers ecosystem. If you are wondering what’s a container runtime, this blog post by Sascha will give you a clear introduction about the topic.
Kata should be used when running container images whose source is not fully trusted, or when allowing other users to run their own containers on your platform.
Traditionally, containers share the same physical and operating system (OS) resources with host processes, and specific kernel features such as namespaces are used to provide an isolation layer between host and container processes. By contrast, Kata containers run inside lightweight virtual machines, adding an extra isolation and security layer, that minimizes the host attack surface and mitigates the consequences of containers breakout. Despite this extra layer, Kata achieves impressive runtime performances thanks to KVM hardware virtualization, and when configured to use a minimalist virtual machine manager (VMM) like Firecracker, a high density of microVM can be packed on a single host.
If you want to know more about Kata features and performances:
- katacontainers.io is a great starting point.
- For something more SUSE oriented, Flavio gave a interesting talk about Kata at SUSECON 2019,
- Kata folks hang out on katacontainers.slack.com, and will be happy to answer any quesitons.
Why is it important for Kubic and openSUSE
SUSE has been an early and relevant open source contributor to containers projects, believing that this technology is the future way of deploying and running software.
The most relevant example is the openSUSE Kubic project, that’s a certified Kubernetes distribution and a set of container-related technologies built by the openSUSE community.
We have also been working for some time in well known container projects, like runC, libpod and CRI-O, and since a year we also collaborate with Kata.
Kata complements other more popular ways to run containers, so it makes sense for us to work on improving it and to assure it can smoothly plug with our products.
How to use
While Kata may be used as a standalone piece of software, it’s intended use is serve as a runtime when integrated in a container engine like Podman or CRI-O.
This section shows a quick and easy way to spin up a Kata container using Podman on openSUSE Tumbleweed.
First, install the Kata packages:
$ sudo zypper in katacontainers
Make sure your system is providing the needed set of hardware virtualization features required by Kata:
$ sudo kata-runtime kata-check
If no errors are reported, great! Your system is now ready to run Kata Containers.
If you haven’t already, install podman with:
$ sudo zypper in podman
That’ all. Try running a your first Kata container with:
$ sudo podman run -it --rm --runtime=/usr/bin/kata-runtime opensuse/leap uname -a
Linux ab511687b1ed 5.2.5-1-kvmsmall #1 SMP Wed Jul 31 10:41:36 UTC 2019 (79b6a9c) x86_64 x86_64 x86_64 GNU/Linux
Differences with runC
Now that you have Kata up and running, let’s see some of the differences between Kata and runC, the most popular container runtime.
When starting a container with runC, container processes can be seen in the host processes tree:
...
10212 ? Ssl 0:00 /usr/lib/podman/bin/conmon -s -c <ctr-id> -u <ctr-id>
10236 ? Ss 0:00 \_ nginx: master process nginx -g daemon off;
10255 ? S 0:00 \_ nginx: worker process
10256 ? S 0:00 \_ nginx: worker process
10257 ? S 0:00 \_ nginx: worker process
10258 ? S 0:00 \_ nginx: worker process
...
With Kata, container processes are instead running in a dedicated VM, so they are not sharing OS resources with the host:
...
10651 ? Ssl 0:00 /home/marco/go/src/github.com/containers/conmon/bin/conmon -s -c <ctr-id> -u <ctr-id>
10703 ? Sl 0:01 \_ /usr/bin/qemu-system-x86_64 -name sandbox-<ctr-id> -uuid e54ee910-2927-456e-a180-836b92ce5e7a -machine pc,accel=kvm,kernel_ir
10709 ? Ssl 0:00 \_ /usr/lib/kata-containers/kata-proxy -listen-socket unix:///run/vc/sbs/<ctr-id>/proxy.sock -mux-socket /run/vc/vm/829d8fe0680b
10729 ? Sl 0:00 \_ /usr/lib/kata-containers/kata-shim -agent unix:///run/vc/sbs/<ctr-id>/proxy.sock -container <ctr-id>
...
Future plans
We are continuing to work to offer you a great user experience when using Kata on openSUSE by:
- improving packages quality and stability,
- delivering periodic releases,
- making sure that Kata well integrates with the other container projects, like Podman and CRI-O.
As a longer term goal, we will integrate Kata in the Kubic distribution and in CaaSP, to make them some of the most complete and secure solutions to manage containers.
Patch Workflow With Mutt 2019
Given that the main development workflow for most kernel maintainers is with email, I spend a lot of time in my email client. For the past few decades I have used (mutt), but every once in a while I look around to see if there is anything else out there that might work better.
One project that looks promising is (aerc) which was started by (Drew DeVault). It is a terminal-based email client written in Go, and relies on a lot of other go libraries to handle a lot of the “grungy” work in dealing with imap clients, email parsing, and other fun things when it comes to free-flow text parsing that emails require.
Communities in the distrowatch.org top 20
| Distribution | Forum | Wiki | Community | Membership | Bug Reporting | Mailing List | Chat |
| MX Linux | Yes | Technical Only | No | No | Yes | No | No |
| Manjaro | Yes | Yes | No | No | Forum Only | Yes | Yes |
| Mint | Yes | No | Yes | No | Upstream or Github | No | IRC |
| elementary | Stack Exchange | No | No | No | Yes | No | Slack |
| Ubuntu | Yes | Yes | Yes | Yes | Yes | Yes | IRC |
| Debian | Yes | Yes | Yes | Yes | Yes | Yes | IRC |
| Fedora | Yes | Yes | Yes | Yes | Yes | Yes | IRC |
| Solus | Yes | No | Yes | No | Yes | No | IRC |
| openSUSE | Yes | Yes | Yes | Yes | Yes | Yes | IRC |
| Zorin | Yes | No | No | No | Forum Only | No | No |
| deepin | Yes | Yes | No | No | Yes | Yes | No |
| KDE neon | Yes | Yes | Yes | No | Yes | Yes | IRC |
| CentOS | Yes | Yes | Yes | No | Yes | Yes | IRC |
| ReactOS* | Yes | No | Yes | No | Yes | Yes | Webchat |
| Arch | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| ArcoLinux | Yes | No | No | No | No | No | Discord |
| Parrot | Yes | Debian Wiki | No | No | Forum Only | No | IRC/Telegram |
| Kali | Yes | No | Yes | No | Yes | No | IRC |
| PCLinuxOS | Yes | No | No | No | Forum Only | No | IRC |
| Lite | Yes | No | Yes | Yes | Yes | No | No |
*All are Linux distributions except ReactOS
Column descriptions:
- Distribution: Name of the distro
- Forum: Is there a support message board?
- Wiki: Is there a user-editable wiki?
- Community: Are there any links where I can directly contribute to the project?
- Membership: Can I become a voting member of the community?
- Bug Reporting: Is there a way to report bugs that I find?
- Mailing list: Is there an active mailing list for support, announcements, etc?
- Chat: Is there a way to talk to other people in the community directly?
What is this list?
This is the top 20 active projects distributions according to distrowatch.org in the past 12 months.
Things that I learned:
Only well-funded corporate sponsored Linux distributions (Fedora, Ubuntu, OpenSUSE) have all categories checked. That doesn’t mean that anyone is getting paid. I believe this means that employees are probably the chief contributors and that means there are more people putting in resources to help.
Some distributions are “Pat’s distribution”. Pat’s group owns it and Pat doesn’t want a steering committee or any other say in how the distro works. Though contributions by means of bug reports may be accepted.
A few distributions “outsource” resources to other distributions. Elementary allows Stack Exchange to provide their forum. Parrot Linux refers users to the Debian wiki. Mint suggests that you put in bug reports with the upstream provider unless it is a specific Mint create application.
There are a few Linux distributions that leave me scratching my head. How is this in the top 20 distros on distrowatch? There’s nothing here and the forum, if there is one, is nearly empty. Who uses this?
What do you want from an open source project?
Do you want to donate your time, make friends, and really help make a Linux distribution grow? Look at Fedora, Ubuntu, OpenSUSE, or Arch. These communities have ways to help you make this happen.
Do you want to just install a free OS on your machine and not worry about what goes into it until something breaks? Check out a Linux distribution with an active and friendly support community. Sometimes the more avenues the better. Sometimes you only need one really good and helpful forum.
Suggestions for distro owners:
Explicitly declare on your website what you want from the people who use your distribution and how they can help! Maybe you just need funding so you can quit your day job and do this full time. Maybe you really need well written bug reports and testers. Say so and help them help you!
Did I miss something? Did I say that you have no chat but you have a thriving community on IRC? Then let me know and I will update this blog post! Also, make sure that it is visible on your page and not hidden away.
On responsible vulnerability disclosure
Recently KDE had an unfortunate event. Someone found a
vulnerability in the code that processes .desktop and
.directory files, through which an attacker could create a malicious
file that causes shell command execution (analysis). They went for immediate,
full disclosure, where KDE didn't even get a chance of fixing the bug
before it was published.
There are many protocols for disclosing vulnerabilities in a coordinated, responsible fashion, but the gist of them is this:
-
Someone finds a vulnerability in some software through studying some code, or some other mechanism.
-
They report the vulnerability to the software's author through some private channel. For free softare in particular, researchers can use Openwall's recommended process for researchers, which includes notifying the author/maintainer and distros and security groups. Free software projects can follow a well-established process.
-
The author and reporter agree on a deadline for releasing a public report of the vulnerability, or in semi-automated systems like Google Zero, a deadline is automatically established.
-
The author works on fixing the vulnerability.
-
The deadline is reached; the patch has been publically released, the appropriate people have been notified, systems have been patched. If there is no patch, the author and reporter can agree on postponing the date, or the reporter can publish the vulnerability report, thus creating public pressure for a fix.
The steps above gloss over many practicalities and issues from the real world, but the idea is basically this: the author or maintainer of the software is given a chance to fix a security bug before information on the vulnerability is released to the hostile world. The idea is to keep harm from being done by not publishing unpatched vulnerabilities until there is a fix for them (... or until the deadline expires).
What happened instead
Around the beginning of July, the reporter posts about looking for bugs in KDE.
On July 30, he posts a video with the proof of concept.
On August 3, he makes a Twitter poll about what to do with the vulnerability.
On August 4, he publishes the vulnerability.
KDE is left with having to patch this in emergency mode. On August 7, KDE releases a security advisory in perfect form:
-
Description of exactly what causes the vulnerability.
-
Description of how it was solved.
-
Instructions on what to do for users of various versions of KDE libraries.
-
Links to easy-to-cherry-pick patches for distro vendors.
Now, distro vendors are, in turn, in emergency mode, as they must apply the patch, run it through QA, release their own advisories, etc.
What if this had been done with coordinated disclosure?
The bug would have been fixed, probably in the same way, but it would not be in emergency mode. KDE's advisory contains this:
Thanks to Dominik Penner for finding and documenting this issue (we wish however that he would have contacted us before making the issue public) and to David Faure for the fix.
This is an extremely gracious way of thanking the reporter.
I am not an infosec person...
... but some behaviors in the infosec sphere are deeply uncomfortable to me. I don't like it when security "research" is hard to tell from vandalism. "Excuse me, you left your car door unlocked" vs. "Hey everyone, this car is unlocked, have at it".
I don't know the details of the discourse in the infosec sphere around full disclosure against irresponsible vendors of proprietary software or services. However, KDE is free software! There is no need to be an asshole to them.
Tricks with IPFS
Since April I am using IPFS
Now I wanted to document some neat tricks and details.
When you have the hex-encoded sha256sum of a small file – for this example let’s use the GPLv3.txt on our media –
sha256sum /ipns/opensuse.zq1.de/tumbleweed/repo/oss/GPLv3.txt
8ceb4b9ee5adedde47b31e975c1d90c73ad27b6b165a1dcd80c7c545eb65b90
Then you can use the hash to address content directly by prefixing it with /ipfs/f01551220 so it becomes
/ipfs/f015512208ceb4b9ee5adedde47b31e975c1d90c73ad27b6b165a1dcd80c7c545eb65b903
In theory this also works with SHA1 and the /ipfs/f01551114 prefix, but then you risk experiencing non-unique content like
/ipfs/f0155111438762cf7f55934b34d179ae6a4c80cadccbb7f0a
And dont even think about using MD5.
For this trick to work, the file needs to be added with ipfs add --raw-leaves and it must be a single chunk – by default 256kB or smaller, but if you do the adding, you can also use larger chunks.
Here is a decoding of the different parts of the prefix:
/ipfs/ is the common path for IPFS-addressed content
f is the multibase prefix for hex-encoded data
01 is for the CID version 1
55 is for raw binary
12 is for sha2-256 (the default hash in IPFS)
20 is for 32 byte = 256 bit length of hash
And finally, you can also access this content via the various IPFS-web-gateways:
https://ipfs.io/ipfs/f015512208ceb4b9ee5adedde47b31e975c1d90c73ad27b6b165a1dcd80c7c545eb65b903
You can also do the same trick with other multibase encodings of the same data – e.g. base2
Base2 looks pretty geeky, but so far I have not found practical applications.
Kernel Adventures: Are USB Sticks Rotational Devices?
Using syslog-ng with the Elastic stack
One of the most popular destinations of syslog-ng is Elasticsearch. Any time a new language binding was introduced to syslog-ng, someone implemented an Elasticsearch destination for it. For many years, the official Elasticsearch destination for syslog-ng was implemented in Java. With the recent enhancements to the http() destination of syslog-ng, a new, native C-based implementation called the elasticsearch-http() destination is available.
Why do so many people want to send their logs to Elasticsearch? There are many reasons:
-
it is an easy-to-scale and easy-to-search data store
-
it is NoSQL: any number of name-value pairs can be stored (Hello, message parsing!)
-
Kibana: an easy-to-use data explorer and visualization solution for Elasticsearch
And why use syslog-ng on the sending side? There are very good reasons for that, too:
-
A single, high-performance and reliable log collector for all your logs, no matter if they come from network devices, local system or applications. Therefore, it can greatly simplify your logging architecture.
-
High speed data processor that parses both structured (JSON, CSV, XML) and unstructured (PatternDB) log messages. It can also anonymize log messages if required by policies or regulations, and reformat them to be easily digested by analyzers.
-
Complex filtering, to make sure that only important messages get through and that they reach the right destination.
This blog post is based on the Elasticsearch-specific parts of the syslog-ng workshop I gave recently at the Pass the SALT conference in Lille, France.
Before you begin
The elasticsearch-http() destination was introduced in syslog-ng version 3.21. To be able to use it, you need HTTP and JSON support enabled in syslog-ng. If you installed syslog-ng in a package, these features might be in separate sub-packages so you can avoid installing any extra dependencies. Depending on your distribution, the necessary package might be called syslog-ng-http (Fedora/RHEL/CentOS), syslog-ng-curl (openSUSE/SLES) or syslog-ng-mod-http (Debian, Ubuntu). Recent versions of FreeBSD ports enable these features by default.
If it is not available as part of your Linux distribution, check our 3rd party binaries page for downloads, or build it yourself from source.
Obviously, you also need Elasticsearch to be installed. The example configuration is tested to work with Elasticsearch 7.X. The minimal differences between 7.X and earlier versions from the syslog-ng configuration point of view will be noted.
Learn how to install syslog-ng and Elasticsearch 7 on RHEL/CentOS, our most popular platforms: https://www.syslog-ng.com/community/b/blog/posts/syslog-ng-and-elasticsearch-7-getting-started-on-rhel-centos
Learning syslog-ng
If you are new to syslog-ng, you can learn about the basics, its major building blocks and configuration from my blog at https://www.syslog-ng.com/community/b/blog/posts/building-blocks-of-syslog-ng. It is the generic part of the syslog-ng workshop I gave at the Pass the SALT conference in Lille.
Once you are confident with the basic concepts, it will be easier to read the documentation. It is massive (well over 300 pages, detailing all the smallest details of syslog-ng), and available at https://www.syslog-ng.com/technical-documents/list/syslog-ng-open-source-edition/
Elasticsearch
Originally, the official syslog-ng Elasticsearch driver was written in Java. It is still available, but most likely will be phased out once the new elasticsearch-http() is fine-tuned. It has several disadvantages (namely, it cannot be included in Linux distributions, and requires a lot more resources), though. The new elasticsearch-http() destination is a wrapper around the http() destination of syslog-ng,written as native C code. As it does not have any “esoteric” dependencies, it can be part of any Linux distributions. Except for extreme load, it is a lot less resource-hungry than the Java-based destination. It only uses more resources than the Java-based Elasticsearch destination in some extreme cases.
Below you can see a very basic configuration for syslog-ng, which saves logs locally and sends the same logs to Elasticsearch as well. This way you can easily check if your logs arrive to Elasticsearch.
@version:3.21
@include "scl.conf"
source s_sys { system(); internal();};
destination d_mesg { file("/var/log/messages"); };
log { source(s_sys); destination(d_mesg); };
destination d_elasticsearch_http {
elasticsearch-http(
index("syslog-ng")
type("")
url("http://localhost:9200/_bulk")
template("$(format-json --scope rfc5424 --scope dot-nv-pairs
--rekey .* --shift 1 --scope nv-pairs
--exclude DATE --key ISODATE @timestamp=${ISODATE})")
);
};
log {
source(s_sys);
destination(d_elasticsearch_http);
flags(flow-control);
};
The configuration above sends log messages to Elasticsearch using the new elasticsearch-http() destination. You need to set an index name and a URL. The type() option is also mandatory, but for Elasticsearch 7.X you should leave it empty. You can see that the Elasticsearch destination uses a complex template (namely, it uses JSON formatting and sends not only syslog data, but name-value pairs parsed from messages as well).
Name-value pairs created by out-of-box parsers of syslog-ng start with a dot. When formatted into JSON, these initial dots are turned into underscores, which is problematic with Elasticsearch. In the template above, initial dots are simply removed. While it is OK in most cases, in your specific environment it might cause problems (namely, overwriting existing name-value pairs), so double check the name of your name-value pairs before using this template.
Elasticsearch prefers the ISODATE date format over the traditional syslog date format, which is why timestamp is replaced on the last line of the template.
You can learn a lot more about configuring syslog-ng for Elasticsearch from the syslog-ng documentation. Here I would like to highlight two differences from Beats/Logstash:
-
If you want to feed a cluster of Elasticsearch nodes using syslog-ng, you have to list the nodes in the url() parameter. There is no automatic node discovery.
-
By default, syslog-ng sends all data to Elasticsearch as string, which limits how data can be used. You can use mapping on the Elasticsearch side, and you can also configure data type on the syslog-ng side: https://www.syslog-ng.com/technical-documents/doc/syslog-ng-open-source-edition/3.16/administration-guide/9#TOPIC-956418
Testing
The configuration above sends all system logs to the Elasticsearch destination as well, so you will most likely have some sample logs in Elasticsearch very soon. If your test machine does not produce any logs within a reasonable time frame, you can use the logger utility to send a few test messages:
logger this is a test massage
Even without extra configuration, you can see results from message parsing in Elasticsearch. Recent (3.13+) versions of syslog-ng parse sudo log messages automatically. If you run any commands through sudo, you should see name-value pairs parsed from sudo messages.
What is next?
Learn what is new with Elasticsearch 7 and syslog-ng, and how to send geographical data from syslog-ng to Elasticsearch along the way: https://www.syslog-ng.com/community/b/blog/posts/syslog-ng-with-elastic-stack-7
If you have questions or comments related to syslog-ng, do not hesitate to contact us. You can reach us by email or you can even chat with us. For a long list of possibilities, check our contact page at https://syslog-ng.org/contact-us/. On Twitter I am available as @Pczanik.