Linux Weekly News (LWN) is a treasure trove of high-quality articles about the Linux kernel, C, Python, tools and Linux distributions. Collected below are my favorite articles from LWN:
2023
- Beyond microblogging with ActivityPub: Blogging, link aggregation, photo sharing, video sharing, book sharing and other services that are being built around ActivityPub, besides Mastodon.
2022
- Hybrid scheduling gets more complicated: Intel’s hybrid CPUs (such as Alder Lake) have a combination of P-cores (for performance) and E-cores (for efficiency, aka Atom). To guide kernel on which process to move to suitable core, each core has a register that shows which performance class best suits currently running process.
- Docker and the OCI container ecosystem: Containers depend on 3 features: Bind mounts & overlayfs (to construct the container FS), control groups (to partition CPU/mem/IO) and namespaces (to keep process inside container isolated). Most Docker features have been standardized as Open Container Initiative (OCI) specs for image, distribution and runtime. Docker is just one of many OCI-compatible container runtimes now.
- The trouble with symbolic links: Hard links can be only to files, within a FS. Symlinks can be to dirs or files, across filesystems, but there can be race conditions using them if the underlying symlinks in a path change while operating on that path.
- Native Python support for units?: Why adding custom literals (like in C++), unit and unit conversion is turning out to be difficult in Python.
- An Ubuntu kernel bug causes container crashes: Ubuntu picks long-term maintenance kernel branch for LTS and any branch for non-LTS releases. For LTS, it also publishes new hardware enablement (HWE) stacks with backports from recent LTS/non-LTS releases. OverlayFS (and AUFS) overlay files in “upper” directory over files in “lower” directory, with upper files overriding lower when paths are same. ShiftFS remaps user/group IDs in a mounted filesystem.
/proc/<PID>/map_files
shows address->file mappings for that process.
- An overview of structural pattern matching for Python: Py finally gets switch-case, but it’s called match-case and can do pattern matching.
- Modern Python performance considerations: Different builds of CPython (like Ubuntu/Win/MacOS) have different performance. None of the projects (Pyston/Faster CPython/Cinder) looked like they’d do anything to reduce the 10-100x perf gap with C.
- Introducing PyScript: New in-browser Python interpreter that executes the Python code embedded within
py-script
tags in HTML. Created by compiling CPython’s C code to WebAssembly using the Emscripten compiler. “Python is like a Honda Civic with mounting bolts for a warp drive”. Engraving at Boston Public Library: “The commonwealth requires the education of the people as the safeguard of order and liberty.”
- Whatever happened to SHA-256 support in Git?: SHA-256 support was added in 2020 after SHA-1 was considered broken. But without any Git hosting providers supporting it and repos of SHA-1/256 not being interoperable, there is visibility when most folks would use it.
- NFS: the new millennium: 2nd and final part about NFS looks at how state management was added to NFS. NFSv4.1 in 2010, updated in 2020. NFSv4.2 in 2016.
- NFS: the early years: NFSv2 was first publicly available implementation introduced by Sun in 1984, described in RFC 1094. 32-byte file handle was central feature, unique for a NFS server. No state to handle file open/lock/cache - which led to several problems. (Unix Ed7/BSD did not have file locking, lock files
.lock
were used instead.) Other protocols to handle locking, status monitoring, mounting, remote quotas and ACLs. NFSv3 in 1995 to support larger files.
- Two memory-tiering patch sets: A system can have many types of memory which can be faster (GPU HBM, CXL) or slower (persistent) than DRAM. A memory tier system is introduced with higher IDs for faster memory:
MEMORY_TIER_DRAM=200
, _HBM_GPU=300
, _PMEM=100
. Memory tier config on any system can be queried through /sys/devices/system/memtier/
. Pages can be demoted from DRAM to slower memory, while another changeset allows slower tier pages to be moved to faster memory.
- A “fireside” chat: Linus Torvalds has been working remote for 20 years. Every Linux release has had a 9-10 week development cycle in the past 15 years. Linus is more famous among CS students for Git (which he worked on 6 months before handing off to Junio Hamano) than for Linux.
- Vetting the cargo: Mozilla introduces a cargo vetting process using audited package lists for Rust.
- Per-file OOM badness: OOM killer will now also take into account the memory attached to files of a process, when choosing the victim process to kill.
- Improved error reporting in CPython 3.10 and later: Possible thanks to its old LL1 parser replaced with a parsing expression parser.
- CXL 1: Management and tiering: Compute eXpress Link is cache coherent interconnect for connecting CPU, RAM and accelerators (like GPU) between nodes. That is, you could have CPU-only node accessing its RAM from a RAM-only node through CXL.
- Super in Python Part 2: Looks at super() call with 2 arguments and its complexities. Normal super() is equivalent to super(ThisClassName, self). super(X, self) takes MRO call chain of self and then fast forwards to after X and does calls.
- Risks of embedded bare repositories in Git: Bare repositories embedded inside a normal repo is apparently a vector for user inadvertently running any random cmd. Article shows it with good example and commands. I don’t see why this bare repository feature is essential, just kill it.
- KOReader on Kobo review: Seems like lots of features, but super slow.
- Super Python Part 1: super() in multiple inheritance creates a MRO (Method Resolution Order) chain in topological order (C3 linearization) and follows those calls. This is confusing because in a diamond example, super() calls in Bottom, will call Left->Right->Top, while you normally expect Left->Top.
2021
2020
2019
2018
- Feedforward network to classify bugfix patches: Feedforward neural networks with three layers: input, hidden, and output. The layer organization is a way of describing the formula for making the decision; there are weights associated with each of the steps in the paths through the network. But where do those weights come from? The weights come from the training process. Data that has results with expected values can be used to “back-propagate” weights in order to tune the model. Training is a process of “moving in some direction in our weight space” to produce better results, Lawall said. It is a hill-climbing problem where each iteration tries to improve on the last. At some point, she said, you decide that the error is small enough and stop. Their network is called PatchNet.
- Make CI as addictive as a slot machine: “But the trick in making CI really effective is to make it about as addictive as a slot machine. Instant gratification and quick results is absolutely key, and developers want to watch it do its things. The more you can give them that, the more your developers will care about CI results (the green checkmark quickly starts to look like candy in your eyes), and the less maintainers have to check this themselves.”
- Terminal emulator comparison by LWN: Comparison based on latency, scroll speed and resource usage. Xterm has low latency, but slow speed. Urxvt has speed, but bad latency.
- Macro to check if input is an ICE (integer constant expression): See in LWN article. Linus Torvalds explaining how it works.
2017
2013
- BTRFS filesystem: B-tree filesystem. ext2 keeps pointers to individual disc blocks. This means the metadata is quite huge since files have been getting bigger. In constrast, (ext4 and) BTRFS keeps pointers to extends: a set of contiguous blocks. COW (Copy on write): Changed blocks are written elsewhere, so older blocks are still kept on disk (until garbage collected when disc gets full). Thanks to COW, no journal is needed (like ext4). So recovery from crash is easy. COW also enables easy snapshots of files (or history). FS can also span across disks and volumes. Checksums for data and metadata to catch any disk error.
- Free drivers for ARM graphics
2004