Dusk OS

Dusk OS is a 32-bit Forth and big brother to Collapse OS. Its primary purpose is to be maximally useful during the first stage of civilizational collapse, that is, when we can't produce modern computers anymore but that there's still many modern computers around.

It does so by aggressively prioritizing simplicity at the cost of unorthodox constraints, while also aiming to make operators happy.

Dusk OS innovates by having an "almost C" compiler allowing it to piggy-back on UNIX C code, through a modest porting effort, to reach its goals and stay true to its design constraints with a minimal effort.

You can read on for more details, but the impatient among you might prefer looking at asciinema and video demos of Dusk OS

Note that the contents of this website is also served on gopher://duskos.org through a server running in Dusk as a Dusk package!

Status

List of ported codebases:

List of homegrown applications:

Getting Dusk

Dusk OS can be downloaded as a tarball (without SSL). It is also available as a Git repository at:

git://git.duskos.org/duskos.git

More information about how to build and run Dusk OS is available in the README.md file at the root of the project.

For deployments to actual machines, there's also the Dusk OS Deployments repository that can be of use.

There is also the option of building Dusk Packages on top of other OSes. You can look at Dusk Packages examples for a quick start.

Why build this OS?

Most modern operating systems can do whatever we want them to do. Why do we need another one? Simplicity.

It's difficult to predict post-collapse conditions, but we can suppose that many operators will need to use their machines in novel and creative ways. Hackability of the operating system then becomes paramount. Open source modern operating systems all can be modified to fit its user's needs, but their complexity limits the likelihood that the user is able to do so. A simpler OS increases this likelihood.

But we can't have our cake and eat it too, right? Either you have a simple toy OS or a complex one. Well, maybe not?

Its authors believe that in the history of computing, Forth has been under-explored. Its approach to simplicity is, we think, revolutionary. It has significant shortcomings when system specifications become more complex (Forth hates complexity and doesn't manage it well), but we believe it possible to elegantly marry it with languages that like complexity better.

This mix, we believe, could provide the operator with computing powers rarely seen with other approaches. We've got to try it.

To be clear: this is a research project, we don't know what it will yield beforehand. We have the intuition that it might lead to a big "ah ah!" moment and reveal a breathtaking combination of power and simplicity.

Features making Dusk OS special

A whole OS built from source on boot

One thing that makes Dusk OS special is that it boots from a very tiny core (1000 lines of i386 assembly). From this tiny core, on boot, it builds its way up to a system that has a functional C compiler, which then allows it to bootstrap itself some more.

This peculiarity of Dusk OS has interesting properties. The nicest one, in my humble opinion, is that this allows us to sidestep the entire problems of binary compatibility and relocation and only deal with source compatibility. So, no ELF, no binutils, only code that is designed to run from where it was generated in the first place. This is so much simpler!

Object files? Global symbols? Nah. C functions are simple Forth words.

Harmonized Assembly Layer

Dusk features what we call the Harmonized Assembly Layer (HAL for short). This is a cross-CPU assembler, on which the C compiler relies, which prioritizes implementation and usage simplicity, but is also designed to generate efficient native code.

Shortest path to self-hosting for an "almost C" compiler

Dusk OS self-hosts in about 1000 lines of assembly and a few hundred lines of Forth (the exact number depends on the target machine). From there, it bootstraps to DuskCC, which is roughly 1500 lines of Forth code. To my knowledge, Dusk OS is unique in that regard.

You can pick any C compiler that requires POSIX and it will automatically require orders of magnitude more lines of code to bootstrap because you need that POSIX system in addition to the C compiler. So even if you pick a small C compiler such as tcc, you still need a POSIX system to build it, which is usually in the millions of LOCs.

To be fair, Dusk OS is not the first project thinking of optimizing that path. Efforts at making our modern software world bootstrappable lead to an "almost C", M2-Planet with a feature set comparable to DuskCC with very few lines of code. M2-Planet itself is about 5K lines of code and the various stages that lead to it are generally a few hundred lines each. The project initially ran on top of regular kernels (as in "fat kernels with lots of code"), but some bare metal stages (1, 2) were created and now this little chain end up being comparable to Dusk in terms of lines of code. Still more than Dusk, but in the same ballpark.

Although this path is short and technically leads you to an "almost C" compiler, you can hardly use it because it has no "real kernel" (those bare metal stages mentioned above are enough to compile M2-Planet, but really not much else, they're extremely limited) and no shell. You'll need those if you want to use your shiny compiler.

One of your best picks, should you try this path, would be Fiwix, a minimal POSIX i386 kernel weighting less than 50K lines of C+asm. But then, M2-Planet is not enough. You need to compile tcc (which M2-Planet can compile after having applied a few patches) which weights 80K. Userspace is worse. Bash+coreutils are 400K, even busybox is 190K. We still end up with a pretty minimal and simple system, but it's still a lot more code than Dusk.

So, unless someone tells me about some option I don't know about, DuskCC is quite innovative on the aspect of self-hosting path length.

Dusk OS is pretty fast

The code generated by Dusk OS holds pretty well to modern compilers with fancy optimizations. For example, let's use the famous "Byte Sieve" benchmark and modify it slightly for modern CPU capabilities:

#include <stdio.h>
#define size 6000000
char flags[size+1];
void main() {
  int i, prime, k, count, iter;
  printf("10 iterations\n");
  for (iter=1; iter<=10; iter++) {
    count = 0;
    for (i=0; i<=size; i++) flags[i] = 1;
    for (i=0; i<=size; i++) {
      if (flags[i]) {
        prime = i + i + 3;
        k = i + prime;
        while (k <= size) {
          flags[k] = 0;
          k += prime;
        }
        count++;
      }
    }
  }
  printf("\n%d primes", count);
}

On my 10 years old desktop on a NetBSD 10.0 i386 machine, doing gcc -o sieve sieve.c (GCC version is 10.5.0) produces a 15752 bytes sieve ELF. Doing time ./sieve completes in 1.22 seconds.

Doing the same with gcc -O2 yields a 15780 bytes ELF that completes in 0.46 seconds.

Now let's take the same C file and compile it through DuskCC. The only modifications needed are the removal of the #include and renaming main() to sieve(). The rest stays the same.

To have a fair comparison, the code will need to run "raw" on the CPU. A good way to do so is to use Usermode Dusk. The Dusk package examples repository even has a sievec example that does exactly this. So, if we do:

git clone git://git.duskos.org/dusk-examples.git
(verify PGP signature)
cd dusk-examples
make sievec_frozen
time ./sievec_frozen

If this fail, try make sievec && time ./sievec, but this is going to be slightly slower.

This produces a 339 bytes word named sieve and running this takes 1.23 seconds. Roughly as good as GCC without a -O! The result of the computation is the same of course.

What about Forth code? Let's see:

6000000 const size 
create flags size 1+ allot

0 value count
0 value prime
: sievef ( -- )
  ." 10 iterations\n"
  10 for
    0 to count
    flags size 1 fill
    0 size for2 i flags + c@ if
      i << 3 + dup to prime
      i + begin ( k ) dup size <= while
        0 over flags + c! ( k )
        prime + repeat drop ( ) 
    to1+ count then next next
  count .f" \n%d primes" ;

: sieveh ( -- )
  ." 10 iterations\n"
  10 for
    0 to count
    flags size >> >> $01010101 fill
    0 r! [ ( W=0 ) 0 i) A>) @, begin
      W) flags +) 8b) A>) compare, 0 Z) branchC,
        1 i) <<, 3 i) +, S) &) !, RSP) +,
          size i) compare, 0 >) branchC, begin
          W) flags +) A>) 8b) !, S) &) +,
          size i) compare, <=) branchC, drop then
      1 to' count m) +n, then 1 RSP) +n,
      RSP) @, size i) compare, NZ) branchC, drop ] ( W )
    rdrop drop next count .f" \n%d primes" ;

The sievef version is the Forth only version of the code which weighs 392 bytes and completes in 3.14 seconds.

The sieveh word (also available in examples as sieve) is a HAL-optimized version of the same word, weighing 272 bytes and completing in... 0.53 seconds! Almost as good as gcc -O2!

So, Dusk OS provides you with speed not too far away from what modern compilers with their millions upon millions of lines of code provide you, but through an other-worldly simpler system.

Who is Dusk for?

Dusk OS doesn't have users, but operators. What's the difference? Control. You use a phone, you use a coffee machine, hell you even use a car these days. But you operate a bulldozer, you operate a crane, you operate a plane.

You use Linux, you use Windows. You operate Dusk OS.

Can you operate Linux? Sure, if you're some kind of god1, in the same way that you can operate a Tesla if you're a top Tesla engineer. But you're much more likely to be able to operate a landmower than a Tesla.

The Dusk operator is someone who's creative, close to hardware, can read a datasheet. Dusk shines when one wants to poke around the hardware without limit.

It compares favorably to other more complete OSes because there's no concurrent process to mess with your poking and the driver structure is more approachable, hackable due to its stricter scope and savvier target audience. Also, there is no "userland". Every tool that Dusk provides or that the operator builds can be directly used on system memory. No middleman.

Let's use an example. Let's say you're on a notebook that runs on a chipset of Intel's ICHn family. You read the datasheet and see "oh, nice, there's an SPI interface in there. Maybe that it's not hooked to anything on the notebook, let's play with it."

Now, that chipset is very, very central to the computer. There are good chances, on a BSD or Linux system, that if you begin poking around its registers, you'll step on someone else toes and crash the system because, for example, of some other process that needed to read from disk at the same time.

In Dusk, you could completely break the SATA controller, you'll still be golden as long as you don't access mass storage. Because Dusk doesn't have concurrency, you have tight control over what happen or doesn't happen on the machine, so all you need to do is to avoid words that access mass storage. That gives you ample wiggling space for your hacking session.

To be clear: this is also possible with a custom made BSD or Linux, but you're going to have to strip a lot of pieces from your distro before you get there and some of those pieces might be useful debugging tools which will be difficult to retrofit because they need a wider system. You'll also need a higher cognitive space to fit BSD/Linux wider abstractions in your mind.

Funding

I began a sabbatical from working on modern technology in 2023, which I hope to extend indefinitely. I believe that my work on Dusk OS and Collapse OS could be significant enough to some people that it could end up being philanthropically funded. If you're rich and inspired by this work, please consider it.

Barring that, I also offer coaching and training services on the subject of low level programming.

Resources


  1. Can you fix bugs in GCC, awk, perl and the SATA drivers? Yeah, you're some kind of god.