My Blog

This blog's still very much WIP - there's not much content, and posts don't yet have a table of contents. I'm currently working on a Lua runtime to embed in my website's compiler; once that's done, this will start looking a lot nicer.

So, TLDR, you may want to check back in a couple of weeks.

A Rhetorical Analysis of the Late Great XZ-Utils Backdoor (Sorta)

Analyzing the human-to-human interaction (ew) involved in the XZ-Utils breach. I wonder if this will format correctly on the website because it is NOT on my programming backend.

Compile Times are Run Times, Too

I never actually finished this blog post. It has some interesting points. It's pretty rambly. Who knows if I'll finish it; but it loosely describes the problem I want to solve in programming.

AI is Unsustainable

As an AI chatbot, please stop investing in this technology.

Truly A Blog Post

wowza a cool description
This one's just a test post.

AI is Unsustainable

There's been an insane spike in AI usage over the past years. It's to the point that I don't think I even need to explain that sentence - you probably know it just as well as I do.

Although, if you do want some numbers:

ChatGPT's number of weekly users doubled from 100 million to 200 million users between November of 2023 and August of 2024 (less than a year!).
ChatGPT released in November of 2022. In the two years since then, there have been at least 3 more major company-backed LLMs released (Google's Gemini, Anthropic's Claude, and Facebook's Llama).
By March of 2023, according to Pew Research, nearly 60% of US adults had heard of ChatGPT (and let's be honest - getting 60% of US adults to know about anything is impressive!).

I've even had a teacher encourage AI usage on assignments.

Note:
Technically, using "AI" here is a bit of a misnomer. Artificial intelligence is an incredibly broad field, and I'm only really talking about one specific aspect of it: Commercial LLMs, the Large Language Models (like ChatGPT and Gemini) being developed commercially by companies. I'm still going to say AI in this post so it's easy to read, but I wanted to leave this disclaimer to remain accurate.

The Problem

While it's cool to see a product spreading so rapidly, it's unfortunately a very problematic technology. There are tons of problems and questions with AI. Many of these problems aren't ones I'm going to address in this post, because they can get rather philosophical and lengthy - I'm not, for example, going to discuss if this technology is truly intelligent.

However, there are three objective problems with AI that you should be aware of:

AI is damaging to the environment, using absurd amounts of water and power. Its power usage is also causing increases in carbon emissions.
Companies are overselling the future of AI. It will not grow to meet their claims.
AI is not financially sustainable, nor profitable. It is draining immense sums of money from our economy.

I Don't Care

These problems seem large and abstract, and I can completely understand someone just not caring. It's easy to write this off as something that doesn't concern us, or something that's too big to change. I thought for a while that this tech would just breeze over and we'd move on after a bit... but, it hasn't. Instead of breezing over, AI has only become more relevant to me and to you. Let me explain how:

AI is relevant because companies are forcing it onto us. iOS, macOS, Android, and Windows are now all introducing (or planning to introduce) AI technology directly onto our smartphones and computers. This technology - which wasn't even relevant until the release of ChatGPT 2 years ago - is now being shoved into everything from web searches to app notifications.
AI is relevant because it's affecting the planet we live on and the economy we live in. As I'll show later in this post, AI is devouring water, electricity, and money at a faster rate than any technology we've seen before. This can worsen droughts in areas low on water and adds strain to the economy we all have to deal with.
AI is relevant because remaining ignorant on the topic allows companies to take advantage of us. For as long as people remain ignorant on the topic, companies will manipulate those people by pushing more AI products to them. Doing so maintains hype, attracts potential customers, and draws in investors - all of which boosts companies' precious profits. Just being more aware of AI and its downsides makes you more resilient to this manipulation.

I'm writing this post because AI is relevant to you. You deserve to know its consequences, and what you can do to address them.

Compute Power Needed for AI

I want to start off with the first fact that companies never mention - AI is super resource-intensive to run. Sure - we've all had apps that take forever to load; computers taking ages to boot; the videogame that makes your laptop's fans run so hard, it sounds like the laptop's going to take off like a hovercraft - but this is on a different level. AI requires entire datacenters of compute power to run. We have buildings, today, that do nothing except process the requests of AI users across the world.

It may sound like I'm being vague about exactly how much compute power AI needs, and you'd be right. AI companies hide figures like this; we don't get to know how many datacenters AI companies use or even where those datacenters are.

However, there are hints as to how much compute power they need. OpenAI seriously proposed building multiple 5-gigawatt datacenters across the US (5 gigawatts being roughly equivalent to 5 nuclear reactors' worth of energy). They've also stated in their blog that they planned on using thousands to tens of thousands of datacenter computers owned by Microsoft for their programs. That blog post is from all the way back in 2016 - six years before ChatGPT was released - and OpenAI's computing needs have absolutely risen dramatically since then. In a much newer blogpost announcing the release of GPT-4, OpenAI even mentions building a literal supercomputer to run the model.

Side Experiment: Just How Much Compute Power is Needed for LLMs?

You can actually run Meta's Llama LLM locally on your machine with ollama. I have a pretty powerful machine, and was curious how an LLM would perform on it.

Warning:
This section ended up being super long. Like, it's long enough to warrant its own blog post. You can skip to the bottom for the conclusion.

For reference, my PC's specs are:

AMD Ryzen 9 5900X CPU (12 cores, 24 threads, overclocked to 4.9ghz)
32gb of RAM
AMD Radeon RX 6800 GPU (16gb VRAM, 60 compute units, 1815mhz)
Running Arch Linux 6.11

This is a pretty chunky build - I easily play any game, usually at high (sometimes max) graphics. I can run Cyberpunk 2077 raytraced at roughly 70fps. I can also crack passwords in the rockyou.txt list in under 30s. PCGameBenchmark rates my setup at 98%; more powerful machines can be built, but my build approaches top-of-the-line for common consumers. Just keep this in mind when I discuss the performance of the LLMs below.

Back to LLMs - an LLM's size can more-or-less be compared by the number of parameters it has. I'm not going to go over what exactly a parameter is, but they kind of make up the "brains" of an LLM. More parameters allow an LLM to generate better output (assuming it's trained well), but they will also require more compute power to run. GPT-3 has 175 billion parameters. It's not publicly known how many parameters GPT-4 has, but it's rumoured to be over 1 trillion.

The largest Llama model I found from a quick search seemed to be the Llama 3.1 405b model. The 405b means it has 405 billion parameters, which is even larger than GPT-3. Unfortunately, this model requires over 200gb of free space to install. That would take forever to download, and I'd have to actually clean my files to make room for it... which, like, ew.

So I went with the next biggest one: The Llama 3.1 70b model (again, 70b meaning 70 billion parameters). This model needed 40gb of free space. The following is my experience running it.

First off, the model took roughly an hour to download because it's 40gb. In the meantime I played around with the smaller Llama 3.2 model, which clocks in at 3 billion parameters and only needs 2gb of storage space. Much smaller.

I actually liked Llama 3.2; it was small, generated answers quickly, and I could use it right from my terminal. If I didn't have to worry about AI hallucinations, I'd probably find myself using it since it's so fast. I asked it a few basic questions about history and got it to write a few basic programs. It knew major world events and could write some Python; though it really tripped up when I asked it to write Rust (yay for my job security?). Considering how much smaller it was than GPT-3, I wasn't really expecting much, and was honestly kind of surprised at how well it did. I assumed I'd notice a difference from ChatGPT immediately, but it seemed fine in the few minutes I used it (granted, I've not used ChatGPT very extensively either).

Note:
For non-programmers, Rust and Python are both programming languages; I just asked Llama to write simple programs in both.

Performance-wise, I was already starting to get worried, though - queries were taking up a noticeable amount of CPU:

I know there's a lot going on in this screenshot. The left half is just output from the LLM. The right half the important part - it's measuring how much of my hardware is being used. The topmost bar graph is CPU usage, the middle (purple) one is RAM/memory usage, and the bottom one is network usage. The important ones are the CPU usage and RAM usage.

The three spikes I drew arrows to were three of the queries I ran. The spikes on their own weren't concerning; but the fact that there's already spikes doesn't bode well for trying to run the 70b model.

The first spike was from the prompt "generate a 100 word story". The second was "generate a 2000 word story", which understandably took longer and resulted in a longer spike. The third prompt was "write a hello world program in Rust using the egui library", which resulted in a shorter but noticeably higher spike.

AI is supposed to be GPU-intensive, though, not CPU-intensive, so my next step was to monitor GPU usage while running some prompts. I started a new session and ran the exact same three prompts, measuring GPU usage with nvtop:

The orange line is GPU memory usage, and the green line is GPU compute usage. The very first spike in this image is just from starting Llama, and the other three are the same prompts as before. I did switch back to my IDE in the middle of the 2000-word-story prompt, which may have affected GPU usage slightly (my code editor uses GPU-accelerated rendering); otherwise, this data should be pretty good.

The model was already using 22% of my GPU memory, which was again a little concerning since this model is small. The prompts maxed out GPU usage, but I think this is expected - it's just using its resources well.

After experimenting with 3.2 and writing this portion of the blog, the download for 3.1 70b model was almost done. This model has over 22x as many parameters, so I was worried if it'd run at all after seeing 3.2's memory usage.

I tried to run it for the first time, and... it immediately errored! Turns out this model needs a whopping 26gb of memory to run. I have 32gb of memory, but also had a lot of apps open. I ended up having to close every single app that was open to have enough memory for the model - no web browser, no Discord, not even the code editor I was writing this post in (I had to take notes on paper like a plebeian).

Amazingly, things only got worse from there.

The model took one and a half minutes to even start. Performance was looking like this:

(I got smart and measured GPU usage at the same time as system usage. They're both the same programs as before - the top half is the CPU/memory/network usage, the bottom half is the GPU usage.)

After a very painful minute and a half, it finally opened. I started with a basic "who are you" to let the AI introduce itself, and was stunned by how slow it was.

The 3.2 model had generated tokens faster than ChatGPT on the web. It gave quick answers and felt responsive.

This one was generating one token every 1-2 seconds.

Note:
If you don't know what tokens are, I'll explain later in the blog. Just pretend a token is an English word for now.

I tried to get a screenshot of the resource usage while the AI was generating a response. This then caused the LLM to crash, and I had to restart the whole thing, including another minute-and-a-half startup time. ugh.

I didn't take any more screenshots until after I shut down the model, to avoid any more crashes. I ran 3 more prompts, and timed the responses for each one:

"write a 50-word story": 1 minute, 10 seconds
"write a python program that adds two numbers and prints the result": 3 minutes, 3 seconds (to give some credit to the AI, it decided to write two programs here. The second was identical to the first but showed off Python's f-strings feature).
"write a hello world in rust using egui": 6 minutes, 35 seconds (the program it wrote also didn't work in the end because the AI invented a library called "epi").

If you're wondering what the outputs for these prompts were, I got a little too eager when shutting down the AI so I could open all my apps again, and accidentally closed the AI chat window. So the output's been lost to history...

After waiting an amazing 6 minutes for the AI to write a basic Rust program, I closed everything and grabbed one last performance screenshot:

The GPU has a lot of spikes, but overall usage was pretty low. In comparison, the 3.2 model consistently used at least 80% of the GPU when it was processing prompts. I think what happened is my GPU didn't have enough VRAM to load the entire AI at once, so it was actually having to load parts of it at a time while processing the prompt. Either that, or it was running part of the model on my CPU instead of my GPU, because my CPU was pretty consistently pushed to the max while running prompts.

Keeping this performance in mind, let's compare to OpenAI. They're running an AI with several times as many parameters as the one I just attempted. They're also doing it concurrently to generate responses for tons of users at the same time, since they have 200 million weekly users.

I think it's safe to say OpenAI is using insane amounts of compute power, and saying they run ChatGPT on supercomputers isn't an exaggeration.

Environmental Impact

Since the tech needs such an incredible amount of compute power, it's going to have an environmental impact. The companies developing LLMs don't release statistics on the environmental impact of those LLMs (of course they don't - it's bad PR!), so we can't find exact details on the environmental impact of LLMs. That being said, there is a way for us to get a rough feeling for the environmental impact of AI.

As stated before, AI programs run in massive datacenters. Although we don't have figures for the environmental impact of AI, we do have figures for the environmental impact of datacenters. Companies like Google and Microsoft publish yearly reports on their environmental impact, which we can use to analyze the impact of their datacenters.

Water Usage

Computers, obviously, get quite hot when pushed to their limits. For smaller computers like laptops, a couple of fans is generally enough to solve this - the fans can pull cold air in and push hot air out, preventing the system from overheating.

Datacenters, however, are an entirely different beast. They have so many computers and generate so much heat that fans just aren't enough to cool them. Instead, they rely on freshwater for cooling. Unfortunately, datacenters use a lot of water to do this - according to a 2023 study from the University of California, datacenters were responsible for the consumption of somewhere between 4.2 and 6.6 billion cubic meters of freshwater for cooling.

Numbers that big don't really make sense without context, so the study also gives us this comparison: Datacenters are using just as much water as half of the entire United Kingdom, every year.

Comparing Water Usage to the Pandemic

Warning:
I made this part its own section because it's somewhat speculative. It doesn't have the same degree of certainty as the rest of this post.

During lockdown for COVID-19, the usage of internet services skyrocketed between 40% and 100% (depending on the service). Because of this increased traffic, we'd expect to see increased datacenter usage (and therefore environmental impact) between 2020 and 2022. ChatGPT released in 2022, and LLMs have really kicked off since then.

With this context, we should be able to see if AI is having a real, measurable impact on water usage by comparing the water usage of datacenters before, during, and after the pandemic. So, then - what is that water usage?

Table Explanation:
This table comes from Microsoft's 2024 environmental report fact sheet. I chose Microsoft in particular because OpenAI uses Microsoft's Azure compute services, so ChatGPT has a direct effect on Microsoft's datacenters' environmental impacts.
The columns are "fiscal years", which run from the July 1st of the previous year to June 30th of the current year. So, for example, FY23 ran from July 1st, 2022 to June 30th, 2023.

As expected, water usage climbed during fiscal years 21 and 22, during the pandemic. But, interestingly, water usage continued to climb after the pandemic ended - by more than 22%!

This seems to me to be the direct result of AI. Internet usage skyrocketed so much during lockdown because of how many things we transitioned to being virtual. School, meetings, and even just buying groceries all went digital, the vast majority of which went back to in-person after lockdown.

It also doesn't seem likely that there's really another explanation for such a drastic increase in water usage. ChatGPT became the fastest growing platform ever within months of its launch - Microsoft hasn't had anything else grow even nearly as explosively since lockdown ended.

Energy Usage

Let's switch gears and talk about electricity. Computers use power to run. At smaller scales like laptops and phones, power usage is pretty low, and we don't really think about it. However, at the scale of an entire datacenter, power usage becomes immense:

In 2023, datacenters made up somewhere between 1%-2% of the entire world's electricity usage.
A single datacenter can use just as much power as 50,000 homes.
According to Google's 2024 environmental report, their carbon emissions have risen by 48% since 2019. There also appears to be a noticeable spike in emissions in 2023, the year they trained and released Gemini:

Now, obviously, there's a lot of things that contribute to our environmental impact besides AI, and even besides datacenters. But any time a company brags about new AI features, or new statistics come out about the number of people using LLMs, keep these numbers in mind. Those new features and users are having a real impact on our power and water usage - an impact companies don't seem keen on talking about.

Fleeting Training Data

The next problem with AI that I'll discuss today is its disappearing training data. I'm going to give a brief overview of how LLMs work as some context for this issue.

The GPT in ChatGPT stands for "Generative Pretrained Transformer" - it's a label for the category of AI that LLMs fall under, and explains how they work.

GPTs are "generative" because they - major spoilers here - generate new content. When you give an LLM a series of words, the LLM will predict the next word that should follow it. A response is generated by predicting the next word, adding it to the response, then repeating that over and over until the response is complete. LLMs predict the next word based on a kind of word database they have, which stores related words closer together.

"Pretrained" refers to the fact that GPTs/LLMs receive training before they're used. That training basically updates the word database the LLM uses and makes it more accurate.

"Transformer" refers to the math that makes GPTs work; but I'm not going to get into that here.

Note:
Technically, LLMs work with tokens, not words. Tokens are how LLMs break down English, and may be words or pieces of words.
Here's some examples of tokens for ChatGPT, with each token separated by a "/":
"Hi, my name is Bob" is broken down into the tokens "Hi/ my/ name/ is/ Bob"
"the orange cat meows" is broken down into the tokens "the/ orange/ cat/ me/ows" (notice that "meows" gets split into 2 tokens)
You can test other phrases yourself here: https://platform.openai.com/tokenizer.
For simplicity, I'm going to keep saying "words", not "tokens", but tokens would be more accurate.

The important thing to realise here is that an LLM's performance depends largely on its training. The training it receives creates the database that it uses to generate content. Too little good data or lots of bad data will make a bad LLM; lots of very good data will make a great LLM. To keep improving the LLM, you'll need even more high-quality training data.

Note:
I'm oversimplifying greatly here for the sake of explaining this problem. There's a lot more factors that go into GPTs.
If you want a more in-depth explanation on how GPTs work, 3Blue1Brown has a great series on YouTube about it.

Since AI needs tons of high-quality training data, that data comes from the internet. There's no other source that can provide such vast amounts of content for training data. However, not all internet content is going to make good training data. A satirical article from The Onion or a piece of propaganda isn't going to be good for the AI, because then it might generate content with misinformation. Ironically, AI-generated content also isn't high-quality enough to train AI, because any errors it makes will get reinforced in AI that trains on it.

This essentially means that training data has to be curated and filtered down to a single set of "good data" that will improve the AI. OpenAI talks about this in their paper about GPT-3 - they say they took a massive collection of 45tb of text and filtered it down to 570gb, which was then added to the dataset they trained GPT-3 on. That's a lot of cut data - they only kept roughly 1.3% of it!

The problem (for AI) is that people have started adding restrictions to their content that specifically prevents using their content as training data for AI. According to a recent study, many of the sites that have historically been goldmines for training data now have as much as 45% of their content restricted this way. The same study states that these restrictions have increased more than 500% compared to last year. In short, the most-used sources for AI training data are becoming less and less available over time.

So when Sam Altman promises we'll have "artificial general intelligence" that thinks like humans in the next few years, it's nothing but marketing. It's a shiny prediction that makes OpenAI sound amazing, which gets the company more investors and allows them to stay afloat.

The reality is that long before we have AI as intelligent as humans, or build a Skynet that tries to eradicate us all, we're going to run out of training data to keep improving AI. It's simply going to stagnate.

Now, there is a potential solution to this problem: AI companies could simply pay people to write new content, and use that as training data. Unfortunately, that's unlikely to happen, because...

Financial Cost

To me, this is one of the stupid parts of this whole situation: AI, as a product, is not profitable.

This might seem counterintuitive, since most AI products are free or have an optional subscription. However, as discussed at the start of this post, AI needs a lot of compute power and energy to run, and someone has to pay for that. Companies hide these costs from end users when the make AI products free. For now, AI companies are able to stay afloat anyways, because private investors are pouring literal billions of dollars into the tech.

Companies like OpenAI and Anthropic don't publish their financial details publicly, which makes it hard to know exactly what their income and expenses are. However, they do share financial documents with investors when raising funding. As it turns out, the New York Times managed to get their hands on one of OpenAI's investor documents, and they published their analysis of it in an article. According to that article, OpenAI is set to make $3.7 billion this year - but the company will still face a net loss of $5 billion. Furthermore, the Times says some expenses are missing from that $5 billion figure, so their losses could be even larger.

This isn't the first year OpenAI has lost money, either. In 2022, while developing ChatGPT, they reportedly lost $548 million.

Let's put this in a timeline, along with some other data to give more context about OpenAI's growth and finances.

November 2022	OpenAI releases ChatGPT
December 2022	OpenAI loses a reported $548 million
January 2023	ChatGPT reaches 100 million monthly users, becoming the fastest growing platform ever
February 2023	OpenAI releases ChatGPT Plus, a subscription for ChatGPT and source of revenue
November 2023	OpenAI reaches 100 million weekly users
August 2024	OpenAI reaches 200 million weekly users, doubling its weekly user count from 2023
December 2024	OpenAI is set to lose $5 billion

After releasing a subscription plan, becoming the fastest growing platform ever, then doubling their weekly userbase, OpenAI's yearly losses have grown tenfold. AI is, quite simply, not profitable.

"Temporary Investments"

I've seen some people claim these losses in AI are just temporary - a small investment now for a tech that will blow up in the future. They compare to companies like Amazon, which lost money for years after being founded.

To be profitable, AI companies will have to start charging for access to their services. There's no other source of income for them - ads don't even make enough to fund YouTube, and there's no other product for them to sell.

This means, to be profitable, ChatGPT would have to go from a free service with an optional $20/month subscription to a product with a required $20/month subscription (likely even more expensive, considering OpenAI's losses). Would you really be willing (or even able) to pay $240/year for a product that still hallucinates and is unlikely to keep improving? Do you think the majority of current ChatGPT users would pay that price?

Amazon always had the potential to make money because they're free to use and every user results in profit for them. For AI, every new user is an increase in server costs. This isn't a crazy new product that will become the next iPhone - it's a fundamentally unsustainable technology that cannot make profits unless users pay every penny of its insane costs upfront.

What can I do?

AI is currently funded by private investments and some of the wealthiest corporations in existence. It's not some government policy that can be voted against or protested; it's being created through raw economic power.

This economic power still has a weakness, though - it feeds on public opinion, and more specifically, misinformation. Here's some examples of that:

As mentioned before, OpenAI doesn't publicly release financial statements. Because there's few public, concrete numbers for its finances, all of us in the general public really have to rely on word-of-mouth and gut instinct about how they're doing financially. This allows them to make it seem like they're making money, because they always publicly mention the billions they raise in investments without ever discussing their even larger losses.
In some cases, companies call software "AI" that is actually a completely unrelated technology. The Rabbit R1, for example, raised tens of millions in funding for a cheap, handheld AI product. It was later revealed to actually rely on simple scripts from a well-known project called Playwright, and really provided no AI services at all besides ChatGPT queries.
Finally, companies profit by overstating what AI will do. There was Sam Altman claiming his company would invent AGI, but even other companies like Apple have oversold their AI products. Apple Intelligence has seen tons of delays, and has mixed results according to MKBHD (MKBHD actually specifically calls out this exact point of overselling!).

So, then - what can you do about these issues with AI? Well, surprisingly, it's quite simple: talk about what you know. Knowledge is power, so spread what you've read here, and any additional details you learn in your own research. Since AI is running largely on mislead public opinion, spreading the truth is one of the simplest ways to cut through its nonsense. This is actually the exact reason I wrote this blog post to begin with.

Another way you can help fight AI by not paying for AI products. Every single time a person pays for a ChatGPT subscription, purchases a Rabbit R1, or pays for some other kind of AI service, it makes the tech look slightly more profitable. That promise of profits then attracts investors, who can pay to offset the losses from running AI products, which keeps this whole gross cycle continuing.

Talking about what you know will slowly change the opinion of the general public. And as public opinion shifts away from AI hype, investors will start to follow. Making AI less profitable will stack on top of this and continue to discourage investors. And once investors back out, companies will have to finally face the real costs of AI head-on; the financial incentive to constantly push AI products will be completely destroyed. Then, at long last, all of this AI nonsense can finally breeze over.

wowza it's markdown

This is a demo post I'm using to test various UI elements.

I wonder if this link works or not hm?

what about a

table	of	values
wow	how	cool

Here's	a
horizontal	table

or quotes?

And
Here's
Numbers

(or maybe you prefer
no numbers?)

Compile Times are Runtimes, Too

Disclaimer: This blog post is incomplete. I'm just publishing it because I don't want to lose it and am also too lazy to figure out some way to only have it be compiled in my website when I'm building locally.
This post is rambly and repetitive and bad. It loosely describes a problem I want to address in programming. I wouldn't consider it worth reading.

At this point in my life I've been programming for at least 6 years, and I've worked with a variety of programming languages - OOP and functional (Java/Nix), compiled and interpreted (Rust/Python), languages with manually managed memory and garbage collected languages (C/Lua), runtime-dependent languages and languages that produce freestanding binaries (JS/Zig). Many of these languages do things better than others. Web languages have amazing developer tooling, Rust has incredible compiler error messages, Java's type reflection and mixins allow for incredible game modding APIs.

But I've never seen a language with a good build system. Not once.

Most languages I listed above have some kind of compile-time code execution; most have some kind of package/dependency management; most have some kind of compilation settings. But all of them have shortcomings at compile-time.

So, in this post I'm going to describe a new perspective on compile-time code: The perspective that compile-time is, really, just another runtime that happens on a different machine, and that if we built more features around it we could produce faster, simpler, and more flexible code.

Defining "Compile-Time"

Compile time is a vague term, so let's start by discussing that. I'm going to give this term a slightly unique meaning just for this blog post, because the ideas I'm going to discuss apply to every programming language, regardless of if we typically consider them to be "compiled" or "interpreted". Though, in my opinion, this meaning is more correct anyways - regardless of if a language is interpreted or not, every piece of code goes through some kind of "compile time".

Why? Well, the simple answer is computers don't natively speak English - nor any other human language. Regardless of how many thousands of lines of code you write, your CPU is never going to be able to execute a for loop simply because you wrote for out in text in a file somewhere on your hard drive. The for loop and other pieces of code you write first have to be converted into a format the CPU can natively execute, and then the CPU has to be told to execute it.

That process of converting the text to a CPU-friendly format is what I consider compilation. Compilation takes all of the source code you wrote in your project - all of the libraries you import, the types and functions you defined, etc. - and converting it into tokens, a syntax tree, binary, bytecode, etc.

Again, this definition is intentionally designed to include interpreted languages. Even if you're running Python, the Python interpreter never interprets your code as-is in the text file on your computer - your code first gets compiled into a custom bytecode format. Of course, the CPU also doesn't speak Python bytecode, so there also has to be an interpreter that translates the bytecode to the CPU's native assembly language in real-time. That last step is why we call interpreted languages interpreted instead of compiled - but the fact that we make that distinction doesn't mean that these languages do not get compiled.

So, alright. Compile times are the time when compilation happens - when source code gets translated to a CPU-friendly format. But what makes compile time a runtime?

What Is a Runtime?

It's kind of hard to say compile times are or are not something without defining the thing first - so let's analyze runtimes. What are they, where are they used, and why are they useful?

Runtimes are essentially utilities available to a program in the time that it's running. As an example, Java relies on the JVM runtime, which allows interpreting the JVM bytecode that Java compiles to. The JVM also provies a lot of APIs to Java code (such as type reflection) that is only available while that Java code is running. As another example, web browsers provide a runtime to JavaScript that provides global ways for JavaScript to interact with webpages - think document.getElementById, or setting the innerHTML property on an element.

Runtimes don't always have to be obvious or available to code, though. The JavaScript and JVM runtimes also provide automatic memory management, which you often have little control over, since it mostly works silently in the background of your code. Even Rust has a runtime! When your Rust program starts, it actually calls code before your main function, which just sets up some operating system-specific APIs (such as putting environment variables in a hashtable that you can later query with std::env::get).

So, generally speaking, runtimes abstract away some smaller details you don't want to have to deal with, and provide some small APIs to be useful to your code; and all of this happens while your program is running. Web browsers abstract away whatever their internal data structure for a web page is, instead providing standard APIs that work regardless of those internals; Rust abstracts away operating system-specific strings, converting Windows' UTF-16-ish strings to UTF-8 for you (unless you opt to work with them manually); Python pretends memory doesn't exist so you never explicitly deal with pointers or memory allocations.

Runtimes can be useful because they handle problems for you, reducing how much you have to think about while you're programming. You never have to worry about allocation APIs, use-after-frees, and similar bugs in Python; the JVM's custom bytecode means you don't have to worry about different types of assembly on different architectures, since you only ever use Java's; and you don't have to think about shaders, GPUs, or rendering while working with HTML elements and JavaScript.

At the same time, a good runtime is extremely flexible. Lua has automatic memory management, but provides options and functions that let you change how that automatic memory management runs, to optimise your program (see e.g. the collectgarbage function and the garbage collection portion of the manual).

And, in this blog post, I intend to show that compile times - the period of time when your code gets built into executable assembly or bytecode - are also runtimes; they're just implemented very poorly.

"But, Bright", you might say: "You just said runtimes are available while your program is running. How can a compile time be a runtime if your code isn't running then?"

But, see, that's the thing:

Your Code Runs at Compile Time, Too

How exactly this is implemented depends on what language you're using, but basically every programming language in existence has some form of meta-programming, where you can execute code while your code is building. This is mostly used to have code write code for you, reducing boilerplate and making codebases easier to maintain. Here's some examples:

C's compiler directives: #if lets you programmatically decide if code should be compiled, #include lets you insert new code into your code on-the-fly, #define lets you do something like a primitive form of Rust macros and insert new tokens into your code on-the-fly.
Rust's macros: These are similar to functions, but operate on the tokens of your source code at compile time. They take tokens as arguments and generate other tokens as output. Procedural macros are even more powerful as they let you execute arbitrary Rust code at compile time.
Zig's comptime: This one's just in the name, Zig lets you execute (almost) arbitrary Zig code while your Zig code is compiling. It's also how generics are implemented!
Python's decorators: These let you modify a function by passing that function to a different function (the decorator) which then returns the modified function. This presumably happens when the Python is "compiled" to bytecode.
Turing-complete type systems: This one's less practical, but TypeScript and Rust have type systems that are so advanced you can implement basic logic entirely in types, leading to insanity like a type-level chess bot or DOOM in a WASM VM in TypeScript types.

If you contribute to a non-trivial programming project, chances are, part of the source code you maintain is not code that runs at runtime. It's code that runs while your code is building, to make the rest of your code easier to maintain.

The Missing Piece

Compile-time code often has to be optimised to work best for compilation speed (e.g. by compiling different code in debug vs release builds) and may have workarounds to deal with compiler quirks (e.g. reference specialization in Rust) - it interacts with the compiler in much the same way that interpreted code interacts with runtimes.

Why do I mention this? Well, as the saying goes, "if it walks like a duck and it quacks like a duck, then it must be a duck"; the compiler is, truly, a runtime for compile-time code. Compile time - the time during which source code gets converted into a CPU-friendly format - truly acts like run time - the time during which that code actually runs on the CPU. The compiler is a runtime for compile-time code. Compile-time code is a runtime, too.

However, modern code tends to dance around this point, and never quite takes it to its peak. Let's change that!

In the rest of this post, I'm going to lay out what a compiler runtime, accessible from normal code, could look like for programmers. I'm going to introduce various features in a programming language I'm kind of half-working on, and describe real-world applications of those features.

The language is called Shard. It doesn't exist today - all the features I'm going to describe are theoretical, things that I want to implement in Shard, but Shard is nowhere remotely close to even a demo right now. With all that being said, this is what I want to create:

Introducing Shard

Since we're talking about compile times, I'll introduce Shard by walking through its planned compilation process. Like any language, everything starts with a CLI invocation: shard run, shard build, etc.

The Build Script

When invoked, the Shard compiler will look for a build.shard file in the current directory (naturally, this path can be changed via CLI args) and interpreting it. In most compiled programming languages, the compiler looks for a source file, parses it, generates some kind of intermediate representation (IR), then hands that IR off to LLVM (or some other similar program) to generate a native binary. Shard doesn't do this - since Shard code has full access to the compiler's runtime, there's no need for the compiler to do anything automatically. Every single compilation step is handled by Shard code, starting at the entrypoint in build.shard.

You'll notice I said build.shard gets interpreted by the compiler. One of the planned features of Shard is that it can be used as both an interpreted and compiled language. The build script for a Shard program will always be interpreted, and can produce Shard bytecode or optimised native assembly. This choice mostly comes from my experience with Rust and Zig: Compiling build scripts can be slow, which is quite frustrating for development, when I want to see the changes in my code now. Of course, interpreting a large amount of Shard bytecode isn't going to be particularly fast either. The plan is for Shard bytecode to be able to call native assembly, so the build script is interpreted by default but can still use native, precompiled code where performance is a necessity (such as codegen).

What exactly will be in build.shard is hard to say. A lot of the point of Shard is that the compilation process is incredibly flexible. Shard could output a native binary to ship to users; a bundled app format, such as msi, apk, ipa, or app, to ship to users; a static/dynamic library for developers to use; or even an ISO image for a custom operating system - all without any external tooling. Or, because build.shard is just a script, it may not output a binary at all. It could run unit tests, benchmarks, lint the codebase, or update dependencies. Let's look at an example (no syntax highlighting since Shard is extremely WIP):

// For `build.shard` specifically, the `compiler` and `std` libraries are// already linked to the program so they're available for use. They just have// to be imported.use ( std::(fetchers::fetchFromGitHub, print, getEnv, getCliArgs), compiler::(tokenize, parseTokens),);// Shard is super modular! The compiler doesn't even ship codegen. So we can// download that ourselves using `extern`, which defines an external fragment// we want to make available to our code.// (A fragment is one unit of Shard code, like a module/package/crate)extern { llvm-codegen = fetchFromGithub{ owner = "some-github-user"; repo = "shard-llvm-codegen"; tag = "69.420.1"; };};// Functions are defined as `fn <input type> <output type> <code>`def main = fn () () [ // Similarly to Zig, `build.shard` handles CLI flags passed to the compiler, // so your code has full control over the CLI flags - but Shard will also // provide default flags that are good enough for most programs. // Shard is strongly typed, but can infer types like Rust or C++'s `auto`, // so listing the type is optional. let flags = getCliArgs(); // Pseudocode, naturally actual CLI parsing would need to be a lot more // precise and would probably rely on a library. if flags.contains("help") [ // show help message... ] else if flags.contains("test") [ buildProject(); // run unit tests... ] else if flags.contains("lint") [ let tokens = tokenize(./src/main.shard); // iterate over the tokens and lint everything... for token in tokens [ // lint ] // or use the compiler-provided types to analyze your code! let fragment = parseTokens(tokens&); for module in fragment.modules() [ // An item is anything declared with `def` - a struct, enum, function, // etc. This is like type reflection, but for source code! for item in module.definitions() [ if item.name().chars()[0].isLowercase() [ print("Linting error at {item.location().toString()}: Item isn't in CamelCase"); ] ] ] ] else if flags.contains("build") [ buildProject(); ] else if flags.contains("run") [ buildProject(); // Can use the full path to a function instead of importing it std::os::process::spawn(./build/compiled-app); ]]def buildProject = fn () () [ // Optional type signature // Note that generics work like Zig's let tokens: std::Vec(compiler::Token) = tokenize(./src/main.shard); // Operators in Shard always go on the right. This creates a reference to the // tokens variable. let fragment = parseTokens(tokens&); llvm-codegen::buildExecutable{ fragment = fragment; mode = .Release; targetTriple = std::host::targetTriple(); };]

Static Analysis and Metaprogramming

As seen in the build script above, Shard exposes normal data types that contain information about a Shard fragment. The goal is for these data types to have enough information for a complete Shard LSP.

These types should help a lot with static analysis. Linters have all the same information about a codebase that the compiler does. No worries about parsing tokens or source code yourself.

This should also be quite powerful for metaprogramming. These datatypes aren't just intended to be read-only views of a codebase - the goal is for Shard code to be able to create new instances of all these types, so you can declare new types, modules, etc. on the fly:

// Zig-style genericsdef Vec = fn (ty: compiler::Type) compiler::Type [ struct { basePtr: Ptr(ty), len: usize, }]// ...or manuallydef Vec = fn (ty: compiler::Type) compiler::Type [ compiler::Type::Struct::fromNamedData{ fields = { basePtr: Ptr(ty), len: usize, }; methods = {}; }]// Build a module from a set of key/value pairs// The keys would basically take the place of `def` statementsdef moduleBuilder = fn () compiler::Module [ compiler::Module::fromNamedData{ someFunc = fn () () []; }]def myModule = moduleBuilder();

FFI

In theory, as Shard code can define new modules and whatnot on-the-fly, it should be possible for Shard code to parse the source code of another programming language and generate new modules for that code. This would allow for literally seamless FFI.

Is This Really Possible?

Theorised features

These are features I've loosely thought about, but I don't know if I'll actually support them because I don't fully understand their implications yet. All of the above ideas are ones that I'm confident I'll add to Shard, and directly relate to the "compile times are runtimes" topic of this blog post. The below are less related, more unhinged ideas.

Assumptions

One idea I've been toying with is the ability to directly encode assumptions. Each assumption is comprised of a compile-time check and some kind of modification to the compiler runtime.

Here's an example for a theoretical bitness-64 assumption:

Check: Checks if the target being compiled for has 64-bit-wide pointers.
Modification: Enables directly casting from u64 (a 64-bit integer) to usize (a pointer-sized integer) and vice-versa.

Assumptions can then be assumed or checked. Assumed assumptions permanently enable the assumption's modification, but will cause a compilation error if their check ever fails. Checked assumptions allow for conditional compilation - a specific piece of code will be compiled if the assumption's check passes, or a different piece of code will be compiled if the check fails.

Assumptions can be assumed/checked at any level. So a project could assume something for the entire project, or a library could check something for just one code block, and so on.

Here's some other assumptions I've been imagining:

os assumption:
- Check: Checks if the target being compiled for has an operating system present.
- Modification: Enables additional filesystem and allocation APIs.
safety assumption:
- Check: Always passes
- Modification: Allows potentially memory-unsafe operations, like dereferencing raw pointers. This basically acts like Rust's unsafe block.
unix/posix assumptions:
- Check: The target being compiled for has an operating system compliant with UNIX/POSIX standards.
- Modification: Exposes standard UNIX/POSIX functions to the program.
Assumptions for specific CPU features (e.g. AVX512 instructions).

Assumptions would allow a programmer to explicitly control how portable they want their program to be. For example, a music player app can assume(os) since the project only ever intends to support targets with operating systems and has no use for embedded devices. As another example, a library could rely on niche x86 instructions when the target supports them, therefore generating more optimised code on modern CPUs while still supporting older ones.

This also allows the Shard language itself to be more portable; features that depend on a specific hardware, binary format, operating system, etc. can be locked behind an assumption. It also pretty much gives a programmer a checklist of things they need to stop assuming if they ever want to support compiling for a specific target.

Binary Source Code Format

My friend Speykious showed me a demo of Ryan Fleurry's now-abandoned project called Dion, where code was stored in a binary format, not text. The language had its own dedicated editor that displayed the binary format as text to the programmer.

This has the obvious downside that you basically have to recreate all developer tooling yourself - everything from IDEs to Git is designed to work on text content.

However, having control of everything from the editor to version control gives you a lot of potential features that most programming languages simply can't take advantage of. Compilation can be faster since it doesn't have to parse text, and the IDE can simply prevent syntax errors by refusing to let you perform certain edits that would break the binary format. Individual developers can change how their IDE displays the binary format, meaning no more debating over formatting and tabs vs spaces.

This idea needs a lot more thought. Ultimately, I don't think this kind of design belongs in Shard; Shard is primarly designed to show how useful compiler runtime APIs can be, and adding something like this simply makes Shard harder to integrate into existing projects, learn, or switch to. So I'll probably leave this idea for a future language, or perhaps as an alternate syntax for Shard that I'll develop later and integrate as an FFI compiler module.

Rhetoric and Hacking

ROUGH DRAFT!!!
THIS IS AN INCOMPLETE ROUGH DRAFT, FOR ANY ISSUES OR INCONSISTENCIES, PLEASE KEEP THEM TO YOURSELF, NERD.

Just over two years ago, in March 29, 2024, the world of server administrators worldwide was jolted awake by the discovery of the XZ-Utils Backdoor, a supply-chain attack that, after gaining maintainers trust over the course of two years, chained together a backdoor through a benign looking package update.

What's most interesting about this hack, however, is the two years leading up to it. This wasn't necessarily an attack that relied on some innate insecurity of code or computer hardware to exploit, but instead, it expertly navigated the rhetorical ecosystem of our digital world to exploit the weakest link; people and their rhetorical interactions.

Note
This article assumes conceptual understandings of Linux/Unix and computer security. For additional info, read below.

Key Terms: If you know *about* Linux, SSH, and basic cryptography, read on. Otherwise, click me!

Technology shouldn't be inaccessible or gatekept! These are the technical terms that will be gone over in this article, and a very brief overview of what they mean.

Note:
Before you rush to call out inaccuracies, most of these terms are very oversimplified for a working knowledge in the context of this article, and may not technically be correct in every implementation. For an in-depth overview, appropriate Wikipedia articles are linked.

End-to-End Encryption (E2EE) - The process of jumbling up data in a way that only the sender and intended recipient can view its contents.
SSH - Secure Shell, the de-facto standard for securely working on remote servers, thanks to E2EE.
FOSS - Free and Open Source Software, a diverse ecosystem of software characterized by the ability for anyone to be able to use, read, and modify the code behind it to best suit their needs. Typically licensed under the GNU Public License (GPL).
Package maintainer - A person in charge of reviewing and accepting submissions, referred to as Pull Requests or PR's, for a certain software or "package."
Linux - a FOSS operating system that, thanks to being so customizeable, powers a whole lot of things through "distributions" or versions, including servers that run websites, your car's infotainment display, and even Android phones.
Compression software - Programs designed specifically to reduce the size of a file, usually for the purpose of transmitting over the internet. .zip files are an implementation you've probably encountered before.

Research Question: In what ways was rhetoric used in the XZ-Utils backdoor attempt?

Prologue

XZ-Utils is a "set of free software command-line lossless data compressors." [1] Being a lossless -- or perfectly reconstructable -- compression library, has caused it to be the predominant choice for pushing out updates while minimizing internet traffic across distributions. Thanks to this, XZ-Utils has cemented itself to be a critical part of every modern Linux and Unix derivative system.

Due to the nature of free software, however, despite the importance and prevalence of it in nearly all Linux & Unix distributions XZ-Utils has been supported by a volunteer maintainer, Lasse Collin, working tirelessly without compensation since its beginning. In this ecosystem, other volunteer programmers regularly publish "PR's" or Pull Requests, to merge their code (bug fixes, improvements, features, etc.) into the official software repository, which must be manually reviewed by these maintainers such as Lasse Collin.

Note:
TODO factcheck + revise
This is pre-AI boom, and LLM roles in repo management is a potential research direction. See more at footnotes

Over time, a particular maintainer by the pseudonym "Jia Tan", associated email "jiat75@gmail.com", became a diligent contributor, with their first minor commit dating back to 2021-10-29, 364 days before being formally added to the "organization" on GitHub, and 847 days, or nearly 2 years and 4 months before the first malicious backdoor upload. (2022-09-27 & 2024-02-23 respectively [2])

Shortly after, several accounts with no additional media presence began sending increasingly mean(~?) messages in the XZ-Utils mailing list. To note, at this point, the maintainer Lasse Collin had made it rather public that he was facing mental health issues, and would take semi-regular breaks

INPROGRESS: Table of chats

Message	Sender	Date
"Patches spend years on this mailing list. There is no reason to think anything is coming soon."	"Jigar Kumar"	2022-04-22
"Over 1 month and no closer to being merged. Not a surprise."	"Jigar Kumar"	2022-05-27
"Progress will not happen until there is new maintainer. XZ for C [a different programming language] has sparse commit log too."	"Jigar Kumar"	2022-06-07
"With your current rate, I very doubt to see 5.4.0 release this year. The only progress since april has been small changes to test code. You ignore the many patches bit rotting away on this mailing list. Right now you choke your repo. Why wait until 5.4.0 to change maintainer? Why delay what your repo needs?"	"Jugar Kumar"	2022-06-14
"I am sorry about your mental health issues, but its important to be aware of your own limits. I get that this is a hobby project for all contributors, but the community desires more. Why not pass on maintainership for XZ for C so you can give XZ for Java more attention? . . ."	"Dennis Ens"	2022-06-21
"Is there any progress on this? Jia I see you have recent commits. Why can't you commit this yourself?"	"Jigar Kumar"	2022-06-21

Future Research Directions

This article opens several future research directions:

Investigating the role that LLM's may have in repository maintenance and rhetorical attacks' adaptation to this.
TESTING

References

[1] "XZ Utils," Wikipedia, Apr. 11, 2024. Accessed Apr. 27, 2026. Revision Mar. 25 2026; https://en.wikipedia.org/w/index.php?title=XZ_Utils&oldid=1345221622

[2] Russ Cox, "research!rsc: Timeline of the xz open source attack," research.swtch.com. https://research.swtch.com/xz-timeline

i STOLE THIS from brightshard and tis is all under maintenance!!! do not harrass brightshard thinking this is their site, that's over at https://brightshard.dev

NOT!!! My Accounts

My Projects

Active

NixOS Config

tinkr

boml

webby

This Website

cve-rs

Paused

nixrs

BS

OSDev Rocks (paused)

Deprecated

Sol

Bargo

Lokinit

My Blog

A Rhetorical Analysis of the Late Great XZ-Utils Backdoor (Sorta)

Compile Times are Run Times, Too

AI is Unsustainable

Truly A Blog Post

AI is Unsustainable

The Problem

I Don't Care

Compute Power Needed for AI

Environmental Impact

Water Usage

Energy Usage

Fleeting Training Data

Financial Cost

"Temporary Investments"

What can I do?

wowza it's markdown

Compile Times are Runtimes, Too

Defining "Compile-Time"

What Is a Runtime?

Your Code Runs at Compile Time, Too

The Missing Piece

Introducing Shard

The Build Script

Static Analysis and Metaprogramming

FFI

Is This Really Possible?

Theorised features

Assumptions

Binary Source Code Format

Rhetoric and Hacking

Research Question: In what ways was rhetoric used in the XZ-Utils backdoor attempt?

Prologue

Future Research Directions

References

PGP Key

SSH Key

Infinity Mirror

Settings

Theme

Catppuccin Latte

Catppuccin Frappe

Catppuccin Macchiato

Catppuccin Mocha

Accent

Credits