中環孖煙通生活雜誌: March 2025

Sunday, March 30, 2025

Austere Instincts

Austerity is soon to be a thing in AI.

Chatted with an old classmate today. One of the things we discussed was how the DeepSeek team started to work on architecture and operations optimizations way way back (a couple years, that's prehistory in AI timescales).

This doesn't seem like a big deal yet. But with the end of Moore's law looming on the horizon, monetary conditions gradually tightening, it seems quite obvious that the prevalent attitude of "don't worry about the money, we can just buy more hardware from Nvidia" can't be sustained for much longer.

When US firms realize capital (money) can't provide a sufficient edge, they *will* have to pivot to focus more on efficiency. And of course they will. But cultural inertia is a thing in large and proud organizations, and I'm actually not very bullish on the idea that a bunch of python coders can somehow be repurposed to hand writing low level PTX or optimizing bits of the pipeline 0.x% at a time.

The effect isn't linear either. Frontier model training is more of a series of scientific experiments than an engineering process, and lowering the cost of doing experiments allows people to experiment more and learn more. The reduced time and money costs of running experiments might be the difference between a subpar model and SoTA.

Unfortunately I don't have an intuition of how much inefficiency there is in training large models today. What we do know is that US trade embargoes of advanced chips to China has forced Chinese firms to focus on efficiency. It's pretty obvious Chinese firms are leading in the efficiency department especially after recent DeepSeek revelations. *If* there is much room for improvement in the efficiency space, US AI firms are going to be royally fucked in the coming 1-2 years.

Long context LLMs

Around mid-2024 I tried running feeding Hong Kong court cases to long context (local) LLMs and see how they fared.

Didn't work well. Although most of them claimed to support long contexts, they kind of just failed (got repetitive, etc.).

Since I wasn't really using them for important stuff I just set it aside.

[ Fast forward an eternity later (i.e. a couple months) ]

I'm pleasantly surprised to find that more recent models (as ancient as Llama-3.2 3B) actually performed pretty well on such long contexts. Llama-3.2 3B was actually the worst of the bunch, and apparently the recent gemma-3 models did really well.

The only issue is that gemma-3 27B is a bit slow.

Didn't really bother to check whether the difference was due to model performance or llama.cpp bugs. I suspect more of the former than the latter (Llama3.1 8B occasionally glitches out too).

Anyway, that's kinda good news, maybe I actually can create a comprehensive database of HK court case summary and commentary....

Sunday, March 23, 2025

Fear

Fear isn’t just fear

It is the logical invitation of possibility

The imagining of madness requires madness itself.

Sunday, March 16, 2025

What would God do? Solving the Paradox of Tolerance and why "evil" or "bad things" happen in the world.

There's a weird trick when dealing with various spiritual or philosophical questions -- just ask: what would God do?

It's a very useful trick because it immediately allows you to throw away bad assumptions and put things into the right perspective.

Take for example the so-called "Paradox of Tolerance". Basically it's the idea that "if a tolerant society allows intolerant ideologies to grow unchecked, those intolerant forces may eventually suppress tolerance itself" (GPT4o)

To most people, that sounds like a perfectly reasonable concern. So the popular solution in modern "tolerant" societies is to become intolerant (of intolerance) so that society can ensure it is tolerant. (Now *this* is a strange paradox!)

But if you switch perspective from being a human being victimized by evil intolerant forces, to that of divine nature, you'll see something totally different. God not only tolerates, but also loves. And thus through its creations, whether we call it "good" or "evil", all things exist and are treated the same. This is why we have "evil" in the world -- God tolerates and loves it as much as it loves "good". There is no paradox -- point to anything that exists or is conceivable to exist, and God loves it since it is its creation.

The paradox returns only when we return to the human perspective. Where is the difference? First, consider the scenario where the intolerant suppresses tolerance in society. This of course does not hinder your *personal* freedom to tolerate this intolerant society -- but those who self-proclaim to be tolerant do not tolerate this. Now, which came first, the intolerant people, or the stance that intolerance cannot be tolerated? From the story telling perspective it is the former, but in fact it is the latter that was planted first into the hearts of the self-proclaimed tolerant.

It is a logical fallacy to think that one cannot be tolerant of the intolerant. It might seem to be pointless, but it is not something illogical or logically impossible. Once this is accepted, the paradox also goes away. You have people who proclaim to be tolerant, but discreetly hold an exception to that rule. The intolerant people show up, trigger that exception, and now there's a "paradox" because the self-proclaimed tolerant can't fully explain why they are so intolerant of the intolerant.

Now, people may object that, if the intolerant gain power, bad things will happen. That may be the case, but that just also shows that tolerance is not really without exceptions. It simply implies that those who proclaim tolerance will be intolerant to things that lead to bad outcomes. (Btw, God is not intolerant of bad outcomes...)

I have put "evil" and "bad things" in quotes because there is no objective way to define these concepts that is orthogonal to subjective value systems. Those that are truly tolerant realize that these concepts are never absolute.

-------------------

Zzz

Those that are in conflict will be separated.

Friday, March 14, 2025

Objective Function of Intelligence

After a decade or so of "machine learning" in the 21st century, arguably the only important lesson is what they call the "Bitter Lesson", which tells us that, with the current level of computation power we have, just saying what you want and performing gradient descent is more than enough. As long as we have an objective function that is composed of differentiable functions, we can just shove FLOPS into it and make it work.

As such, the most valuable thing on Earth right now, something that would be worth trillions of USD, is an objective function of intelligence.

The fact that we can't define intelligence, but depend on it, is a really really interesting fact.

What does it mean? It means that intelligence is not a simple thing we can objectively (as opposed to subjectively) define. Instead, it depends on human judgments. In particular, value judgments.

This is not to say that value judgments are the only thing important for intelligence. But it is a dependency.

In contrast, the kind of intelligence that does not require human value judgments is simply what we call computation. We have formal frameworks to define and validate computation. The reason we can't apply those things to general intelligence is the dependency on human judgment.

This kind of means "we" (collectively) get to choose. Choice. Free Will.

Ah yes. The most valuable thing in the 21st century is our collective free will.

Of course this also means that we don't have an "objective" function of intelligence in that the function must be subjective to the user's choices.

To me, this is almost a fulfillment of the idea of "ask, and it will be given to you; seek, and you will find; knock, and it will be opened to you." (Matthew 7:7)

To those who have not yet experienced providence, it is difficult to accept that the limit on ourselves is not lack of substance, but that we don't yet know what we truly desire. And now the trillion dollar objective function is staring directly at us. This is not "heaven on earth" yet for sure, but divine concepts do manifest in such similar worldly forms.

Wednesday, March 12, 2025

Timescales

A reply I made on HN that is generally applicable (even devoid of context):

Apparently there's a quote attributed to Bill Gates: "people overestimate what they can do in one year and underestimate what they can do in 10 years."

People overestimate the changes that could happen within a couple years, and totally underestimate the changes that would happen in decades.

Perhaps it's a consequence of change having some kind of exponential behavior. The first couple years might not feel like anything in absolute terms, but give it 10 or 20 years and you'll see a huge change in retrospect.

IMHO, I don't think anyone needs to panic now, changes happen fast these days but I don't think things are going to drastically change in these ~2-3 years. But the world is probably going to look very different in 20 years, and in retrospect it will be attributed to seeds planted in these couple years.

In short I think both camps are right, except on different timescales.

Thursday, March 6, 2025

先知力

其實我真係好尊敬史兄嘅人品同能力

但咁樣真係有少少燈

高位燈完燈埋低谷... 完美 timing

cogito ergo sum

It is really hard to not see an alternative to a "rationalist" thinking of the expression "cogito ergo sum" given that God self-introduced as ""Ego sum qui sum".

The "I am" (to be / existence) symbolism is in all important places right in front of our eyes. For example, in Taoism it is "自然" (自 = self, 然 = to be).

Thought brings existence into being.

Wednesday, March 5, 2025

吓，原來 Dokibird 係（廣義）香港人嚟？

啱啱先見到條片佢講廣東話

心諗

我舊年食咗佢咁大粒花生

都唔知原來佢識講廣東話

Tuesday, March 4, 2025

Strawberry surprise!

Didn't expect that to happen.

Sunday, March 2, 2025

Hardcore WoW

Based comment on the "Classic Hardcore Moments" channel.

About hardcore wow, or on reincarnation?