Follow

Gonna say this one time.

I don’t post this stuff because I am looking for someone to tell me there’s not a scholarly article, or a deeper dive. I post it because of trends I have seen in the reporting of day to day events, and emerging threats.

I didn’t get here on scholarly articles on emerging threats

As this one gets closer to being truly weaponized… You need to know that SPECTRE and Meltdown cannot be patched..

threatpost.com/attacks-slaught

I mean, I hate it too… all my life has levels been focused on one version of x86 or another.

But SPECTRE is not a ghost. It is real. It can do damage.

Show thread

@thegibson omg Intel's response, "there's no problem here, as long as every developer compiles their code to assembly to find and correct timing vulnerabilities" 😂

@datatitian yep, cause that is how the world works. 😂​

@datatitian @thegibson Compiles their code to assembly? Eehh...we're talking about timing side channel attacks here. Compiling to bytecode gives the target at least the chance to winnow and chafe by randomizing the time each bytecode takes to execute. Compiling to raw assembly is probably the worst thing you can do under these stresses.

@vertigo @datatitian @thegibson how many programming languages or compilers have types of annotations that say, hey, this here needs to execute in constant time pls

@vertigo @datatitian @thegibson there's haybale-pitchfork

a tool for verifying that constant-time code is, indeed, constant-time. It can analyze code written in C/C++, Rust, or any other language which can compile to LLVM bitcode (e.g., Swift, Go, and others).

@meena @datatitian @thegibson Ooh, my apologies. I misread "types of annotations" as "type annotations", so I thought you wondering about native syntax and semantics.

@meena @datatitian @thegibson That said, I was unaware of this package. Thank you for linking to it!

@vertigo @datatitian @thegibson i think this was meant to be "types or annotations"

so, yes, native syntax would be good
a package / library etc would be okay

a compiler directive would be extremely good, too

@datatitian @thegibson Interesting; I will give this a read through over my lunch break today. Thanks for the link.

@thegibson I am so ready for the age of Intel to be over. Like, if they have a near-death experience and start shipping sane, non-broken CPUs like almost everyone else, that's fine.

If they die entirely, it's real bad for US chipmaking, but good for the world.

Worst case would be Intel survives and mitigates Spectre enough to be tomorrow's problem.

@thegibson we have to tools to solve these problems, but I’ve had little luck convincing anyone with the resources to get it done that it’s real.

These two are just the beginning, and as long as we rely on static logic we’ll have computers that can’t be fixed.

I wrote a (weirdly patriotic?) post about using FPGA to solve this and many other systemic vulnerabilities our computers have, but I’m not sure how to push it forward.

jasongullickson.com/computatio

@requiem @thegibson If I may be allowed to be pedantic here, I ask that my words be considered with some gravity.

The issue isn't static logic. The issue is divorcing instruction decoding from instruction set design to attain performance goals not originally built into the ISA.

It takes, for example, several clock cycles just to decode x86 instructions into a form that can then be readily executed. Several clocks to load the code cache. Several clocks to translate what's in the code cache into a pre-decoded form in the pre-decode cache. Several clocks to load a pre-decode line into the instruction registers (yes, plural) of the instruction fetch unit. A clock to pass that onto the first of (I think?) three instruction decode stages in the core. Three more clocks after that, you finally have a fully decoded instruction that the remainder of the pipelines (yes, plural) can potentially execute.

Of course, I say potentially because there's register renaming happening, there's delays caused by waiting for available instruction execution units to become available in the first place, there's waiting for result buses to become uncontested, ...

The only reason all this abhorrent latency is obscured is because the CPU literally has hundreds of instructions in flight at any given time. Gone are the days when it was a technical achievement that the Pentium had 2 concurrently running instructions. Today, our CPUs, have literally hundreds.

(Consider: a 7-pipe superscalar processor with 23 pipeline stages, assuming no other micro-architectural features to enhance performance, still offers 23*7=161 in-flight instructions, assuming you have some other means of keeping those pipes filled.)

This is why CPU vendors no longer put cycle counts next to their instructions anymore. Instructions are pre-decoded into short programs, and it's those programs (strings of "micro-ops", hence micro-op caches, et. al.) which are executed by the core on a more primitive level.

Make no mistake: the x86 instruction set architecture we all love to hate today has been shambling undead zombie for decades now. RISC definitely won, which is why every x86-compatible processor has been built on top of RISC cores since the early 00s, if not earlier. Intel just doesn't want everyone to know it because the ISA is such a cash cow these days. Kind of like how the USA is really a nation whose official measurement system is the SI system, but we continue to use imperial units because we have official definitions that maps one to the other.

Oh, but don't think that RISC is immune from this either. It makes my blood boil when people say, "RISC-V|ARM|MIPS|POWER is immune."

No, it's not. Neither is MIPS, neither is ARM, neither is POWER. If your processor has any form of speculative execution and depends on caches for maintaining instruction throughputs, which is to say literally all architectures on the planet since the Pentium-Pro demonstrated its performance advantages over the PowerPC 601, you will be susceptible to SPECTRE. Full stop. That's laws of physics talking, not Intel or IBM.

Whether it's implemented as a sea-of-gates in some off-brand ASIC or if it's an FPGA, or you're using the latest nanometer-scale process node by the most expensive fab house on the planet, it won't matter -- SPECTRE is an artifact of the micro-architecture used by the processor. It has nothing whatsoever to do with the ISA. It has everything to do with performance-at-all-costs, gotta-keep-them-pipes-full mentality that drives all of today's design requirements.

I will put the soapbox back in the closet now. Sorry.

@vertigo @requiem @TheGibson I heckin love it when you get passionately loud about cpu design, just saying

@djsundog @requiem @thegibson I distinctly remember when the first round of SPECTRE and Meltdown attacks came out and everyone and their grandmother were heralding the technical superiority of ARM cores because they didn't have a successful demonstration of these attacks.

It only took several months of effort to demo the first attack for the ARM.

Then, POWER became the patron saint of processing. And, as I recall, not long after, its fortified walls fell eventually as well.

You can absolutely get to the moon from here if you have enough bandaids. But, I'll argue that there are easier ways to do it than creating a big, gooey stack of padded rubber strips carefully balanced on each other.

@vertigo @djsundog @requiem @thegibson I'm not sure that can be done by patching up the ISA or instruction handling, there are just too many ways besides that to snoop state on a shared device: page table hierarchies have timing differences, IOMMUs have slightly different behavior, you can analyze the storage hierarchy after forcing memory pressure, ...

It might be easier to just pack two computers in the box that communicate via a simple(!) bus, with everything else strictly separate (no shared memory, no shared storage, etc). Security critical code ends up on the smaller of the two units and any insecure code can request it to do things but never measure the details because the communication granularity is from request to reply. (remember to put any power management into the secure side of things or the insecure side can gleam information off from there)

Still needs some care (e.g. constant time implementations) but it's much easier to reason about.

The biggest concern would be that it won't take long before chip vendors end up putting them into the same package again "because we made it secure, pinky promise!" (which is how Arm TrustZone and the various Intel initiatives work - and fail - these days)

Which reminds me - aren't you building a computer that also features a simple communication channel? ;-)

@patrick @requiem @thegibson @djsundog Yes, I am. Although, my focus isn't security, but rather to have fun hacking on an open platform that can still evolve into something useful to me later on.

@vertigo @requiem @TheGibson Does this mean the "compiled to be pre-scheduled for your particular CPU" idea of Itanium is going to win in the end?!

@sjb @requiem @thegibson That works for some workloads. Consider GPUs for example. However, for other workloads not so much. A more general approach would be a fleet of small processors interacting with each other over communications links each with their own private memory. This cellular approach to computing is something that was envisioned back in the days of SmallTalk, but never fully realized. The GreenArrays GA144 chip is probably the next incarnation of the idea, but it's application domain appears to be limited to deep embedded to applications.

I don't claim to know a general purpose solution to this extremely general purpose problem. However knowing the true reasons why it exists in the first place is critical in knowing how to mitigate it, at least for specific domains.

@vertigo @requiem @TheGibson
> RISC definitely won, which is why every x86-compatible processor has been built on top of RISC cores since the early 00s, if not earlier.

Yes _and no_ ... RISC as a microarchitecture thoroughly and definitively won. There's a real argument to be made that a CISC ISA with a RISC microarchitecture is the true performance winner. (Cf. X86, GPUs)

And of course, as you say, our RISCs aren't as RISCy as they could be these days, either.

@vertigo @requiem @TheGibson
What if it's the compiler that does the speculation, VLIW-style?

@wolf480pl @requiem @thegibson Everything is much more deterministic in that scenario, and is immune to SPECTRE.

SPECTRE depends on changing between user and kernel modes of operation. The idea is to exploit failed speculation into kernel space. Under these conditions, you're still running in user-space, but the caches now have privileged information in them. How much depends on which paths were speculated in the kernel, and flushing those cache lines in favor of new user-mode content takes time. Hence, the timing side-channel.

With a compiler for a VLIW architecture, this can't occur, because speculation never happens across a privilege boundary. The cache is always hot with the working set of the process currently running.

@vertigo @requiem @TheGibson
btw. (unrelated to long-term solutions)

I read the paper about that latest spectre variant and it looks like their whole lfence-bypassing attack relies on a secret-dependent indirect branch after the bounds/permission check (and the lfence), and to my best understanding, if that indirect branch was a retpoline, the attack would no longer work.

Am I missing something? I can't believe they haven't thought of such a simple mitigation...

@requiem DNS isn't resolving from here... or for either of the "down for everyone or just me" sites i tried

@millihertz @requiem Confirmed; that domain name no longer seems to resolve.

kc5tja@pop-os:~$ dig jasongullickson.com
^Ckc5tja@pop-os:~$ dig -n jasongullickson.com

; <<>> DiG 9.16.6-Ubuntu <<>> -n jasongullickson.com
;; global options: +cmd
;; connection timed out; no servers could be reached

kc5tja@pop-os:~$

@vertigo @millihertz fascinating, thanks for the heads-up; wonder why it works for me?

@djsundog @vertigo @millihertz I tested from a couple different hosts (two outside my LAN), but Ive restarted the DNS just for good measure.

@requiem @djsundog @vertigo @millihertz

~ [13] ; dig -n jasongullickson.com

; <<>> DiG 9.10.6 <<>> -n jasongullickson.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 27306
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

~ [14] ; host !$
host jasongullickson.com
Host jasongullickson.com not found: 2(SERVFAIL)

@requiem @djsundog @vertigo @millihertz That's... interesting.

~ [17] ; nslookup
> jasongullickson.com.
;; Got SERVFAIL reply from 192.5.37.10, trying next server
Server: 192.5.37.1
Address: 192.5.37.1#53

** server can't find jasongullickson.com: SERVFAIL
> set type=ANY
> jasongullickson.com.
;; Truncated, retrying in TCP mode.
Server: 192.5.37.10
Address: 192.5.37.10#53

Non-authoritative answer:
jasongullickson.com rdata_46 = A 7 2 86400 20210423000000 20210324112516 29102 jasongullickson.com. O2UwrcqqV64uNec1cw/NuauJQ7ojml7J8FhNA4Ot9+GHu7eOG4yTQXFI gAOKXhd+ti4oLnrQDMyu/iffGpgWPywvKYAZH0f7y5s8Wzi/gs1jfwXN 5cH3/vdyZQeeK4HkHXlryllYhODLiNYocOw274MxjMKKTbKD/IZXQUXc xOk=
Name: jasongullickson.com
Address: 178.128.234.190
jasongullickson.com nameserver = ns2.box.virtualprivatenation.com.
jasongullickson.com nameserver = ns1.box.virtualprivatenation.com.
jasongullickson.com rdata_46 = DS 8 2 86400 20210511042217 20210504031217 54714 com. AilXbw5fM0XAlkRAeaA5X3v6ns+3J6Uc8raVVnSPKppCToalHnDHkftm AIq2PHAuiefCbdRsHQKoVAvAIIhpptz/AiluUXut3pnPui00VxzTBYiY 5LZC4uNJCt26+hhTxZiMC45/Wg6oIuQDPT9JAUha+2/KEZVg2+UABw1I 2BAJSHOKWGPWl9OArTbnLMdCnQEj8yVBHc3Se/VlFLX/Jg==
jasongullickson.com rdata_43 = 6744 7 2 89460E1D42B6F3E34E34FF643C23EC81EDF13548371F7382851EC152 03B024A9

Authoritative answers can be found from:
ns1.box.virtualprivatenation.com internet address = 185.203.114.173
ns2.box.virtualprivatenation.com internet address = 185.203.114.173

: || nomad@stefen ~ [18] ; host 178.128.234.190
190.234.128.178.in-addr.arpa domain name pointer preposter.us.

@nomad @djsundog @vertigo @millihertz I don’t understand what this means.

Those servers at the bottom are my DNS servers, so what is all that other stuff that doesn’t work?

@requiem @djsundog @vertigo @millihertz The important part, I think, are the server fail messages.

However, I just tried again:
: || nomad@stefen ~ [20] ; host jasongullickson.com
jasongullickson.com has address 178.128.234.190
jasongullickson.com mail is handled by 10 box.virtualprivatenation.com.

so it looks like it's working now.

@nomad @djsundog @vertigo @millihertz thanks for checking, I did find a zombie process on the DNS server so perhaps the reboot did the trick.

I’m going to keep a eye on it and see if it comes back, but thanks for the recon and info!

@djsundog @vertigo @millihertz if you have a chance give it another go and let me know if it’s still down, thanks!

@thegibson It's also implemented in Javascript. It's probably on a couple of CDNs by now.

@rysiek, @tomasino, MI raises ARM.*

So, you are telling me that after all this time of running away from the rotten apple, I now need to get an M1 device? Or is there a viable RISC alternative?

* M1, but with Roman numerals, read as "/me raises arm"

@walter @rysiek @tomasino There's also a RISC-V soft-SoC that runs on an Arty-A7 FPGA board. It runs Linux. Since the CPU is a soft-core hardware bugs *can* be patched.

But I believe it might not be an off the shelf solution for everybody yet.

An old link:
antmicro.com/blog/2020/05/mult

Trying to find a more up-to-date one...

@TheGibson I know we've been tooting the same horn about spectre for a while now but it feels like people just don't get the scope of it -- even people that should

Sign in to participate in the conversation
hackers.town

A bunch of technomancers in the fediverse. Keep it fairly clean please. This arcology is for all who wash up upon it's digital shore.