Branch prediction & speculative fetch mitigation

Why isn’t virtual address (VA) separation enough to mitigate the various spectre & meltdown flaws? I mean the generic ones, not including the one that attacks the intel p-cache == v-cache hack; that was such an obviously bad idea, I can’t find any sympathy.

As a base line:

My kernel address space (AS) only shares one text and data page with the user AS. Those pages contain just enough code and data to save and store registers; load a new memory context, and jump to the appropriate place. Thus, there are no interesting addresses to uncover here.

No process ASs from exec have any VAs in common. That is, every VA allocation is taken from a common pool, so that even shared objects like libc are at a different address in every process. Most unix-derived folks would find this odd, but it is certainly feasible; I did it once by mistake^H*10/for testing.

Fork()’d processes images are sandboxed if they are in separate access control domains, to prevent cross leakage. Sandboxing can involve context switch cache eviction, cpusets that exclude hyper-threads, all the way up to a non-interference kernel.

I understand that [1] is the basic mitigation for meltdown-related problems; and [2] is a broadening of [1] so it applies to spectre. [3] would cause performance problems, but again, limited to just those cases.

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

asked Nov 20 '18 at 23:56

mevets

2,040618

2

What are you talking about with attacks the intel p-cache == v-cache hack? I understand exactly what Spectre and Meltdown are and how they work, but that doesn't sound anything like either of them. It sounds like you're talking about a VIPT L1d cache that avoids aliasing problems by being associative enough that the index bits all come from the offset within a page (and thus translate for free, so the cache behaves like a PIPT but can still do the TLB translation in parallel with fetching data+tags from the indexed set). That's not the cause of Meltdown.
– Peter Cordes
Nov 21 '18 at 0:07

add a comment |

As a base line:

My kernel address space (AS) only shares one text and data page with the user AS. Those pages contain just enough code and data to save and store registers; load a new memory context, and jump to the appropriate place. Thus, there are no interesting addresses to uncover here.

No process ASs from exec have any VAs in common. That is, every VA allocation is taken from a common pool, so that even shared objects like libc are at a different address in every process. Most unix-derived folks would find this odd, but it is certainly feasible; I did it once by mistake^H*10/for testing.

Fork()’d processes images are sandboxed if they are in separate access control domains, to prevent cross leakage. Sandboxing can involve context switch cache eviction, cpusets that exclude hyper-threads, all the way up to a non-interference kernel.

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

asked Nov 20 '18 at 23:56

mevets

2,040618

2

What are you talking about with attacks the intel p-cache == v-cache hack? I understand exactly what Spectre and Meltdown are and how they work, but that doesn't sound anything like either of them. It sounds like you're talking about a VIPT L1d cache that avoids aliasing problems by being associative enough that the index bits all come from the offset within a page (and thus translate for free, so the cache behaves like a PIPT but can still do the TLB translation in parallel with fetching data+tags from the indexed set). That's not the cause of Meltdown.
– Peter Cordes
Nov 21 '18 at 0:07

add a comment |

As a base line:

My kernel address space (AS) only shares one text and data page with the user AS. Those pages contain just enough code and data to save and store registers; load a new memory context, and jump to the appropriate place. Thus, there are no interesting addresses to uncover here.

No process ASs from exec have any VAs in common. That is, every VA allocation is taken from a common pool, so that even shared objects like libc are at a different address in every process. Most unix-derived folks would find this odd, but it is certainly feasible; I did it once by mistake^H*10/for testing.

Fork()’d processes images are sandboxed if they are in separate access control domains, to prevent cross leakage. Sandboxing can involve context switch cache eviction, cpusets that exclude hyper-threads, all the way up to a non-interference kernel.

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

asked Nov 20 '18 at 23:56

mevets

2,040618

As a base line:

My kernel address space (AS) only shares one text and data page with the user AS. Those pages contain just enough code and data to save and store registers; load a new memory context, and jump to the appropriate place. Thus, there are no interesting addresses to uncover here.

No process ASs from exec have any VAs in common. That is, every VA allocation is taken from a common pool, so that even shared objects like libc are at a different address in every process. Most unix-derived folks would find this odd, but it is certainly feasible; I did it once by mistake^H*10/for testing.

Fork()’d processes images are sandboxed if they are in separate access control domains, to prevent cross leakage. Sandboxing can involve context switch cache eviction, cpusets that exclude hyper-threads, all the way up to a non-interference kernel.

x86 arm x86-64 cpu-architecture branch-prediction

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

asked Nov 20 '18 at 23:56

mevets

2,040618

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

asked Nov 20 '18 at 23:56

mevets

2,040618

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

edited Nov 21 '18 at 5:02

Peter Cordes

119k16181311

asked Nov 20 '18 at 23:56

mevets

2,040618

asked Nov 20 '18 at 23:56

mevets

2,040618

asked Nov 20 '18 at 23:56

mevets

2,040618

2

What are you talking about with attacks the intel p-cache == v-cache hack? I understand exactly what Spectre and Meltdown are and how they work, but that doesn't sound anything like either of them. It sounds like you're talking about a VIPT L1d cache that avoids aliasing problems by being associative enough that the index bits all come from the offset within a page (and thus translate for free, so the cache behaves like a PIPT but can still do the TLB translation in parallel with fetching data+tags from the indexed set). That's not the cause of Meltdown.
– Peter Cordes
Nov 21 '18 at 0:07

add a comment |

2

What are you talking about with attacks the intel p-cache == v-cache hack? I understand exactly what Spectre and Meltdown are and how they work, but that doesn't sound anything like either of them. It sounds like you're talking about a VIPT L1d cache that avoids aliasing problems by being associative enough that the index bits all come from the offset within a page (and thus translate for free, so the cache behaves like a PIPT but can still do the TLB translation in parallel with fetching data+tags from the indexed set). That's not the cause of Meltdown.
– Peter Cordes
Nov 21 '18 at 0:07

What are you talking about with attacks the intel p-cache == v-cache hack? I understand exactly what Spectre and Meltdown are and how they work, but that doesn't sound anything like either of them. It sounds like you're talking about a VIPT L1d cache that avoids aliasing problems by being associative enough that the index bits all come from the offset within a page (and thus translate for free, so the cache behaves like a PIPT but can still do the TLB translation in parallel with fetching data+tags from the indexed set). That's not the cause of Meltdown.
– Peter Cordes
Nov 21 '18 at 0:07

add a comment |

1 Answer
1

active

oldest

votes

Meltdown attack depend on (speculatively) accessing the target virtual address directly (from within the attacking process)¹.

But Spectre is not. You prime the branch predictor so that the code under attack speculatively accesses its own virtual address space, which it has permission to do. Branch-predictor aliasing means you can usually / sometimes prime the prediction for a branch at a virtual address you can't / don't have mapped. (e.g. in the kernel.)

The usual side-channel, a cache-read attack, is based on evicting the cache for an array in your own address space. But other side-channels are possible to get the Spectre data from the target back to the attacker, like priming the cache and then looking for which entry was evicted by a conflict-miss for an address which aliases some memory in the process under attack. (Harder because L3 cache in modern x86 CPUs uses a complex indexing function, unlike simpler caches which use a simple range of bits as the index. But possibly you could use L2 or L1d misses.
L2 miss / L3 hit should still be measurably longer than an L2 hit.)

Or with SMT (e.g. Hyperthreading), an ALU timing attack where the Spectre gadget creates data-dependent ALU port pressure. In this case the only relevant memory access is the data under attack (which is allowed by the hardware, only mis-speculation of the branch causes a rollback, not a load fault).

When attacking the kernel, it will have the physical memory pages of the attacking process mapped somewhere. (Most kernels map all of physical memory to a contiguous range of virtual addresses, allowing easy access to any physical address.) Caching is based on physical addresses, not virtual.

A Spectre gadget that makes a cache line hot via a different mapping for the same page still works.

In the context of a system call, the kernel usually keeps user-space memory mapped to the same virtual addresses it was using inside the process, so system calls like read and write can copy between user-space and the pagecache. And many system calls pass user-space pointers to filenames. So when attacking the kernel, a Spectre gadget can directly use user-space addresses in the attacking process.

The Spectre gadget itself could maybe even be in user-space memory, although with separate page tables to work around Meltdown, you might mitigate that by setting the kernel page tables to have user-space VAs mapped without exec permission.

Footnote 1: Meltdown is a bypass for the U/S bit in the page tables, allowing user-space to potentially read any memory the kernel leaves mapped. And yes, [1] is a sufficient workaround. See http://blog.stuffedcow.net/2018/05/meltdown-microarchitecture/.

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

Modern kernels, at least the ones I’ve worked on, do not map all of physical memory; that is an 1970s thing. Modern kernels don’t even have the means to do so. The branch predictors use hashing? I’ve only seen truly inside one architecture, and it certainly didn’t; so to train the branch predictor, you had to train it with exact VAs. Agreed SMT is mainly garbage, but I guess that was known 15 years ago.
– mevets
Nov 21 '18 at 0:54

@mevets: Linux on x86-64 direct-maps all of physical memory, up to 64TB anyway. Note the direct mapping of all physical memory (page_offset_base) entry in kernel.org/doc/Documentation/x86/x86_64/mm.txt. With 56-bit 5-level page tables, the direct-map size goes up to 32 PB.
– Peter Cordes
Nov 21 '18 at 0:58

Thank you, btw; I was thinking it was a combined laziness between the kernel folks and cpu folks; but that BP-shambles really puts the whole train wreck at the cpu peoples feet.
– mevets
Nov 21 '18 at 1:00

Linux is not a modern os, but any stretch of anybodies imagination.
– mevets
Nov 21 '18 at 1:01

@mevets: TAGE branch predictors do have "tagged" in the name, but my understanding is that they basically allow aliasing. The most common branch dominates the prediction for that combination of branch history and address, and doesn't have that valuable state wiped out by one rare branch that aliases. Paul Clayton comments that TAGE can/does use partial tagging: Why did Intel change the static branch prediction mechanism over these years?. See also Bee's comments here.
– Peter Cordes
Nov 21 '18 at 1:03

|
show 3 more comments

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403381%2fbranch-prediction-speculative-fetch-mitigation%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Meltdown attack depend on (speculatively) accessing the target virtual address directly (from within the attacking process)¹.

A Spectre gadget that makes a cache line hot via a different mapping for the same page still works.

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

Modern kernels, at least the ones I’ve worked on, do not map all of physical memory; that is an 1970s thing. Modern kernels don’t even have the means to do so. The branch predictors use hashing? I’ve only seen truly inside one architecture, and it certainly didn’t; so to train the branch predictor, you had to train it with exact VAs. Agreed SMT is mainly garbage, but I guess that was known 15 years ago.
– mevets
Nov 21 '18 at 0:54

@mevets: Linux on x86-64 direct-maps all of physical memory, up to 64TB anyway. Note the direct mapping of all physical memory (page_offset_base) entry in kernel.org/doc/Documentation/x86/x86_64/mm.txt. With 56-bit 5-level page tables, the direct-map size goes up to 32 PB.
– Peter Cordes
Nov 21 '18 at 0:58

Thank you, btw; I was thinking it was a combined laziness between the kernel folks and cpu folks; but that BP-shambles really puts the whole train wreck at the cpu peoples feet.
– mevets
Nov 21 '18 at 1:00

Linux is not a modern os, but any stretch of anybodies imagination.
– mevets
Nov 21 '18 at 1:01

@mevets: TAGE branch predictors do have "tagged" in the name, but my understanding is that they basically allow aliasing. The most common branch dominates the prediction for that combination of branch history and address, and doesn't have that valuable state wiped out by one rare branch that aliases. Paul Clayton comments that TAGE can/does use partial tagging: Why did Intel change the static branch prediction mechanism over these years?. See also Bee's comments here.
– Peter Cordes
Nov 21 '18 at 1:03

|
show 3 more comments

Meltdown attack depend on (speculatively) accessing the target virtual address directly (from within the attacking process)¹.

A Spectre gadget that makes a cache line hot via a different mapping for the same page still works.

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

Modern kernels, at least the ones I’ve worked on, do not map all of physical memory; that is an 1970s thing. Modern kernels don’t even have the means to do so. The branch predictors use hashing? I’ve only seen truly inside one architecture, and it certainly didn’t; so to train the branch predictor, you had to train it with exact VAs. Agreed SMT is mainly garbage, but I guess that was known 15 years ago.
– mevets
Nov 21 '18 at 0:54

@mevets: Linux on x86-64 direct-maps all of physical memory, up to 64TB anyway. Note the direct mapping of all physical memory (page_offset_base) entry in kernel.org/doc/Documentation/x86/x86_64/mm.txt. With 56-bit 5-level page tables, the direct-map size goes up to 32 PB.
– Peter Cordes
Nov 21 '18 at 0:58

Thank you, btw; I was thinking it was a combined laziness between the kernel folks and cpu folks; but that BP-shambles really puts the whole train wreck at the cpu peoples feet.
– mevets
Nov 21 '18 at 1:00

Linux is not a modern os, but any stretch of anybodies imagination.
– mevets
Nov 21 '18 at 1:01

@mevets: TAGE branch predictors do have "tagged" in the name, but my understanding is that they basically allow aliasing. The most common branch dominates the prediction for that combination of branch history and address, and doesn't have that valuable state wiped out by one rare branch that aliases. Paul Clayton comments that TAGE can/does use partial tagging: Why did Intel change the static branch prediction mechanism over these years?. See also Bee's comments here.
– Peter Cordes
Nov 21 '18 at 1:03

|
show 3 more comments

Meltdown attack depend on (speculatively) accessing the target virtual address directly (from within the attacking process)¹.

A Spectre gadget that makes a cache line hot via a different mapping for the same page still works.

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

Meltdown attack depend on (speculatively) accessing the target virtual address directly (from within the attacking process)¹.

A Spectre gadget that makes a cache line hot via a different mapping for the same page still works.

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

answered Nov 21 '18 at 0:45

Peter Cordes

119k16181311

Modern kernels, at least the ones I’ve worked on, do not map all of physical memory; that is an 1970s thing. Modern kernels don’t even have the means to do so. The branch predictors use hashing? I’ve only seen truly inside one architecture, and it certainly didn’t; so to train the branch predictor, you had to train it with exact VAs. Agreed SMT is mainly garbage, but I guess that was known 15 years ago.
– mevets
Nov 21 '18 at 0:54

@mevets: Linux on x86-64 direct-maps all of physical memory, up to 64TB anyway. Note the direct mapping of all physical memory (page_offset_base) entry in kernel.org/doc/Documentation/x86/x86_64/mm.txt. With 56-bit 5-level page tables, the direct-map size goes up to 32 PB.
– Peter Cordes
Nov 21 '18 at 0:58

Thank you, btw; I was thinking it was a combined laziness between the kernel folks and cpu folks; but that BP-shambles really puts the whole train wreck at the cpu peoples feet.
– mevets
Nov 21 '18 at 1:00

Linux is not a modern os, but any stretch of anybodies imagination.
– mevets
Nov 21 '18 at 1:01

@mevets: TAGE branch predictors do have "tagged" in the name, but my understanding is that they basically allow aliasing. The most common branch dominates the prediction for that combination of branch history and address, and doesn't have that valuable state wiped out by one rare branch that aliases. Paul Clayton comments that TAGE can/does use partial tagging: Why did Intel change the static branch prediction mechanism over these years?. See also Bee's comments here.
– Peter Cordes
Nov 21 '18 at 1:03

|
show 3 more comments

Modern kernels, at least the ones I’ve worked on, do not map all of physical memory; that is an 1970s thing. Modern kernels don’t even have the means to do so. The branch predictors use hashing? I’ve only seen truly inside one architecture, and it certainly didn’t; so to train the branch predictor, you had to train it with exact VAs. Agreed SMT is mainly garbage, but I guess that was known 15 years ago.
– mevets
Nov 21 '18 at 0:54

@mevets: Linux on x86-64 direct-maps all of physical memory, up to 64TB anyway. Note the direct mapping of all physical memory (page_offset_base) entry in kernel.org/doc/Documentation/x86/x86_64/mm.txt. With 56-bit 5-level page tables, the direct-map size goes up to 32 PB.
– Peter Cordes
Nov 21 '18 at 0:58

Thank you, btw; I was thinking it was a combined laziness between the kernel folks and cpu folks; but that BP-shambles really puts the whole train wreck at the cpu peoples feet.
– mevets
Nov 21 '18 at 1:00

Linux is not a modern os, but any stretch of anybodies imagination.
– mevets
Nov 21 '18 at 1:01

@mevets: TAGE branch predictors do have "tagged" in the name, but my understanding is that they basically allow aliasing. The most common branch dominates the prediction for that combination of branch history and address, and doesn't have that valuable state wiped out by one rare branch that aliases. Paul Clayton comments that TAGE can/does use partial tagging: Why did Intel change the static branch prediction mechanism over these years?. See also Bee's comments here.
– Peter Cordes
Nov 21 '18 at 1:03

Modern kernels, at least the ones I’ve worked on, do not map all of physical memory; that is an 1970s thing. Modern kernels don’t even have the means to do so. The branch predictors use hashing? I’ve only seen truly inside one architecture, and it certainly didn’t; so to train the branch predictor, you had to train it with exact VAs. Agreed SMT is mainly garbage, but I guess that was known 15 years ago.
– mevets
Nov 21 '18 at 0:54

@mevets: Linux on x86-64 direct-maps all of physical memory, up to 64TB anyway. Note the direct mapping of all physical memory (page_offset_base) entry in kernel.org/doc/Documentation/x86/x86_64/mm.txt. With 56-bit 5-level page tables, the direct-map size goes up to 32 PB.
– Peter Cordes
Nov 21 '18 at 0:58

Thank you, btw; I was thinking it was a combined laziness between the kernel folks and cpu folks; but that BP-shambles really puts the whole train wreck at the cpu peoples feet.
– mevets
Nov 21 '18 at 1:00

Linux is not a modern os, but any stretch of anybodies imagination.
– mevets
Nov 21 '18 at 1:01

@mevets: TAGE branch predictors do have "tagged" in the name, but my understanding is that they basically allow aliasing. The most common branch dominates the prediction for that combination of branch history and address, and doesn't have that valuable state wiped out by one rare branch that aliases. Paul Clayton comments that TAGE can/does use partial tagging: Why did Intel change the static branch prediction mechanism over these years?. See also Bee's comments here.
– Peter Cordes
Nov 21 '18 at 1:03

|
show 3 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk