Skip to content

Commit 2ff8768

Browse files
committed
Add a DCE barrier builtin
In #43852 we noticed that the compiler is getting good enough to completely DCE a number of our benchmarks. We need to add some sort of mechanism to prevent the compiler from doing so. This adds just such an intrinsic. The intrinsic itself doesn't do anything, but it is considered effectful by our optimizer, preventing it from being DCE'd. At the LLVM level, it turns into a volatile store to an alloca (or an llvm.sideeffect if the values passed to the `dcebarrier` do not have any actual LLVM-level representation). The docs for the new intrinsic are as follows: ``` dcebarrier(args...) This function prevents dead-code elimination (DCE) of itself and any arguments passed to it, but is otherwise the lightest barrier possible. In particular, it is not a GC safepoint, does model an observable heap effect, does not expand to any code itself and may be re-ordered with respect to other side effects (though the total number of executions may not change). A useful model for this function is that it hashes all memory `reachable` from args and escapes this information through some observable side-channel that does not otherwise impact program behavior. Of course that's just a model. The function does nothing and returns `nothing`. This is intended for use in benchmarks that want to guarantee that `args` are actually computed. (Otherwise DCE may see that the result of the benchmark is unused and delete the entire benchmark code). **Note**: `dcebarrier` does not affect constant foloding. For example, in `dcebarrier(1+1)`, no add instruction needs to be executed at runtime and the code is semantically equivalent to `dcebarrier(2).` *# Examples function loop() for i = 1:1000 # The complier must guarantee that there are 1000 program points (in the correct # order) at which the value of `i` is in a register, but has otherwise # total control over the program. dcebarrier(i) end end ``` I believe the voltatile store at the LLVM level is actually somewhat stronger than what we want here. Ideally the `dcebarrier` would not and up generating any machine code at all and would also be compatible with optimizations like SROA and vectorization. However, I think this is fine for now.
1 parent e3b681c commit 2ff8768

File tree

8 files changed

+99
-3
lines changed

8 files changed

+99
-3
lines changed

base/compiler/tfuncs.jl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -527,6 +527,7 @@ add_tfunc(atomic_pointerset, 3, 3, (a, v, order) -> (@nospecialize; a), 5)
527527
add_tfunc(atomic_pointerswap, 3, 3, (a, v, order) -> (@nospecialize; pointer_eltype(a)), 5)
528528
add_tfunc(atomic_pointermodify, 4, 4, atomic_pointermodify_tfunc, 5)
529529
add_tfunc(atomic_pointerreplace, 5, 5, atomic_pointerreplace_tfunc, 5)
530+
add_tfunc(donotdelete, 0, INT_INF, (@nospecialize args...)->Nothing, 0)
530531

531532
# more accurate typeof_tfunc for vararg tuples abstract only in length
532533
function typeof_concrete_vararg(t::DataType)
@@ -1695,6 +1696,8 @@ function _builtin_nothrow(@nospecialize(f), argtypes::Array{Any,1}, @nospecializ
16951696
return true
16961697
end
16971698
return false
1699+
elseif f === donotdelete
1700+
return true
16981701
end
16991702
return false
17001703
end

base/docs/basedocs.jl

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2897,4 +2897,39 @@ See also [`"`](@ref \")
28972897
"""
28982898
kw"\"\"\""
28992899

2900+
"""
2901+
donotdelete(args...)
2902+
2903+
This function prevents dead-code elimination (DCE) of itself and any arguments
2904+
passed to it, but is otherwise the lightest barrier possible. In particular,
2905+
it is not a GC safepoint, does model an observable heap effect, does not expand
2906+
to any code itself and may be re-ordered with respect to other side effects
2907+
(though the total number of executions may not change).
2908+
2909+
A useful model for this function is that it hashes all memory `reachable` from
2910+
args and escapes this information through some observable side-channel that does
2911+
not otherwise impact program behavior. Of course that's just a model. The
2912+
function does nothing and returns `nothing`.
2913+
2914+
This is intended for use in benchmarks that want to guarantee that `args` are
2915+
actually computed. (Otherwise DCE may see that the result of the benchmark is
2916+
unused and delete the entire benchmark code).
2917+
2918+
**Note**: `donotdelete` does not affect constant folding. For example, in
2919+
`donotdelete(1+1)`, no add instruction needs to be executed at runtime and
2920+
the code is semantically equivalent to `donotdelete(2).`
2921+
2922+
# Examples
2923+
2924+
function loop()
2925+
for i = 1:1000
2926+
# The complier must guarantee that there are 1000 program points (in the correct
2927+
# order) at which the value of `i` is in a register, but has otherwise
2928+
# total control over the program.
2929+
donotdelete(i)
2930+
end
2931+
end
2932+
"""
2933+
Base.donotdelete
2934+
29002935
end

base/essentials.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# This file is a part of Julia. License is MIT: https://julialang.org/license
22

3-
using Core: CodeInfo, SimpleVector
3+
using Core: CodeInfo, SimpleVector, donotdelete
44

55
const Callable = Union{Function,Type}
66

src/builtin_proto.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ DECLARE_BUILTIN(typeassert);
5353
DECLARE_BUILTIN(_typebody);
5454
DECLARE_BUILTIN(typeof);
5555
DECLARE_BUILTIN(_typevar);
56+
DECLARE_BUILTIN(donotdelete);
5657

5758
JL_CALLABLE(jl_f_invoke_kwsorter);
5859
#ifdef DEFINE_BUILTIN_GLOBALS
@@ -65,6 +66,7 @@ JL_CALLABLE(jl_f__abstracttype);
6566
JL_CALLABLE(jl_f__primitivetype);
6667
JL_CALLABLE(jl_f__setsuper);
6768
JL_CALLABLE(jl_f__equiv_typedef);
69+
JL_CALLABLE(jl_f_donotdelete);
6870

6971
#ifdef __cplusplus
7072
}

src/builtins.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1472,6 +1472,11 @@ JL_CALLABLE(jl_f__setsuper)
14721472
return jl_nothing;
14731473
}
14741474

1475+
JL_CALLABLE(jl_f_donotdelete)
1476+
{
1477+
return jl_nothing;
1478+
}
1479+
14751480
static int equiv_field_types(jl_value_t *old, jl_value_t *ft)
14761481
{
14771482
size_t nf = jl_svec_len(ft);
@@ -1834,6 +1839,7 @@ void jl_init_primitives(void) JL_GC_DISABLED
18341839
add_builtin_func("_setsuper!", jl_f__setsuper);
18351840
jl_builtin__typebody = add_builtin_func("_typebody!", jl_f__typebody);
18361841
add_builtin_func("_equiv_typedef", jl_f__equiv_typedef);
1842+
jl_builtin_donotdelete = add_builtin_func("donotdelete", jl_f_donotdelete);
18371843

18381844
// builtin types
18391845
add_builtin("Any", (jl_value_t*)jl_any_type);

src/codegen.cpp

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -473,6 +473,18 @@ static AttributeList get_func_attrs(LLVMContext &C)
473473
None);
474474
}
475475

476+
static AttributeList get_donotdelete_func_attrs(LLVMContext &C)
477+
{
478+
AttributeSet FnAttrs = AttributeSet::get(C, makeArrayRef({Attribute::get(C, "thunk")}));
479+
FnAttrs.addAttribute(C, Attribute::InaccessibleMemOnly);
480+
FnAttrs.addAttribute(C, Attribute::WillReturn);
481+
FnAttrs.addAttribute(C, Attribute::NoUnwind);
482+
return AttributeList::get(C,
483+
FnAttrs,
484+
Attributes(C, {Attribute::NonNull}),
485+
None);
486+
}
487+
476488
static AttributeList get_attrs_noreturn(LLVMContext &C)
477489
{
478490
return AttributeList::get(C,
@@ -3464,6 +3476,36 @@ static bool emit_builtin_call(jl_codectx_t &ctx, jl_cgval_t *ret, jl_value_t *f,
34643476
return true;
34653477
}
34663478

3479+
else if (f == jl_builtin_donotdelete) {
3480+
// For now we emit this as a vararg call to the builtin
3481+
// (which doesn't look at the arguments). In the future,
3482+
// this should be an LLVM builtin.
3483+
auto it = builtin_func_map.find(jl_f_donotdelete);
3484+
if (it == builtin_func_map.end()) {
3485+
return false;
3486+
}
3487+
3488+
*ret = mark_julia_const(ctx, jl_nothing);
3489+
FunctionType *Fty = FunctionType::get(getVoidTy(ctx.builder.getContext()), true);
3490+
Function *dnd = prepare_call(it->second);
3491+
SmallVector<Value*, 1> call_args;
3492+
3493+
for (size_t i = 1; i <= nargs; ++i) {
3494+
const jl_cgval_t &obj = argv[i];
3495+
if (obj.V) {
3496+
// TODO is this strong enough to constitute a read of any contained
3497+
// pointers?
3498+
Value *V = obj.V;
3499+
if (obj.isboxed) {
3500+
V = emit_pointer_from_objref(ctx, V);
3501+
}
3502+
call_args.push_back(V);
3503+
}
3504+
}
3505+
ctx.builder.CreateCall(Fty, dnd, call_args);
3506+
return true;
3507+
}
3508+
34673509
return false;
34683510
}
34693511

@@ -8133,6 +8175,7 @@ extern "C" void jl_init_llvm(void)
81338175
{ jl_f_arrayset_addr, new JuliaFunction{XSTR(jl_f_arrayset), get_func_sig, get_func_attrs} },
81348176
{ jl_f_arraysize_addr, new JuliaFunction{XSTR(jl_f_arraysize), get_func_sig, get_func_attrs} },
81358177
{ jl_f_apply_type_addr, new JuliaFunction{XSTR(jl_f_apply_type), get_func_sig, get_func_attrs} },
8178+
{ jl_f_donotdelete_addr, new JuliaFunction{XSTR(jl_f_donotdelete), get_func_sig, get_donotdelete_func_attrs} }
81368179
};
81378180

81388181
jl_default_debug_info_kind = (int) DICompileUnit::DebugEmissionKind::FullDebug;

src/staticdata.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ extern "C" {
2626
// TODO: put WeakRefs on the weak_refs list during deserialization
2727
// TODO: handle finalizers
2828

29-
#define NUM_TAGS 152
29+
#define NUM_TAGS 153
3030

3131
// An array of references that need to be restored from the sysimg
3232
// This is a manually constructed dual of the gvars array, which would be produced by codegen for Julia code, for C.
@@ -198,6 +198,7 @@ jl_value_t **const*const get_tags(void) {
198198
INSERT_TAG(jl_builtin__expr);
199199
INSERT_TAG(jl_builtin_ifelse);
200200
INSERT_TAG(jl_builtin__typebody);
201+
INSERT_TAG(jl_builtin_donotdelete);
201202

202203
// All optional tags must be placed at the end, so that we
203204
// don't accidentally have a `NULL` in the middle
@@ -252,7 +253,7 @@ static const jl_fptr_args_t id_to_fptrs[] = {
252253
&jl_f_applicable, &jl_f_invoke, &jl_f_sizeof, &jl_f__expr, &jl_f__typevar,
253254
&jl_f_ifelse, &jl_f__structtype, &jl_f__abstracttype, &jl_f__primitivetype,
254255
&jl_f__typebody, &jl_f__setsuper, &jl_f__equiv_typedef, &jl_f_opaque_closure_call,
255-
NULL };
256+
&jl_f_donotdelete, NULL };
256257

257258
typedef struct {
258259
ios_t *s;

test/compiler/codegen.jl

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -711,3 +711,9 @@ end
711711
@test !cmp43123(Ref{Function}(+), Ref{Union{typeof(+), typeof(-)}}(-))
712712
@test cmp43123(Function[+], Union{typeof(+), typeof(-)}[+])
713713
@test !cmp43123(Function[+], Union{typeof(+), typeof(-)}[-])
714+
715+
# Test that donotdelete survives through to LLVM time
716+
f_donotdelete_input(x) = Base.donotdelete(x+1)
717+
f_donotdelete_const() = Base.donotdelete(1+1)
718+
@test occursin("call void (...) @jl_f_donotdelete(i64", get_llvm(f_donotdelete_input, Tuple{Int64}, true, false, false))
719+
@test occursin("call void (...) @jl_f_donotdelete()", get_llvm(f_donotdelete_const, Tuple{}, true, false, false))

0 commit comments

Comments
 (0)