Skip to content

Commit 93fae78

Browse files
committed
[IRGen][Runtime] Add emit-into-client retain/release calls for Darwin ARM64.
This is currently disabled by default. Building the client library can be enabled with the CMake option SWIFT_BUILD_CLIENT_RETAIN_RELEASE, and using the library can be enabled with the flags -Xfrontend -enable-client-retain-release. To improve retain/release performance, we build a static library containing optimized implementations of the fast paths of swift_retain, swift_release, and the corresponding bridgeObject functions. This avoids going through a stub to make a cross-library call. IRGen gains awareness of these new functions and emits calls to them when the functionality is enabled and the target supports them. Two options are added to force use of them on or off: -enable-client-retain-release and -disable-client-retain-release. When enabled, the compiler auto-links the static library containing the implementations. The new calls also use LLVM's preserve_most calling convention. Since retain/release doesn't need a large number of scratch registers, this is mostly harmless for the implementation, while allowing callers to improve code size and performance by spilling fewer registers around refcounting calls. (Experiments with an even more aggressive calling convention preserving x2 and up showed an insignificant savings in code size, so preserve_most seems to be a good middle ground.) Since the implementations are embedded into client binaries, any change in the runtime's refcounting implementation needs to stay compatible with this new fast path implementation. This is ensured by having the implementation use a runtime-provided mask to check whether it can proceed into its fast path. The mask is provided as the address of the absolute symbol _swift_retainRelease_slowpath_mask_v1. If that mask ANDed with the object's current refcount field is non-zero, then we take the slow path. A future runtime that changes the refcounting implementation can adjust this mask to match, or set the mask to all 1s to disable the old embedded fast path entirely (as long as the new representation never uses 0 as a valid refcount field value). As part of this work, the overall approach for bridgeObjectRetain is changed slightly. Previously, it would mask off the spare bits from the native pointer and then call through to swift_retain. This either lost the spare bits in the return value (when tail calling swift_retain) which is problematic since it's supposed to return its parameter, or it required pushing a stack frame which is inefficient. Now, swift_retain takes on the responsibility of masking off spare bits from the parameter and preserving them in the return value. This is a trivial addition to the fast path (just a quick mask and an extra register for saving the original value) and makes bridgeObjectRetain quite a bit more efficient when implemented correctly to return the exact value it was passed. The runtime's implementations of swift_retain/release are now also marked as preserve_most so that they can be tail called from the client library. preserve_most is compatible with callers expecting the standard calling convention so this doesn't break any existing clients. Some ugly tricks were needed to prevent the compiler from creating unnecessary stack frames with the new calling convention. Avert your eyes. To allow back deployment, the runtime now has aliases for these functions called swift_retain_preservemost and swift_release_preservemost. The client library brings weak definitions of these functions that save the extra registers and call through to swift_retain/release. This allows them to work correctly on older runtimes, with a small performance penalty, while still running at full speed on runtimes that have the new preservemost symbols. Although this is only supported on Darwin at the moment, it shouldn't be too much work to adapt it to other ARM64 targets. We need to ensure the assembly plays nice with the other platforms' assemblers, and make sure the implementation is correct for the non-ObjC-interop case. rdar://122595871
1 parent cb7ddbe commit 93fae78

26 files changed

+706
-44
lines changed

CMakeLists.txt

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,19 @@ option(SWIFT_BUILD_STDLIB_CXX_MODULE
271271
"If not building stdlib, controls whether to build the Cxx module"
272272
TRUE)
273273

274+
# The swiftClientRetainRelease library is currently only available for Darwin
275+
# platforms.
276+
if(SWIFT_HOST_VARIANT_SDK IN_LIST SWIFT_DARWIN_PLATFORMS)
277+
# Off by default everywhere for now.
278+
option(SWIFT_BUILD_CLIENT_RETAIN_RELEASE
279+
"Build the swiftClientRetainRelease library"
280+
FALSE)
281+
else()
282+
option(SWIFT_BUILD_CLIENT_RETAIN_RELEASE
283+
"Build the swiftClientRetainRelease library"
284+
FALSE)
285+
endif()
286+
274287
# In many cases, the CMake build system needs to determine whether to include
275288
# a directory, or perform other actions, based on whether the stdlib or SDK is
276289
# being built at all -- statically or dynamically. Please note that these

include/swift/AST/IRGenOptions.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -548,6 +548,9 @@ class IRGenOptions {
548548
// Whether to emit mergeable or non-mergeable traps.
549549
unsigned MergeableTraps : 1;
550550

551+
/// Enable the use of swift_retain/releaseClient functions.
552+
unsigned EnableClientRetainRelease : 1;
553+
551554
/// The number of threads for multi-threaded code generation.
552555
unsigned NumThreads = 0;
553556

@@ -675,6 +678,7 @@ class IRGenOptions {
675678
EmitAsyncFramePushPopMetadata(true), EmitTypeMallocForCoroFrame(true),
676679
AsyncFramePointerAll(false), UseProfilingMarkerThunks(false),
677680
UseCoroCCX8664(false), UseCoroCCArm64(false), MergeableTraps(false),
681+
EnableClientRetainRelease(false),
678682
DebugInfoForProfiling(false), CmdArgs(),
679683
SanitizeCoverage(llvm::SanitizerCoverageOptions()),
680684
TypeInfoFilter(TypeInfoDumpFilter::All),

include/swift/Option/FrontendOptions.td

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1477,6 +1477,13 @@ def mergeable_traps :
14771477
Flag<["-"], "mergeable-traps">,
14781478
HelpText<"Emit mergeable traps even in optimized builds">;
14791479

1480+
def enable_client_retain_release :
1481+
Flag<["-"], "enable-client-retain-release">,
1482+
HelpText<"Enable use of swift_retain/releaseClient functions">;
1483+
def disable_client_retain_release :
1484+
Flag<["-"], "disable-client-retain-release">,
1485+
HelpText<"Disable use of swift_retain/releaseClient functions">;
1486+
14801487
def enable_new_llvm_pass_manager :
14811488
Flag<["-"], "enable-new-llvm-pass-manager">,
14821489
HelpText<"Enable the new llvm pass manager">;

include/swift/Runtime/Config.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,17 @@ extern uintptr_t __COMPATIBILITY_LIBRARIES_CANNOT_CHECK_THE_IS_SWIFT_BIT_DIRECTL
248248
// so changing this value is not sufficient.
249249
#define SWIFT_DEFAULT_LLVM_CC llvm::CallingConv::C
250250

251+
// Define the calling convention for refcounting functions for targets where it
252+
// differs from the standard calling convention. Currently this is only used for
253+
// swift_retain, swift_release, and some internal helper functions that they
254+
// call.
255+
#if defined(__aarch64__)
256+
#define SWIFT_REFCOUNT_CC SWIFT_CC_PreserveMost
257+
#define SWIFT_REFCOUNT_CC_PRESERVEMOST 1
258+
#else
259+
#define SWIFT_REFCOUNT_CC
260+
#endif
261+
251262
/// Should we use absolute function pointers instead of relative ones?
252263
/// WebAssembly target uses it by default.
253264
#ifndef SWIFT_COMPACT_ABSOLUTE_FUNCTION_POINTER

include/swift/Runtime/CustomRRABI.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,9 @@ namespace swift {
5959
template <typename Ret, typename Param>
6060
Param returnTypeHelper(Ret (*)(Param)) {}
6161

62+
template <typename Ret, typename Param>
63+
Param returnTypeHelper(SWIFT_REFCOUNT_CC Ret (*)(Param)) {}
64+
6265
#if defined(__LP64__) || defined(_LP64)
6366
#define REGISTER_SUBSTITUTION_PREFIX ""
6467
#define REGISTER_PREFIX "x"

include/swift/Runtime/HeapObject.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,7 @@ HeapObject* swift_allocEmptyBox();
137137
/// It may also prove worthwhile to have this use a custom CC
138138
/// which preserves a larger set of registers.
139139
SWIFT_RUNTIME_EXPORT
140+
SWIFT_REFCOUNT_CC
140141
HeapObject *swift_retain(HeapObject *object);
141142

142143
SWIFT_RUNTIME_EXPORT
@@ -173,6 +174,7 @@ bool swift_isDeallocating(HeapObject *object);
173174
/// - maybe a variant that can assume a non-null object
174175
/// It's unlikely that a custom CC would be beneficial here.
175176
SWIFT_RUNTIME_EXPORT
177+
SWIFT_REFCOUNT_CC
176178
void swift_release(HeapObject *object);
177179

178180
SWIFT_RUNTIME_EXPORT

include/swift/Runtime/RuntimeFunctions.def

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,22 @@ FUNCTION(NativeStrongRelease, Swift, swift_release, C_CC, AlwaysAvailable,
220220
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
221221
UNKNOWN_MEMEFFECTS)
222222

223+
// void *swift_retain(void *ptr);
224+
FUNCTION(NativeStrongRetainClient, Swift, swift_retainClient, SwiftClientRR_CC, AlwaysAvailable,
225+
RETURNS(RefCountedPtrTy),
226+
ARGS(RefCountedPtrTy),
227+
ATTRS(NoUnwind, FirstParamReturned, WillReturn),
228+
EFFECT(RuntimeEffect::RefCounting),
229+
UNKNOWN_MEMEFFECTS)
230+
231+
// void swift_release(void *ptr);
232+
FUNCTION(NativeStrongReleaseClient, Swift, swift_releaseClient, SwiftClientRR_CC, AlwaysAvailable,
233+
RETURNS(VoidTy),
234+
ARGS(RefCountedPtrTy),
235+
ATTRS(NoUnwind),
236+
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
237+
UNKNOWN_MEMEFFECTS)
238+
223239
// void *swift_retain_n(void *ptr, int32_t n);
224240
FUNCTION(NativeStrongRetainN, Swift, swift_retain_n, C_CC, AlwaysAvailable,
225241
RETURNS(RefCountedPtrTy),
@@ -420,6 +436,24 @@ FUNCTION(BridgeObjectStrongRelease, Swift, swift_bridgeObjectRelease,
420436
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
421437
UNKNOWN_MEMEFFECTS)
422438

439+
// void *swift_bridgeObjectRetainClient(void *ptr);
440+
FUNCTION(BridgeObjectStrongRetainClient, Swift, swift_bridgeObjectRetainClient,
441+
SwiftClientRR_CC, AlwaysAvailable,
442+
RETURNS(BridgeObjectPtrTy),
443+
ARGS(BridgeObjectPtrTy),
444+
ATTRS(NoUnwind, FirstParamReturned),
445+
EFFECT(RuntimeEffect::RefCounting),
446+
UNKNOWN_MEMEFFECTS)
447+
448+
// void *swift_bridgeObjectReleaseClient(void *ptr);
449+
FUNCTION(BridgeObjectStrongReleaseClient, Swift, swift_bridgeObjectReleaseClient,
450+
SwiftClientRR_CC, AlwaysAvailable,
451+
RETURNS(VoidTy),
452+
ARGS(BridgeObjectPtrTy),
453+
ATTRS(NoUnwind),
454+
EFFECT(RuntimeEffect::RefCounting, RuntimeEffect::Deallocating),
455+
UNKNOWN_MEMEFFECTS)
456+
423457
// void *swift_nonatomic_bridgeObjectRetain(void *ptr);
424458
FUNCTION(NonAtomicBridgeObjectStrongRetain, Swift, swift_nonatomic_bridgeObjectRetain,
425459
C_CC, AlwaysAvailable,

lib/Frontend/CompilerInvocation.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3983,6 +3983,11 @@ static bool ParseIRGenArgs(IRGenOptions &Opts, ArgList &Args,
39833983

39843984
Opts.MergeableTraps = Args.hasArg(OPT_mergeable_traps);
39853985

3986+
Opts.EnableClientRetainRelease =
3987+
Args.hasFlag(OPT_enable_client_retain_release,
3988+
OPT_disable_client_retain_release,
3989+
Opts.EnableClientRetainRelease);
3990+
39863991
Opts.EnableObjectiveCProtocolSymbolicReferences =
39873992
Args.hasFlag(OPT_enable_objective_c_protocol_symbolic_references,
39883993
OPT_disable_objective_c_protocol_symbolic_references,

lib/IRGen/GenHeap.cpp

Lines changed: 40 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -997,11 +997,16 @@ void IRGenFunction::emitNativeStrongRetain(llvm::Value *value,
997997
value = Builder.CreateBitCast(value, IGM.RefCountedPtrTy);
998998

999999
// Emit the call.
1000-
llvm::CallInst *call = Builder.CreateCall(
1001-
(atomicity == Atomicity::Atomic)
1002-
? IGM.getNativeStrongRetainFunctionPointer()
1003-
: IGM.getNativeNonAtomicStrongRetainFunctionPointer(),
1004-
value);
1000+
FunctionPointer function;
1001+
if (atomicity == Atomicity::Atomic &&
1002+
IGM.TargetInfo.HasSwiftClientRRLibrary &&
1003+
getOptions().EnableClientRetainRelease)
1004+
function = IGM.getNativeStrongRetainClientFunctionPointer();
1005+
else if (atomicity == Atomicity::Atomic)
1006+
function = IGM.getNativeStrongRetainFunctionPointer();
1007+
else
1008+
function = IGM.getNativeNonAtomicStrongRetainFunctionPointer();
1009+
llvm::CallInst *call = Builder.CreateCall(function, value);
10051010
call->setDoesNotThrow();
10061011
call->addParamAttr(0, llvm::Attribute::Returned);
10071012
}
@@ -1257,10 +1262,16 @@ void IRGenFunction::emitNativeStrongRelease(llvm::Value *value,
12571262
Atomicity atomicity) {
12581263
if (doesNotRequireRefCounting(value))
12591264
return;
1260-
emitUnaryRefCountCall(*this, (atomicity == Atomicity::Atomic)
1261-
? IGM.getNativeStrongReleaseFn()
1262-
: IGM.getNativeNonAtomicStrongReleaseFn(),
1263-
value);
1265+
llvm::Constant *function;
1266+
if (atomicity == Atomicity::Atomic &&
1267+
IGM.TargetInfo.HasSwiftClientRRLibrary &&
1268+
getOptions().EnableClientRetainRelease)
1269+
function = IGM.getNativeStrongReleaseClientFn();
1270+
else if (atomicity == Atomicity::Atomic)
1271+
function = IGM.getNativeStrongReleaseFn();
1272+
else
1273+
function = IGM.getNativeNonAtomicStrongReleaseFn();
1274+
emitUnaryRefCountCall(*this, function, value);
12641275
}
12651276

12661277
void IRGenFunction::emitNativeSetDeallocating(llvm::Value *value) {
@@ -1353,20 +1364,30 @@ void IRGenFunction::emitUnknownStrongRelease(llvm::Value *value,
13531364

13541365
void IRGenFunction::emitBridgeStrongRetain(llvm::Value *value,
13551366
Atomicity atomicity) {
1356-
emitUnaryRefCountCall(*this,
1357-
(atomicity == Atomicity::Atomic)
1358-
? IGM.getBridgeObjectStrongRetainFn()
1359-
: IGM.getNonAtomicBridgeObjectStrongRetainFn(),
1360-
value);
1367+
llvm::Constant *function;
1368+
if (atomicity == Atomicity::Atomic &&
1369+
IGM.TargetInfo.HasSwiftClientRRLibrary &&
1370+
getOptions().EnableClientRetainRelease)
1371+
function = IGM.getBridgeObjectStrongRetainClientFn();
1372+
else if (atomicity == Atomicity::Atomic)
1373+
function = IGM.getBridgeObjectStrongRetainFn();
1374+
else
1375+
function = IGM.getNonAtomicBridgeObjectStrongRetainFn();
1376+
emitUnaryRefCountCall(*this, function, value);
13611377
}
13621378

13631379
void IRGenFunction::emitBridgeStrongRelease(llvm::Value *value,
13641380
Atomicity atomicity) {
1365-
emitUnaryRefCountCall(*this,
1366-
(atomicity == Atomicity::Atomic)
1367-
? IGM.getBridgeObjectStrongReleaseFn()
1368-
: IGM.getNonAtomicBridgeObjectStrongReleaseFn(),
1369-
value);
1381+
llvm::Constant *function;
1382+
if (atomicity == Atomicity::Atomic &&
1383+
IGM.TargetInfo.HasSwiftClientRRLibrary &&
1384+
getOptions().EnableClientRetainRelease)
1385+
function = IGM.getBridgeObjectStrongReleaseClientFn();
1386+
else if (atomicity == Atomicity::Atomic)
1387+
function = IGM.getBridgeObjectStrongReleaseFn();
1388+
else
1389+
function = IGM.getNonAtomicBridgeObjectStrongReleaseFn();
1390+
emitUnaryRefCountCall(*this, function, value);
13701391
}
13711392

13721393
void IRGenFunction::emitErrorStrongRetain(llvm::Value *value) {

lib/IRGen/IRGenModule.cpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -564,8 +564,9 @@ IRGenModule::IRGenModule(IRGenerator &irgen,
564564
InvariantMetadataID = getLLVMContext().getMDKindID("invariant.load");
565565
InvariantNode = llvm::MDNode::get(getLLVMContext(), {});
566566
DereferenceableID = getLLVMContext().getMDKindID("dereferenceable");
567-
567+
568568
C_CC = getOptions().PlatformCCallingConvention;
569+
SwiftClientRR_CC = llvm::CallingConv::PreserveMost;
569570
// TODO: use "tinycc" on platforms that support it
570571
DefaultCC = SWIFT_DEFAULT_LLVM_CC;
571572

@@ -1730,6 +1731,11 @@ void IRGenModule::addLinkLibraries() {
17301731
registerLinkLibrary(
17311732
LinkLibrary{"objc", LibraryKind::Library, /*static=*/false});
17321733

1734+
if (TargetInfo.HasSwiftClientRRLibrary &&
1735+
getOptions().EnableClientRetainRelease)
1736+
registerLinkLibrary(LinkLibrary{"swiftClientRetainRelease",
1737+
LibraryKind::Library, /*static=*/true});
1738+
17331739
// If C++ interop is enabled, add -lc++ on Darwin and -lstdc++ on linux.
17341740
// Also link with C++ bridging utility module (Cxx) and C++ stdlib overlay
17351741
// (std) if available.

0 commit comments

Comments
 (0)