forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
[AArch64][GlobalISel] Improve codegen for G_VECREDUCE_{SMIN,SMAX,UMIN,UMAX} for odd-sized vectors #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
dc03-work
wants to merge
1
commit into
users/spr/dc03-work/main.aarch64globalisel-improve-codegen-for-g_vecreduce_sminsmaxuminumax-for-odd-sized-vectors
Choose a base branch
from
users/spr/dc03-work/aarch64globalisel-improve-codegen-for-g_vecreduce_sminsmaxuminumax-for-odd-sized-vectors
base: users/spr/dc03-work/main.aarch64globalisel-improve-codegen-for-g_vecreduce_sminsmaxuminumax-for-odd-sized-vectors
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Created using spr 1.3.5
dc03-work
pushed a commit
that referenced
this pull request
Feb 23, 2024
…lvm#80904)" This reverts commit b1ac052. This commit breaks coroutine splitting for non-swift calling convention functions. In this example: ```ll ; ModuleID = 'repro.ll' source_filename = "stdlib/test/runtime/test_llcl.mojo" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" @0 = internal constant { i32, i32 } { i32 trunc (i64 sub (i64 ptrtoint (ptr @craSH to i64), i64 ptrtoint (ptr getelementptr inbounds ({ i32, i32 }, ptr @0, i32 0, i32 1) to i64)) to i32), i32 64 } define dso_local void @af_suspend_fn(ptr %0, i64 %1, ptr %2) #0 { ret void } define dso_local void @craSH(ptr %0) #0 { %2 = call token @llvm.coro.id.async(i32 64, i32 8, i32 0, ptr @0) %3 = call ptr @llvm.coro.begin(token %2, ptr null) %4 = getelementptr inbounds { ptr, { ptr, ptr }, i64, { ptr, i1 }, i64, i64 }, ptr poison, i32 0, i32 0 %5 = call ptr @llvm.coro.async.resume() store ptr %5, ptr %4, align 8 %6 = call { ptr, ptr, ptr } (i32, ptr, ptr, ...) @llvm.coro.suspend.async.sl_p0p0p0s(i32 0, ptr %5, ptr @ctxt_proj_fn, ptr @af_suspend_fn, ptr poison, i64 -1, ptr poison) ret void } define dso_local ptr @ctxt_proj_fn(ptr %0) #0 { ret ptr %0 } ; Function Attrs: nomerge nounwind declare { ptr, ptr, ptr } @llvm.coro.suspend.async.sl_p0p0p0s(i32, ptr, ptr, ...) #1 ; Function Attrs: nounwind declare token @llvm.coro.id.async(i32, i32, i32, ptr) #2 ; Function Attrs: nounwind declare ptr @llvm.coro.begin(token, ptr writeonly) #2 ; Function Attrs: nomerge nounwind declare ptr @llvm.coro.async.resume() #1 attributes #0 = { "target-features"="+adx,+aes,+avx,+avx2,+bmi,+bmi2,+clflushopt,+clwb,+clzero,+crc32,+cx16,+cx8,+f16c,+fma,+fsgsbase,+fxsr,+invpcid,+lzcnt,+mmx,+movbe,+mwaitx,+pclmul,+pku,+popcnt,+prfchw,+rdpid,+rdpru,+rdrnd,+rdseed,+sahf,+sha,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+sse4a,+ssse3,+vaes,+vpclmulqdq,+wbnoinvd,+x87,+xsave,+xsavec,+xsaveopt,+xsaves" } attributes #1 = { nomerge nounwind } attributes #2 = { nounwind } ``` This verifier crashes after the `coro-split` pass with ``` cannot guarantee tail call due to mismatched parameter counts musttail call void @af_suspend_fn(ptr poison, i64 -1, ptr poison) LLVM ERROR: Broken function PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: opt ../../../reduced.ll -O0 #0 0x00007f1d89645c0e __interceptor_backtrace.part.0 /build/gcc-11-XeT9lY/gcc-11-11.4.0/build/x86_64-linux-gnu/libsanitizer/asan/../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4193:28 #1 0x0000556d94d254f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22 #2 0x0000556d94d19a2f llvm::sys::RunSignalHandlers() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Signals.cpp:105:20 llvm#3 0x0000556d94d1aa42 SignalHandler(int) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/Unix/Signals.inc:371:36 llvm#4 0x00007f1d88e42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) llvm#5 0x00007f1d88e969fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 llvm#6 0x00007f1d88e969fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10 llvm#7 0x00007f1d88e969fc pthread_kill ./nptl/pthread_kill.c:89:10 llvm#8 0x00007f1d88e42476 gsignal ./signal/../sysdeps/posix/raise.c:27:6 llvm#9 0x00007f1d88e287f3 abort ./stdlib/abort.c:81:7 llvm#10 0x0000556d8944be01 std::vector<llvm::json::Value, std::allocator<llvm::json::Value>>::size() const /usr/include/c++/11/bits/stl_vector.h:919:40 llvm#11 0x0000556d8944be01 bool std::operator==<llvm::json::Value, std::allocator<llvm::json::Value>>(std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&, std::vector<llvm::json::Value, std::allocator<llvm::json::Value>> const&) /usr/include/c++/11/bits/stl_vector.h:1893:23 llvm#12 0x0000556d8944be01 llvm::json::operator==(llvm::json::Array const&, llvm::json::Array const&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Support/JSON.h:572:69 llvm#13 0x0000556d8944be01 llvm::json::operator==(llvm::json::Value const&, llvm::json::Value const&) (.cold) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/JSON.cpp:204:28 llvm#14 0x0000556d949ed2bd llvm::report_fatal_error(char const*, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Support/ErrorHandling.cpp:82:70 llvm#15 0x0000556d8e37e876 llvm::SmallVectorBase<unsigned int>::size() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:91:32 llvm#16 0x0000556d8e37e876 llvm::SmallVectorTemplateCommon<llvm::DiagnosticInfoOptimizationBase::Argument, void>::end() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:282:41 llvm#17 0x0000556d8e37e876 llvm::SmallVector<llvm::DiagnosticInfoOptimizationBase::Argument, 4u>::~SmallVector() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallVector.h:1215:24 llvm#18 0x0000556d8e37e876 llvm::DiagnosticInfoOptimizationBase::~DiagnosticInfoOptimizationBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:413:7 llvm#19 0x0000556d8e37e876 llvm::DiagnosticInfoIROptimization::~DiagnosticInfoIROptimization() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:622:7 llvm#20 0x0000556d8e37e876 llvm::OptimizationRemark::~OptimizationRemark() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/DiagnosticInfo.h:689:7 llvm#21 0x0000556d8e37e876 operator() /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2213:14 llvm#22 0x0000556d8e37e876 emit<llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::CGSCCAnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&)::<lambda()> > /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/Analysis/OptimizationRemarkEmitter.h:83:12 llvm#23 0x0000556d8e37e876 llvm::CoroSplitPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroSplit.cpp:2212:13 llvm#24 0x0000556d8c36ecb1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::CoroSplitPass, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 llvm#25 0x0000556d91c1a84f llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:90:12 llvm#26 0x0000556d8c3690d1 llvm::detail::PassModel<llvm::LazyCallGraph::SCC, llvm::PassManager<llvm::LazyCallGraph::SCC, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager<llvm::LazyCallGraph::SCC, llvm::LazyCallGraph&>&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 llvm#27 0x0000556d91c2162d llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:278:18 llvm#28 0x0000556d8c369035 llvm::detail::PassModel<llvm::Module, llvm::ModuleToPostOrderCGSCCPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 llvm#29 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20 llvm#30 0x0000556d8e30979e llvm::CoroConditionalWrapper::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/lib/Transforms/Coroutines/CoroConditionalWrapper.cpp:19:74 llvm#31 0x0000556d8c365755 llvm::detail::PassModel<llvm::Module, llvm::CoroConditionalWrapper, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:91:3 llvm#32 0x0000556d9457abc5 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/PassManager.h:247:20 llvm#33 0x0000556d89818556 llvm::SmallPtrSetImplBase::isSmall() const /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:196:33 llvm#34 0x0000556d89818556 llvm::SmallPtrSetImplBase::~SmallPtrSetImplBase() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:84:17 llvm#35 0x0000556d89818556 llvm::SmallPtrSetImpl<llvm::AnalysisKey*>::~SmallPtrSetImpl() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:321:7 llvm#36 0x0000556d89818556 llvm::SmallPtrSet<llvm::AnalysisKey*, 2u>::~SmallPtrSet() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/ADT/SmallPtrSet.h:427:7 llvm#37 0x0000556d89818556 llvm::PreservedAnalyses::~PreservedAnalyses() /home/ubuntu/modular/third-party/llvm-project/llvm/include/llvm/IR/Analysis.h:109:7 llvm#38 0x0000556d89818556 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/NewPMDriver.cpp:532:10 llvm#39 0x0000556d897e3939 optMain /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/optdriver.cpp:737:27 llvm#40 0x0000556d89455461 main /home/ubuntu/modular/third-party/llvm-project/llvm/tools/opt/opt.cpp:25:33 llvm#41 0x00007f1d88e29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16 llvm#42 0x00007f1d88e29e40 call_init ./csu/../csu/libc-start.c:128:20 llvm#43 0x00007f1d88e29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5 llvm#44 0x0000556d897b6335 _start (/home/ubuntu/modular/.derived/third-party/llvm-project/build-relwithdebinfo-asan/bin/opt+0x150c335) Aborted (core dumped)
dc03-work
pushed a commit
that referenced
this pull request
Mar 8, 2024
…ter partial ordering when determining primary template (llvm#82417) Consider the following: ``` struct A { static constexpr bool x = true; }; template<typename T, typename U> void f(T, U) noexcept(T::y); // #1, error: no member named 'y' in 'A' template<typename T, typename U> void f(T, U*) noexcept(T::x); // #2 template<> void f(A, int*) noexcept; // explicit specialization of #2 ``` We currently instantiate the exception specification of all candidate function template specializations when deducting template arguments for an explicit specialization, which results in a error despite `#1` not being selected by partial ordering as the most specialized template. According to [except.spec] p13: > An exception specification is considered to be needed when: > - [...] > - the exception specification is compared to that of another declaration (e.g., an explicit specialization or an overriding virtual function); Assuming that "comparing declarations" means "determining whether the declarations correspond and declare the same entity" (per [basic.scope.scope] p4 and [basic.link] p11.1, respectively), the exception specification does _not_ need to be instantiated until _after_ partial ordering, at which point we determine whether the implicitly instantiated specialization and the explicit specialization declare the same entity (the determination of whether two functions/function templates correspond does not consider the exception specifications). This patch defers the instantiation of the exception specification until a single function template specialization is selected via partial ordering, matching the behavior of GCC, EDG, and MSVC: see https://godbolt.org/z/Ebb6GTcWE.
dc03-work
pushed a commit
that referenced
this pull request
Mar 11, 2024
TestCases/Misc/Linux/sigaction.cpp fails because dlsym() may call malloc on failure. And then the wrapped malloc appears to access thread local storage using global dynamic accesses, thus calling ___interceptor___tls_get_addr, before REAL(__tls_get_addr) has been set, so we get a crash inside ___interceptor___tls_get_addr. For example, this can happen when looking up __isoc23_scanf which might not exist in some libcs. Fix this by marking the thread local variable accessed inside the debug checks as "initial-exec", which does not require __tls_get_addr. This is probably a better alternative to llvm#83886. This fixes a different crash but is related to llvm#46204. Backtrace: ``` #0 0x0000000000000000 in ?? () #1 0x00007ffff6a9d89e in ___interceptor___tls_get_addr (arg=0x7ffff6b27be8) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2759 #2 0x00007ffff6a46bc6 in __sanitizer::CheckedMutex::LockImpl (this=0x7ffff6b27be8, pc=140737331846066) at /path/to/llvm/compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:218 llvm#3 0x00007ffff6a448b2 in __sanitizer::CheckedMutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:129 llvm#4 __sanitizer::Mutex::Lock (this=0x7ffff6b27be8, this@entry=0x730000000580) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:167 llvm#5 0x00007ffff6abdbb2 in __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock (mu=0x730000000580, this=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383 llvm#6 __sanitizer::SizeClassAllocator64<__tsan::AP64>::GetFromAllocator (this=0x7ffff7487dc0 <__tsan::allocator_placeholder>, stat=stat@entry=0x7ffff570db68, class_id=11, chunks=chunks@entry=0x7ffff5702cc8, n_chunks=n_chunks@entry=128) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_primary64.h:207 llvm#7 0x00007ffff6abdaa0 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Refill (this=<optimized out>, c=c@entry=0x7ffff5702cb8, allocator=<optimized out>, class_id=<optimized out>) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:103 llvm#8 0x00007ffff6abd731 in __sanitizer::SizeClassAllocator64LocalCache<__sanitizer::SizeClassAllocator64<__tsan::AP64> >::Allocate (this=0x7ffff6b27be8, allocator=0x7ffff5702cc8, class_id=140737311157448) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_local_cache.h:39 llvm#9 0x00007ffff6abc397 in __sanitizer::CombinedAllocator<__sanitizer::SizeClassAllocator64<__tsan::AP64>, __sanitizer::LargeMmapAllocatorPtrArrayDynamic>::Allocate (this=0x7ffff5702cc8, cache=0x7ffff6b27be8, size=<optimized out>, size@entry=175, alignment=alignment@entry=16) at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_allocator_combined.h:69 llvm#10 0x00007ffff6abaa6a in __tsan::user_alloc_internal (thr=0x7ffff7ebd980, pc=140737331499943, sz=sz@entry=175, align=align@entry=16, signal=true) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:198 llvm#11 0x00007ffff6abb0d1 in __tsan::user_alloc (thr=0x7ffff6b27be8, pc=140737331846066, sz=11, sz@entry=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_mman.cpp:223 llvm#12 0x00007ffff6a693b5 in ___interceptor_malloc (size=175) at /path/to/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:666 llvm#13 0x00007ffff7fce7f2 in malloc (size=175) at ../include/rtld-malloc.h:56 llvm#14 __GI__dl_exception_create_format (exception=exception@entry=0x7fffffffd0d0, objname=0x7ffff7fc3550 "/path/to/llvm/compiler-rt/cmake-build-all-sanitizers/lib/linux/libclang_rt.tsan-x86_64.so", fmt=fmt@entry=0x7ffff7ff2db9 "undefined symbol: %s%s%s") at ./elf/dl-exception.c:157 llvm#15 0x00007ffff7fd50e8 in _dl_lookup_symbol_x (undef_name=0x7ffff6af868b "__isoc23_scanf", undef_map=<optimized out>, ref=0x7fffffffd148, symbol_scope=<optimized out>, version=<optimized out>, type_class=0, flags=2, skip_map=0x7ffff7fc35e0) at ./elf/dl-lookup.c:793 --Type <RET> for more, q to quit, c to continue without paging-- llvm#16 0x00007ffff656d6ed in do_sym (handle=<optimized out>, name=0x7ffff6af868b "__isoc23_scanf", who=0x7ffff6a3bb84 <__interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long)+36>, vers=vers@entry=0x0, flags=flags@entry=2) at ./elf/dl-sym.c:146 llvm#17 0x00007ffff656d9dd in _dl_sym (handle=<optimized out>, name=<optimized out>, who=<optimized out>) at ./elf/dl-sym.c:195 llvm#18 0x00007ffff64a2854 in dlsym_doit (a=a@entry=0x7fffffffd3b0) at ./dlfcn/dlsym.c:40 llvm#19 0x00007ffff7fcc489 in __GI__dl_catch_exception (exception=exception@entry=0x7fffffffd310, operate=0x7ffff64a2840 <dlsym_doit>, args=0x7fffffffd3b0) at ./elf/dl-catch.c:237 llvm#20 0x00007ffff7fcc5af in _dl_catch_error (objname=0x7fffffffd368, errstring=0x7fffffffd370, mallocedp=0x7fffffffd367, operate=<optimized out>, args=<optimized out>) at ./elf/dl-catch.c:256 llvm#21 0x00007ffff64a2257 in _dlerror_run (operate=operate@entry=0x7ffff64a2840 <dlsym_doit>, args=args@entry=0x7fffffffd3b0) at ./dlfcn/dlerror.c:138 llvm#22 0x00007ffff64a28e5 in dlsym_implementation (dl_caller=<optimized out>, name=<optimized out>, handle=<optimized out>) at ./dlfcn/dlsym.c:54 llvm#23 ___dlsym (handle=<optimized out>, name=<optimized out>) at ./dlfcn/dlsym.c:68 llvm#24 0x00007ffff6a3bb84 in __interception::GetFuncAddr (name=0x7ffff6af868b "__isoc23_scanf", trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:42 llvm#25 __interception::InterceptFunction (name=0x7ffff6af868b "__isoc23_scanf", ptr_to_real=0x7ffff74850e8 <__interception::real___isoc23_scanf>, func=11, trampoline=140737311157448) at /path/to/llvm/compiler-rt/lib/interception/interception_linux.cpp:61 llvm#26 0x00007ffff6a9f2d9 in InitializeCommonInterceptors () at /path/to/llvm/compiler-rt/lib/tsan/rtl/../../sanitizer_common/sanitizer_common_interceptors.inc:10315 ``` Reviewed By: vitalybuka, MaskRay Pull Request: llvm#83890
dc03-work
pushed a commit
that referenced
this pull request
May 14, 2024
…e exception specification of a function (llvm#90760) [temp.deduct.general] p6 states: > At certain points in the template argument deduction process it is necessary to take a function type that makes use of template parameters and replace those template parameters with the corresponding template arguments. This is done at the beginning of template argument deduction when any explicitly specified template arguments are substituted into the function type, and again at the end of template argument deduction when any template arguments that were deduced or obtained from default arguments are substituted. [temp.deduct.general] p7 goes on to say: > The _deduction substitution loci_ are > - the function type outside of the _noexcept-specifier_, > - the explicit-specifier, > - the template parameter declarations, and > - the template argument list of a partial specialization > > The substitution occurs in all types and expressions that are used in the deduction substitution loci. [...] Consider the following: ```cpp struct A { static constexpr bool x = true; }; template<typename T, typename U> void f(T, U) noexcept(T::x); // #1 template<typename T, typename U> void f(T, U*) noexcept(T::y); // #2 template<> void f<A>(A, int*) noexcept; // clang currently accepts, GCC and EDG reject ``` Currently, `Sema::SubstituteExplicitTemplateArguments` will substitute into the _noexcept-specifier_ when deducing template arguments from a function declaration or when deducing template arguments for taking the address of a function template (and the substitution is treated as a SFINAE context). In the above example, `#1` is selected as the primary template because substitution of the explicit template arguments into the _noexcept-specifier_ of `#2` failed, which resulted in the candidate being ignored. This behavior is incorrect ([temp.deduct.general] note 4 says as much), and this patch corrects it by deferring all substitution into the _noexcept-specifier_ until it is instantiated. As part of the necessary changes to make this patch work, the instantiation of the exception specification of a function template specialization when taking the address of a function template is changed to only occur for the function selected by overload resolution per [except.spec] p13.1 (as opposed to being instantiated for every candidate).
dc03-work
pushed a commit
that referenced
this pull request
May 14, 2024
…ined member functions & member function templates (llvm#88963) Consider the following snippet from the discussion of CWG2847 on the core reflector: ``` template<typename T> concept C = sizeof(T) <= sizeof(long); template<typename T> struct A { template<typename U> void f(U) requires C<U>; // #1, declares a function template void g() requires C<T>; // #2, declares a function template<> void f(char); // llvm#3, an explicit specialization of a function template that declares a function }; template<> template<typename U> void A<short>::f(U) requires C<U>; // llvm#4, an explicit specialization of a function template that declares a function template template<> template<> void A<int>::f(int); // llvm#5, an explicit specialization of a function template that declares a function template<> void A<long>::g(); // llvm#6, an explicit specialization of a function that declares a function ``` A number of problems exist: - Clang rejects `llvm#4` because the trailing _requires-clause_ has `U` substituted with the wrong template parameter depth when `Sema::AreConstraintExpressionsEqual` is called to determine whether it matches the trailing _requires-clause_ of the implicitly instantiated function template. - Clang rejects `llvm#5` because the function template specialization instantiated from `A<int>::f` has a trailing _requires-clause_, but `llvm#5` does not (nor can it have one as it isn't a templated function). - Clang rejects `llvm#6` for the same reasons it rejects `llvm#5`. This patch resolves these issues by making the following changes: - To fix `llvm#4`, `Sema::AreConstraintExpressionsEqual` is passed `FunctionTemplateDecl`s when comparing the trailing _requires-clauses_ of `llvm#4` and the function template instantiated from `#1`. - To fix `llvm#5` and `llvm#6`, the trailing _requires-clauses_ are not compared for explicit specializations that declare functions. In addition to these changes, `CheckMemberSpecialization` now considers constraint satisfaction/constraint partial ordering when determining which member function is specialized by an explicit specialization of a member function for an implicit instantiation of a class template (we previously would select the first function that has the same type as the explicit specialization). With constraints taken under consideration, we match EDG's behavior for these declarations.
dc03-work
pushed a commit
that referenced
this pull request
May 20, 2024
...which caused issues like > ==42==ERROR: AddressSanitizer failed to deallocate 0x32 (50) bytes at address 0x117e0000 (error code: 28) > ==42==Cannot dump memory map on emscriptenAddressSanitizer: CHECK failed: sanitizer_common.cpp:81 "((0 && "unable to unmmap")) != (0)" (0x0, 0x0) (tid=288045824) > #0 0x14f73b0c in __asan::CheckUnwind()+0x14f73b0c (this.program+0x14f73b0c) > #1 0x14f8a3c2 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long)+0x14f8a3c2 (this.program+0x14f8a3c2) > #2 0x14f7d6e1 in __sanitizer::ReportMunmapFailureAndDie(void*, unsigned long, int, bool)+0x14f7d6e1 (this.program+0x14f7d6e1) > llvm#3 0x14f81fbd in __sanitizer::UnmapOrDie(void*, unsigned long)+0x14f81fbd (this.program+0x14f81fbd) > llvm#4 0x14f875df in __sanitizer::SuppressionContext::ParseFromFile(char const*)+0x14f875df (this.program+0x14f875df) > llvm#5 0x14f74eab in __asan::InitializeSuppressions()+0x14f74eab (this.program+0x14f74eab) > llvm#6 0x14f73a1a in __asan::AsanInitInternal()+0x14f73a1a (this.program+0x14f73a1a) when trying to use an ASan suppressions file under Emscripten: Even though it would be considered OK by SUSv4, the Emscripten runtime states "We don't support partial munmapping" (see <emscripten-core/emscripten@f4115eb> "Implement MAP_ANONYMOUS on top of malloc in STANDALONE_WASM mode (llvm#16289)"). Co-authored-by: Stephan Bergmann <stephan.bergmann@allotropia.de>
dc03-work
pushed a commit
that referenced
this pull request
May 20, 2024
…ication as used during partial ordering (llvm#91534) We do not deduce template arguments from the exception specification when determining the primary template of a function template specialization or when taking the address of a function template. Therefore, this patch changes `isAtLeastAsSpecializedAs` such that we do not mark template parameters in the exception specification as 'used' during partial ordering (per [temp.deduct.partial] p12) to prevent the following from being ambiguous: ``` template<typename T, typename U> void f(U) noexcept(noexcept(T())); // #1 template<typename T> void f(T*) noexcept; // #2 template<> void f<int>(int*) noexcept; // currently ambiguous, selects #2 with this patch applied ``` Although there is no corresponding wording in the standard (see core issue filed here cplusplus/CWG#537), this seems to be the intended behavior given the definition of _deduction substitution loci_ in [temp.deduct.general] p7 (and EDG does the same thing).
dc03-work
pushed a commit
that referenced
this pull request
May 20, 2024
…erSize (llvm#67657)" This reverts commit f0b3654. This commit triggers UB by reading an uninitialized variable. `UP.PartialThreshold` is used uninitialized in `getUnrollingPreferences()` when it is called from `LoopVectorizationPlanner::executePlan()`. In this case the `UP` variable is created on the stack and its fields are not initialized. ``` ==8802==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x557c0b081b99 in llvm::BasicTTIImplBase<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/CodeGen/BasicTTIImpl.h #1 0x557c0b07a40c in llvm::TargetTransformInfo::Model<llvm::X86TTIImpl>::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) llvm-project/llvm/include/llvm/Analysis/TargetTransformInfo.h:2277:17 #2 0x557c0f5d69ee in llvm::TargetTransformInfo::getUnrollingPreferences(llvm::Loop*, llvm::ScalarEvolution&, llvm::TargetTransformInfo::UnrollingPreferences&, llvm::OptimizationRemarkEmitter*) const llvm-project/llvm/lib/Analysis/TargetTransformInfo.cpp:387:19 llvm#3 0x557c0e6b96a0 in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7624:7 llvm#4 0x557c0e6e4b63 in llvm::LoopVectorizePass::processLoop(llvm::Loop*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10253:13 llvm#5 0x557c0e6f2429 in llvm::LoopVectorizePass::runImpl(llvm::Function&, llvm::ScalarEvolution&, llvm::LoopInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::BlockFrequencyInfo*, llvm::TargetLibraryInfo*, llvm::DemandedBits&, llvm::AssumptionCache&, llvm::LoopAccessInfoManager&, llvm::OptimizationRemarkEmitter&, llvm::ProfileSummaryInfo*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10344:30 llvm#6 0x557c0e6f2f97 in llvm::LoopVectorizePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10383:9 [...] Uninitialized value was created by an allocation of 'UP' in the stack frame #0 0x557c0e6b961e in llvm::LoopVectorizationPlanner::executePlan(llvm::ElementCount, unsigned int, llvm::VPlan&, llvm::InnerLoopVectorizer&, llvm::DominatorTree*, bool, llvm::DenseMap<llvm::SCEV const*, llvm::Value*, llvm::DenseMapInfo<llvm::SCEV const*, void>, llvm::detail::DenseMapPair<llvm::SCEV const*, llvm::Value*>> const*) llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7623:3 ```
dc03-work
pushed a commit
that referenced
this pull request
May 20, 2024
…vm#90820) This solves some ambuguity introduced in P0522 regarding how template template parameters are partially ordered, and should reduce the negative impact of enabling `-frelaxed-template-template-args` by default. When performing template argument deduction, a template template parameter containing no packs should be more specialized than one that does. Given the following example: ```C++ template<class T2> struct A; template<template<class ...T3s> class TT1, class T4> struct A<TT1<T4>>; // #1 template<template<class T5 > class TT2, class T6> struct A<TT2<T6>>; // #2 template<class T1> struct B; template struct A<B<char>>; ``` Prior to P0522, candidate `#2` would be more specialized. After P0522, neither is more specialized, so this becomes ambiguous. With this change, `#2` becomes more specialized again, maintaining compatibility with pre-P0522 implementations. The problem is that in P0522, candidates are at least as specialized when matching packs to fixed-size lists both ways, whereas before, a fixed-size list is more specialized. This patch keeps the original behavior when checking template arguments outside deduction, but restores this aspect of pre-P0522 matching during deduction. --- Since this changes provisional implementation of CWG2398 which has not been released yet, and already contains a changelog entry, we don't provide a changelog entry here.
dc03-work
pushed a commit
that referenced
this pull request
May 23, 2024
'reduction' has a few restrictions over normal 'var-list' clauses: 1- On parallel, a num_gangs can only have 1 argument when combined with reduction. These two aren't able to be combined on any other of the compute constructs however. 2- The vars all must be 'numerical data types' types of some sort, or a 'composite of numerical data types'. A list of types is given in the standard as a minimum, so we choose 'isScalar', which covers all of these types and keeps types that are actually numeric. Other compilers don't seem to implement the 'composite of numerical data types', though we do. 3- Because of the above restrictions, member-of-composite is not allowed, so any access via a memberexpr is disallowed. Array-element and sub-arrays (aka array sections) are both permitted, so long as they meet the requirements of #2. This patch implements all of these for compute constructs.
dc03-work
pushed a commit
that referenced
this pull request
May 23, 2024
…llvm#92855) This solves some ambuguity introduced in P0522 regarding how template template parameters are partially ordered, and should reduce the negative impact of enabling `-frelaxed-template-template-args` by default. When performing template argument deduction, we extend the provisional wording introduced in llvm#89807 so it also covers deduction of class templates. Given the following example: ```C++ template <class T1, class T2 = float> struct A; template <class T3> struct B; template <template <class T4> class TT1, class T5> struct B<TT1<T5>>; // #1 template <class T6, class T7> struct B<A<T6, T7>>; // #2 template struct B<A<int>>; ``` Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This patch restores the pre-P0522 behavior, `#2` is picked again. This has the beneficial side effect of making the following code valid: ```C++ template<class T, class U> struct A {}; A<int, float> v; template<template<class> class TT> void f(TT<int>); // OK: TT picks 'float' as the default argument for the second parameter. void g() { f(v); } ``` --- Since this changes provisional implementation of CWG2398 which has not been released yet, and already contains a changelog entry, we don't provide a changelog entry here.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
i8 vectors do not have their sizes changed as I noticed regressions in some
tests when that was done.