Skip to content

valgrind#4

Open
Nahom-123 wants to merge 824 commits intofeature/darwin-malloc_zone-testfrom
master
Open

valgrind#4
Nahom-123 wants to merge 824 commits intofeature/darwin-malloc_zone-testfrom
master

Conversation

@Nahom-123
Copy link

Valgrind

petar-jovanovic and others added 30 commits December 17, 2019 17:08
Define PLAT_mips32_linux if __mips==32 rather than if __mips!=64.

Patch by Rosen Penev <rosenp@gmail.com>.
Specific use case bug found in SysRes VG_(do_sys_sigprocmask).

Fix for case when ,,set,, parameter is NULL.
In this case ,,how,, parameter should be ignored because we are
only requesting from kernel to put current signal mask into ,,oldset,,.
But instead we determine the action based on ,,how,, parameter and
therefore make the system call fail when it should pass.
Taken from linux man pages (sigprocmask).

The same is specified for POSIX.

https://bugs.kde.org/show_bug.cgi?id=414565
Integrate the test case written by Nikola Milutinovic to the
testsuite. (https://bugs.kde.org/show_bug.cgi?id=414565)
Patch from Assad Hashmi <assad.hashmi@linaro.org>.

This patch adds support for AArch64 ARMv8.1 SIMD instructions:
SQRDMLAH <V><d>, <V><n>, <V><m>
SQRDMLAH <Vd>.<T>, <Vn>.<T>, <Vm>.<T>
SQRDMLAH <V><d>, <V><n>, <Vm>.<Ts>[<index>]
SQRDMLAH <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>]
SQRDMLSH <V><d>, <V><n>, <V><m>
SQRDMLSH <Vd>.<T>, <Vn>.<T>, <Vm>.<T>
SQRDMLSH <V><d>, <V><n>, <Vm>.<Ts>[<index>]
SQRDMLSH <Vd>.<T>, <Vn>.<T>, <Vm>.<Ts>[<index>]
…_GSET. n-i-bz.

For the case ETHTOOL_GSET, don't insist that the whole structure is defined.
That appears to cause false positives.  All other cases remain unchanged.
Patches from Miroslav Lichvar <mlichvar@redhat.com>.
Patch from Nick Black <dankamongmen@gmail.com>.
Patches from Simon Richter <Simon.Richter@hogyros.de>.
Necessary changes to support nanoMIPS on Linux.

Part 3/4 - Coregrind and tools changes

Patch by Aleksandar Rikalo, Dimitrije Nikolic, Tamara Vlahovic,
Nikola Milutinovic and Aleksandra Karadzic.

Related KDE issue: #400872.
Necessary changes to support nanoMIPS on Linux.

Part 4/4 - Other changes (mainly include/*)

Patch by Aleksandar Rikalo, Dimitrije Nikolic, Tamara Vlahovic,
Nikola Milutinovic and Aleksandra Karadzic.

Related KDE issue: #400872.
Add
  /none/tests/sigprocmask

to .gitignore.
Update the tests so they can be compiled for nanoMIPS.

Patch by Dimitrije Nikolic and Aleksandra Karadzic.
This branch contains code which avoids Memcheck false positives resulting from
gcc and clang creating branches on uninitialised data.  For example:

   bool isClosed;
   if (src.isRect(..., &isClosed, ...) && isClosed) {

clang9 -O2 compiles this as:

   callq  7e7cdc0 <_ZNK6SkPath6isRectEP6SkRectPbPNS_9DirectionE>

   cmpb   $0x0,-0x60(%rbp)  // "if (isClosed) { .."
   je     7ed9e08           // "je after"

   test   %al,%al           // "if (return value of call is nonzero) { .."
   je     7ed9e08           // "je after"

   ..
   after:

That is, the && has been evaluated right-to-left.  This is a correct
transformation if the compiler can prove that the call to |isRect| returns
|false| along any path on which it does not write its out-parameter
|&isClosed|.

In general, for the lazy-semantics (L->R) C-source-level && operator, we have
|A && B| == |B && A| if you can prove that |B| is |false| whenever A is
undefined.  I assume that clang has some kind of interprocedural analysis that
tells it that.  The compiler is further obliged to show that |B| won't trap,
since it is now being evaluated speculatively, but that's no big deal to
prove.

A similar result holds, per de Morgan, for transformations involving the C
language ||.

Memcheck correctly handles bitwise &&/|| in the presence of undefined inputs.
It has done so since the beginning.  However, it assumes that every
conditional branch in the program is important -- any branch on uninitialised
data is an error.  However, this idiom demonstrates otherwise.  It defeats
Memcheck's existing &&/|| handling because the &&/|| is spread across two
basic blocks, rather than being bitwise.

This initial commit contains a complete initial implementation to fix that.
The basic idea is to detect the && condition spread across two blocks, and
transform it into a single block using bitwise &&.  Then Memcheck's existing
accurate instrumentation of bitwise && will correctly handle it.  The
transformation is

   <contents of basic block A>
   C1 = ...
   if (!C1) goto after
   .. falls through to ..

   <contents of basic block B>
   C2 = ...
   if (!C2) goto after
   .. falls through to ..

   after:

 ===>

   <contents of basic block A>
   C1 = ...
   <contents of basic block B, conditional on C1>
   C2 = ...
   if (!C1 && !C2) goto after
   .. falls through to ..

   after:

This assumes that <contents of basic block B> can be conditionalised, at the
IR level, so that the guest state is not modified if C1 is |false|.  That's
not possible for all IRStmt kinds, but it is possible for a large enough
subset to make this transformation feasible.

There is no corresponding transformation that recovers an || condition,
because, per de Morgan, that merely corresponds to swapping the side exits vs
fallthoughs, and inverting the sense of the tests, and the pattern-recogniser
as implemented checks all possible combinations already.

The analysis and block-building is performed on the IR returned by the
architecture specific front ends.  So they are almost not modified at all: in
fact they are simplified because all logic related to chasing through
unconditional and conditional branches has been removed from them, redone at
the IR level, and centralised.

The only file with big changes is the IRSB constructor logic,
guest_generic_bb_to_IR.c (a.k.a the "trace builder").  This is a complete
rewrite.

There is some additional work for the IR optimiser (ir_opt.c), since that
needs to do a quick initial simplification pass of the basic blocks, in order
to reduce the number of different IR variants that the trace-builder has to
pattern match on.  An important followup task is to further reduce this cost.

There are two new IROps to support this: And1 and Or1, which both operate on
Ity_I1.  They are regarded as evaluating both arguments, consistent with AndXX
and OrXX for all other sizes.  It is possible to synthesise at the IR level by
widening the value to Ity_I8 or above, doing bitwise And/Or, and re-narrowing
it, but this gives inefficient code, so I chose to represent them directly.

The transformation appears to work for amd64-linux.  In principle -- because
it operates entirely at the IR level -- it should work for all targets,
providing the initial pre-simplification pass can normalise the block ends
into the required form.  That will no doubt require some tuning.  And1 and Or1
will have to be implemented in all instruction selectors, but that's easy
enough.

Remaining FIXMEs in the code:

* Rename `expr_is_speculatable` et al to `expr_is_conditionalisable`.  These
  functions merely conditionalise code; the speculation has already been done
  by gcc/clang.

* `expr_is_speculatable`: properly check that Iex_Unop/Binop don't contain
  operatins that might trap (Div, Rem, etc).

* `analyse_block_end`: recognise all block ends, and abort on ones that can't
  be recognised.  Needed to ensure we don't miss any cases.

* maybe: guest_amd64_toIR.c: generate better code for And1/Or1

* ir_opt.c, do_iropt_BB: remove the initial flattening pass since presimp
  will already have done it

* ir_opt.c, do_minimal_initial_iropt_BB (a.k.a. presimp).  Make this as
  cheap as possible.  In particular, calling `cprop_BB_wrk` is total overkill
  since we only need copy propagation.

* ir_opt.c: once the above is done, remove boolean parameter for `cprop_BB_wrk`.

* ir_opt.c: concatenate_irsbs: maybe de-dup w.r.t. maybe_unroll_loop_BB.

* remove option `guest_chase_cond` from VexControl (?).  It was never used.

* convert option `guest_chase_thresh` from VexControl (?) into a Bool, since
the revised code here only cares about the 0-vs-nonzero distinction now.
* document some functions

* change naming and terminology from 'speculation' (which it isn't)
  to 'guarding' (which it is)

* add a new function |primopMightTrap| so as to avoid conditionalising
  IRExprs involving potentially trappy IROps
.. and check more carefully for unexpected control flow in the blocks being
analysed.
* removes --vex-guest-chase-cond=no|yes.  This was never used in practice.

* rename --vex-guest-chase-thresh=<0..99> to --vex-guest-chase=no|yes.  In
  otherwords, downgrade it from a numeric flag to a boolean one, that can
  simply disable all chasing if required.  (Some tools, notably Callgrind,
  force-disable block chasing, so this functionality at least needs to be
  retained).
* Rewrite do_minimal_initial_iropt_BB so it doesn't do full constant folding;
  that is unnecessary expense at this point, and later passes will do it
  anyway

* do_iropt_BB: don't flatten the incoming block, because
  do_minimal_initial_iropt_BB will have run earlier and done so.  But at least
  for the moment, assert that it really is flat.

* VEX/priv/guest_generic_bb_to_IR.c create_self_checks_as_needed: generate
  flat IR so as not to fail the abovementioned assertion.

I believe this completes the target-independent aspects of this work, and also
the x86_64 specifics (of which there are very few).
.. when speculating into conditional-branch destinations.  A simple change
requiring a big comment explaining the rationale.
* guest_arm64_toIR.c: use |sigill_diag| to guard auxiliary diagnostic printing
  in case of decode failure

* guest_generic_bb_to_IR.c expr_is_guardable(), stmt_is_guardable(): handle a
  few more cases that didn't turn up so far on x86 or amd64

* host_arm64_defs.[ch]:

  - new instruction ARM64Instr_Set64, to copy a condition code value into a
    register (the CSET instruction)

  - use this to reimplement Iop_And1 and Iop_Or1
* priv/guest_generic_bb_to_IR.c expr_is_guardable(), stmt_is_guardable():
  add some missing cases

* do_minimal_initial_iropt_BB: add comment (no functional change)

* priv/host_arm_isel.c iselCondCode_wrk(): handle And1 and Or1, the
  not-particularly-optimal way
julian-seward1 and others added 30 commits September 19, 2020 12:11
…_Add32.

This is necessary to avoid some false positives in code compiled by clang 10
at -O2.  Some very crude measurements suggest the increase in generated code
size is around 0.2%, viz, insignificant.
Apparently on Fedora 33 the POSIX thread functions exist in both libc and
libpthread. Hence this patch that intercepts the pthread functions in
libc. See also https://bugs.kde.org/show_bug.cgi?id=426144 .
addi    Add Immediate
lbz     Load Byte & Zero
ld      Load Doubleword
lfd     Load Floating Double
lfs     Load Floating Single
lha     Load Halfword Algebraic
lhz     Load Halfword & Zero
lq      Load Quadword
lwa     Load Word Algebraic
lwz     Load Word & Zero
lxsd    Load VSX Scalar Doubleword
lxssp   Load VSX Scalar Single-Precision
lxv     Load VSX Vector
stb     Store Byte
std     Store Doubleword
stfd    Store Floating Double
stfs    Store Floating Single
sth     Store Halfword
stq     Store Quadword
stw     Store Word
stxsd   Store VSX Scalar Doubleword
stxssp  Store VSX Scalar Single-Precision
stxv    Store VSX Vector
header files and other common parts associated with the initial isa v3.1
support
The code in test_isa_3_1_common.c should only be included
if ISA 3.1 support exists.
On ppc64 [old big endian] altivec.h can not be included directly.
Move the HAS_ISA_3_1 guard around so the include is only done when
the full test (and test_list_t) are build.
Fix the file consistency check in none/tests/ppc64/Makefile.am.  Subsequent
patches for the PPC ISA 3.1 support will fully add the additional tests.
Add support for the new ISA 3.1 word instructions:

brd Byte-Reverse Doubleword
brh Byte-Reverse Halfword
brw Byte-Reverse Word
Add support for the new ISA 3.1 set boolean condition
word instructions:

setbc Set Boolean Condition
setbcr Set Boolean Condition Reverse
setnbc Set Negative Boolean Condition
setnbcr Set Negative Boolean Condition Reverse.
Add support for the new ISA 3.1 load and store
instructions:

lxvpx Load VSX Vector Paired Indexed
plxvp Prefixed Load VSX Vector Paired
pstxvp Prefixed Store VSX Vector Paired
stxvpx Store VSX Vector Paired Indexed

Update the parsing of the lxvp and stxvp instructions that
were previously added.

lxvp Load VSX Vector Paired
stxvp Store VSX Vector Paired

A couple of format changes for the arguments to the
calculate_prefix_EA function.

Add comments to the else if and case statement to
clarify which instructions meet this condition.
Add support for:

vdivesd Vector Divide Extended Signed Doubleword
vdivesw Vector Divide Extended Signed Word
vdiveud Vector Divide Extended Unsigned Doubleword
vdiveuw Vector Divide Extended Unsigned Word
vdivsd Vector Divide Signed Doubleword
vdivsw Vector Divide Signed Word
vdivud Vector Divide Unsigned Doubleword
vdivuw Vector Divide Unsigned Word
vmodsd Vector Modulo Signed Doubleword
vmodsw Vector Modulo Signed Word
vmodud Vector Modulo Unsigned Doubleword
vmoduw Vector Modulo Unsigned Word
vmulhsd Vector Multiply High Signed Doubleword
vmulhsw Vector Multiply High Signed Word
vmulhud Vector Multiply High Unsigned Doubleword
vmulhuw Vector Multiply High Unsigned Word
vmulld Vector Multiply Low Doubleword
Add support for:

vxvkq Load VSX Vector Special Value Quadword
vextddvlx Vector Extract Double Dword to VSR Left-Indexed
vextddvrx Vector Extract Double Dword to VSR Right-Indexed
vextdubvlx Vector Extract Double Unsigned Byte to VR Left-Indexed
vextdubvrx Vector Extract Double Unsigned Byte to VR Right-Indexed
vextduhvlx Vector Extract Double Unsigned Hword to VR Left-Indexed
vextduhvrx Vector Extract Double Unsigned Hword to VR Right-Indexed
vextduwvlx Vector Extract Double Unsigned Word to VR Left-Indexed
vextduwvrx Vector Extract Double Unsigned Word to VR Right-Indexed
vinsblx Vector Insert Byte from GPR Left-Indexed
vinsbrx Vector Insert Byte from GPR Right-Indexed
vinsbvlx Vector Insert Byte from VSR Left-Indexed
vinsbvrx Vector Insert Byte from VSR Right-Indexed
vinsd Vector Insert Dword from GPR
vinsdlx Vector Insert Dword from GPR Left-Indexed
vinsdrx Vector Insert Dword from GPR Right-Indexed
vinshlx Vector Insert Hword from GPR Left-Indexed
vinshrx Vector Insert Hword from GPR Right-Indexed
vinshvlx Vector Insert Hword from VSR Left-Indexed
vinshvrx Vector Insert Hword from VSR Right-Indexed
vinsw Vector Insert Word from GPR
vinswlx Vector Insert Word from GPR Left-Indexed
vinswrx Vector Insert Word from GPR Right-Indexed
vinswvlx Vector Insert Word from VSR Left-Indexed
vinswvrx Vector Insert Word from VSR Right-Indexed
vsldbi Vector Shift Left Double by Bit Immediate
vsrdbi Vector Shift Right Double by Bit Immediate
xxblendvb VSX Vector Blend Variable Byte
xxblendvd VSX Vector Blend Variable Dword
xxblendvh VSX Vector Blend Variable Hword
xxblendvw VSX Vector Blend Variable Word
xxpermx VSX Vector Permute Extended
xxsplti32dx VSX Vector Splat Immediate32 Dword Indexed
xxspltidp VSX Vector Splat Immediate DP
xxspltiw VSX Vector Splat Immediate Word
faccessat2 is a new syscall in linux 5.8 and will be used by glibc 2.33.
faccessat2 is simply faccessat with a new flag argument. It has
a common number across all linux arches.

https://bugs.kde.org/427787
See also https://bugs.kde.org/show_bug.cgi?id=428035.

Reported-by: Stacy <stacy.gaikovaia@windriver.com>
Fixes: 15330ad ("drd: Port to Fedora 33")
Mark
   LD3/ST3 (multiple 3-elem structs to/from 3 regs
   LD4/ST4 (multiple 4-elem structs to/from 4 regs
as "verbose", since they can generate so much IR that a long sequence
of them causes later stages of the JIT to run out of space.
Similar to Bug 417452, where the instruction selector sometimes attempted
to generate vector stores with a 20-bit displacement, the same problem has
now been reported with vector loads.

The problem is caused in s390_isel_vec_expr_wrk(), where the addressing
mode is generated with s390_isel_amode() instead of
s390_isel_amode_short().  This is fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comments