Skip to content

Conversation

@barracuda156
Copy link

Rather trivial fixes allowing the build on Darwin powerpc.

G.E. and others added 30 commits March 6, 2024 15:28
This adds three new CMake options, all defaulting to true, making it
possible to opt-out of building parts of Vectorscan that are not
essential for deployment of the matching runtime.

These new options:

- `BUILD_UNIT`: control whether the `unit` directory is included
- `BUILD_DOC`: control whether the `doc` directory is included
- `BUILD_TOOLS`: control whether the `tools` directory is included
Man pages tend to be preferred in some circles, lets add an
option to build the vectorscan documentation that way.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
The project name in the documentation should probably
be updated to reflect that this is vectorscan. Update
the copyright too.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
The generated documentation continues to refer to Hyperscan
despite the project now being VectorScan. Lets replace many
of the Hyperscan references with Vectorscan.

At the same time, lets resync the documentation here with the
vectorscan readme. This updates the supported platforms/compilers
and build options.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Correct the description in the pkgconfig file, but
leave the name alone as we want to remain compatible
with projects utilizing hyperscan.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
While fixing the documentation, it was noticed that the hsbench
output was still referring to the project as Hyperscan.
Lets correct it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
Add CMake options for more build granularity
Add man page generation, change man section, update docs to reflect name change, and couple other tweaks
…sheng-implementation-on-arm

RFC Enable sheng32/64 for SVE
…add-wider-sheng-implementation-on-arm

Revert "RFC Enable sheng32/64 for SVE"
isildur-g and others added 23 commits June 26, 2024 22:35
Major refactoring of teddy and teddy_avx2, unrolling macros to C++ templated functions

---------

Co-authored-by: G.E <gregory.economou@vectorcamp.gr>
This allows the use of SIMDE library to emulate SSSE3/SSE4.2 instructions on SSE2-only (x86-64-v2) hardware.

---------

Co-authored-by: G.E <gregory.economou@vectorcamp.gr>
Co-authored-by: Konstantinos Margaritis <konstantinos@vectorcamp.gr>
…ctorCamp#306)

* maybe fix the hsbench issue (check_ssse3 again) in sse2/simde env

* fix the last failing unit test with fat

---------

Co-authored-by: G.E. <gregory.economou@vectorcamp.gr>
* rebar based unit tests

* fixing paths

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>
* fixed paths and utf8-lossy=true

* revert to maskz (its the bug)

* cppcheck fix

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>
By using svmatch on 16 bit lanes with a 8 bit predicate, we end up
including an undefined character in the pattern checks. The inactive
lane after load contains an undefined value, usually \0. Patterns
using \0 as the last character would then match this spurious
character, returning a match beyond the buffer's end. The fix checks
for such matches and rejects them.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
Vectorscan used to reject such pattern because they were being compared
to "" and found to be an empty string. We now check the pattern length
instead.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
Vectorscan requires SSE4.2 as a minimum on x86_64. For Hyperscan this
used to be SSSE3.

Applications that use the library call hs_valid_platform() to check if
the CPU fulfils this minimum requirement. However, when Vectorscan
upgraded to SSE4.2, the check was not updated. This leads to the library
trying to execute instructions that are not supported, resulting in the
application to crash.

This might not have been noticed as the CPUs that do not support SSE4.2
are rather old and unlikely to run any load where performance is an
issue. However, I believe that the library should not let the
application crash.

Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
* Revert "Fix noodle SVE2 off by one bug"

This patch was fixing the bug when it happens at the end of the buffer
but it wasn't fixing it when we do scanDoubleOnce before the main loop

The next patch fix this bug for both case instead

This reverts commit 48dd0e5.

* Fix noodle spurious match with \0 chars for SVE2

When sve2's noodle process a non full vector (before the main loop or
at the end of it), a fake \0 was being parsed, trigerring a match for
pattern that ended with \0. This patch fix this.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

---------

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
* supress knownConditionTrueFalse

* cppcheck suppress redundantInitialization

* cppcheck solve stlcstrStream

* cppcheck suppress useStlAlgorithm

* cppcheck-suppress derefInvalidIteratorRedundantCheck

* cppcheck solvwe constParameterReference

* const parameter reference cppcheck

* removed wrong fix

* cppcheck-suppress memsetClassFloat

* cppcheck fix memsetClassFloat

* cppcheck fix unsignedLessThanZero

* supressing all errors on simde gitmodule

* fix typo (unsignedLessThanZero)

* fix cppcheck suppress simde gitmodule

* cppcheck-suppress unsignedLessThanZero

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>
Revert the code that produced the regression error in VectorCamp#317 
Add the regression error to a unit test regressions.cpp along with the rebar tests

---------

Co-authored-by: gtsoul-tech <gtsoulkanakis@gmail.com>
An old commit (24ae167) had the side effect of moving cmake defines after
they were being used. This patch move them back to be defined before being used.
Speed hsbench back up by ~ 0.8%

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
…ning (VectorCamp#332)

* Clang 17+ is more restrictive on rebind<T> on MacOS/Boost, remove warning

* More clang/boost warnings on MacOS, disable for now
VectorCamp#333)

Fixed out of bounds read in AVX512VBMI version of fdr_exec_fat_teddy (VectorCamp#322)

  * Replaced the 32 byte read with a properly truncated mapped read
  * Added a unit test

Co-authored-by: Rafał Dowgird <rafal.dowgird@rtbhouse.com>
Multiple AVX512VBMI-related fixes:

src/nfa/mcsheng_compile.cpp: No need for an assert here, impl_id can be set to 0
src/nfa/nfa_api_queue.h: Make sure this compiles on both C++ and C
src/nfagraph/ng_fuzzy.cpp: Fix compilation error when DEBUG_OUTPUT=on
src/runtime.c: Fix crash when data == NULL
unit/internal/sheng.cpp: Unit test has to enable AVX512VBMI manually as autodetection does not get trigger, this causes test to fail
src/fdr/teddy_fat.cpp: AVX512 loads need to be 64-bit aligned, caused a crash on clang-18
* added static libraries in cmake to fix unit-internal seg fault in freebsd, ppc64le, gcc13 error
* Moved gcc13 flags for freebsd-gcc13 in cmake/cflags-ppc64le.make
* Add regression test for double shufti

It tests for false positive at vector edges.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

* Fix double shufti reporting false positives

Double shufti used to offset one vector, resulting in losing one character
at the end of every vector. This was replaced by a magic value indicating a
match. This meant that if the first char of a pattern fell on the last char of
a vector, double shufti would assume the second character is present and
report a match.
This patch fixes it by keeping the previous vector and feeding its data to the
new one when we shift it, preventing any loss of data.

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

* vshl() will call the correct implementation

* implement missing vshr_512_imm(), simplifies caller x86 code

* Fix x86 case, use alignr instead

* it's the reverse, the avx512 alignr is incorrect, need to fix

* Make shufti's OR reduce size agnostic

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

* Fix test's array size

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>

* Fix AVX2/AVX512 alignr implementations and unit tests

* Fix Power VSX alignr

---------

Signed-off-by: Yoan Picchi <yoan.picchi@arm.com>
Co-authored-by: Konstantinos Margaritis <konstantinos@vectorcamp.gr>
Prevents overwriting GNUCC_ARCH with an empty value when parsing output
of gcc -Q --help=target. Ensures robustness if detection fails and
returns an empty string.

Signed-off-by: Ibrahim Kashif <ibrahim.kashif@arm.com>
* Add entry for Changelog
* Add new contributors
* Bump library version
@barracuda156
Copy link
Author

There is no VSX pre-ISA 2.06, therefore simde used.

@barracuda156
Copy link
Author

@markos Looks like ppc64le CI failures are tests-related: https://buildbot-ci.vectorcamp.gr/#/builders/199/builds/248
Should be unrelated.

set(ARCH_FLAG mcpu)
elseif (ARCH_PPC)
# Avoid mcpu, it can produce broken code on 32-bit ppc.
set(ARCH_FLAG mtune)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

care to give an example?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set(GNUCC_ARCH power8)
set(TUNE_FLAG power8)
elseif(ARCH_PPC)
set(GNUCC_ARCH G5)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct, if we're going to add PPC then surely PPC is 32-bit and G4 and PPC64 is going to be G5-class. If you have only tested on G5, then this needs to be set to ARCH_PPC64BE. If you've tested on G5 but running 32-bit, then tuning for G5 is going to make the work not run on older PowerPC CPUs. If we're going to add support, we're going to do it properly or not at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then tuning for G5 is going to make the work not run on older PowerPC CPUs

This is not true, provided -mtune is used. It optimized scheduling of insns, but does not use unsupported ones.
-mcpu won’t be compatible.

If we're going to add support, we're going to do it properly or not at all.

You are right, it is better to consider both cases. I will think how to do it in a better way.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind you, this is just the cmake stuff, I will not add it yet to the develop branch as the code is not BE-friendly. I can suggest the following, we create a develop-ppc64BE branch and we try to make things work there. I can also do the following, I have a ppc64 VM on a Power9 I can setup to work inside our CI as a test bed. IF we get things working there, then we can consider moving to 32-bit powerpc (though I doubt it's worth it, tbh). But, I will not consider 32-bit powerpc first. So, a couple of action points. If you're still interested, try adding ARCH_PPC64BE only and we can work on fixing the failures in compilation and unit tests.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will post details when I have the VM and branch ready.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you. I make this a draft first.

@barracuda156 barracuda156 marked this pull request as draft October 10, 2025 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.