[CALCITE-7436] Test: Add high-coverage Jazzer fuzzing for Avatica core modules#300
[CALCITE-7436] Test: Add high-coverage Jazzer fuzzing for Avatica core modules#300vishalcoc44 wants to merge 6 commits intoapache:mainfrom
Conversation
|
Is there a JIRA issue for this feature? |
|
https://issues.apache.org/jira is the JIRA |
Alright, I'll file a ticket! |
CALCITE-7436 |
|
@julianhyde could you re trigger the workflow |
|
@vishalcoc44 , I've approved and trigged the workflows. |
the checks are good, once this PR is merged, I plan to submit a follow-up PR to the google/oss-fuzz repository to update the Avatica project configuration. This will enable the OSS-Fuzz infrastructure to build and run these new fuzzers directly from the upstream source, is that okay? Could I coordinate with you @F21 ? |
|
I am not familiar with OSS-Fuzz or Avatica internals, so I will defer the code review to other committers who have more knowledge in this area. I am, however, happy to coordinate and assist in any way to get this contribution merged. |
|
As a starter, can you please subscribe to the dev mailing list and start a discussion around these changes? See https://calcite.apache.org/community/#mailing-lists for instructions. It will bring more visibility to your proposed changes and allow input from community members. |
Hey, thanks for the info, i have raised a
alright, i subscribed to the mailing list. Since we are gonna have all the fuzzers in this repo, we should have a clfuzz workflow over here which will run fuzzers everytime someone pushes changes to this repo automatically. so the three new additions i've ,made to this existing commit are the two new fuzzer files and the workflow script. |
|
@vishalcoc44, the PR needs to be reviewed and approved, and unfortunately I don't have enough knowledge of the internals to do so. Please start a discussion on the mailing list as I suggested to solicit interest and discussion from the community. |
I have started a discussion thread on the dev mailing list as suggested. |
|
can anyone check this out? |
mihaibudiu
left a comment
There was a problem hiding this comment.
Frankly, the code looks fine, but I haven't really studied how the fuzzer infrastructure works.
Does it do anything useful?
I think the proof would be in exhibiting at least one bug it has found.
There must be some.
Ran the fuzzers locally and found 4 bugs in under 5 minutes.. errors like assertion, parsing crash, etc. |
|
Great, that is validation that the work is useful. |
Where do i file the issues? on github or Jira? |
|
Issues in JIRA. PRs on github. |
I've filed two issues with JIRA, |
|
If I understand this right, since the fuzzer will immediately find bugs, the Avatica CI won't pass until we fix the easy to find ones? |
yep, but we can make it such that it reports the bugs in logs without failing the ci, or we make make it run in a separate workflow too which notifies us |
|
If no one reads the logs, it's as if they are not there. |
the default notification mechanism for this kind of fuzzing is usually being handled by google's system itself, if you check out the code in the link below you'll see that some people have been configured to receive mails whenever the fuzzing reports any issues : (https://github.com/google/oss-fuzz/blob/master/projects/calcite-avatica/project.yaml) the issues are sent to their mail as well as put up in the oss fuzz issues page over here for this project : (https://issues.oss-fuzz.com/issues?q=calcite-avatica) usually this is how people are reported about bugs in other projects |
Added Jazzer fuzzing to hit the actually important parts that had 0% OSS-Fuzz coverage:
JsonService + Jackson (nested/garbage JSON in & out)
ProtobufTranslationImpl (corrupted/truncated protobuf → POJO)
TypedValue factory (nasty type codes, overflows, nulls, scales)
AvaticaSite.get(...) (15+ JDBC/SQL types: DECIMAL precisions, timestamps, etc.)
Changes:
Added com.code_intelligence:jazzer-api to testImplementation (core/build.gradle.kts)
New fuzzers in core/src/test/java/org/apache/calcite/avatica/fuzz/
Results so far:
Coverage in RPC + type layers went from ~0% → thousands of lines
Catches bad payloads that could previously OOM, CPU spike, or throw ugly exceptions