Skip to content

[draft] change sys.exit to exception#5269

Closed
PauloVLB wants to merge 8 commits into
devfrom
build-manager-exception
Closed

[draft] change sys.exit to exception#5269
PauloVLB wants to merge 8 commits into
devfrom
build-manager-exception

Conversation

@PauloVLB
Copy link
Copy Markdown
Collaborator

@PauloVLB PauloVLB commented May 6, 2026

Remove calls for log_fatal_and_exit from build_manager when it fails to make space for builds

IvanBM18 and others added 8 commits May 4, 2026 21:50
Currently if we run fuzzing sessions on swarming, the metrics dont seem
to register those fuzzing hours correctly:
<img width="952" height="676" alt="image"
src="https://github.com/user-attachments/assets/024db79d-9a24-4b34-9591-c2a49904dc69"
/>
Note that the swarming fuzzing hours appear as empty: `runtime=`

- This PR adds swarming as a possible runtime
- Adds swarming in the error count metric
- Now that theres no circular dependency between `logs` <->
`environment` modules, i removed some methods that were previously
declared twice to avoid said circular dependency, now they are correctly
imported from a single module
- Also adds unit tests


## Tests performed
Changes present in dev since before the last `dev` branch reset(which is
around 22th of April).

#### Worth noting error logs in `dev` env:
Heres a list of logs that came to my attention:

There is a recurring PERMISSION_DENIED exception in the
testcase_manager.py. It specifically points to "Missing or insufficient
permissions" when calling the Datastore API via gRPC.
```
method: "_do_run_testcase_and_return_result_in_queue"
path: "/mnt/scratch0/clusterfuzz/src/clusterfuzz/_internal/bot/testcase_manager.py"
}
message: "Exception occurred while running run_testcase_and_return_result_in_queue.
Traceback (most recent call last):
  File "/mnt/scratch0/clusterfuzz/src/third_party/google/cloud/ndb/_datastore_api.py", line 98, in rpc_call
    result = yield rpc
             ^^^^^^^^^
	status = StatusCode.PERMISSION_DENIED
	details = "Missing or insufficient permissions."
	debug_error_string = "UNKNOWN:Error received "
>
```
Not related since we are not making any changes nor to auth or to the
data store api.

```
raise RuntimeError('Request to %s failed. Code: %d. Reason: %s' %
RuntimeError: Request to https://storage.googleapis.com....
<Error><Code>SignatureDoesNotMatch</Code><Message>Access denied.</Message><Details>The request signature we calculated does not match the signature you provided. Check your Google secret key and signing method.</Details>
```
A weird error group, seems related to missing permissions or mismatch
across service accounts, may be worth looking, or may come from other
tests that someone is performing in dev

---------

Co-authored-by: Diego Jardon <[email protected]>
…ronment (#5260)

`libfuzzer_chrome_msan` bots have errors caused by the `cwd` being wrong
when we execute the fuzzer. We are looking to pass `BUILD_DIR` to `cwd`
so that fuzzers are executed from the archive root.

Bug: https://crbug.com/507025973, https://crbug.com/326101784#comment46

- Add optional `cwd` argument to `LibFuzzerRunner` class which passes to
parent `ProcessRunner` class which will run`subprocess.Popen` with the
`cwd` argument.
- When a `LibFuzzerRunner` is requested with `get_runner()` the `cwd` is
set to `BUILD_DIR` if `FUZZ_TARGET_CWD_IS_BUILD_DIR` is true
I was running into an issue running the server in the devcontainer when
I had initialized the config on my host machine.

This PR checks and cleans up broken symlinks correctly

After this change I can run the server in the devcontainer and the
symlink is fixed
Temporarily disabling the K8s E2E test workflow. It retains the workflow
to be triggered manually for testing.

This is failing and we are working on fixing, until then let's disable
it to avoid PRs from being marked as not passing our CI.
We use this field to determine whether a fuzzer is blackbox or an engine
guided fuzzer. In order to aggregate stats for blackbox fuzzers, we need
to be able to query by the `builtin` field, which requires it to be
indexed.


https://docs.cloud.google.com/datastore/docs/concepts/indexes#unindexed_properties

The migration to index the field for existing fuzzers just requires
rewriting the Fuzzers in the datastore without actually making any
changes. Then the index will be created under the hood.

Testing
I ran this in dev and verified I can filter by `builtin == False`
…5265)

Adds cron job to aggregate fuzzer stats into a daily bigquery table
`fuzzer_stats.daily_stats`.

## Context

We will use this to benchmark our blackbox fuzzers, previously we
couldn't easily join the fuzzing hours from BigQuery with the bugs filed
by clusterfuzz in our dashboards. We need a separate aggregated table
because the `fuzzer_stats` `JobRun` tables are all in separate datasets
per fuzzer, and we can't simply query across all of those datasets in
BigQuery or Plx.

The cron job defaults to yesterdays stats so we can run it after the
stats are loaded into bigquery, but takes a date flag so we can backfill
days as necessary.

### Idempotency
Whenever a date is inserted, the schema uses `WRITE_TRUNCATE` with a
date partition to overwrite all of the rows for that date. So if the job
runs multiple times for the same day, it will not add additional rows
but overwrite any previous rows for that date.

This simplifies edge cases where the job fails or runs multiple times.
We can just make sure the last run of the job succeeds and the data will
be correct. It will just pull in the latest data on the JobRun tables
for the fuzzers.


#### Example query:
```
select fuzzer_name,
SUM(fuzzing_duration) as fuzzing_duration,
SUM(testcases_executed) as testcases_executed,
from `your-project.fuzzer_stats.daily_stats`
group by fuzzer_name
order by fuzzing_duration desc
limit 1000;
```

The remaining work here is to set up the cron job configuration. This PR
only adds the logic for the job.
[crbug.com/501066151](https://crbug.com/501066151)

### Related PRs:
These migrate the bigquery and datastore schemas to support the new
fields
#5264
#5263


### Testing
Ran this against the dev data and verified that the fuzzer stats
bigquery table is populated.
Logs from dev: https://paste.googleplex.com/4884361662038016

After the job inserted the aggregated rows into BigQuery, I was able to
compare the aggregated testcase stats and fuzzing hours between fuzzers
for a given date range.
@PauloVLB PauloVLB closed this May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants