-
Notifications
You must be signed in to change notification settings - Fork 33
Description
I'm running ci-queue in a bit of an alternative setting. I've got a Ruby on Rails app which has a long test suite, and I'm running 8 concurrent instances of RSpec on Heroku CI, but on one and the same dyno, which also runs Redis in-dyno.
I start the whole chain with the following script
export CI_QUEUE_URL=$([ -z "$REDIS_URL" ] && echo "redis://127.0.0.1:6379" || echo "$REDIS_URL")
for num in $(seq 1 $PARALLEL_COUNT); do
RUBYOPT="-W0" \
CI_NODE_INDEX=$(expr $num - 1) \
DATABASE_URL=$([ "$num" -ne "1" ] && echo $DATABASE_URL$num || echo $DATABASE_URL) \
rspec-queue \
--timeout 180 \
--max-consecutive-failures 10 \
--max-requeues 5 \
--requeue-tolerance 5 $@ &
done
wait
rspec-queue --reportEvery now and then when this runs on Heroku CI, 7 out of the 8 workers end at the same time, but it seems the 8th one keeps on running for hours until Heroku kills my build after 2 hours that usually passes in about 10-15 minutes. It's as if it doesn't understand that the queue is done.
Randomized with seed 57482
.
Finished in 9 minutes 51 seconds (files took 10.22 seconds to load)
85 examples, 0 failures, 1 pending
Randomized with seed 15517
< THIS IS WHERE I EXPECT THE REPORT >
.......................................................................................................................-----> test command `bin/test` failed with signal: terminated
While there are 8 instances running, the word Finished only shows up 7 times in the log, so I assume what I see here is the finishing of workers 6 and 7, and the following dots being part of worker 8.
I have no idea how to debug this further, but I'd love to add more details if someone can help me in the right direction.
A little more research
I've compared the output of a successful build and a failed build.
- A successful build had 8 workers finish with a total of 1216 examples, an average number of 152.
- A failed build had 7 workers finish with a total of 1194 examples reported.
- This means that at the time they finished, the 8th worker only processed 22 examples, which is remarkebly few compared to the average of 170 examples processed by the other 7. (This could potentially corroborate your theory). However:
- After that, the 8th worker drew 130 dots, which as I can only imagine refer to specs that already passed in different workers.