@@ -447,9 +447,16 @@ These runtime errors are handled in the following manner:
447447* ``application_runtime_error `` - a ``tensorflow.errors.InternalError `` error
448448 is raised. The error message contains the reason why the error occurred. An
449449 IPU reset will be performed before the next execution of a Poplar program.
450- * ``recoverable_runtime_error `` with a recovery action ``poplar::RecoveryAction::IPU_RESET `` - a ``tensorflow.errors.InternalError `` error
451- is raised. The error message contains the reason why the error occurred. An
452- IPU reset will be performed before the next execution of a Poplar program.
450+ * ``recoverable_runtime_error ``- a ``tensorflow.errors.InternalError `` error
451+ is raised. The error message contains the reason why the error occurred
452+ and `recovery_action ` string attribute.
453+ This attribute can contain:
454+
455+ - `IPU_RESET `: IPU reset will be performed before the next execution of a Poplar program.
456+ - `LINK_RESET `: Reset the IPU-Links in a non-Pod system. This retrains the IPU-Links between IPUs.
457+ - `PARTITION_RESET `: Reset the IPU partition in a Pod system. This retrains the IPU-Links between IPUs.
458+ - `FULL_RESET `: Power cycle the system.
459+
453460* Unknown runtime errors - a ``tensorflow.errors.Unknown `` error
454461 is raised. The error message might contain the reason why the error occurred.
455462 When these errors occur manual intervention is required before the system is
@@ -459,3 +466,5 @@ These runtime errors are handled in the following manner:
459466 When these errors occur manual intervention might be required before the
460467 system is operational again. The error message might contain a required
461468 recovery action.
469+
470+
0 commit comments