TensorFlow Debugger V2

Official documentation
https://www.tensorflow.org/tensorboard/debugger%5Fv2

The tfdbg CLI is now available for TensorFlow 2.0.

It’s no longer a TUI/curses-like interface, but rather a CLI which you connect to TensorBoard, which serves as its GUI.

How to use

This shell script is an example of how to interact with the debugger.

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/v2/examples%5Fv2%5Ftest.sh

The above script shows the usage of the --dump_dir flag.

You may then view in tensorboard. Below is an example.

1
2
python -m tensorflow.python.debug.examples.v2.debug_mnist_v2 \
    --dump_dir /tmp/tfdbg2_logdir --dump_tensor_debug_mode FULL_HEALTH
1
tensorboard --logdir /tmp/tfdbg2_logdir

CLI args

1
python3.8 -m tensorflow.python.debug.examples.v2.debug_mnist_v2 --helpfull
Demo of the tfdbg curses CLI: Locating the source of bad numerical values with TF v2.

This demo contains a classical example of a neural network for the mnist
dataset, but modifications are made so that problematic numerical values (infs
and nans) appear in nodes of the graph during training.

flags:

absl.app:
  -?,--[no]help: show this help
    (default: 'false')
  --[no]helpfull: show full help
    (default: 'false')
  --[no]helpshort: show this help
    (default: 'false')
  --[no]helpxml: like --helpfull, but generates XML output
    (default: 'false')
  --[no]only_check_args: Set to true to validate args and exit.
    (default: 'false')
  --[no]pdb: Alias for --pdb_post_mortem.
    (default: 'false')
  --[no]pdb_post_mortem: Set to true to handle uncaught exceptions with PDB post
    mortem.
    (default: 'false')
  --profile_file: Dump profile information to a file (for python -m pstats).
    Implies --run_with_profiling.
  --[no]run_with_pdb: Set to true for PDB debug mode
    (default: 'false')
  --[no]run_with_profiling: Set to true for profiling the script. Execution will
    be slower, and the output format might change over time.
    (default: 'false')
  --[no]use_cprofile_for_profiling: Use cProfile instead of the profile module
    for profiling. This has no effect unless --run_with_profiling is set.
    (default: 'true')

absl.logging:
  --[no]alsologtostderr: also log to stderr?
    (default: 'false')
  --log_dir: directory to write logfiles into
    (default: '')
  --logger_levels: Specify log level of loggers. The format is a CSV list of
    `name:level`. Where `name` is the logger name used with
    `logging.getLogger()`, and `level` is a level name  (INFO, DEBUG, etc). e.g.
    `myapp.foo:INFO,other.logger:DEBUG`
    (default: '')
  --[no]logtostderr: Should only log to stderr?
    (default: 'false')
  --[no]showprefixforinfo: If False, do not prepend prefix to info messages when
    it's logged to stderr, --verbosity is set to INFO level, and python logging
    is used.
    (default: 'true')
  --stderrthreshold: log messages at this level, or more severe, to stderr in
    addition to the logfile.  Possible values are 'debug', 'info', 'warning',
    'error', and 'fatal'.  Obsoletes --alsologtostderr. Using --alsologtostderr
    cancels the effect of this flag. Please also note that this flag is subject
    to --verbosity and requires logfile not be stderr.
    (default: 'fatal')
  -v,--verbosity: Logging verbosity level. Messages logged at this level or
    lower will be included. Set to 1 for debug logging. If the flag was not set
    or supplied, the value will be changed from the default of -1 (warning) to 0
    (info) after flags are parsed.
    (default: '-1')
    (an integer)

absl.testing.absltest:
  --test_random_seed: Random seed for testing. Some test frameworks may change
    the default value of this flag between runs, so it is not appropriate for
    seeding probabilistic tests.
    (default: '301')
    (an integer)
  --test_randomize_ordering_seed: If positive, use this as a seed to randomize
    the execution order for test cases. If "random", pick a random seed to use.
    If 0 or not set, do not randomize test case execution order. This flag also
    overrides the TEST_RANDOMIZE_ORDERING_SEED environment variable.
    (default: '')
  --test_srcdir: Root of directory tree where source files live
    (default: '')
  --test_tmpdir: Directory for temporary testing files
    (default: '/tmp/absl_testing')
  --xml_output_file: File to store XML test results
    (default: '')

tensorflow.python.ops.parallel_for.pfor:
  --[no]op_conversion_fallback_to_while_loop: DEPRECATED: Flag is ignored.
    (default: 'true')

tensorflow.python.tpu.client.client:
  --[no]runtime_oom_exit: Exit the script when the TPU runtime is OOM.
    (default: 'true')

absl.flags:
  --flagfile: Insert flag definitions from the given file into the command line.
    (default: '')
  --undefok: comma-separated list of flag names that it is okay to specify on
    the command line even if the program does not define a flag with that name.
    IMPORTANT: flags in this list that have arguments MUST use the --flag=value
    format.
    (default: '')

TensorFlow Debugger V1

Relevant source material
TensorFlow Debugger Screencast - YouTube
DebugTFBasics
https://github.com/tensorflow/tensorflow/issues/29679

V1 of tfdbg had a TUI (terminal user interface).

https://www.w3cschool.cn/doc%5Ftensorflow%5Fguide/tensorflow%5Fguide-programmers%5Fguide-debugger.html

Wrapping TensorFlow Sessions With tfdbg

Add the following lines of code to use tfdbg and then contain the Session object using a debugger wrapper.

1
from tensorflow.python import debug as tf_debug
  • CLI should be called before and after Session.run() if you wish to take control of the execution and know the internal state of the graph.
  • Filters can be added for assisting the diagnosis.
    In the provided example, there is already a filter called tfdbg.has_inf_or_nan, which determine the presence of nan or inf in any in-between tensors, which are neither inputs nor outputs.
  • You may write your own custom filters.

Debugging TensorFlow Model Training with tfdbg

This only works for TensorFlow 1 at the moment.

Typical use-cases

  • finding NaNs
  • finding Infs (infinities)

1
ewwlinks +/"4. Debugging TensorFlow Model Training with tfdbg" "https://data-flair.training/blogs/tensorflow-debugging/"

How to run tfdbg.

1
python3.7 -m tensorflow.python.debug.examples.debug_mnist –debug

As you pointed out, tfdbg is originally designed for TF v1.x and is centered around the tf.Session API.

As tf.Session is replaced by eager execution and tf.functions in 2.0, the debugger feature needs to be adapted to the architectural change.

If you have a specific use case (e.g., finding Infinities and NaNs, or other debugging workflows), please let us know.

It’ll inform the design of the new version of tfdbg.

Automating tfdbg with emacs

I had planned on building a mode for emacs that includes syntax highlighting and keyboard macros to improve the tfdb experience but V2 now uses TensorBoard as it’s user interface, so there is no longer a very good need for this.