DBCollect/Troubleshooting
DBCollect Troubleshooting
Tips
- Debug
- If something fails, run dbcollect with the
-D
(debug) option. This will show the content of exception debug messages and print the full dbcollect.log logfile when finished.
- Logfile
- dbcollect writes log messages to
/tmp/dbcollect.log
. This file will be moved into the dbcollect ZIP file when dbcollect finishes. View the contents of /tmp/dbcollect.log or dump it from the zip file:unzip -qc /tmp/dbcollect-hostname.zip hostname/dbcollect.log
- Sending failed results
- If something goes wrong, you can send the ZIP file anyway, this allows me to inspect the debug messages and find out what went wrong.
Problems with executing DBCollect
dbcollect itself will not run for some reason.
DBCollect Error and warning messages
dbcollect runs but spits out error or warning messages.
Almost all errors and warning messages have a message ID. For errors, this is E followed by a 3-digit number, for warnings, this is W followed by a 3-digit number. The list of messages can be found here:
List of error and warning messages
DBcollect runs for a very long time
This is most likely due to performance issues in Oracle with the generation of AWR reports. It is especially the case if you run dbcollect on Oracle RAC. There are a number of known issues that cause it to be very slow. I have observed up to 20 seconds per AWR report (usually about 0.5 seconds) making dbcollect run for over an hour or more for a typical 10-day, 1-hour interval, single database cluster. Check Support notes for fixes and workarounds: 2404906.1, 2565465.1, 2318124.1, 29932310.8, 2148489.1, 29470291.8. Or be very patient.
Update: As of version 1.11, dbcollect runs multiple AWR reports in parallel for each instance, making it much faster, by default 50% of CPUs with a max of 8.
To use all available CPUs, use --tasks 0
(note that CPU load will likely go to 100%, be careful)