DBCollect/Overview: Difference between revisions

From Dirty Cache Wiki
Jump to navigation Jump to search
No edit summary
Line 12: Line 12:


[https://github.com/outrunnl/dbcollect DBcollect on Github]
[https://github.com/outrunnl/dbcollect DBcollect on Github]
== How does it work ==
Admin runs DBCollect on a database host
Can be a VM or bare metal host
Should be run on all database nodes in scope of the project
Supported are Linux (EL6 or higher, runs out of the box), AIX, HP-UX or Solaris (may require some steps to make it work)
DBCollect detects running Oracle databases (instances)
Generates capacity and performance reports
Also picks up OS config and performance data (SAR/Sysstat)
Output is created as a ZIP file in TEMP dir
/tmp/dbcollect-<hostname>.zip
Customer sends ZIP files for further processing
Secure FTP, or direct upload to LoadMaster website (via provided secure UUID url)
Additional info to be provided, such as
Licensing details
Grouping (which systems are prod, test/dev, DR etc), VM-to-Hypervisor mapping
Any current issues (performance, other)
Physical location
SLA requirements (RPO/RTO, yearly growth, platform requirements/preferences i.e. hypervisors, storage/servers, etc)


= Why DBcollect =
= Why DBcollect =

Revision as of 11:21, 10 October 2024


Overview

What is DBCollect

DBCollect is a performance and metadata collection tool for servers running Oracle databases, providing workload and config data from database hosts It is NOT (yet?) an official DELL tool - but written by Bart Sjerps as an independent Open-Source project

It collects database and host information from Oracle database servers in a standardized way, so it can easily be processed by workload analysis tools. It supports Enterprise Linux (RHEL, OEL, CentOS) as well as AIX and Solaris (SPARC).

DBcollect on Github

How does it work

Admin runs DBCollect on a database host Can be a VM or bare metal host Should be run on all database nodes in scope of the project Supported are Linux (EL6 or higher, runs out of the box), AIX, HP-UX or Solaris (may require some steps to make it work) DBCollect detects running Oracle databases (instances) Generates capacity and performance reports Also picks up OS config and performance data (SAR/Sysstat) Output is created as a ZIP file in TEMP dir /tmp/dbcollect-<hostname>.zip Customer sends ZIP files for further processing Secure FTP, or direct upload to LoadMaster website (via provided secure UUID url) Additional info to be provided, such as Licensing details Grouping (which systems are prod, test/dev, DR etc), VM-to-Hypervisor mapping Any current issues (performance, other) Physical location SLA requirements (RPO/RTO, yearly growth, platform requirements/preferences i.e. hypervisors, storage/servers, etc)

Why DBcollect

In the past, I frequently got workload data from customers in all kinds of strange formats (unusable AWR formats, reports in text format, with far too large AWR intervals, etc). Unix info was usually either not available at all or a random dump of output of some commands. By standardizing the reporting in a single tool, we get consistent, reliable database and host information.

This is needed for creating workload assessments where the data must be machine readable in a standardized format.

In most cases it automatically detects all running Oracle instances on the host.

DBcollect architecture

DBcollect is written in Python. It supports Python 2 and 3 (mainly because a lot of legacy systems only have Python 2 installed). It also contains a few SQL scripts that are executed by Oracle SQL*Plus. It is distributed as a ZipApp package. This means it is not binary compiled, but all Python and SQL files are bundled in a ZIP file in such a way that it can be executed as a single command but users can inspect the contents by simply running unzip:

unzip -v /usr/local/bin/dbcollect

Collected data

  • Host CPU, memory, network interface and disk info
  • Running processes, file systems, installed packages, os version and release etc.
  • Hardware vendor, product etc where possible
  • SAR reports
  • AWR or Statspack reports for the given period (default 10 days)
  • Additional database information (such as capacity of tablespaces and disks, settings, compression analysis, backup, ...)

For detailed info on what data is collected, most commands and files are in the config section: DBcollect config

Removing SQL code

DBcollect generates HTML formatted Oracle AWR reports. These reports usually contain snippets of SQL queries. These snippets may sometimes help us in finding the root cause of slow queries, but if it is prohibited to send SQL queries outside of the organization, the SQL sections can be completely removed from the DBcollect ZIP files:

dbcollect --strip

Note that this only works for HTML-formatted AWR reports, it does NOT work for Statspack.

Safety

DBcollect is designed to run on production systems and great effort has been spent on making it robust. The SECURITY section on the github page explains in great detail how this works. Note that dbcollect has an option to remove SQL code from AWR reports, in case this is condidered a security or confidentiality issue.

root user

DBcollect does NOT run as root (ever). If you execute it as root, it will re-launch itself as another user, usually 'oracle' or 'nobody' on systems without Oracle. You can force another user:

dbcollect --user oraprod

Example

# Install dbcollect via the quick install metohd
[root@db01 ~](-) # curl https://raw.githubusercontent.com/outrunnl/dbcollect/master/scripts/download | python
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1925  100  1925    0     0   4372      0 --:--:-- --:--:-- --:--:--  4375
Github dbcollect downloader - download the latest dbcollect release
-> Retrieving GitHub metadata from https://api.github.com/repos/outrunnl/dbcollect/releases/latest
-> Downloading binary version 1.6.0 from https://github.com/outrunnl/dbcollect/releases/download/v1.6.0/dbcollect
-> md5 hash: c943738c2f589758add20f1ef495423f
-> Saving binary as dbcollect
-> Setting permissions to mode 775
# Move dbcollect in /usr/local/bin ($PATH)
[root@db01 ~](-) # sudo mv dbcollect /usr/local/bin
# Test version
[root@db01 ~](-) # dbcollect -V
dbcollect 1.6.0 - collect Oracle AWR/Statspack, database and system info
Author:    Bart Sjerps <bart@outrun.nl>
Copyright: Copyright 2020, Bart Sjerps
License:   GPLv3+, https://www.gnu.org/licenses/gpl-3.0.html
Version:   1.6.0
Builddate: 2021-54-24 09:54

Verify dbcollect source code

# Verify dbcollect contents
# As dbcollect is actually a ZIP file, you could unzip it and verify the code
[root@db01 ~](-) # unzip -v /usr/local/bin/dbcollect
Archive:  /usr/local/bin/dbcollect
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
       0  Stored        0   0% 06-24-2021 09:53 00000000  modules/
       0  Stored        0   0% 06-24-2021 09:53 00000000  sql/
       0  Stored        0   0% 06-04-2021 14:38 00000000  lib/
    7890  Stored     7890   0% 06-24-2021 09:53 69c764c0  dbcollect.py
    2060  Stored     2060   0% 03-02-2021 08:38 808202b2  modules/updater.py
       0  Stored        0   0% 06-04-2021 14:38 00000000  modules/__init__.py
    4211  Stored     4211   0% 06-04-2021 14:38 94eb789d  modules/awrstrip.py
    7470  Stored     7470   0% 06-08-2021 11:19 ca6e6bd2  modules/syscollect.py
   18030  Stored    18030   0% 06-24-2021 09:53 c2123dc2  modules/oracle.py
       0  Stored        0   0% 09-29-2020 16:23 00000000  sql/__init__.py
    2883  Stored     2883   0% 12-09-2020 20:57 ff3541aa  sql/instance.sql
    5765  Stored     5765   0% 12-09-2020 20:57 11ea706c  sql/database.sql
    5726  Stored     5726   0% 12-09-2020 20:57 e7cb4bcb  sql/pdbinfo.sql
     580  Stored      580   0% 06-04-2021 14:38 d31abc0d  sql/awr_report.sql
     562  Stored      562   0% 06-04-2021 14:38 8b34f5af  sql/awrusage.sql
    1966  Stored     1966   0% 06-04-2021 14:38 ad771e51  sql/getawrs.sql
    1974  Stored     1974   0% 06-04-2021 14:38 434e6838  sql/getsps.sql
     580  Stored      580   0% 06-04-2021 14:38 95a440f6  sql/sp_report.sql
   22899  Stored    22899   0% 06-24-2021 09:53 0cce1b41  sql/dbinfo.sql
      87  Stored       87   0% 10-30-2020 20:54 967cb29d  lib/__init__.py
    1800  Stored     1800   0% 10-16-2020 16:33 086a8907  lib/user.py
    3228  Stored     3228   0% 06-04-2021 14:38 e1d507b6  lib/archive.py
    3101  Stored     3101   0% 06-04-2021 14:38 5487c955  lib/functions.py
    2471  Stored     2471   0% 06-04-2021 14:38 d351f5b2  lib/log.py
      58  Stored       58   0% 06-24-2021 09:54 8fed2ee0  __main__.py
--------          -------  ---                            -------
   93341            93341   0%                            25 files

Run dbcollect

# Run dbcollect (default 10 days of AWR, user 'oracle'
dbcollect 1.6.0 - collect Oracle AWR/Statspack, database and system info
INFO     : dbcollect 1.6.0 - database and system info collector
INFO     : Current user is oracle
INFO     : Zip file is /tmp/dbcollect-db01.lan.zip
INFO     : Collecting OS info (Linux)
INFO     : Collecting Linux SAR files
INFO     : Collecting Oracle info
INFO     : Found instance DEMO on /u01/app/oracle/product/12.1.0/dbhome_1
WARNING  : Skipped DEMO (DOWN)
INFO     : Found instance DEMO2 on /u01/app/oracle/product/12.1.0/dbhome_1
WARNING  : Skipped DEMO2 (DOWN)
INFO     : Found instance demo on /u01/app/oracle/product/12.1.0/dbhome_1
WARNING  : Skipped demo (DOWN)
INFO     : Found instance DB01 on /u01/app/oracle/product/12.1.0/dbhome_1
INFO     : DB01: AWR usage detected, generating reports 
INFO     : Generating Oracle AWR/Statspack reports, with 1 workers
INFO     : processing instance DB01
report 15 of 230 (6.5% done), elapsed: 0:00:34, remaining: 0:08:18
INFO     : Zip file /tmp/dbcollect-db01.lan.zip is created succesfully.
INFO     : Finished

Review collected data

# Inspect ZIP file contents
[root@db01 ~](-) # unzip -v /tmp/dbcollect-db01.lan.zip 
Archive:  /tmp/dbcollect-db01.lan.zip
dbcollect version=1.6.0 hostname=db01.lan
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
     541  Defl:N      272  50% 07-06-2021 13:07 0fea7e9c  db01.lan/meta.json
    1247  Defl:N      663  47% 07-06-2021 13:07 176503d7  db01.lan/cmd/lscpu
    1133  Defl:N      145  87% 07-06-2021 13:07 a248c133  db01.lan/cmd/lsscsi
      40  Defl:N       27  33% 07-06-2021 13:07 4cbb2c4b  db01.lan/cmd/sestatus
    2476  Defl:N      865  65% 07-06-2021 13:07 94874f95  db01.lan/cmd/lsmod
  139387  Defl:N    25763  82% 07-06-2021 13:07 7d28e51b  db01.lan/cmd/dmesg
   20691  Defl:N     3664  82% 07-06-2021 13:07 912b04bc  db01.lan/cmd/psef
   26229  Defl:N     4734  82% 07-06-2021 13:07 0bac6129  db01.lan/cmd/psfaux
    5050  Defl:N      889  82% 07-06-2021 13:07 aafb6b4a  db01.lan/cmd/lsblk
    1076  Defl:N      406  62% 07-06-2021 13:07 a73ecda5  db01.lan/cmd/dfpt
     326  Defl:N      196  40% 07-06-2021 13:07 85199d6c  db01.lan/cmd/iplink
     433  Defl:N      181  58% 07-06-2021 13:07 5e9c5f56  db01.lan/cmd/ipaddr
   45599  Defl:N    17256  62% 07-06-2021 13:07 1e259717  db01.lan/cmd/rpmqf
   28046  Defl:N     6136  78% 07-06-2021 13:07 7d437fa2  db01.lan/cmd/sysctla
    2020  Defl:N      648  68% 07-06-2021 13:07 b7ca76b3  db01.lan/proc/cpuinfo
    1310  Defl:N      521  60% 07-06-2021 13:07 128cc245  db01.lan/proc/meminfo
     378  Defl:N      151  60% 07-06-2021 13:07 83f093db  db01.lan/proc/filesystems
     979  Defl:N      296  70% 07-06-2021 13:07 562875d5  db01.lan/proc/partitions
     536  Defl:N      299  44% 07-06-2021 13:07 696196ae  db01.lan/proc/devices
    3142  Defl:N      764  76% 07-06-2021 13:07 32da3154  db01.lan/proc/mounts
      40  Defl:N       42  -5% 07-06-2021 13:07 f842b1fd  db01.lan/proc/mdstat
     215  Defl:N      157  27% 07-06-2021 13:07 2091f2f6  db01.lan/proc/misc
      22  Defl:N       24  -9% 07-06-2021 13:07 c1ca737d  db01.lan/proc/uptime
     393  Defl:N      226  43% 11-23-2020 16:08 358d6782  db01.lan/etc/os-release
      37  Defl:N       39  -5% 11-23-2020 16:08 36030dbe  db01.lan/etc/system-release
     103  Defl:N      102   1% 01-03-2021 15:34 02058559  db01.lan/etc/issue
      87  Defl:N       86   1% 01-03-2021 15:34 ad1f98c9  db01.lan/etc/motd
    1019  Defl:N      520  49% 05-31-2021 21:05 547a8978  db01.lan/etc/oratab
      13  Defl:N       15 -15% 06-20-2021 00:16 875d0c1f  db01.lan/sys/class/dmi/id/sys_vendor
      24  Defl:N       26  -8% 06-21-2021 11:59 a4480e0b  db01.lan/sys/class/dmi/id/product_name
      33  Defl:N       35  -6% 06-21-2021 11:59 fbe92f48  db01.lan/sys/class/dmi/id/board_name
      18  Defl:N       20 -11% 06-20-2021 00:16 5deaba97  db01.lan/sys/class/dmi/id/board_vendor
...
  605964  Defl:N    75815  88% 06-27-2021 23:50 fbb4552b  db01.lan/var/log/sa/sa27
  605964  Defl:N    76052  87% 06-28-2021 23:50 5704a62d  db01.lan/var/log/sa/sa28
  605964  Defl:N    76113  87% 06-29-2021 23:50 8267a285  db01.lan/var/log/sa/sa29
  605964  Defl:N    75803  88% 06-30-2021 23:50 d7478b20  db01.lan/var/log/sa/sa30
  574442  Defl:N    54492  91% 07-06-2021 13:07 1f290b08  db01.lan/oracle/DB01/DB01_awrrpt_1_24717_24718_20210627_0000.html
  627738  Defl:N    64554  90% 07-06-2021 13:07 554dcc81  db01.lan/oracle/DB01/DB01_awrrpt_1_24718_24719_20210627_0100.html
  583958  Defl:N    57024  90% 07-06-2021 13:07 3c7b8d7e  db01.lan/oracle/DB01/DB01_awrrpt_1_24719_24720_20210627_0201.html
...
   24290  Defl:N     4504  82% 07-06-2021 13:07 49383845  db01.lan/oracle/DB01/dbinfo.txt
    4011  Defl:N      636  84% 07-06-2021 13:07 c0304c39  db01.lan/oracle/DB01/pdbinfo.txt
  127795  Defl:N     3911  97% 07-06-2021 13:07 551eaccc  db01.lan/oracle/DB01/reports.sql
   41510  Defl:N     3589  91% 07-06-2021 13:15 b742ecf9  db01.lan/dbcollect.log
--------          -------  ---                            -------
157350796         16088556  90%                            403 files