DBCollect/Overview: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
[[Category:DBCollect]] | [[Category:DBCollect]] | ||
{{PrevNext|DBC:Errors/DBC-E002|DBCollect/Considerations}} | {{PrevNext|DBC:Errors/DBC-E002|DBCollect/Considerations}}__NOTOC__ | ||
__NOTOC__ | |||
= Overview = | = Overview = | ||
Revision as of 15:33, 10 October 2024
Overview
What is DBCollect
DBCollect is a performance and metadata collection tool for servers running Oracle databases, providing workload and config data from database hosts
It is NOT (yet?) an official DELL tool - but written by Bart Sjerps as an independent Open-Source project
It collects database and host information from Oracle database servers in a standardized way, so it can easily be processed by workload analysis tools. It supports Enterprise Linux (RHEL, OEL, CentOS) as well as AIX and Solaris (SPARC).
How does it work
- System or Database Administrator runs DBCollect on a database host
- Can be a VM or bare metal host
- Should be run on all database nodes in scope of the project
- Supported are Linux (EL6 or higher, runs out of the box), AIX, HP-UX or Solaris (may require some steps to make it work)
- DBCollect detects running Oracle databases (instances)
- Generates capacity and performance reports
- Also picks up OS config and performance data (SAR/Sysstat)
- Output is created as a ZIP file in TEMP dir
- /tmp/dbcollect-<hostname>.zip
- Customer sends ZIP files for further processing
- Secure FTP, or direct upload to LoadMaster website (via provided secure UUID url)
- Additional info to be provided, such as
- Licensing details
- Grouping (which systems are prod, test/dev, DR etc), VM-to-Hypervisor mapping
- Any current issues (performance, other)
- Physical locations
- SLA requirements (RPO/RTO, yearly growth, platform requirements/preferences i.e. hypervisors, storage/servers, etc)
Why DBcollect
In the past, I frequently got workload data from customers in all kinds of strange formats (unusable AWR formats, reports in text format, with far too large AWR intervals, etc). Unix info was usually either not available at all or a random dump of output of some commands. By standardizing the reporting in a single tool, we get consistent, reliable database and host information.
This is needed for creating workload assessments where the data must be machine readable in a standardized format.
In most cases it automatically detects all running Oracle instances on the host.
Problem
- Customers find all kinds of creative ways to provide system data
- Excel sheets, Word docs, strange RAR or TAR files, AWR reports with insanely long or short intervals (6 weeks ) or weird language settings, etc.
- Not easily processed by automated tools (Try to load Excel into a database table)
- Still missing capacity and system data (how big is the database? What processor is being used? Compression? Encryption? Backups?)
- Usually only for one or few databases (customer may have hundreds)
- Often unreliable (outdated, non-deterministic metrics, and so on)
Solution
- Provide a standard way of getting OS and Database performance/config data
- Oracle AWR, Linux/UNIX SAR, Database configuration, OS configuration
- Retrieves historic data (which is already stored on the OS and in the databases)
- All (running) Oracle instances, at least 10 days (Oracle) or 1 month (OS) performance
- Be as secure and safe as possible
- Cannot change anything on the databases
- Only reads/writes files in the /tmp directory
- Does not pick up security sensitive data (passwords, etc)
- Honor customer licenses (in particular, Diagnostics Pack is required to generate AWRs)
- Make it as easy as possible for the administrator to run it
- Tool consists of one file, requires running one command only, simple installation
- Generates only one output (ZIP) file in the /tmp directory
- Open-Source (GitHub, GPLv3 licensed)
- Users can review the code (Python, SQL*Plus scripts and commands)
- Users are encouraged to help testing, writing improvements (via Git pull requests or just by email)
DBcollect architecture
DBcollect is written in Python. It supports Python 2 and 3 (mainly because a lot of legacy systems only have Python 2 installed). It also contains a few SQL scripts that are executed by Oracle SQL*Plus. It is distributed as a ZipApp package. This means it is not binary compiled, but all Python and SQL files are bundled in a ZIP file in such a way that it can be executed as a single command but users can inspect the contents by simply running unzip:
unzip -v /usr/local/bin/dbcollect
Collected data
- Host CPU, memory, network interface and disk info
- Running processes, file systems, installed packages, os version and release etc.
- Hardware vendor, product etc where possible
- SAR reports
- AWR or Statspack reports for the given period (default 10 days)
- Additional database information (such as capacity of tablespaces and disks, settings, compression analysis, backup, ...)
For detailed info on what data is collected, most commands and files are in the config section: DBcollect config
Removing SQL code
DBcollect generates HTML formatted Oracle AWR reports. These reports usually contain snippets of SQL queries. These snippets may sometimes help us in finding the root cause of slow queries, but if it is prohibited to send SQL queries outside of the organization, the SQL sections can be completely removed from the DBcollect ZIP files:
dbcollect --strip
Note that this only works for HTML-formatted AWR reports, it does NOT work for Statspack.
Safety
DBcollect is designed to run on production systems and great effort has been spent on making it robust. The SECURITY section on the github page explains in great detail how this works. Note that dbcollect has an option to remove SQL code from AWR reports, in case this is condidered a security or confidentiality issue.
root user
DBcollect does NOT run as root (ever). If you execute it as root, it will re-launch itself as another user, usually 'oracle' or 'nobody' on systems without Oracle. You can force another user:
dbcollect --user oraprod
Example
# Install dbcollect via the quick install metohd
curl https://raw.githubusercontent.com/outrunnl/dbcollect/master/scripts/download | python
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2154 100 2154 0 0 4018 0 --:--:-- --:--:-- --:--:-- 4011
Github dbcollect downloader - download the latest dbcollect release
-> Retrieving GitHub metadata from https://api.github.com/repos/outrunnl/dbcollect/releases/latest
-> Downloading binary version 1.16.3 from https://github.com/outrunnl/dbcollect/releases/download/v1.16.3/dbcollect
-> md5 hash: 6d080140269bf9fe14039cfbfbd6806e
-> Saving binary as /usr/local/bin/dbcollect
-> Setting permissions to mode 755
# Test version
[root@db08 ~](-) # dbcollect --version
dbcollect 1.16.3 - collect Oracle AWR/Statspack, database and system info
Author: Bart Sjerps <info@dirty-cache.com>
Copyright: Copyright 2023, Bart Sjerps
License: GPLv3+, https://www.gnu.org/licenses/gpl-3.0.html
Version: 1.16.3
Builddate: 2024-07-01 15:04
Buildhash: 1e08f98a7ca3b546d166c964300be221f952c7a0
Verify dbcollect source code
# Verify dbcollect contents
# As dbcollect is actually a ZIP file, you could unzip it and verify the code
unzip -v /usr/local/bin/dbcollect
Archive: /usr/local/bin/dbcollect
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
0 Stored 0 0% 07-01-2024 15:01 00000000 lib/
0 Stored 0 0% 07-01-2024 15:01 00000000 modules/
0 Stored 0 0% 07-01-2024 15:01 00000000 sql/
7816 Stored 7816 0% 06-10-2024 13:06 267803f6 dbcollect.py
0 Stored 0 0% 10-31-2023 19:14 00000000 lib/__init__.py
257 Stored 257 0% 07-01-2024 15:04 34d07d31 lib/buildinfo.py
2767 Stored 2767 0% 05-30-2024 15:33 d95147e0 lib/user.py
1984 Stored 1984 0% 05-30-2024 15:34 5272c204 lib/archive.py
2322 Stored 2322 0% 05-30-2024 15:34 fdd6efa2 lib/functions.py
--- snip ---
209 Stored 209 0% 02-05-2024 08:38 446743e2 sql/splunk/readme.txt
79 Stored 79 0% 02-05-2024 08:38 3b1af7bd sql/splunk/splunk_header.sql
58 Stored 58 0% 07-01-2024 15:04 8fed2ee0 __main__.py
-------- ------- --- -------
166163 166163 0% 81 files
Run dbcollect
# Run dbcollect (default 10 days of AWR, user 'oracle'
dbcollect 1.6.0 - collect Oracle AWR/Statspack, database and system info
INFO : dbcollect 1.6.0 - database and system info collector
INFO : Current user is oracle
INFO : Zip file is /tmp/dbcollect-db01.lan.zip
INFO : Collecting OS info (Linux)
INFO : Collecting Linux SAR files
INFO : Collecting Oracle info
INFO : Found instance DEMO on /u01/app/oracle/product/12.1.0/dbhome_1
WARNING : Skipped DEMO (DOWN)
INFO : Found instance DEMO2 on /u01/app/oracle/product/12.1.0/dbhome_1
WARNING : Skipped DEMO2 (DOWN)
INFO : Found instance demo on /u01/app/oracle/product/12.1.0/dbhome_1
WARNING : Skipped demo (DOWN)
INFO : Found instance DB01 on /u01/app/oracle/product/12.1.0/dbhome_1
INFO : DB01: AWR usage detected, generating reports
INFO : Generating Oracle AWR/Statspack reports, with 1 workers
INFO : processing instance DB01
report 15 of 230 (6.5% done), elapsed: 0:00:34, remaining: 0:08:18
INFO : Zip file /tmp/dbcollect-db01.lan.zip is created succesfully.
INFO : Finished
Review collected data
# Inspect ZIP file contents
[root@db01 ~](-) # unzip -v /tmp/dbcollect-db01.lan.zip
Archive: /tmp/dbcollect-db01.lan.zip
dbcollect version=1.6.0 hostname=db01.lan
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
541 Defl:N 272 50% 07-06-2021 13:07 0fea7e9c db01.lan/meta.json
1247 Defl:N 663 47% 07-06-2021 13:07 176503d7 db01.lan/cmd/lscpu
1133 Defl:N 145 87% 07-06-2021 13:07 a248c133 db01.lan/cmd/lsscsi
40 Defl:N 27 33% 07-06-2021 13:07 4cbb2c4b db01.lan/cmd/sestatus
2476 Defl:N 865 65% 07-06-2021 13:07 94874f95 db01.lan/cmd/lsmod
139387 Defl:N 25763 82% 07-06-2021 13:07 7d28e51b db01.lan/cmd/dmesg
20691 Defl:N 3664 82% 07-06-2021 13:07 912b04bc db01.lan/cmd/psef
26229 Defl:N 4734 82% 07-06-2021 13:07 0bac6129 db01.lan/cmd/psfaux
5050 Defl:N 889 82% 07-06-2021 13:07 aafb6b4a db01.lan/cmd/lsblk
1076 Defl:N 406 62% 07-06-2021 13:07 a73ecda5 db01.lan/cmd/dfpt
326 Defl:N 196 40% 07-06-2021 13:07 85199d6c db01.lan/cmd/iplink
433 Defl:N 181 58% 07-06-2021 13:07 5e9c5f56 db01.lan/cmd/ipaddr
45599 Defl:N 17256 62% 07-06-2021 13:07 1e259717 db01.lan/cmd/rpmqf
28046 Defl:N 6136 78% 07-06-2021 13:07 7d437fa2 db01.lan/cmd/sysctla
2020 Defl:N 648 68% 07-06-2021 13:07 b7ca76b3 db01.lan/proc/cpuinfo
1310 Defl:N 521 60% 07-06-2021 13:07 128cc245 db01.lan/proc/meminfo
378 Defl:N 151 60% 07-06-2021 13:07 83f093db db01.lan/proc/filesystems
979 Defl:N 296 70% 07-06-2021 13:07 562875d5 db01.lan/proc/partitions
536 Defl:N 299 44% 07-06-2021 13:07 696196ae db01.lan/proc/devices
3142 Defl:N 764 76% 07-06-2021 13:07 32da3154 db01.lan/proc/mounts
40 Defl:N 42 -5% 07-06-2021 13:07 f842b1fd db01.lan/proc/mdstat
215 Defl:N 157 27% 07-06-2021 13:07 2091f2f6 db01.lan/proc/misc
22 Defl:N 24 -9% 07-06-2021 13:07 c1ca737d db01.lan/proc/uptime
393 Defl:N 226 43% 11-23-2020 16:08 358d6782 db01.lan/etc/os-release
37 Defl:N 39 -5% 11-23-2020 16:08 36030dbe db01.lan/etc/system-release
103 Defl:N 102 1% 01-03-2021 15:34 02058559 db01.lan/etc/issue
87 Defl:N 86 1% 01-03-2021 15:34 ad1f98c9 db01.lan/etc/motd
1019 Defl:N 520 49% 05-31-2021 21:05 547a8978 db01.lan/etc/oratab
13 Defl:N 15 -15% 06-20-2021 00:16 875d0c1f db01.lan/sys/class/dmi/id/sys_vendor
24 Defl:N 26 -8% 06-21-2021 11:59 a4480e0b db01.lan/sys/class/dmi/id/product_name
33 Defl:N 35 -6% 06-21-2021 11:59 fbe92f48 db01.lan/sys/class/dmi/id/board_name
18 Defl:N 20 -11% 06-20-2021 00:16 5deaba97 db01.lan/sys/class/dmi/id/board_vendor
...
605964 Defl:N 75815 88% 06-27-2021 23:50 fbb4552b db01.lan/var/log/sa/sa27
605964 Defl:N 76052 87% 06-28-2021 23:50 5704a62d db01.lan/var/log/sa/sa28
605964 Defl:N 76113 87% 06-29-2021 23:50 8267a285 db01.lan/var/log/sa/sa29
605964 Defl:N 75803 88% 06-30-2021 23:50 d7478b20 db01.lan/var/log/sa/sa30
574442 Defl:N 54492 91% 07-06-2021 13:07 1f290b08 db01.lan/oracle/DB01/DB01_awrrpt_1_24717_24718_20210627_0000.html
627738 Defl:N 64554 90% 07-06-2021 13:07 554dcc81 db01.lan/oracle/DB01/DB01_awrrpt_1_24718_24719_20210627_0100.html
583958 Defl:N 57024 90% 07-06-2021 13:07 3c7b8d7e db01.lan/oracle/DB01/DB01_awrrpt_1_24719_24720_20210627_0201.html
...
24290 Defl:N 4504 82% 07-06-2021 13:07 49383845 db01.lan/oracle/DB01/dbinfo.txt
4011 Defl:N 636 84% 07-06-2021 13:07 c0304c39 db01.lan/oracle/DB01/pdbinfo.txt
127795 Defl:N 3911 97% 07-06-2021 13:07 551eaccc db01.lan/oracle/DB01/reports.sql
41510 Defl:N 3589 91% 07-06-2021 13:15 b742ecf9 db01.lan/dbcollect.log
-------- ------- --- -------
157350796 16088556 90% 403 files