DB-hub Technology Oracle Lab – DGMGRL fast-start failover(FSFO)

Oracle Lab – DGMGRL fast-start failover(FSFO)

1.FSFO简介

The Fast-Start Failover Observer is an Oracle Data Guard Broker component that allows the DBA automate failover tasks and have more peaceful nights of sleep.

主数据库出现故障时,能快速与可靠的把standby切换成primary,在整个过程中不需要DBA干预。fast-start failover只能通过dgmgrl与Enterprise Managerg来配置。
当数据库以正常模式(shutdown immediate/normal/transactional)关闭时,系统不会触发fast-start failover。 使用非正常关闭方式shutdown abort关闭primary时会触发fast-start failover。

只有maximum availability mode 和 maximum performance mode才能启用fast-start failover模式。在maximum availability模式下面,在切换时可以保证无数据丢失,在maximum performance mode下面,会有数据丢失,丢失多少数据由FastStartFailoverLagLimit这个参数来配置。

当observer与指定的primary数据库失去连接的时间超过FastStartFailoverThreshold后,observer就会启动fast-start failover 到standby数据库。如果配置了FastStartFailoverPmyShutdown为true,原来的primary数据库将会自动的shutdown。如果配置FastStartFailoverAutoReinstate为true,failover完成后,启动数据库时,会自动的执行Reinstate database,把原来的主数据库变成备库,并与新主库进行同步。

2.FSFO 故障切换的条件

默认情况下,当且仅当满足以下条件时,观察器才会启动到目标备用数据库的故障切换:

  • 观察器正在运行
  • 观察器和备用数据库均与主数据库失去联系
    注:如果观察器与主数据库失去联系,但是备用数据库并未失去联系,观察器可以确定主数据库仍然通过备用数据库运行。
  • 观察器仍然保持与备用数据库的联系
  • 满足持久性约束条件
  • 故障切换阈值延时已过

用户可配置的故障切换条件(11g 及更高版本)
Oracle 数据库 11g 第 1 版引入了用户可配置的故障切换条件,这些条件可以触发观察器立即启动故障切换。

运行状况条件

Broker 可配置为在以下任一条件下启动故障切换。条件1,2,3是默认启用的。

  1. Datafile Offline(由于 IO 错误)
  2. Corrupted Controlfile
  3. Corrupted Dictionary
  4. Inaccessible Logfile(由于 IO 错误)
  5. Stuck Archiver

Oracle 错误 (ORA-NNNNN)

您可以指定一个将启动 FSFO 故障切换的 ORA 错误列表。该列表默认为空。

应用程序启动

应用程序可直接使用 DBMS_DG.INITIATE_FS_FAILOVER 过程启动 FSFO 故障切换,并包括一个可选的在观察器日志和主数据库警报日志中显示的消息文本。

3.FSFO 环境的主要组件

FSFO 构建于众多其它 Oracle 技术和特性(如 Data Guard、闪回数据库和 Data Guard Broker)之上。

3.1.Data Guard

FSFO 的基础是 Data Guard — 一个主数据库和至少一个备用数据库。备用数据库可以是物理的或逻辑的,可以有多个备用数据库,但只有一个备用数据库可作为随时进行故障切换的目标。

3.2.最高可用性模式

最高可用性模式 (Oracle 数据库 10g 第 2 版及更高版本)

在最高可用性模式中,FSFO 保证在故障切换期间不会丢失已收到提交确认的事务。该保证的代价是增加提交延迟(log file sync 等待事件)。最高可用性模式使用同步重做传输,FSFO 则增加了额外要求,即重做应记录在目标备用数据库(log_archive_dest_n 的 AFFIRM 选项)的备用重做日志 (SRL) 中。Overall commit latency is increased by the round-trip network latency. 往返网络延迟增加了总体提交延迟。增加的延迟降低了吞吐量;然而,有些时候吞吐量的差异是由增加的并行造成的。
虽然重做传输是同步的,但如果备用数据库由于某种原因(如备用数据库、主机或网络故障)不可用,最高可用性模式会保持主数据库可用。如果在用户指定的时间段(log_archive_dest_n 的 NET_TIMEOUT 选项)之后,主数据库无法联系到备用数据库,它将退出同步传输模式并开始像在最高性能模式中一样进行操作。当备用数据库再次可用时,主数据库和备用数据库重新同步并恢复同步重做传输。

最高性能模式 (Oracle 数据库 11g 第 1 版及更高版本)

Oracle 数据库 11g FSFO 增加了对最高性能模式的支持(异步重做传输),提供了用持久性换取性能的灵活性。提交延迟不受重做传输的影响,但备用数据库未收到其重做的已提交事务将在故障切换期间丢失。通过指定故障切换期间丢失事务的最大允许时间,最高性能模式中的 FSFO 配置可以限制潜在的数据丢失。例如,如果指定的限定值为 30 秒(默认),FSFO 将保证在故障切换期间保存 30 秒钟内提交的所有事务。 最小允许限定值为 10 秒钟。

3.3.Data Guard Broker

Broker 是一个 Data Guard 管理实用程序,用于维护有关主数据库及其备用数据库的状态信息。 它自动设置与 Data Guard 相关的数据库初始化参数(如实例启动和角色转换)、启动备用数据库的应用服务,并且自动执行与维护 Data Guard 配置相关的许多管理任务。FSFO 是 Broker 的一个特性,用于记录故障切换目标的相关信息,例如,故障发生后到触发故障切换之间的等待时间以及 FSFO 的其他特有属性。

3.4.闪回数据库(Flashback)

闪回数据库是一个集成在 Oracle 数据库中的持续数据保护 (CDP) 解决方案。它使用名为闪回日志的磁盘数据结构,提供一个将数据库快速恢复到之前时间点或 SCN 的方法。数据库闪回比传统的时间点或基于 SCN 的恢复速度更快、结合更完美(一条简单的 DDL 语句)。FSFO 将闪回数据库用作将故障主数据库恢复为备用数据库流程的一部分。

自动恢复的问题通常因为错误的配置,因此我们来了解一些详细信息。
闪回数据库记录经过更改的数据块的前映像。为了避免记录每个数据块的每次更改的开销,闪回数据库每 30 分钟进行一次“模糊”快照,仅记录前映像块上一次快照后的第一次更改。在同一快照期间中,不再记录对同一数据块的后续更改。

数据库的闪回分成两个阶段:

恢复 — 闪回数据库将数据文件恢复到指定 SCN 前最近的快照。这可以与执行从指定 SCN 前的备份进行数据文件 RMAN 恢复相比,但是速度更快。
介质恢复 — 恢复完成后,恢复将作为典型介质恢复继续进行,根据存档重做日志和联机重做日志应用重做并通过撤消回滚未提交的更改。这意味着为了使闪回数据库操作成功,闪回数据库需要在快照时间和恢复 SCN 之间生成的所有存档重做日志(通常为重做后的 30 分钟)。使用 V$RECOVERY_PROGRESS 视图监视恢复状态。
对于 FSFO 环境,设置 db_flashback_retention_target = 60 或更高值,可以为自动备用恢复提供足够的闪回数据库历史记录。模糊快照的元数据存储在闪回日志本身中。如果没有元数据,Oracle 将无法找到模糊快照,从而无法进行闪回。为了避免计时差异产生的问题,我们建议值设置为不少于 60 分钟,实际上,如果值设置为 30 或 30 以下,肯定会导致闪回数据库故障。

闪回数据库将日志存储在快速恢复区 (FRA) 中,所以 FRA 必须有足够大的空间来存储至少 60 分钟的闪回数据库历史记录。 总的存储需求与快照期间更改的不同数据块的数量成比例,例如,一小组数据块上的 1,000,000 次块更改生成的闪回数据库历史记录小于一大组数据块上的 1,000,000 次块更改所生成的闪回数据库历史记录。确定闪存数据库存储需求的一个好方法是,启用闪存数据库并观察其在几次峰值负载时所使用的存储量。通过启用闪回数据库来确定其存储需求也有一定的风险 — 如有必要,在主数据库处于打开状态时可以将其禁用。然而,重新启用闪回数据库将需要回弹,因为数据库必须进行安装且未打开。

3.5.FSFO 观察器

观察器是非常典型的主/备用 Data Guard 配置中的第三方。它实际上是一个内置于 DGMGRL CLI(Data Guard Broker 命令行界面)中、占用空间很小的 OCI 客户端,与其他所有客户端一样,可以运行在与数据库服务器不同的硬件平台上。其主要工作是在条件允许时执行故障切换,而不影响 DBA 设置的数据持久性约束条件。只有观察器能启动 FSFO 故障切换。它的另一个工作是在启用该特性的情况下(默认)自动将主数据库恢复为备用数据库。观察器是 Data Guard 故障切换在强健的高可用性解决方案中承担重要角色的关键因素,也是造成 Data Guard 故障切换在 FSFO 出现前后重大差异的关键因素。

注:FSFO 观察器版本必须与数据库版本匹配。 Oracle 数据库 11g 观察器与 10g 数据库不兼容,Oracle 数据库 10g 观察器与 11g 数据库也不兼容。

4.实验环境配置

DB Name:             OTTER
DB Unique Name:      OTTER_DR, OTTER_PR
Host Name:           dg1, dg2, orapg(observer)
dg1:                 192.168.1.180
dg2:                 192.168.1.181
orapg:               192.168.1.199
Data Guard Listener: LISTENER_DG
TNS Alias:           OTTER_DR, OTTER_PR

4.1.配置 Oracle Net

Data Guard 使用 Oracle Net (SQL*Net) 在主数据库和备用数据库以及 FSFO 观察器之间进行通信。正确配置 Oracle Net 是成功部署 FSFO 的一个关键因素。错误的 Oracle Net 配置是导致所报告的 FSFO 问题的主要原因。

注:Data Guard 的正确操作需要专用服务器连接。 不要针对 Data Guard 使用 Shared Server(以前称为 MTS)

4.2.配置监听器

针对应用程序连接和 Data Guard 连接使用不同的监听器是很好的做法。 这样,Data Guard 可以在应用程序监听器因进行维护而停止工作时保持运行。请确保在 local_listeners 数据库参数中包括 Data Guard 监听器。

大多数在 FSFO 环境中使用的网络服务可使用动态注册,但要在角色转换期间或故障切换后的恢复期间启用 Broker 以重新启动实例,您必须定义一个名为 db_unique_name_dgmgrl.db_domain 的静态服务。(注:11.1.0.7 添加了 StaticConnectIdentifierBroker 数据库属性,允许您指定一个不同的服务名称。) 如果您要使用 RMAN 创建备用数据库,还需要一个静态服务来重新启动所创建的数据库。为了将 Broker 和非 Broker 活动分离,建议再定义一个静态服务。

针对主机“dg1”的 listener.ora 配置:

LISTENER=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.180)(PORT=1521)))

SID_LIST_LISTENER =
(SID_LIST=
    (SID_DESC=
        (GLOBAL_DBNAME=OTTER)
        (SID_NAME=OTTER)
    )
    (SID_DESC=
      (SID_NAME=OTTER)
      (GLOBAL_DBNAME=OTTER_PR_DGMGRL)
    )
)


LISTENER_DG =
  (DESCRIPTION=
    (ADDRESS_LIST=
      (ADDRESS=(PROTOCOL=tcp)(HOST=dg1)(PORT=1522))
    )
  )

SID_LIST_LISTENER_DG=
  (SID_LIST=
    (SID_DESC=
      (SID_NAME=OTTER)
      (GLOBAL_DBNAME=OTTER_PR_BROKER)
    )
  )

针对主机“dg2”的 listener.ora 配置:

LISTENER=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.181)(PORT=1521)))

SID_LIST_LISTENER =
(SID_LIST=
    (SID_DESC=
        (GLOBAL_DBNAME=OTTER)
        (SID_NAME=OTTER)
    )
    (SID_DESC=
      (SID_NAME=OTTER)
      (GLOBAL_DBNAME=OTTER_DR_DGMGRL)
    )
)

LISTENER_DG =
  (DESCRIPTION=
    (ADDRESS_LIST=
      (ADDRESS=(PROTOCOL=tcp)(HOST=dg2)(PORT=1522))
    )
  )

SID_LIST_LISTENER_DG=
  (SID_LIST=
    (SID_DESC=
      (SID_NAME=OTTER)
      (GLOBAL_DBNAME=OTTER_DR_BROKER)
    )
  )

4.3.配置命名

为每个数据库创建一个唯一的连接别名。

OTTER_PR_BROKER:用于连接到数据库“dg1”上的静态 Data Guard 服务的别名
OTTER_DR_BROKER:用于连接到数据库“dg2”上的静态 Data Guard 服务的别名

OTTER_PR_BROKER =
  (description =
    (address_list =
      (address = (protocol = tcp)(host = dg1)(port = 1522))
    )
    (connect_data =
      (service_name = OTTER_PR_BROKER)
    )
  )

OTTER_DR_BROKER =
  (description =
    (address_list =
      (address = (protocol = tcp)(host = dg2)(port = 1522))
    )
    (connect_data =
      (service_name = OTTER_DR_BROKER)
    )
  )

4.4.启动 Data Guard 监听器

启动“dg1”和“dg2”主机上的 Data Guard 监听器。

[oracle@dg1:/oracle/app/oracle/product/12.2.0.3/network/admin]$lsnrctl start listener_dg

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 15-JAN-2021 12:48:53

Copyright (c) 1991, 2019, Oracle.  All rights reserved.

Starting /oracle/app/oracle/product/12.2.0.3/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 19.0.0.0.0 - Production
System parameter file is /oracle/app/oracle/product/12.2.0.3/network/admin/listener.ora
Log messages written to /oracle/app/oracle/diag/tnslsnr/dg1/listener_dg/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg1)(PORT=1522)))

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg1)(PORT=1522)))
STATUS of the LISTENER
------------------------
Alias                     listener_dg
Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date                15-JAN-2021 12:48:53
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /oracle/app/oracle/product/12.2.0.3/network/admin/listener.ora
Listener Log File         /oracle/app/oracle/diag/tnslsnr/dg1/listener_dg/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg1)(PORT=1522)))
Services Summary...
Service "OTTER_PR_BROKER" has 1 instance(s).
  Instance "OTTER", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

4.5.测试 Oracle Net 连接

验证来自两个主机的配置。

[oracle@dg1:/oracle/app/oracle/product/12.2.0.3/network/admin]$tnsping OTTER_DR_BROKER

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 15-JAN-2021 12:49:41

Copyright (c) 1997, 2019, Oracle.  All rights reserved.

Used parameter files:
/oracle/app/oracle/product/12.2.0.3/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (description = (address_list = (address = (protocol = tcp)(host = dg2)(port = 1522))) (connect_data = (service_name = OTTER_DR_BROKER)))
OK (10 msec)
[oracle@dg2:/oracle/app/oracle/product/12.2.0.3/network/admin]$tnsping OTTER_PR_BROKER

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 15-JAN-2021 12:50:15

Copyright (c) 1997, 2019, Oracle.  All rights reserved.

Used parameter files:
/oracle/app/oracle/product/12.2.0.3/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (description = (address_list = (address = (protocol = tcp)(host = dg1)(port = 1522))) (connect_data = (service_name = OTTER_PR_BROKER)))
OK (10 msec)

5.准备主数据库

为了使用快速启动故障切换,主数据库必须满足一系列前提条件。本部分描述将如何配置和验证每个前提条件。要查看主数据库是否已满足某个前提条件,请按照验证部分的说明进行操作。

需要主数据库处于mount状态。

5.1.local_listeners 参数

将 Data Guard 监听器添加到 local_listeners 参数。

为了使数据库在 Data Guard 监听器中注册,必须将监听器端点添加到数据库的 local_listener 参数中。如果已经使用了 local_listener,则需将 Data Guard 监听器添加到列表中。设置完 local_listener 之后,将数据库注册到监听器并验证该服务已注册成功。

注:您也可以在设置 local_listener 参数时使用 tnsnames.ora 文件中定义的 TNS 别名。 在向参数值可能超过 255 个字符限制的多个监听器注册时,该方法尤其有用。

设置别名:

tnsnames.ora
LISTENER_DG =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = dg1)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = dg1)(PORT = 1522))
    )
  )

设置LOCAL_LISTENER参数:

alter system set local_listener=LISTENER_DG;
alter system register;

验证:

show parameter local_listener
[oracle@dg1:/oracle/app/oracle/product/12.2.0.3/network/admin]$lsnrctl status listener_dg

LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 15-JAN-2021 12:51:23

Copyright (c) 1991, 2019, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg1)(PORT=1522)))
STATUS of the LISTENER
------------------------
Alias                     listener_dg
Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
Start Date                15-JAN-2021 12:48:53
Uptime                    0 days 0 hr. 2 min. 30 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /oracle/app/oracle/product/12.2.0.3/network/admin/listener.ora
Listener Log File         /oracle/app/oracle/diag/tnslsnr/dg1/listener_dg/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg1)(PORT=1522)))
Services Summary...
Service "OTTERXDB" has 1 instance(s).
  Instance "OTTER", status READY, has 1 handler(s) for this service...
Service "OTTER_CFG" has 1 instance(s).
  Instance "OTTER", status READY, has 1 handler(s) for this service...
Service "OTTER_PR" has 1 instance(s).
  Instance "OTTER", status READY, has 1 handler(s) for this service...
Service "OTTER_PR_BROKER" has 1 instance(s).
  Instance "OTTER", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully

测试 Oracle Net 连接
验证来自两个主机的配置。

[oracle@dg1:/oracle/app/oracle/product/12.2.0.3/network/admin]$tnsping OTTER_DR_BROKER

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 15-JAN-2021 12:52:36

Copyright (c) 1997, 2019, Oracle.  All rights reserved.

Used parameter files:
/oracle/app/oracle/product/12.2.0.3/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (description = (address_list = (address = (protocol = tcp)(host = dg2)(port = 1522))) (connect_data = (service_name = OTTER_DR_BROKER)))
OK (10 msec)
[oracle@dg2:/oracle/app/oracle/product/12.2.0.3/network/admin]$tnsping OTTER_PR_BROKER

TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 15-JAN-2021 12:53:29

Copyright (c) 1997, 2019, Oracle.  All rights reserved.

Used parameter files:
/oracle/app/oracle/product/12.2.0.3/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (description = (address_list = (address = (protocol = tcp)(host = dg1)(port = 1522))) (connect_data = (service_name = OTTER_PR_BROKER)))
OK (0 msec)

5.2.启用强制日志记录

所有 Data Guard 环境都应在数据库级别启用强制日志记录,以避免添加时产生无日志记录的表空间。

启用

alter database force logging;

验证

select force_logging from v$database;
FORCE_LOGGING
---------------------------------------
YES

5.3.创建 spfile

Broker 将在启动和角色转换期间通过 ALTER SYSTEM 命令时更改数据库参数。 保存这些更改将需要一个 spfile。

创建

create spfile='?/dbs/spfile{ORACLE_SID}.ora' from pfile='?/dbs/init{ORACLE_SID}.ora';
alter system set spfile='/oracle/app/oracle/product/12.2.0.3/dbs/spfileOTTER.ora';

验证

show parameter spfile;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
spfile                               string      /oracle/app/oracle/product/12.
                                                 2.0.3/dbs/spfileOTTER.ora

5.4.创建口令文件

所有 Data Guard 环境都需要使用口令文件以使数据库可以互相连接。

创建

orapwd file=ORACLE_HOME/dbs/orapwORACLE_SID

验证

SQL> select USERNAME, SYSDBA, SYSOPER, SYSASM, ACCOUNT_STATUS, PASSWORD_PROFILE from v$pwfile_users;

USERNAME                  SYSDB SYSOP SYSAS ACCOUNT_STATUS                 PASSWORD_PROFILE
------------------------- ----- ----- ----- ------------------------------ --------------------
SYS                       TRUE  TRUE  FALSE OPEN

5.5.启用远程登录

Data Guard 配置中的数据库要进行互相连接,需要带有口令文件的远程登录。

启用

alter system set remote_login_passwordfile=exclusive scope=spfile;

验证

show parameter remote_login_passwordfile
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
remote_login_passwordfile            string      EXCLUSIVE

5.6.设置 db_unique_name

Data Guard 配置中的每个数据库必须有唯一的名称。 本指南使用在 db_name 加一个下划线后接一个字母的命名惯例来创建 db_unique_name。

设置

alter system set db_unique_name = OTTER_PR scope=spfile;

验证

show parameter db_unique_name

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_unique_name                       string      OTTER_PR

5.7.配置快速恢复区

如果您还没有快速恢复区 (FRA),您将需要为闪回数据库创建一个。 如果您已经拥有一个 FRA,可能需要增加其大小以容纳闪回数据库文件。

配置

alter system set db_recovery_file_dest_size = 4g scope=both;
alter system set db_recovery_file_dest = '/oradata/fra' scope=both;

验证

SQL> show parameter db_recovery

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest                string      /oradata/fra
db_recovery_file_dest_size           big integer 4G

5.8.启用自动备用文件管理

非硬性要求,但建议如此。
如果为AUTO则表示primary库的数据文件发生修改(新增,重命名),则在standby库会自动做相应的更改。如果是MANUAL,则表示手动。

启用

alter system set standby_file_management=auto;

验证

show parameter standby_file_management

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
standby_file_management              string      AUTO

5.9.设置 log_archive_config

必须先设置该参数,然后才能在最高可用性模式下打开主数据库。其余 Data Guard 相关参数稍后将本指南中通过 Broker 进行设置。

设置

alter system set log_archive_config='DG_CONFIG=(OTTER_PR,OTTER_DR)' scope=both;

验证

show parameter log_archive_config

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_config                   string      dg_config=(OTTER_PR,OTTER_DR)

5.10.创建备用重做日志

为适应所有加载条件,Oracle 建议 SRL 组至少比相同大小的 ORL 组多一个。本指南假定主数据库和备用数据库上的所有 ORL 和 SRL 大小相同。

创建
确定联机重做日志文件 (ORL) 的数量和大小

select bytes, count(group#) from v$log group by bytes;

     BYTES COUNT(GROUP#)
---------- -------------
  52428800             3

找到最大 group#

select max(group#) from v$log;

MAX(GROUP#)
-----------
          3

添加 SRL。与 ORL 不同,创建的 SRL 每组中仅有一个成员。无需像 ORL 中那样创建多个 SRL 来保护重做(主数据库中的 ORL 中已对重做进行保护)。多个 SRL 只会增加不必要的 IO 并将增加提交延时。对于具有多个 RAID 控制器的系统,考虑创建 SRL,以便在在控制器之间平衡分布其 IO。

在本示例中,有 3 个最大 ORL,最大 group# 为 3。 我们将从 group# 11 开始递增,创建 4 个 SRL。从 11 开始是完全形式化的,这样使稍后添加的新 ORL 组可以保持其 group# 与现有 ORL 拥有相同的顺序。

alter database add standby logfile group 4 '/oradata/data/OTTER_PR/standby_redo04.dbf' size 52428800;
alter database add standby logfile group 5 '/oradata/data/OTTER_PR/standby_redo05.dbf' size 52428800;
alter database add standby logfile group 6 '/oradata/data/OTTER_PR/standby_redo06.dbf' size 52428800;
alter database add standby logfile group 7 '/oradata/data/OTTER_PR/standby_redo07.dbf' size 52428800;

验证

col MEMBER for a50
select group#, type, member from v$logfile where type = 'STANDBY';

    GROUP# TYPE    MEMBER
---------- ------- --------------------------------------------------
         4 STANDBY /oradata/data/OTTER_PR/standby_redo04.dbf
         5 STANDBY /oradata/data/OTTER_PR/standby_redo05.dbf
         6 STANDBY /oradata/data/OTTER_PR/standby_redo06.dbf
         7 STANDBY /oradata/data/OTTER_PR/standby_redo07.dbf

5.11.需要主数据库重启的设置

以下设置要求数据库处于mount状态。 可以在一次重启中将它们全部同时完成。

5.11.1.启用Archive Log模式

启用

alter database archivelog;

验证

select log_mode from v$database;
LOG_MODE
------------
ARCHIVELOG

5.11.2.启用Flashback

启用

alter database flashback on;
alter system set db_flashback_retention_target = 1440 scope=both;

验证

select flashback_on from v$database;
FLASHBACK_ON
------------------
YES
show parameter db_flashback_retention_target
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target        integer     1440

5.11.3.启用最高可用性模式

最高可用性模式对于 Oracle 数据库 10g 是必选的,对于 Oracle 数据库 11g 则是可选的。
本实验使用最大性能模式。

启用

alter database set standby database to maximize availability;

验证

select protection_mode from v$database;

PROTECTION_MODE
--------------------
MAXIMUM AVAILABILITY

6.创建备用数据库

如果您还没有备用数据库,使用您最喜欢的方法创建一个备用数据库。以下示例将利用 11g RMAN 活动数据库复制特性。通过该特性,RMAN 可以通过网络复制现有数据库而无需磁盘或磁带备份。以下部分假定备用主机已根据 Oracle 的建议进行设置,并且操作系统、帐户、安全性、资源限制、目录结构等配置正确。

6.1.复制口令文件到备用数据库

在 Oracle 数据库 11g 中,由于 Oracle 数据库 11g 中安全性的增强,备用数据库中的口令文件必须是主数据库中口令文件的物理副本。Oracle 数据库 10g 中允许使用不同的口令文件,只要主数据库和备用数据库中的 SYS 口令相同即可。

6.2.创建 oratab 条目

向备用数据库的 oratab 文件添加一个条目

vi /etc/oratab
OTTER:/oracle/app/oracle/product/12.2.0.3:N

6.3.创建 init.ora 文件

对于 RMAN 复制活动数据库方法而言,init.ora 文件仅需要一个参数:db_name(甚至不必是数据库的真实名称 — 可使用任意名称)。RMAN 将从主数据库复制 spfile,因此仅在复制的第一阶段需要该 init.ora 文件。

db_name = OTTER

6.4.设置环境

确保备用数据库上的操作系统环境已设置。

6.5.以nomount方式启动备用数据库

startup nomount

6.6.使用 RMAN 创建备用数据库

运行 RMAN 程序并连接到目标(主数据库)和辅助对象(新备用数据库)。

rman target sys/password@OTTER_PR auxiliary sys/password@OTTER_DR_static

connected to target database:  OTTER (DBID=480536289)
connected to auxiliary database: OTTER (not mounted)

将默认设备类型设置为磁盘

CONFIGURE DEFAULT DEVICE TYPE TO DISK;

从活动主数据库复制数据库
Oracle Data Guard 概念和管理文档(10g 第 2 版和 11g 第 1 版)的附录 F 中详细介绍了使用 RMAN 创建备用服务器的过程。本示例使用 11g 中的 FROM ACTIVE DATABASE 子句,该子句允许 RMAN 通过跨网络复制主数据库来创建备用数据库,而无需在磁盘或磁带上存储备份文件。RMAN 还复制 spfile 和口令文件,您可以更改各个参数的值。至少必须设置 db_unique_name。本示例假定备用数据库使用和主数据库相同的的目录结构。

注:如果您刚启用了Archive Log模式,则会强行创建一个Archive Log (alter system archive log current) 以确保至少存在一个存档日志。否则,DUPLICATE TARGET DATABASE 命令将失败,错误消息为“RMAN-20208: UNTIL CHANGE is before RESETLOGS change”。

DUPLICATE TARGET DATABASE
 FOR STANDBY
 FROM ACTIVE DATABASE
 DORECOVER
 SPFILE
 SET db_unique_name='OTTER_DR'
 SET log_archive_config=''
 SET log_file_name_convert= ' ',' '
 SET local_listener='LISTENER_DG'
 NOFILENAMECHECK;

6.7.在备用数据库启用闪回

启用

alter database flashback on;
alter system set db_flashback_retention_target = 1440 scope=both;

验证

select flashback_on from v$database;

FLASHBACK_ON
------------------
YES

show parameter db_flashback_retention_target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_flashback_retention_target        integer     1440

7.Data Guard Broker

7.1.设置 Data Guard Broker 配置文件的位置

Broker 将其配置信息存储在数据库外的一组文件镜像中。默认情况下,两个文件都存储在 \$ORACLE_HOME/dbs 中。要保护这两个文件,比较好的做法是将它们存储在不同的文件系统中。

主数据库

mkdir -p /oracle/app/oracle/admin/OTTER_PR/dgbroker

alter system set dg_broker_config_file1='/oracle/app/oracle/admin/OTTER_PR/dgbroker/otter_pr1.dat';
alter system set dg_broker_config_file2='/oracle/app/oracle/admin/OTTER_PR/dgbroker/otter_pr2.dat';

验证

show parameter dg_broker_config_file

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1               string      /oracle/app/oracle/admin/OTTER
                                                 _PR/dgbroker/otter_pr1.dat
dg_broker_config_file2               string      /oracle/app/oracle/admin/OTTER
                                                 _PR/dgbroker/otter_pr2.dat

备用数据库

mkdir -p /oracle/app/oracle/admin/OTTER_DR/dgbroker

alter system set dg_broker_config_file1='/oracle/app/oracle/admin/OTTER_DR/dgbroker/otter_dr1.dat';
alter system set dg_broker_config_file2='/oracle/app/oracle/admin/OTTER_DR/dgbroker/otter_dr2.dat';

验证

show parameter dg_broker_config_file

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1               string      /oracle/app/oracle/admin/OTTER
                                                 _DR/dgbroker/otter_dr1.dat
dg_broker_config_file2               string      /oracle/app/oracle/admin/OTTER
                                                 _DR/dgbroker/otter_dr2.dat

7.2.启用 Data Guard Broker

启用(主数据库和备用数据库)

alter system set dg_broker_start=true;

验证

show parameter dg_broker_start

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_start                      boolean     TRUE

7.3.Unset log_archive_dest_2 parameters

SQL> alter system set log_archive_dest_2='' scope=both;

When using the Data Guard Broker, you don’t need to set any LOG_ARCHIVE_* parameter for the databases that are part of your Data Guard configuration. The broker is doing that for you. Forget about what you may have heard about VALID_FOR – you don’t need that with the broker. Actually, setting any of the LOG_ARCHIVE_* parameters with an enabled broker configuration might even confuse the broker and lead to error messages.

Error: ORA-16698: member has a LOG_ARCHIVE_DEST_n parameter with SERVICE attribute set

This issue was not happen on older versions Oracle. There are few changes from Oracle 12c. LOG_ARCHIVE_DEST_2 parameter have to be unset as DGMGRL will take care of updating this parameter.

Followed below steps if your see that error:

  1. Disable and Remove Data Guard configuration
DGMGRL> disable configuration;
DGMGRL> remove configuration;
  1. Disable/Enable Data Guard broker on both Primary/Standby
SQL> alter system set dg_broker_start=false scope=both;
SQL> alter system set dg_broker_start=true scope=both;

These will stop and start dmon process.

  1. Disable log_archive_dest_2 on both Primary/Standby
SQL> alter system set log_archive_dest_2='' scope=both;
  1. Configure Data Guard Broker again
create configuration 'FSFO' as primary database is 'OTTER_PR' connect identifier is OTTER_PR_BROKER;
Configuration "FSFO" created with primary database "OTTER_PR"

add database 'OTTER_DR' as connect identifier is OTTER_DR_BROKER maintained as physical;
Database "OTTER_DR" added

A typical example about the redo log transport mode. Refer to:Let the Data Guard Broker control LOG_ARCHIVE parameters

7.4.启动 dgmgrl

在主数据库上启动 dgmgrl 实用程序并以 SYS 身份进行连接。When connecting to the broker, you should always connect to a DMON running on the primary site.

dgmgrl sys/password@OTTER_PR_BROKER
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Tue Jan 12 23:24:13 2021
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "OTTER_PR"
Connected as SYSDBA.

7.5.添加数据库

添加主数据库

create configuration 'FSFO' as primary database is 'OTTER_PR' connect identifier is OTTER_PR_BROKER;

Configuration "FSF" created with primary database "otter_pr"

添加备用数据库

add database 'OTTER_DR' as connect identifier is OTTER_DR_BROKER maintained as physical;

验证配置

show configuration

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_PR - Primary database
    OTTER_DR - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
DISABLED

7.6.编辑数据库属性

编辑属性的命令:

DGMGRL> help edit

Edits a configuration or a member

Syntax:

  EDIT CONFIGURATION SET PROTECTION MODE [AS]
    { MaxProtection | MaxAvailability | MaxPerformance };

例如:

edit database 'OTTER_DR' set property FastStartFailoverTarget=OTTER_PR;
edit database 'OTTER_PR' set property FastStartFailoverTarget=OTTER_DR;
  • LogXptMode
    日志传递方式, 默认情况下,Broker 将主数据库设置为使用异步日志传输。针对最高可用性环境时,需要将此设置更改为同步。
  • NetTimeout
    NetTimeout 属性指定在考虑连接丢失前 LGWR 将阻塞对同步模式中来自备用数据库的确认的等待秒数(对应于 log_archive_dest_n 的 NET_TIMEOUT 选项)。默认值为 30 秒。使用最高可用性模式时,考虑降低该值以减少备用数据库不可用时的提交阻塞时间。选择一个足够高的值,避免由间歇性网络问题引起的假性断开。本示例使用 10 秒钟。
  • ObserverConnectIdentifier(11g 及更高版本)
    Oracle 数据库 11g 将 ObserverConnectIdentifier 数据库属性添加到 Broker 配置,使您可以为观察器指定一个连接标识符,用于监视主数据库和故障切换目标。默认情况下,观察器和 Data Guard 使用相同的连接标识符在主数据库和备用数据库间进行重做传输和信息交换(Oracle 数据库 11g 中为 DGConnectIdentifier,Oracle 数据库 10g 中为 InitialConnectIdentifier)。ObserverConnectIdentifier 使您可以指定观察器使用不同的连接标识符。例如,您可以用此参数使观察器使用与客户端应用程序相同的连接标识符监视数据库。
  • 在本指南中,我们将在保留其他属性的默认值,但您应熟悉所有 Broker 配置和数据库属性。Data Guard Broker 文档(10g 和 11g)第 9 章中包含了每个属性的描述。其中一些属性已经在这两个版本中有所改动。

注: Broker 的许多数据库属性与数据库 spfile 参数相对应。Broker 在角色转换、数据库启动/关闭以及其他事件期间,通过执行相应的 ALTER SYSTEM 命令来维护这些参数。如果这些参数在 Broker 外部进行了修改,将出现警告。要查看特定参数,使用“show database … StatusReport”命令。

本实验需要设置LogXptMode, NetTimeout:

edit database 'OTTER_DR' set property LogXptMode='ASYNC';
edit database 'OTTER_PR' set property LogXptMode='ASYNC';
edit database 'OTTER_PR' set property NetTimeout=10;
edit database 'OTTER_DR' set property NetTimeout=10;

7.7.编辑最高可用模式属性

本实验使用最大性能模式, Data Guard默认设置,以下命令仅供参考,不需要运行。

edit configuration set protection mode as maxavilability;

ORA-16693错误说明:

ORA-16693: requirements not met for enabling fast-start failover
Cause: The attempt to enable fast-start failover could not be completed because one or more requirements were not met:

  • The effective redo transport mode for both the primary database and the fast-start failover target standby database must be set to either SYNC or FASTSYNC if the configuration protection mode is set to MaxProtection or MaxAvailability mode.
  • The effective redo transport mode for both the primary database and the fast-start failover target standby database must be set to ASYNC if the configuration protection mode is set to MaxPerformance mode.
  • No valid target standby database was specified in the primary database FastStartFailoverTarget property prior to the attempt to enable fast-start failover, and more than one standby database exists in the Oracle Data Guard configuration.
  • fast-start failover target standby database did not receive redo data directly from the primary database when the protection mode was set to MaxPerformance mode. – The primary database has multiple routes to the fast-start failover target standby database.
  • Logical standby databases could not be the fast-start failover targets because the configuration protection mode was set to MaxProtection mode.

7.8.启用配置

Broker 将验证配置、设置两个数据库上的参数并启动托管恢复。 这可能需要几分钟的时间。监视主数据库和备用数据库上的警报日志是在监视 Broker 运行和熟悉其如何执行各种任务的好方法。

enable configuration;

验证配置
在继续前确保一切正常运行。

show configuration

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_PR - Primary database
    OTTER_DR - Physical standby database

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 29 seconds ago)

7.9.启用FSFO

现在,启用 FSFO 以确保已满足所有前提。Broker 将在启用 FSFO 前验证配置已满足所有前提条件并将报告发现的任何问题。最常见的问题是不匹配 Data Guard 保护模式和 LogXptMode 属性,以及忘记在主数据库或备用数据库上启用闪回数据库。

注意,启用 FSFO 并不能使其完成自动故障切换的配置 — 这需要我们接下来将介绍的观察器。

DGMGRL> enable fast_start failover;
DGMGRL> show configuration;

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_PR - Primary database
    Warning: ORA-16819: fast-start failover observer not started

    OTTER_DR - (*) Physical standby database
      Warning: ORA-16819: fast-start failover observer not started

Fast-Start Failover: Enabled in Potential Data Loss Mode

Configuration Status:
WARNING   (status updated 56 seconds ago)

在主数据库上运行 StatusReport 应验证导致错误的原因是缺少观察器。

show database 'OTTER_PR' statusreport;
STATUS REPORT
       INSTANCE_NAME   SEVERITY ERROR_TEXT
                   *    WARNING ORA-16819: fast-start failover observer not started

8.配置观察器

为了最大程度地体现 FSFO 的优势,观察器应与主数据库和备用数据库运行在不同主机上。理想情况下,主数据库、备用数据库和观察器将位于不同的地理区域。观察器非常轻型且需要较少的系统资源。观察器与主数据库/备用数据库互连(带宽和延迟将决定性能因素)不同,仅需很少的网络带宽并且对延迟不太敏感,这使其可以用于任何有可靠连接的地点。

由于观察器是 dgmgrl 会话的特定实例,观察器主机应安装 Oracle Client Administrator 软件或完整的 Oracle 数据库软件系列。

8.1.验证到主数据库和备用数据库的连接

[oracle@orapg:/oracle/app/oracle/12.2.0.1/network/admin]$scp oracle@dg1:/oracle/app/oracle/product/12.2.0.3/network/admin/sqlnet.ora .

[oracle@orapg:/oracle/app/oracle/12.2.0.1/network/admin]scp oracle@dg1:/oracle/app/oracle/product/12.2.0.3/network/admin/tnsnames.ora .
[oracle@orapg:/oracle/app/oracle/12.2.0.1/network/admin]$tnsping OTTER_PR_BROKER

TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 15-JAN-2021 12:56:21

Copyright (c) 1997, 2016, Oracle.  All rights reserved.

Used parameter files:
/oracle/app/oracle/12.2.0.1/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (description= (address_list= (address=(protocol=tcp)(host=dg1)(port=1522))) (connect_data= (service_name=OTTER_PR_BROKER)))
OK (10 msec)
[oracle@orapg:/oracle/app/oracle/12.2.0.1/network/admin]$tnsping OTTER_DR_BROKER

TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 15-JAN-2021 12:56:45

Copyright (c) 1997, 2016, Oracle.  All rights reserved.

Used parameter files:
/oracle/app/oracle/12.2.0.1/network/admin/sqlnet.ora


Used TNSNAMES adapter to resolve the alias
Attempting to contact (description= (address_list= (address=(protocol=tcp)(host=dg2)(port=1522))) (connect_data= (service_name=OTTER_DR_BROKER)))
OK (0 msec)

8.2.启动观察器

通过运行 dgmgrl 启动观察器并使用 SYS 凭证登录。 我们现在将以交互方式启动它以验证一切运行正常。注意,启动观察器后,终端会话将呈现挂起状态。这是正常现象。附录提供有关如何创建简单的包装脚本以将观察器作为后台进程启动的信息。

dgmgrl sys/password@OTTER_PR_BROKER
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Wed Jan 13 10:01:01 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "OTTER_PR"
Connected as SYSDBA.
DGMGRL> start observer;
[W000 01/13 10:01:15.60] FSFO target standby is OTTER_DR
[W000 01/13 10:01:17.96] Observer trace level is set to USER
[W000 01/13 10:01:17.96] Try to connect to the primary.
[W000 01/13 10:01:17.96] Try to connect to the primary otter_pr.
[W000 01/13 10:01:17.99] The standby OTTER_DR is ready to be a FSFO target
[W000 01/13 10:01:19.99] Connection to the primary restored!
[W000 01/13 10:01:22.00] Disconnecting from database otter_pr.

此时,终端会话将呈现挂起状态。

8.3.验证配置

在单独的终端会话中,验证配置。 本示例介绍了用于提供 FSFO 特定信息的“show configuration”命令的详细模式。如果状态为 SUCCESS,您可以开始测试角色转换。

dgmgrl sys/password@OTTER_PR_BROKER

DGMGRL> show configuration verbose

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_PR - Primary database
    OTTER_DR - (*) Physical standby database

  (*) Fast-Start Failover target

  Properties:
    FastStartFailoverThreshold      = '30'
    OperationTimeout                = '30'
    TraceLevel                      = 'USER'
    FastStartFailoverLagLimit       = '30'
    CommunicationTimeout            = '180'
    ObserverReconnect               = '0'
    FastStartFailoverAutoReinstate  = 'TRUE'
    FastStartFailoverPmyShutdown    = 'TRUE'
    BystandersFollowRoleChange      = 'ALL'
    ObserverOverride                = 'FALSE'
    ExternalDestination1            = ''
    ExternalDestination2            = ''
    PrimaryLostWriteAction          = 'CONTINUE'
    ConfigurationWideServiceName    = 'OTTER_CFG'

Fast-Start Failover: ENABLED

  Threshold:          30 seconds
  Target:             OTTER_DR
  Observer:           orapg
  Lag Limit:          30 seconds
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: (none)
  Observer Override:  FALSE

Configuration Status:
SUCCESS

查询OTTER_PR的配置:
The Observer will try to connect the databases using the ObserverConnectIdentifier.

If this is null, it will use the DGConnectIdentifier. Make sure to include these two entries in the tnsnames.ora and in the Oracle Wallet of Observer’s host.

DGMGRL> show database verbose 'OTTER_PR';

Database - OTTER_PR

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    OTTER

  Properties:
    DGConnectIdentifier             = 'otter_pr_broker'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = 'OTTER_DR'
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'OPTIONAL'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '10'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '30'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    StandbyFileManagement           = ''
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '0'
    LogArchiveMinSucceedDest        = '0'
    DataGuardSyncLatency            = '0'
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = ''
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'dg1'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=dg1)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=dg1)(PORT=1522)))(CONNECT_DATA=(SERVICE_NAME=OTTER_PR_DGMGRL)(INSTANCE_NAME=OTTER)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /oracle/app/oracle/diag/rdbms/otter_pr/OTTER/trace/alert_OTTER.log
    Data Guard Broker log   : /oracle/app/oracle/diag/rdbms/otter_pr/OTTER/trace/drcOTTER.log

Database Status:
SUCCESS

查询OTTER_DR的配置:

DGMGRL> show database verbose 'OTTER_DR';

Database - OTTER_DR

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          0 seconds (computed 1 second ago)
  Average Apply Rate: 1.00 KByte/s
  Active Apply Rate:  281.00 KByte/s
  Maximum Apply Rate: 281.00 KByte/s
  Real Time Query:    ON
  Instance(s):
    OTTER

  Properties:
    DGConnectIdentifier             = 'otter_dr_broker'
    ObserverConnectIdentifier       = ''
    FastStartFailoverTarget         = 'OTTER_PR'
    PreferredObserverHosts          = ''
    LogShipping                     = 'ON'
    RedoRoutes                      = ''
    LogXptMode                      = 'ASYNC'
    DelayMins                       = '0'
    Binding                         = 'optional'
    MaxFailure                      = '0'
    ReopenSecs                      = '300'
    NetTimeout                      = '10'
    RedoCompression                 = 'DISABLE'
    PreferredApplyInstance          = ''
    ApplyInstanceTimeout            = '0'
    ApplyLagThreshold               = '30'
    TransportLagThreshold           = '30'
    TransportDisconnectedThreshold  = '30'
    ApplyParallel                   = 'AUTO'
    ApplyInstances                  = '0'
    StandbyFileManagement           = ''
    ArchiveLagTarget                = '0'
    LogArchiveMaxProcesses          = '0'
    LogArchiveMinSucceedDest        = '0'
    DataGuardSyncLatency            = '0'
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = ''
    DbFileNameConvert               = ''
    LogFileNameConvert              = ''
    ArchiveLocation                 = ''
    AlternateLocation               = ''
    StandbyArchiveLocation          = ''
    StandbyAlternateLocation        = ''
    InconsistentProperties          = '(monitor)'
    InconsistentLogXptProps         = '(monitor)'
    LogXptStatus                    = '(monitor)'
    SendQEntries                    = '(monitor)'
    RecvQEntries                    = '(monitor)'
    HostName                        = 'dg2'
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=dg2)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=OTTER_DR_DGMGRL)(INSTANCE_NAME=OTTER)(SERVER=DEDICATED)))'
    TopWaitEvents                   = '(monitor)'
    SidName                         = '(monitor)'

  Log file locations:
    Alert log               : /oracle/app/oracle/diag/rdbms/otter_dr/OTTER/trace/alert_OTTER.log
    Data Guard Broker log   : /oracle/app/oracle/diag/rdbms/otter_dr/OTTER/trace/drcOTTER.log

Database Status:
SUCCESS

使用“show fast_start failover”命令查看FSFO故障切换是否有效。

DGMGRL> show fast_start failover;

Fast-Start Failover: ENABLED

  Threshold:          30 seconds
  Target:             OTTER_DR
  Observer:           orapg
  Lag Limit:          30 seconds
  Shutdown Primary:   TRUE
  Auto-reinstate:     TRUE
  Observer Reconnect: (none)
  Observer Override:  FALSE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Write Errors          YES

  Oracle Error Conditions:
    (none)

9.测试

配置的真实测试是成功的进行双向角色转换,包括转换和 FSFO 故障切换。

9.1.测试双向转换(Switch Over)

为了实现完全自动转换,Broker 需要 SYSDBA 凭证以重新启动两个数据库或其中一个数据库。 如果没有该凭证,Broker 仍将完成角色转换,但需要手动重新启动数据库。
转换之前,请确保备库处于实时应用状态,没有GAP。

DGMGRL> switchover to 'OTTER_DR';
Performing switchover NOW, please wait...
Operation requires a connection to database "OTTER_DR"
Connecting ...
Connected to "OTTER_DR"
Connected as SYSDBA.
New primary database "OTTER_DR" is opening...
Operation requires start up of instance "OTTER" on database "OTTER_PR"
Starting instance "OTTER"...
ORACLE instance started.
Database mounted.
Connected to "OTTER_PR"
Switchover succeeded, new primary is "OTTER_DR"

dg1:

SQL> select database_role, open_mode, switchover_status from v$database;

DATABASE_ROLE    OPEN_MODE            SWITCHOVER_STATUS
---------------- -------------------- --------------------
PHYSICAL STANDBY MOUNTED              NOT ALLOWED

dg2:

SQL> select database_role, open_mode, switchover_status from v$database;

DATABASE_ROLE    OPEN_MODE            SWITCHOVER_STATUS
---------------- -------------------- --------------------
PRIMARY          READ WRITE           TO STANDBY

现在,让我们来测试另一个方向的转换。

DGMGRL> switchover to 'OTTER_PR';
Performing switchover NOW, please wait...
Operation requires a connection to database "OTTER_PR"
Connecting ...
Connected to "OTTER_PR"
Connected as SYSDBA.
New primary database "OTTER_PR" is opening...
Operation requires start up of instance "OTTER" on database "OTTER_DR"
Starting instance "OTTER"...
ORACLE instance started.
Database mounted.
Connected to "OTTER_DR"
Switchover succeeded, new primary is "OTTER_PR"

dg1:

SQL> select database_role, open_mode, switchover_status from v$database;

DATABASE_ROLE    OPEN_MODE            SWITCHOVER_STATUS
---------------- -------------------- --------------------
PRIMARY          READ WRITE           TO STANDBY

dg2:

SQL> select database_role, open_mode, switchover_status from v$database;

DATABASE_ROLE    OPEN_MODE            SWITCHOVER_STATUS
---------------- -------------------- --------------------
PHYSICAL STANDBY MOUNTED              NOT ALLOWED

9.2.双向测试 FSFO 故障切换

现在我们知道Switch Over已成功,该测试故障切换了。测试 FSFO 故障切换需要模拟缺少主数据库。最简单的方法是中止主数据库。这将使观察器在达到 FSFO 阈值延时(默认值为 30 秒)后,启动故障切换。在中止主数据库后,查看两个数据库上的警报日志和观察器日志对深入了解 FSFO 故障切换期间所发生的情况有很大帮助。

我们还可以使用 dgmgrl 故障切换命令来启动故障切换。这将演练配置,但触发故障切换的方式与失去与主数据库的连接不同。

检查闪回数据库记忆
我们希望故障切换测试完成后,观察器能够自动将之前的主数据库恢复为备用数据库,因此在每次测试之前,确保闪回数据库至少有 30 分钟的历史记录。每次故障切换测试前执行此操作。如果闪回数据库历史记录不足,观察器将不能进行恢复,而您将需要手动从备份或主数据库副本进行恢复。

在即将中止的主数据库上(主库):

SQL> select (sysdate - oldest_flashback_time)*24*60 as history from v$flashback_database_log;

   HISTORY
----------
19748.2833

如果没有 30 分钟内的可用历史记录,不要启动故障切换。

创建一些测试数据
如果我们不须验证是否满足持久性约束条件,顶多是个测试,因此我们在主数据库上进行一些更改并查看更改是否在故障切换后仍然存在。本指南使用最高可用性模式实现“零数据丢失”。

作为测试用户登录,并进行一些不会影响系统其他部分的更改。

-- Note that DDL statements automatically commit
create table x as select * from all_objects;

Table created.
select count(*) from x;

  COUNT(*)
----------
     71412

启动 FSFO 故障切换
中止主数据库。

shutdown abort

观察器日志:

[W000 01/15 14:07:00.47] Try to connect to the primary.
[W000 01/15 14:07:02.95] Primary database cannot be reached.
[W000 01/15 14:07:03.95] Try to connect to the primary.
[W000 01/15 14:07:27.56] Primary database cannot be reached.
[W000 01/15 14:07:27.57] Fast-Start Failover threshold has not exceeded. Retry for the next 2 seconds
[W000 01/15 14:07:28.57] Try to connect to the primary.
[W000 01/15 14:07:31.23] Primary database cannot be reached.
[W000 01/15 14:07:31.23] Fast-Start Failover threshold has expired.
[W000 01/15 14:07:31.23] Try to connect to the standby.
[W000 01/15 14:07:31.23] Making a last connection attempt to primary database before proceeding with Fast-Start Failover.
[W000 01/15 14:07:31.23] Check if the standby is ready for failover.
[S005 01/15 14:07:31.24] Fast-Start Failover started...

14:07:31.24  Friday, January 15, 2021
Initiating Fast-Start Failover to database "OTTER_DR"...
[S005 01/15 14:07:31.24] Initiating Fast-start Failover.
Performing failover NOW, please wait...
Failover succeeded, new primary is "OTTER_DR"
14:07:48.44  Friday, January 15, 2021
[S005 01/15 14:07:48.44] Fast-Start Failover finished...
[W000 01/15 14:07:48.44] Failover succeeded. Restart pinging.
[W000 01/15 14:07:48.45] Primary database has changed to OTTER_DR.
[W000 01/15 14:07:48.45] Try to connect to the primary.
[W000 01/15 14:07:48.45] Try to connect to the primary otter_dr_broker.
[W000 01/15 14:07:49.76] Connection to the primary restored!
[W000 01/15 14:07:49.77] The standby OTTER_PR needs to be reinstated
[W000 01/15 14:07:49.77] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:07:50.77] Disconnecting from database otter_dr_broker.
[W000 01/15 14:07:52.78] Connection to the new standby restored!
[W000 01/15 14:07:52.79] Failed to ping the new standby.
[W000 01/15 14:07:53.79] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:07:56.81] Connection to the new standby restored!
[W000 01/15 14:07:58.81] Failed to ping the new standby.
[W000 01/15 14:07:58.81] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:01.83] Connection to the new standby restored!
[W000 01/15 14:08:01.84] Failed to ping the new standby.
[W000 01/15 14:08:02.84] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:05.85] Connection to the new standby restored!
[W000 01/15 14:08:07.86] Failed to ping the new standby.
[W000 01/15 14:08:07.86] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:10.88] Connection to the new standby restored!
[W000 01/15 14:08:11.89] Failed to ping the new standby.
[W000 01/15 14:08:12.89] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:15.90] Connection to the new standby restored!
[W000 01/15 14:08:17.90] Failed to ping the new standby.
[W000 01/15 14:08:18.91] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:21.91] Connection to the new standby restored!
[W000 01/15 14:08:22.93] Failed to ping the new standby.
[W000 01/15 14:08:23.93] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:26.94] Connection to the new standby restored!
[W000 01/15 14:08:28.95] Failed to ping the new standby.
[W000 01/15 14:08:28.95] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:31.96] Connection to the new standby restored!
[W000 01/15 14:08:31.97] Failed to ping the new standby.
[W000 01/15 14:08:32.97] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:35.99] Connection to the new standby restored!
[W000 01/15 14:08:37.99] Failed to ping the new standby.
[W000 01/15 14:08:37.99] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:41.02] Connection to the new standby restored!
[W000 01/15 14:08:43.02] Failed to ping the new standby.
[W000 01/15 14:08:43.02] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:46.03] Connection to the new standby restored!
[W000 01/15 14:08:46.04] Failed to ping the new standby.
[W000 01/15 14:08:47.04] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:50.05] Connection to the new standby restored!
[W000 01/15 14:08:52.05] Failed to ping the new standby.
[W000 01/15 14:08:52.06] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:08:55.07] Connection to the new standby restored!
[W000 01/15 14:08:56.08] Failed to ping the new standby.
[W000 01/15 14:08:57.09] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:00.10] Connection to the new standby restored!
[W000 01/15 14:09:02.11] Failed to ping the new standby.
[W000 01/15 14:09:03.11] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:06.13] Connection to the new standby restored!
[W000 01/15 14:09:07.14] Failed to ping the new standby.
[W000 01/15 14:09:08.14] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:11.16] Connection to the new standby restored!
[W000 01/15 14:09:13.16] Failed to ping the new standby.
[W000 01/15 14:09:13.16] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:16.17] Connection to the new standby restored!
[W000 01/15 14:09:17.17] Failed to ping the new standby.
[W000 01/15 14:09:18.18] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:21.19] Connection to the new standby restored!
[W000 01/15 14:09:22.20] Failed to ping the new standby.
[W000 01/15 14:09:23.20] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:26.21] Connection to the new standby restored!
[W000 01/15 14:09:28.22] Failed to ping the new standby.
[W000 01/15 14:09:28.22] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:31.23] Connection to the new standby restored!
[W000 01/15 14:09:32.24] Failed to ping the new standby.
[W000 01/15 14:09:33.24] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:36.26] Connection to the new standby restored!
[W000 01/15 14:09:37.27] Failed to ping the new standby.
[W000 01/15 14:09:38.27] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:41.28] Connection to the new standby restored!
[W000 01/15 14:09:43.28] Failed to ping the new standby.
[W000 01/15 14:09:43.28] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:46.30] Connection to the new standby restored!
[W000 01/15 14:09:46.31] Failed to ping the new standby.
[W000 01/15 14:09:47.32] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:50.33] Connection to the new standby restored!
[W000 01/15 14:09:52.33] Failed to ping the new standby.
[W000 01/15 14:09:52.33] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:55.35] Connection to the new standby restored!
[W000 01/15 14:09:55.36] Failed to ping the new standby.
[W000 01/15 14:09:56.36] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:09:59.37] Connection to the new standby restored!
[W000 01/15 14:10:01.37] Failed to ping the new standby.
[W000 01/15 14:10:01.37] Try to connect to the new standby OTTER_PR.
Unable to connect to database using otter_pr_broker
ORA-12537: TNS:connection closed

通过登录到新主数据库上的 dgmgrl 查看 Broker 配置。 您会看到之前的主数据库现在已禁用。

DGMGRL> show configuration;

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_DR - Primary database
    Warning: ORA-16824: multiple warnings, including fast-start failover-related warnings, detected for the database

    OTTER_PR - (*) Physical standby database (disabled)
      ORA-16661: the standby database needs to be reinstated

Fast-Start Failover: ENABLED

Configuration Status:
WARNING   (status updated 49 seconds ago)

查看测试数据
登录到新的主数据库并验证更改与之前主数据库一致。

SQL> select count(*) from x;

  COUNT(*)
----------
     71267

将之前中止的主数据库恢复为备用数据库

通过mount数据库启动恢复。注意,数据库此时不会打开。仅当观察器验证主数据库仍为主数据库后,主数据库才会打开。如果观察器发现该数据库不再是主数据库,会尝试将其恢复为故障切换的目标备用数据库。

恢复的第一步是将数据库闪回到备用数据库变为主数据库的 SCN 处(新主数据库上的 v$database.standby_became_primary_scn)。如闪回数据库部分中所述,闪回数据库将分成两个阶段进行:恢复阶段和介质恢复阶段。在恢复阶段,闪回数据库使用闪回数据库日志中的前映像块将数据库恢复到 standby_became_primary_scn 之前的一点。在介质恢复阶段中,闪回数据库应用重做以将数据库带到 standby_became_primary_scn。为使闪回数据库成功,闪回数据库日志中必须包括足够的可用历史记录,并且恢复点和 standby_became_primary_scn 之间生成的所有重做必须可用。如果闪回数据库失败,自动恢复将停止,您将需要手动执行基于 SCN 的恢复以恢复到 standby_became_primary_scn,直到完成该恢复。

一旦闪回数据库成功,观察器会将该数据库转换为备用数据库,重启并开始应用服务。

原主库:

startup mount

观察器:

[oracle@orapg:/home/oracle]$dgmgrl sys/oracle@OTTER_DR_BROKER
DGMGRL> show configuration;

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_DR - Primary database
    Warning: ORA-16829: fast-start failover configuration is lagging

    OTTER_PR - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:
WARNING   (status updated 28 seconds ago)

观察器日志:

[W000 01/15 14:10:31.47] Try to connect to the new standby OTTER_PR.
[W000 01/15 14:10:33.48] Connection to the new standby restored!
[W000 01/15 14:10:36.50] Try to connect to the primary otter_dr_broker.
[W000 01/15 14:10:38.51] Connection to the primary restored!
[W000 01/15 14:10:39.51] Wait for new primary to be ready to reinstate.
[W000 01/15 14:10:40.51] New primary is now ready to reinstate.
[W000 01/15 14:10:40.52] Issuing REINSTATE command.

14:10:40.52  Friday, January 15, 2021
Initiating reinstatement for database "OTTER_PR"...
Reinstating database "OTTER_PR", please wait...
[W000 01/15 14:10:58.59] The standby OTTER_PR is ready to be a FSFO target
Reinstatement of database "OTTER_PR" succeeded
14:11:15.83  Friday, January 15, 2021
[W000 01/15 14:11:16.64] Successfully reinstated database OTTER_PR.
[W000 01/15 14:11:16.65] The reinstatement of standby OTTER_PR was just done

等待几分钟再次查看:

DGMGRL> show configuration;

Configuration - FSFO

  Protection Mode: MaxPerformance
  Members:
  OTTER_DR - Primary database
    OTTER_PR - (*) Physical standby database

Fast-Start Failover: ENABLED

Configuration Status:
SUCCESS   (status updated 8 seconds ago)

在另一个方向重复操作
现在,对 FSFO 故障切换回到原始主数据库的操作进行测试。进行一些新的更改并验证故障切换执行后这些更改仍然存在。记住,在中止主数据库前查看闪回数据库历史记录。

9.3.主数据库停滞

还有一个很好的测试是模拟网络故障时主数据库仍然运行,但与故障切换目标备用数据库和观察器失去联系。当主数据库同时失去与故障切换目标和观察器的联系时,将进入“停滞”状态 (v$database.fs_failover_status = ‘STALLED’),所有仍与主数据库连接的会话将在提交时阻塞。查询和 DML 将继续运行 — 只有提交的会话会阻塞。这将防止在故障切换时出现“裂脑”情况,因为对隔离主数据库进行的更改并不是永久性的。运行 10.2.0.4 之前版本的 Oracle 数据库 10g 数据库将处于停滞状态,直至您中止或由观察器设定其在连接恢复后仍然作为主数据库。从 10.2.0.4 开始(包括 11g 以及之后的所有版本),Oracle 将提供 FastStartFailoverPmyShutdown Broker 属性,如果超过 FSFO 阈值超时后主数据库仍然处于停滞状态,您可以通过该属性指定主数据库应该进行何种操作。如果该属性设置为“TRUE”(默认值),主数据库将自己终止。如果该属性设置为“FALSE”,数据库仍然处于打开的停滞状态,直至终止或者通知其在未发生故障切换的时间内继续运行(例如,停滞开始后但未到达故障切换超时前,观察器被终止)。

本指南中介绍的简单测试有利于确保基础部分正常工作,但是您可能希望开发一套更全面、更适合您的环境和需求的测试。

10.监视 FSFO 准备情况

启用 FSFO 的系统和可以运行 FSFO 的系统有很大的区别。 可以运行 FSFO 意味着满足成功进行故障切换的所有条件,包括正在运行的观察器和为满足持久性需求传输到故障切换目标的重做。本部分将介绍如何保持顶级 FSFO 环境。

10.1.查询Broker

如果 Broker 报错,使用 Broker 的“show configuration”命令确定 FSFO 状态,使用“show database statusreport”命令查看详细信息。

10.2.查询主数据库

v$database 视图有专门用于监视 FSFO 状态的栏。 以下所列的值表示 FSFO 可以进行故障切换。有关详细信息,请参阅所用版本的 Oracle 参考资料和 Data Guard 管理员指南。

fs_failover_status =
'SYNCHRONIZED' for MaxAvailability
'TARGET UNDER LAG LIMIT' for MaxPerformance

fs_failover_observer_present = 'YES'

SQL> select fs_failover_status, fs_failover_observer_present from v$database;

FS_FAILOVER_STATUS     FS_FAIL
---------------------- -------
TARGET UNDER LAG LIMIT YES

10.3.备用数据库应用

没有什么比发现观察器刚刚进行故障切换的备用数据库在应用重做中落后 12 个小时更糟糕的了。 如果故障切换目标收到了所有用于满足您持久性需求的重做,这将完美地满足观察器的所有条件。观察器不会考虑已经应用了多少重做。所以,请准备一个在备用数据库应用落后过多时通知相关人员的办法。

10.4.闪回数据库历史记录

确保故障主数据库可自动恢复的重要性仅次于确保您拥有一个良好的主数据库。监视闪回数据库历史记录并在其降至 30 分钟以下时作出反应,这将节省时间并提高可用性。这还能在数据库在 FSFO 启动后的某一点禁用闪回数据库时向您发出警报。启用 FSFO 后,Broker 将检查主数据库和故障切换目标上的闪回数据库是否已启用。启用 FSFO 后,Broker 将继续在运行状况检查期间查看闪回数据库是否已启用。如果检测到闪回数据库因自身发现问题而遭禁用(无论手动或自动),Broker 将提示“ORA-16827: Flashback Database is disabled”。

10.5.客户端通知

如果应用程序和其他数据库客户端不知道主数据库的去向,故障切换数据库也无法发挥很好的作用。如果客户端已配置为收不到数据库响应时自动超时并重新连接,使用网络别名(例如,DNS CNAME)解决主数据库的问题将是一个简单有效的方法。进行角色更改后,命名服务将更新为新的主数据库地址。可以使用 DB_ROLE_CHANGE 系统上的触发器更新命名服务,并且客户端可以借助相应的客户端缓存 TTL 设置非常快速地连接到新的主数据库。

Oracle 还为 OCI 客户端提供快速应用通知 (FAN),为 JDBC 客户端提供快速连接故障切换。这些功能使利用它们的应用程序可以异步接收数据库事件的通知(包括角色转换)。

11.创建钱夹

虽然并非严格要求,但创建钱夹将为存储启动观察器时自动连接到主数据库所需的凭证提供一个安全的方法。Oracle 数据库安全指南中的“配置身份认证”一章提供了有关如何创建钱夹的详细说明。

Oracle 数据库 10g FSFO 观察器限于使用钱夹中定义的默认用户名和口令。在 10g 中,一个钱夹可以用于多个观察器,但必须使用相同的 SYS 口令。Oracle 数据库 11g 观察器可使用特定的凭证,允许同一钱夹在多个观察器中使用不同的 SYS 口令。

Task for setup the wallet connection:

  • Create a wallet directory
  • Adjust the sqlnet.ora on the observer
  • Create the wallet and the credentials
  • Test the connections via wallets
  • Adjust connection string in your shell scripts

只需要在观察器服务器中设置。

11.1.创建存储钱夹的目录

mkdir -p /oracle/app/oracle/admin/wallet

11.2.sqlnet.ora

cd $TNS_ADMI
vi sqlnet.ora

WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
      (DIRECTORY = /oracle/app/oracle/admin/wallet)
    )
  )
SQLNET.WALLET_OVERRIDE = TRUE

11.3.创建钱夹和密码

1)Create a wallet
Create a wallet on the client by using the following syntax at the command line:

mkstore -wrl <wallet_location> -create

This command creates an Oracle Wallet with the autologin feature enabled at the location specified. Autologin enables the client to access the Wallet contents without supplying a password and prevents exposing a clear text password on the client.

The mkstore command prompts for a password that is used for subsequent commands. Passwords must have a minimum length of eight characters and contain alphabetic characters combined with numbers or special characters.

mkstore -wrl /oracle/app/oracle/admin/wallet -create
Enter password:
PKI-01002: Invalid password. Passwords must have a minimum length of eight characters and contain alphabetic characters combined with numbers or special characters.
Enter password:oracle123
Enter password again:

Note:
Using Oracle Wallet moves the security vulnerability from a clear text password in the datasource configuration file to an encrypted password in the Wallet file. Make sure the Wallet file is stored in a secure location.

2)Add a login credentials to wallet

You can store multiple credentials for multiple databases in one client wallet. You cannot store multiple credentials (for logging in to multiple schemas) for the same database in the same wallet. If you have multiple login credentials for the same database, then they must be stored in separate wallets.

To add database login credentials to an existing client wallet, enter the following command at the command line:

mkstore -wrl <wallet_location> -createCredential <db_connect_string> <username> <password>

The db_connect_string must be identical to the connection string that you specify in the URL used in the datasource definition (the part of the string that follows the @). It can be either the short form or the long form of the URL. For example:

myhost:1521/myservice or

(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=myhost-scan)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=myservice)))

You should enclose this value in quotation marks to escape any special characters from the shell. Since this name is generally a long and complex value, an alternative is to use TNS aliases.

The username and password are the database login credentials.

添加SYS用户密码:

mkstore -wrl /oracle/app/oracle/admin/wallet -createCredential OTTER_PR_BROKER SYS oracle

Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:oracle123


mkstore -wrl /oracle/app/oracle/admin/wallet -createCredential OTTER_DR_BROKER SYS oracle

Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:oracle123

查询

mkstore -wrl /oracle/app/oracle/admin/wallet -listCredential

Oracle Secret Store Tool Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Enter wallet password:oracle123
List credential (index: connect_string username)
2: OTTER_DR_BROKER SYS
1: OTTER_PR_BROKER SYS

[oracle@dg1:/oracle/app/oracle/admin/wallet]$ls -l
total 8
-rw------- 1 oracle oinstall 957 Jan 15 19:56 cwallet.sso
-rw------- 1 oracle oinstall   0 Jan 15 19:56 cwallet.sso.lck
-rw------- 1 oracle oinstall 912 Jan 15 19:56 ewallet.p12
-rw------- 1 oracle oinstall   0 Jan 15 19:56 ewallet.p12.lck

11.4.Test the connections via wallets

[oracle@orapg:/oracle/app/oracle/admin/wallet]sqlplus /@OTTER_DR_BROKER as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Fri Jan 15 20:16:01 2021

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

SQL> exit
Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
[oracle@orapg:/oracle/app/oracle/admin/wallet]sqlplus /@OTTER_PR_BROKER as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Fri Jan 15 20:16:16 2021

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
[oracle@orapg:/home/oracle]$dgmgrl
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Fri Jan 15 20:17:40 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /@OTTER_DR_BROKER
Connected to "OTTER_DR"
Connected as SYSDBA.
[oracle@orapg:/home/oracle]$dgmgrl
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Fri Jan 15 20:18:26 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /@OTTER_PR_BROKER
Connected to "OTTER_PR"
Connected as SYSDBA.

11.5.Adjust connection string in your shell scripts

# Connection string to the primary
ConnectStringPR="/@OTTER_PR_BROKER"

# Connection string to the Standby
ConnectStringDR="/@OTTER_DR_BROKER"

12.创建观察器包装脚本

本部分将帮助您开始创建用于自动启动和重新启动 FSFO 观察器的包装脚本。 使用包装脚本在观察器主机启动时启动观察器进程,或者在观察器主机终止重新启动。您将决定如何编写包装和决定何时执行该包装的方法。

12.1.fsfo.dat文件

确定观察器状态文件(fsfo.dat 文件)的存储位置。
观察器在一个文件中维护状态信息。 默认情况下,该文件名为 fsfo.dat 并创建于启动观察器的工作目录。该状态文件在观察器运行时锁定,以防止多个观察器使用同一文件。为避免一台主机上运行的多个观察器时的锁定问题,通常来说,比较好的做法是将状态文件存储在与数据库相关联的目录中。

mkdir -p /oracle/app/oracle/admin/observer

12.2.观察器启动命令

Syntax:

START OBSERVER [ FILE=observer_configuration_filename ];
  • observer_configuration_filename: Specifies an explicit directory path and file name on the observer computer.
  • specify the -logfile optional parameter on the command line so that output generated while acting as the observer is not lost.
  • If a directory path is not specified with the FILE parameter, the observer searches the current working directory for the fsfo.dat file. If an fsfo.dat file is not found and this is the first time the START OBSERVER command is issued, the observer creates a fsfo.dat file.
  • The primary and target standby database DB_UNIQUE_NAME initialization parameter and connect identifiers are stored in the fsfo.dat configuration file. Oracle recommends you ensure this file is protected from unauthorized access.
  • Fast-start failover does not need to be enabled before you issue this command.
    If fast-start failover is enabled, the observer will retrieve primary and target standby connect identifiers from the broker configuration and begin monitoring the configuration.
    If fast-start failover is not enabled, the observer continually monitors for when fast-start failover is enabled.
  • Only the primary database needs to be running when you issue this command; the standby database that will be the target of a fast-start failover does not need to be running.
  • If the observer is stopped:
    Because the STOP OBSERVER command was issued, you can issue the START OBSERVER command on any computer to restart the observer.
    For any reason other than because the STOP OBSERVER command was issued, you must issue the START OBSERVER command on the same observer computer where it was started originally, with the observer configuration file used when the observer had first been started.
  • If an observer is already running, the START OBSERVER command fails and returns the following errors:ORA-16647: could not start more than one observer DGM-16954: unable to open and lock the Observer configuration file
  • If the primary and target standby databases stay connected but they lose the connection to the observer, then the primary database goes into an unobserved state. This state is reported by the broker’s health check capability.
  • Use the SHOW FAST_START FAILOVER command, the SHOW CONFIGURATION VERBOSE command, or query the FS_FAILOVER_* columns in the V$DATABASE view on the primary database to see the status of the observer and its host computer.

以下是观察器启动命令。 注意,使用“/@”登录将用到钱夹。

输入file参数出错:

nohup dgmgrl /@OTTER_PR_BROKER "start observer FSFO\
file='/oracle/app/oracle/admin/observer/fsfo.dat'" \
 >>  /oracle/app/oracle/admin/observer/dgmgrl.log &

不输入file参数运行成功:

[oracle@orapg:/home/oracle]nohup dgmgrl -logfile /oracle/app/oracle/admin/observer/dgmgrl.log /@OTTER_PR_BROKER "start observer FSFO" &
[1] 6423
[oracle@orapg:/home/oracle]nohup: ignoring input and appending output to ‘nohup.out’
[oracle@orapg:/oracle/app/oracle/admin/observer]cat dgmgrl.log
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Fri Jan 15 21:13:18 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "OTTER_PR"
Connected as SYSDBA.
Error: ORA-16814: duplicate name specified for observer

[W000 01/15 21:13:20.24] Failed to start the Observer.
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Fri Jan 15 21:51:31 2021

Copyright (c) 1982, 2017, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "OTTER_PR"
Connected as SYSDBA.
[W000 01/15 21:51:35.95] Observer 'fsfo' started
[W000 01/15 21:51:35.95] Observer trace level is set to USER
[W000 01/15 21:51:35.95] Try to connect to the primary.
[oracle@orapg:/oracle/app/oracle/admin/observer]$ps -ef|grep dgmgrl | grep -v grep
oracle     6423   6302  0 21:51 pts/0    00:00:01 dgmgrl -logfile /oracle/app/oracle/admin/observer/dgmgrl.log                   start observer FSFO

12.3.观察器的连接

所有与观察器相关联的连接(包括初始连接)必须使用专用服务器连接。 使用 Shared Server (MTS) 或连接池可能导致无法预料的行为。
为实现可靠的启动,初始连接应始终连接到主数据库。运行中的观察器在角色转换后将自动跟随主数据库,但如果初始连接连接到故障数据库或者包含过期或损坏的 Broker 配置文件的数据库,新(重新)启动的观察器将不会启动。
启动可能会失败,原因为“ORA-16647: could not start more than one observer”。如果之前的观察器进程未注销即终止并且新观察器未使用之前的 fsfo.dat 文件,那么即使没有观察器在实际运行也会发生该错误。如果出现此情形,运行“stop observer”并重试。
至少配置两台主机来运行观察器是一个比较好的办法,这样,其中一台主机可以在另一台发生故障时进行替换。

12.4.故障切换脚本

创建一个自动启动 FSFO 故障切换的脚本,并将其作为备用数据库翻转的标准方法。 这不仅节省了时间,还将使由自动化反向手动进程所引起的问题最小化。每一次翻转都将演练您的故障切换和 DR 流程,从中您将了解 FSFO 配置是多么出色,并且在真实的紧急情况下,每一个人都知道如何应对。故障切换将趋于常规。实际上,故障切换非常可靠、快捷和简便,转换将成为异常而不再是惯例。只需确保脚本中包括对闪回数据库历史记录的检查,从而在故障切换需要手动恢复时提供一个中止的选项。

12.5.DB_ROLE_CHANGE事件

当数据库在角色转换后第一次打开时,将触发 DB_ROLE_CHANGE 事件。 转换或故障切换后,在该事件上创建一个触发器来执行特定于您环境的操作,如将名称解析服务更新为指向新主数据库。尽量保持该触发器保持简单可靠,将其仅限定为角色转换时绝对必要,因为此时出现任何故障都将影响可用性。如果需要触发很多操作,请将它们放入一个单独的脚本中并使用触发器在一个孤立的进程或独立于数据库的线程中运行该脚本。

13.包括多个备用数据库的 FSFO 配置

13.1.备用节点

除故障切换目标之外的所有备用数据库被称为备用节点 (v$database.fs_failover_status = ‘BYSTANDER’)。备用节点是 Data Guard 配置的一部分,但不是 FSFO 配置的一部分。FSFO 配置在任何给定时间只包括两个数据库(主数据库和故障切换目标数据库)。

在故障切换期间,备用节点默认“跟随”主数据库,根据需要从新主数据库进行闪回和重应用重做。(是的,备用节点同样需要闪回数据库)。

在故障切换后,备用节点将不会自动成为新的故障切换目标。只有通过操作将故障切换目标更改为一个备用节点,新的主数据库只有在之前主数据库恢复为备用数据库后才能拥有一个故障切换目标。如果 Broker 配置更改为使备用节点成为新的故障切换目标(适用于故障数据库将停机一段时间的情况),观察器将不会自动恢复之前的主数据库,因为它将不再是 FSFO 配置的一部分。恢复需通过其他方式完成(手动或脚本 Broker 脚本命令)。

13.2.转换限制

启用 FSFO 的配置包含多个备用数据库,不能转换到非故障切换目标的备用数据库。 要转换到非当前故障切换目标的备用数据库:

禁用 FSFO
将故障切换目标更改为转换到的备用数据库
启用 FSFO
转换

或者

禁用 FSFO
转换
将故障切换目标更改为所希望的备用数据库
启用 FSFO