從10g開始,Oracle提供更高效的Data Pump(即expdp/impdp)來進行數據的導入和導出,老的exp/imp還可以用,但已經不建議使用。注意:expdp/impdp和exp/imp之間互不兼容,也就是說exp導出的文件只能用imp導入,expdp導出的文件只能用impdp導入。
Data Pump的組成部分
- 客戶端工具:expdp/impdp
- Data Pump API (即DBMS_DATAPUMP)
- Metadata API(即DMBS_METADATA)
Data Pump相關的角色
- DATAPUMP_EXP_FULL_DATABASE
- DATAPUMP_IMP_FULL_DATABASE
Data Pump數據導入方法
- 數據文件拷貝:這種是最快的方法,dumpfile里只包含元數據,在操作系統層面拷貝數據文件,相關參數有:TRANSPORT_TABLESPACES,
TRANSPORTABLE=ALWAYS
- 直接路徑加載:這是除了文件拷貝之外最快的方法,除非無法用(比如BFILE),否則都用這種方法
- 外部表:第1,2種無法用的情況下,才會使用外部表
- 傳統路徑加載:只有在以上所有方法都不可用的情況下,才會使用傳統路徑加載,這種方法性能很差
Data Pump Job
- 主進程(master process):控制整個job,是整個job的協調者。
- 主表(master table):記錄dumpfile里數據庫對象的元信息,expdp結束時將它寫入dumpfile里,impdp開始時讀取它,這樣才能知道dumpfile里的內容。
- 工作進程(worker processes):執行導出導入工作,根據實際情況自動創建多個工作進程並行執行,但不能超過參數PARALLEL定義的個數。
監控Job狀態
DBA_DATAPUMP_JOBS
,
USER_DATAPUMP_JOBS
, or
DBA_DATAPUMP_SESSIONS。
- USERNAME - job owner
- OPNAME - job name
- TARGET_DESC - job operation
- SOFAR - megabytes transferred thus far during the job
- TOTALWORK - estimated number of megabytes in the job
- UNITS - megabytes (MB)
- MESSAGE - a formatted status message of the form:
- 'job_name: operation_name : nnn out of mmm MB done'
創建Directory
- SQL> CREATE DIRECTORY dpump_dir1 AS '/usr/apps/datafiles';
創建了directory對象之后,還要把讀寫權限賦給執行Data Pump的用戶,如下所示:
- SQL> GRANT READ, WRITE ON DIRECTORY dpump_dir1 TO hr;
導出模式
SYS
,
ORDSYS
, and
MDSYS等。
Full模式
導出一個或多個Schemas(參數SCHEMAS),默認導出當前用戶的schema,只有擁有DATAPUMP_EXP_FULL_DATABASE角色才能導出其它Schemas, 例子:
- > expdp hr DIRECTORY=dpump_dir1 DUMPFILE=expdat.dmp SCHEMAS=hr,sh,oe
Table模式
- TABLES=[schema_name.]table_name[:partition_name] [, ...]
如果schema_name省略,表示導出當前用戶schema下的表,
- > expdp hr DIRECTORY=dpump_dir1 DUMPFILE=tbs.dmp
- TABLESPACES=tbs_4, tbs_5, tbs_6
Transpotable Tablespace模式
- > expdp hr DIRECTORY=dpump_dir1 DUMPFILE=tts.dmp
- TRANSPORT_TABLESPACES=tbs_1 TRANSPORT_FULL_CHECK=YES LOGFILE=tts.log
導出過程中的過濾
數據過濾
- QUERY = [schema.][table_name:] query_clause
- QUERY=employees:"WHERE department_id > 10 AND salary > 10000"
- NOLOGFILE=YES
- DIRECTORY=dpump_dir1
- DUMPFILE=exp1.dmp
參數SAMPLE指定導出百分比,其語法如下:
- SAMPLE=[[schema_name.]table_name:]sample_percent
下面是一個例子:
- expdp FULL=YES DUMPFILE=expfull.dmp EXCLUDE=SCHEMA:"='HR'"
- > expdp hr DIRECTORY=dpump_dir1 DUMPFILE=hr_exclude.dmp EXCLUDE=VIEW,
- PACKAGE, FUNCTION
INCLUDE例子:
- SCHEMAS=HR
- DUMPFILE=expinclude.dmp
- DIRECTORY=dpump_dir1
- LOGFILE=expinclude.log
- INCLUDE=TABLE:"IN ('EMPLOYEES', 'DEPARTMENTS')"
- INCLUDE=PROCEDURE
- INCLUDE=INDEX:"LIKE 'EMP%'"
主要參數說明
- > expdp hr SCHEMAS=hr DIRECTORY=dpump_dir1 DUMPFILE=dpump_dir2:exp1.dmp,
- exp2%U.dmp PARALLEL=3
ESTIMATE_ONLY:如果你只想事先評估下dump文件占用空間大小,可以指定ESTIMATE_ONLY=yes
導入模式
Full模式
Schema模式
SCHEMAS=schema_name [,...]
TABLES=[schema_name.]table_name[:partition_name]
> impdp hr DIRECTORY=dpump_dir1 DUMPFILE=expfull.dmp TABLES=employees,jobs
> impdp hr DIRECTORY=dpump_dir1 DUMPFILE=expdat.dmp TABLES=sh.sales:sales_Q1_2012,sh.sales:sales_Q2_2012
Tablespace模式
TABLESPACES=tablespace_name [, ...]
> impdp hr DIRECTORY=dpump_dir1 DUMPFILE=expfull.dmp TABLESPACES=tbs_1,tbs_2,tbs_3,tbs_4
Transpotable Tablespace模式
TRANSPORT_TABLESPACES=tablespace_name [, ...]
TRANSPORT_DATAFILES=datafile_name
DIRECTORY=dpump_dir1
NETWORK_LINK=source_database_link
TRANSPORT_TABLESPACES=tbs_6
TRANSPORT_FULL_CHECK=NO
TRANSPORT_DATAFILES='user01/data/tbs6.dbf'
導入過程中的過濾
主要參數說明
ACCESS_METHOD=[AUTOMATIC | DIRECT_PATH | EXTERNAL_TABLE | CONVENTIONAL]定義導入方法,強烈建議采用默認設置AUTOMATIC,不要改動。
CONTENT=[ALL | DATA_ONLY | METADATA_ONLY]定義只導入數據、元數據還是都要
DIRECTORY=directory_object
指定導入數據文件所在的文件夾
DUMPFILE=[directory_object:]file_name [, ...]
指定導入Dump文件名稱,可用通配符%U匹配多個Dump文件
HELP=YESimpdp help=y 顯示幫助信息
JOB_NAME=jobname_string
指定Job_name,一般默認即可
LOGFILE=[directory_object:]file_name
指定日志文件名
MASTER_ONLY=[YES | NO]指定只導入master table,由於master table包含dumpfile的信息,這樣就可以指定dumpfile里包含哪些數據。
PARALLEL=integer
指定導入時的並行度
PARFILE=[directory_path]file_name
指定參數文件
REMAP_DATA=[schema.]tablename.column_name:[schema.]pkg.function
導入時對數據進行修改,比如重新生成PK防止和原有的PK沖突等。
REMAP_DATAFILE=source_datafile:target_datafile
可以解決異構平台間文件命名規范不同的問題
REMAP_SCHEMA=source_schema:target_schema
這個參數很常用,可以讓你導入到不同的schema中,如果target_schema不存在,導入時會自動創建,下面是一個例子:
> expdp system SCHEMAS=hr DIRECTORY=dpump_dir1 DUMPFILE=hr.dmp > impdp system DIRECTORY=dpump_dir1 DUMPFILE=hr.dmp REMAP_SCHEMA=hr:scott
REMAP_TABLE=[schema.]old_tablename[.partition]:new_tablename
可以在導入時重命名表或分區,下面是一個例子:
> impdp hr DIRECTORY=dpump_dir1 DUMPFILE=expschema.dmp TABLES=hr.employees REMAP_TABLE=hr.employees:emps
REMAP_TABLESPACE=source_tablespace:target_tablespace
在導入時修改表空間名,下面是一個例子:
> impdp hr REMAP_TABLESPACE=tbs_1:tbs_6 DIRECTORY=dpump_dir1 DUMPFILE=employees.dmp
REUSE_DATAFILES=[YES | NO]是否重用數據文件,默認為NO,一定要謹慎,一旦設為YES,原有同名的數據文件將被覆蓋
SQLFILE=[directory_object:]file_name
如果指定該參數,則不真正執行導入,而是把導入時所需的DDL SQL寫入到SQLFILE里。
expdp的network_link
- source_db =
- (DESCRIPTION =
- (ADDRESS_LIST =
- (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.15)(PORT = 1521))
- )
- (CONNECT_DATA =
- (sid = orcl)
- )
- )
2. 在目標數據庫創建db link:
- SQL>create public database link source_db_link connect to system identified by *** using 'test15';
- Database link created.
- SQL>select instance_name from v$instance@source_db;
- INSTANCE_NAME
- ----------------
- orcl
3. 在目標服務器執行expdp:
- $ expdp system directory=dump_dir network_link=test15 tables=test.test dumpfile=test.dmp logfile=expdp_test.log
其中network_link等於第2步上創建的db link,dumpfile,logfile放在目標數據庫的dump_dir
- $ impdp system directory=dump_dir network_link=test15 tables=test.test logfile=impdp_test.log
上面語句直接把test.test表從源數據庫導入至目標數據庫,中間不產生dumpfile,到會產生logfile(logfile放在目標數據庫端的dump_dir里)
當我們起了一個datapump job之后,可以通過v$session_longops查看當前進度。
- USERNAME - job owner
- OPNAME - job name
- TARGET_DESC - job operation
- SOFAR - megabytes transferred thus far during the job
- TOTALWORK - estimated number of megabytes in the job
- UNITS - megabytes (MB)
- MESSAGE - a formatted status message of the form:
- 'job_name: operation_name : nnn out of mmm MB done'
- SYS@TEST16>select username,opname,sofar,TOTALWORK,UNITS,message from v$session_longops where opname='SYS_EXPORT_FULL_03';
- USERNAME OPNAME SOFAR TOTALWORK UNITS MESSAGE
- --------------- -------------------- ---------- ---------- ----- ------------------------------------------------------------
- SYSTEM SYS_EXPORT_FULL_03 4737 35368 MB SYS_EXPORT_FULL_03: EXPORT : 4737 out of 35368 MB done
但有時候單單監控是不夠的,我們可能還需要修改相應的JOB,這時我們就需要進行datapumo的命令交互模式。
有兩種方式可以進入命令交互模式,分別是:
1. 在logging模式下按ctrl+C
2. expdp or impdp attach=SYSTEM.SYS_EXPORT_FULL_03
expdp交互模式的命令如下:
Activity | Command Used |
---|---|
Add additional dump files. |
|
Exit interactive mode and enter logging mode. |
|
Stop the export client session, but leave the job running. |
|
Redefine the default size to be used for any subsequent dump files. |
|
Display a summary of available commands. |
|
Detach all currently attached client sessions and terminate the current job. |
|
Increase or decrease the number of active worker processes for the current job. This command is valid only in the Enterprise Edition of Oracle Database 11g. |
|
Restart a stopped job to which you are attached. |
|
Display detailed status for the current job and/or set status interval. |
|
Stop the current job for later restart. |
impdp的交互模式命令如下:
Activity | Command Used |
---|---|
Exit interactive-command mode. |
|
Stop the import client session, but leave the current job running. |
|
Display a summary of available commands. |
|
Detach all currently attached client sessions and terminate the current job. |
|
Increase or decrease the number of active worker processes for the current job. This command is valid only in Oracle Database Enterprise Edition. |
|
Restart a stopped job to which you are attached. |
|
Display detailed status for the current job. |
|
Stop the current job. |
|
下面以expdp為例,介紹幾個常用命令(如果忘記命令,敲萬能的help)。
1. status:查看當前job的狀態,如完成的百分比、並行度等,每個worker代表一個並行進程。
- Export> status
- Job: SYS_EXPORT_FULL_03
- Operation: EXPORT
- Mode: FULL
- State: EXECUTING
- Bytes Processed: 8,357,285,928
- Percent Done: 23
- Current Parallelism: 2
- Job Error Count: 0
- Dump File: /home/oracle/dump/full_%u.dmp
- Dump File: /home/oracle/dump/full_01.dmp
- bytes written: 8,357,294,080
- Dump File: /home/oracle/dump/full_02.dmp
- bytes written: 4,096
- Worker 1 Status:
- Process Name: DW00
- State: EXECUTING
- Object Type: DATABASE_EXPORT/SCHEMA/TABLE/COMMENT
- Completed Objects: 5,120
- Worker Parallelism: 1
- Worker 2 Status:
- Process Name: DW01
- State: EXECUTING
- Object Schema: P95169
- Object Name: GRADE_RCCASE
- Object Type: DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA
- Completed Objects: 3
- Total Objects: 1,866
- Completed Rows: 23,505,613
- Worker Parallelism: 1
2. parallel:動態調整並行度
- Export> parallel=4
3. add_file:增加dumpfile
4. stop_job, kill_job, start_job
stop_job只是暫停,之后可以用start_job重新啟動,而kill_job直接殺掉,不可恢復
5. continue_client:退出交互模式,進入logging模式;
exit_client: 退出客戶端