mysql load data, select into outfile 導入和導出 CSV格式


1. secure_file_priv

mysql 數據導出和導入csv格式時,需要特別注意 null 和空字符的處理,在導出和導入的結果要保持一致

secure_file_priv 在 select into file 時指定文件存儲位置。

如果為null表示不能使用 select into outfile ;

如果為 '' 表示可以使用 select into file 保存到任何目錄;

該變量的修改,需要在my.cnf的[mysqld]中配置:

  並且需要重啟mysqld。

2. select from xxx  into outfile 

SELECT * FROM xxxx WHERE 
INTO OUTFILE "/tmp/xxx.csv" 
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY "\r\n";

 使用 select into outfile 生成CSV格式,需要注意默認的轉義字符為 eccaped by '\', 比如 NULL 導出CSV為: \N,如下所示:

導入時,使用的語句如下:

LOAD DATA INFILE '/tpm/xxx.csv' 
INTO TABLE xxxxxx 
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY "\r\n" 

上面的導入語句和導出語句配合,確保 在導出和導入的過程不能發生把 NULL 丟失掉,比如 NULL變成字符串的'', 變成整數的 0 這樣的問題。

有時不能用雙引號包括,這個時候,需要去掉 OPTIONALLY ENCLOSED BY '"'

LOAD DATA INFILE '/tpm/xxx.csv' 
INTO TABLE xxxxxx 
FIELDS TERMINATED BY ','  LINES TERMINATED BY "\r\n" 

有時需要導出CSV的表頭,此時需要使用union進行處理

select * 
from (
SELECT '姓名','身份證號碼','盟市','旗縣'
UNION
SELECT * from table 
) t  INTO OUTFILE "/tmp/xxx.csv" FIELDS TERMINATED BY ','  LINES TERMINATED BY "\r\n"  

上面 的  SELECT '姓名','身份證號碼','盟市','旗縣' 會生成表頭;

有時導出的CSV用excel打開時亂碼,此時需要指定編碼:

select * from t into outfile 'tmp/xxx.csv' character set gbk 
FIELDS TERMINATED BY ','  LINES TERMINATED BY "\r\n" 

如果不想指定編碼重新導出CSV的話,還有其他的處理方法:

在簡體中文系統的環境下,EXCEL打開的CSV文件默認是ANSI編碼,如果CSV文件的編碼方式為utf-8、Unicode等編碼可能就會出現文件亂碼的情況。知道什么原因,那接下來就去解決:

1)把記事本等文本文件打開,然后另存文件,編碼選擇ANSI。

2)方法二:

創建一個新的Excel文件;切換至“數據”菜單,選擇數據來源為“自文本”選擇 CSV 文件,

 

 

查看幫助文檔

mysql> ? load data
Name: 'LOAD DATA'
Description:
Syntax:
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
    [REPLACE | IGNORE]
    INTO TABLE tbl_name
    [PARTITION (partition_name [, partition_name] ...)]
    [CHARACTER SET charset_name]
    [{FIELDS | COLUMNS}
        [TERMINATED BY 'string']
        [[OPTIONALLY] ENCLOSED BY 'char']
        [ESCAPED BY 'char']
    ]
    [LINES
        [STARTING BY 'string']
        [TERMINATED BY 'string']
    ]
    [IGNORE number {LINES | ROWS}]
    [(col_name_or_user_var
        [, col_name_or_user_var] ...)]
    [SET col_name={expr | DEFAULT},
        [, col_name={expr | DEFAULT}] ...]

The LOAD DATA INFILE statement reads rows from a text file into a table
at a very high speed. LOAD DATA INFILE is the complement of SELECT ...
INTO OUTFILE. (See
http://dev.mysql.com/doc/refman/5.6/en/select-into.html.) To write data
from a table to a file, use SELECT ... INTO OUTFILE. To read the file
back into a table, use LOAD DATA INFILE. The syntax of the FIELDS and
LINES clauses is the same for both statements. Both clauses are
optional, but FIELDS must precede LINES if both are specified.

You can also load data files by using the mysqlimport utility; it
operates by sending a LOAD DATA INFILE statement to the server. The
--local option causes mysqlimport to read data files from the client
host. You can specify the --compress option to get better performance
over slow networks if the client and server support the compressed
protocol. See http://dev.mysql.com/doc/refman/5.6/en/mysqlimport.html.

For more information about the efficiency of INSERT versus LOAD DATA
INFILE and speeding up LOAD DATA INFILE, see
http://dev.mysql.com/doc/refman/5.6/en/insert-optimization.html.

The file name must be given as a literal string. On Windows, specify
backslashes in path names as forward slashes or doubled backslashes.
The character_set_filesystem system variable controls the
interpretation of the file name.

LOAD DATA supports explicit partition selection using the PARTITION
option with a list of one or more comma-separated names of partitions,
subpartitions, or both. When this option is used, if any rows from the
file cannot be inserted into any of the partitions or subpartitions
named in the list, the statement fails with the error Found a row not
matching the given partition set. For more information and examples,
see http://dev.mysql.com/doc/refman/5.6/en/partitioning-selection.html.

For partitioned tables using storage engines that employ table locks,
such as MyISAM, LOAD DATA cannot prune any partition locks. This does
not apply to tables using storage engines which employ row-level
locking, such as InnoDB. For more information, see
http://dev.mysql.com/doc/refman/5.6/en/partitioning-limitations-locking
.html.

The server uses the character set indicated by the
character_set_database system variable to interpret the information in
the file. SET NAMES and the setting of character_set_client do not
affect interpretation of input. If the contents of the input file use a
character set that differs from the default, it is usually preferable
to specify the character set of the file by using the CHARACTER SET
clause. A character set of binary specifies "no conversion."

LOAD DATA INFILE interprets all fields in the file as having the same
character set, regardless of the data types of the columns into which
field values are loaded. For proper interpretation of file contents,
you must ensure that it was written with the correct character set. For
example, if you write a data file with mysqldump -T or by issuing a
SELECT ... INTO OUTFILE statement in mysql, be sure to use a
--default-character-set option so that output is written in the
character set to be used when the file is loaded with LOAD DATA INFILE.

*Note*:

It is not possible to load data files that use the ucs2, utf16,
utf16le, or utf32 character set.

If you use LOW_PRIORITY, execution of the LOAD DATA statement is
delayed until no other clients are reading from the table. This affects
only storage engines that use only table-level locking (such as MyISAM,
MEMORY, and MERGE).

If you specify CONCURRENT with a MyISAM table that satisfies the
condition for concurrent inserts (that is, it contains no free blocks
in the middle), other threads can retrieve data from the table while
LOAD DATA is executing. This option affects the performance of LOAD
DATA a bit, even if no other thread is using the table at the same
time.

With row-based replication, CONCURRENT is replicated regardless of
MySQL version. With statement-based replication CONCURRENT is not
replicated prior to MySQL 5.5.1 (see Bug #34628). For more information,
see
http://dev.mysql.com/doc/refman/5.6/en/replication-features-load-data.h
tml.

The LOCAL keyword affects expected location of the file and error
handling, as described later. LOCAL works only if your server and your
client both have been configured to permit it. For example, if mysqld
was started with the local_infile system variable disabled, LOCAL does
not work. See
http://dev.mysql.com/doc/refman/5.6/en/load-data-local.html.

The LOCAL keyword affects where the file is expected to be found:

o If LOCAL is specified, the file is read by the client program on the
  client host and sent to the server. The file can be given as a full
  path name to specify its exact location. If given as a relative path
  name, the name is interpreted relative to the directory in which the
  client program was started.

  When using LOCAL with LOAD DATA, a copy of the file is created in the
  server's temporary directory. This is not the directory determined by
  the value of tmpdir or slave_load_tmpdir, but rather the operating
  system's temporary directory, and is not configurable in the MySQL
  Server. (Typically the system temporary directory is /tmp on Linux
  systems and C:\WINDOWS\TEMP on Windows.) Lack of sufficient space for
  the copy in this directory can cause the LOAD DATA LOCAL statement to
  fail.

o If LOCAL is not specified, the file must be located on the server
  host and is read directly by the server. The server uses the
  following rules to locate the file:

  o If the file name is an absolute path name, the server uses it as
    given.

  o If the file name is a relative path name with one or more leading
    components, the server searches for the file relative to the
    server's data directory.

  o If a file name with no leading components is given, the server
    looks for the file in the database directory of the default
    database.

In the non-LOCAL case, these rules mean that a file named as
./myfile.txt is read from the server's data directory, whereas the file
named as myfile.txt is read from the database directory of the default
database. For example, if db1 is the default database, the following
LOAD DATA statement reads the file data.txt from the database directory
for db1, even though the statement explicitly loads the file into a
table in the db2 database:

LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;

Non-LOCAL load operations read text files located on the server. For
security reasons, such operations require that you have the FILE
privilege. See
http://dev.mysql.com/doc/refman/5.6/en/privileges-provided.html. Also,
non-LOCAL load operations are subject to the secure_file_priv system
variable setting. If the variable value is a nonempty directory name,
the file to be loaded must be located in that directory. If the
variable value is empty (which is insecure), the file need only be
readable by the server.

Using LOCAL is a bit slower than letting the server access the files
directly, because the contents of the file must be sent over the
connection by the client to the server. On the other hand, you do not
need the FILE privilege to load local files.

LOCAL also affects error handling:

o With LOAD DATA INFILE, data-interpretation and duplicate-key errors
  terminate the operation.

o With LOAD DATA LOCAL INFILE, data-interpretation and duplicate-key
  errors become warnings and the operation continues because the server
  has no way to stop transmission of the file in the middle of the
  operation. For duplicate-key errors, this is the same as if IGNORE is
  specified. IGNORE is explained further later in this section.

The REPLACE and IGNORE keywords control handling of input rows that
duplicate existing rows on unique key values:

o If you specify REPLACE, input rows replace existing rows. In other
  words, rows that have the same value for a primary key or unique
  index as an existing row. See [HELP REPLACE].

o If you specify IGNORE, rows that duplicate an existing row on a
  unique key value are discarded.

o If you do not specify either option, the behavior depends on whether
  the LOCAL keyword is specified. Without LOCAL, an error occurs when a
  duplicate key value is found, and the rest of the text file is
  ignored. With LOCAL, the default behavior is the same as if IGNORE is
  specified; this is because the server has no way to stop transmission
  of the file in the middle of the operation.

URL: http://dev.mysql.com/doc/refman/5.6/en/load-data.html


mysql>

--------------------------------------------------------------------------------------------------------------------------------- 

相關:https://stackoverflow.com/questions/2675323/mysql-load-null-values-from-csv-data

參考:https://www.cnblogs.com/kumufengchun/p/10365911.html   

基本語法:

load data  [low_priority] [local] infile 'file_name txt' [replace | ignore]
into table tbl_name
[fields
[terminated by't']
[OPTIONALLY] enclosed by '']
[escaped by'\' ]]
[lines terminated by'n']
[ignore number lines]
[(col_name,   )]

load data infile語句從一個文本文件中以很高的速度讀入一個表中。使用這個命令之前,mysqld進程(服務)必須已經在運行。為了安全原因,當讀取位於服務器上的文本文件時,文件必須處於數據庫目錄或可被所有人讀取。另外,為了對服務器上文件使用load data infile,在服務器主機上你必須有file的權限。
1  如果你指定關鍵詞low_priority,那么MySQL將會等到沒有其他人讀這個表的時候,才把插入數據。可以使用如下的命令: 
load data  low_priority infile "/home/mark/data sql" into table Orders;
 
2  如果指定local關鍵詞,則表明從客戶主機讀文件。如果local沒指定,文件必須位於服務器上。
 
3  replace和ignore關鍵詞控制對現有的唯一鍵記錄的重復的處理。如果你指定replace,新行將代替有相同的唯一鍵值的現有行。如果你指定ignore,跳過有唯一鍵的現有行的重復行的輸入。如果你不指定任何一個選項,當找到重復鍵時,出現一個錯誤,並且文本文件的余下部分被忽略。例如:
load data  low_priority infile "/home/mark/data sql" replace into table Orders;
 
4 分隔符
(1) fields關鍵字指定了文件記段的分割格式,如果用到這個關鍵字,MySQL剖析器希望看到至少有下面的一個選項: 
terminated by分隔符:意思是以什么字符作為分隔符
enclosed by字段括起字符
escaped by轉義字符
terminated by描述字段的分隔符,默認情況下是tab字符(\t) 
enclosed by描述的是字段的括起字符。
escaped by描述的轉義字符。默認的是反斜杠(backslash:\ )  
例如:load data infile "/home/mark/Orders txt" replace into table Orders fields terminated by',' enclosed by '"';
(2)lines 關鍵字指定了每條記錄的分隔符默認為'\n'即為換行符
如果兩個字段都指定了那fields必須在lines之前。如果不指定fields關鍵字缺省值與如果你這樣寫的相同: fields terminated by'\t' enclosed by ’ '' ‘ escaped by'\\'
如果你不指定一個lines子句,缺省值與如果你這樣寫的相同: lines terminated by'\n'
例如:load data infile "/jiaoben/load.txt" replace into table test fields terminated by ',' lines terminated by '/n';
 
5  load data infile 可以按指定的列把文件導入到數據庫中。 當我們要把數據的一部分內容導入的時候,,需要加入一些欄目(列/字段/field)到MySQL數據庫中,以適應一些額外的需要。比方說,我們要從Access數據庫升級到MySQL數據庫的時候
下面的例子顯示了如何向指定的欄目(field)中導入數據: 
load data infile "/home/Order txt" into table Orders(Order_Number, Order_Date, Customer_ID);
 
6  當在服務器主機上尋找文件時,服務器使用下列規則: 
(1)如果給出一個絕對路徑名,服務器使用該路徑名。 
(2)如果給出一個有一個或多個前置部件的相對路徑名,服務器相對服務器的數據目錄搜索文件。  
(3)如果給出一個沒有前置部件的一個文件名,服務器在當前數據庫的數據庫目錄尋找文件。 
例如: /myfile txt”給出的文件是從服務器的數據目錄讀取,而作為“myfile txt”給出的一個文件是從當前數據庫的數據庫目錄下讀取。
 

使用mysql命令導出數據

利用mysql的-e參數,可以導出數據,最重要的是我們可以對導出的數據進行正則處理。

如下利用mysql命令導出數據到csv文件,並且把表中的null值在excel中顯示為空。

[root@test2 ~]# mysql -e "set names gbk;select * from newsdb.t_hk_stock_news where news_time > '2019-03-31 23:59:59' limit 5" |sed -e  "s/\t/,/g" -e "s/NULL/  /g" -e "s/\n/\r\n/g" > /db/test.csv

#在-e參數中實際使用了兩條命令,一條是設置字符集,另一條是select語句,通過管道把每一行數據都通過正則來處理。
#正則中把字段之間的TAB鍵換為“,”,然后把字段值中的null替換為空字符

Load data 完整的demo:

LOAD DATA LOCAL INFILE '/tmp/2982/20200424/user.csv' 
INTO TABLE t_user CHARACTER SET utf8mb4 FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\r\n' 
IGNORE 1 LINES 
(userName, userNo, age, homeAddr)  
SET province = '浙江省', city='杭州市', creatorId=2982, createTime='2020-04-24 13:24:24'

local 關鍵字的作用:CSV文件和mysql不在同一個服務器時,使用local關鍵字可以A服務器的CSV文件導入到 mysql所在的B服務器。

如果沒有local關鍵字則,csv文件和mysql必須在同一個服務器上面;

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM