MySQL基礎知識:創建MySQL數據庫和表


虛構一個微型在線書店的數據庫和數據,作為后續MySQL腳本的執行源,方便后續MySQL和SQL的練習。

在虛構這個庫的過程中,主要涉及的是如何使用命令行管理 MySQL數據庫對象:數據庫、表、索引、外鍵等;另一個更為重要的是如何Mock對應表的數據。

虛構書店數據庫的dump腳本Github

數據庫(Database)

將要創建的虛擬書店的數據庫名為: mysql_practice;

創建數據庫的語法:

CREATE DATABASE [IF NOT EXISTS] database_name
[CHARACTER SET charset_name]
[COLLATE collation_name]
  1. IF NOT EXISTS: 可選項,避免數據庫已經存在時報錯。
  2. CHARACTER SET:可選項,不指定的時候會默認給個。
    • 查看當前MySQL Server支持的字符集(character set):
      show character set; -- 方法1
      show charset; -- 方法2
      show char set; -- 方法3
      
  3. COLLATE:針對特定character set比較字符串的規則集合;可選項,不指定的時候會默認給個。
    • 獲取 charater setcollations
      show collation like 'utf8%';
      
    • collation名字的規則: charater_set_name_ci 或者 charater_set_name_cscharater_set_name_bin_ci表示不區分大小寫,_cs表示區分大小寫;_bin表示用編碼值比較。
  4. 示例:
    CREATE DATABASE my_test_tb CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
    

TODO: 關於 character set和collations,內容稍微有點多,后面會單獨記一篇文章。

登錄的時候選擇數據庫

mysql -uroot -D database_name -p

登錄后選擇數據庫

use database_name;

查看當前選的數據庫

select database();

創建新數據庫

create database if not exists mysql_practice;

通過下面的語句可以檢查創建的數據庫:

show create database mysql_practice;

可以看到,如果創建數據庫時候沒有指定 character setcollate 的話,會默認指定一套。

顯示所有當前賬戶可見的數據庫

show databases;

刪除數據庫

drop database if exists mysql_practice;

MySQL中 schemadatabase 的同義詞,因此也可以使用下面語句刪除數據庫:

drop schema if exists mysql_practice;

數據表(Table)

MySQL創建數據表的語法

CREATE TABLE [IF NOT EXISTS] table_name(
   column_1_definition,
   column_2_definition,
   ...,
   table_constraints
) ENGINE=storage_engine;

表列的定義語法:

column_name data_type(length) [NOT NULL] [DEFAULT value] [AUTO_INCREMENT] column_constraint;

表的約束(Table Constraints): UNIQUE, CHECK, PRIMARY KEY and FOREIGN KEY.

查看表的定義

desc table_name;

創建mysql_practice數據表

USE mysql_practice;

DROP TABLE IF EXISTS customer_order;
DROP TABLE IF EXISTS book;
DROP TABLE IF EXISTS book_category;
DROP TABLE IF EXISTS customer_address;
DROP TABLE IF EXISTS customer;
DROP TABLE IF EXISTS region;


-- region,數據使用: https://github.com/xiangyuecn/AreaCity-JsSpider-StatsGov
CREATE TABLE IF NOT EXISTS region(
	id INT AUTO_INCREMENT,
    pid INT NOT NULL,
    deep INT NOT NULL,
    name VARCHAR(200) NOT NULL,
    pinyin_prefix VARCHAR(10) NOT NULL,
    pinyin VARCHAR(200) NOT NULL,
    ext_id VARCHAR(100) NOT NULL,
    ext_name VARCHAR(200) NOT NULL,
    PRIMARY KEY(id)
);


-- customer
CREATE TABLE IF NOT EXISTS customer(
    id INT AUTO_INCREMENT,
    no VARCHAR(50) NOT NULL,
    first_name VARCHAR(255) NOT NULL,
	last_name VARCHAR(255) NOT NULL,
    status VARCHAR(20) NOT NULL,
    phone_number VARCHAR(20)  NULL,
    updated_at DATETIME NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY(id),
    unique(no)
) ENGINE=INNODB;


-- customer address
CREATE TABLE IF NOT EXISTS customer_address(
	id INT AUTO_INCREMENT,
    customer_id INT NOT NULL,
    area_id INT NULL,
    address_detail VARCHAR(200) NULL,
	is_default bit NOT NULL,
	updated_at DATETIME NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY(id),
    FOREIGN KEY(customer_id) REFERENCES customer (id) ON UPDATE RESTRICT ON DELETE CASCADE
) ENGINE=INNODB;


-- book category
CREATE TABLE IF NOT EXISTS book_category(
	id INT AUTO_INCREMENT,
    code VARCHAR(200) NOT NULL,
	name VARCHAR(200) NOT NULL,
    parent_id INT NULL,
    deep INT NULL,
	updated_at DATETIME NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY(id)
);

-- book
CREATE TABLE IF NOT EXISTS book(
	id INT AUTO_INCREMENT,
    category_id INT NOT NULL,
    no VARCHAR(50) NOT NULL,
    name VARCHAR(200) NOT NULL,
    status VARCHAR(50) NOT NULL,
    unit_price DOUBLE NOT NULL,
    author VARCHAR(50)  NULL,
    publish_date DATETIME NULL,
    publisher VARCHAR(200) NOT NULL,
	updated_at DATETIME NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY(id),
    FOREIGN KEY (category_id) REFERENCES book_category (id) ON UPDATE RESTRICT ON DELETE CASCADE
);

-- orders
CREATE TABLE IF NOT EXISTS customer_order(
	id INT AUTO_INCREMENT,
    no VARCHAR(50) NOT NULL,
    customer_id INT NOT NULL,
    book_id INT NOT NULL,
    quantity INT NOT NULL,
    total_price DOUBLE NOT NULL,
    discount DOUBLE NULL,
    order_date DATETIME NOT NULL,
	updated_at DATETIME NOT NULL,
    created_at DATETIME NOT NULL,
    PRIMARY KEY(id),
    FOREIGN KEY (customer_id) REFERENCES customer(id) ON UPDATE RESTRICT ON DELETE CASCADE,
    FOREIGN KEY (book_id) references book (id) on update restrict on delete cascade
) ENGINE=INNODB;


導入region數據

下載region csv數據:【三級】省市區 數據下載.

導入語句:

LOAD DATA INFILE '/tmp/ok_data_level3.csv' 
INTO TABLE region 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

導入如果報錯:

ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement
  • 通過命令 mdfind -name my.cnf 找到mysql配置文件 my.cnf
  • 解決辦法 (還沒實際測試過,大都使用的是 LOATA DATA LOCAL INFILE 方式)

或者使用 LOAD DATA LOCAL INFILE代替 LOAD DATA INFILE 即:

LOAD DATA LOCAL INFILE '/tmp/ok_data_level3.csv' 
INTO TABLE region 
FIELDS TERMINATED BY ',' 
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

如果報錯:

Error Code: 3948. Loading local data is disabled; this must be enabled on both the client and server sides

或者報錯:

ERROR 1148 (42000): The used command is not allowed with this MySQL version
  • 查看配置: show variables like "local_infile";
  • 修改配置: set global local_infile = 1;

生成Customer數據

創建一個SP:

USE mysql_practice;

DROP PROCEDURE IF EXISTS sp_generate_customers;

DELIMITER $$

CREATE PROCEDURE sp_generate_customers()
BEGIN


-- Generate 10000 customer and customer_address

set @fNameIndex = 1;
set @lNameIndex = 1;

loop_label_f: LOOP

	IF @fNameIndex > 100 THEN
		LEAVE loop_label_f;
	END IF;
    
    set @fName = ELT(@fNameIndex, "James","Mary","John","Patricia","Robert","Linda","Michael","Barbara","William","Elizabeth","David","Jennifer","Richard","Maria","Charles","Susan","Joseph","Margaret","Thomas","Dorothy","Christopher","Lisa","Daniel","Nancy","Paul","Karen","Mark","Betty","Donald","Helen","George","Sandra","Kenneth","Donna","Steven","Carol","Edward","Ruth","Brian","Sharon","Ronald","Michelle","Anthony","Laura","Kevin","Sarah","Jason","Kimberly","Matthew","Deborah","Gary","Jessica","Timothy","Shirley","Jose","Cynthia","Larry","Angela","Jeffrey","Melissa","Frank","Brenda","Scott","Amy","Eric","Anna","Stephen","Rebecca","Andrew","Virginia","Raymond","Kathleen","Gregory","Pamela","Joshua","Martha","Jerry","Debra","Dennis","Amanda","Walter","Stephanie","Patrick","Carolyn","Peter","Christine","Harold","Marie","Douglas","Janet","Henry","Catherine","Carl","Frances","Arthur","Ann","Ryan","Joyce","Roger","Diane");
    
    
		loop_label_last: LOOP
		
		IF @lNameIndex > 100 THEN
			LEAVE loop_label_last;
		END IF;
		
			SET @lName =  ELT(@lNameIndex, "Smith","Johnson","Williams","Jones","Brown","Davis","Miller","Wilson","Moore","Taylor","Anderson","Thomas","Jackson","White","Harris","Martin","Thompson","Garcia","Martinez","Robinson","Clark","Rodriguez","Lewis","Lee","Walker","Hall","Allen","Young","Hernandez","King","Wright","Lopez","Hill","Scott","Green","Adams","Baker","Gonzalez","Nelson","Carter","Mitchell","Perez","Roberts","Turner","Phillips","Campbell","Parker","Evans","Edwards","Collins","Stewart","Sanchez","Morris","Rogers","Reed","Cook","Morgan","Bell","Murphy","Bailey","Rivera","Cooper","Richardson","Cox","Howard","Ward","Torres","Peterson","Gray","Ramirez","James","Watson","Brooks","Kelly","Sanders","Price","Bennett","Wood","Barnes","Ross","Henderson","Coleman","Jenkins","Perry","Powell","Long","Patterson","Hughes","Flores","Washington","Butler","Simmons","Foster","Gonzales","Bryant","Alexander","Russell","Griffin","Diaz","Hayes");
		
			-- insert into customer
			INSERT INTO customer(no, first_name, last_name, status, phone_number, updated_at, created_at) 
            values(
				REPLACE(LEFT(uuid(), 16), '-', ''),
				@fName, 
				@lName, 
				'ACTIVE',
				null, 
				curdate(), 
				curdate()
            );
            
            -- insert into customer_address
            set @randomArea = 0;
            SELECT id into @randomArea FROM region where deep = 2 ORDER BY RAND() LIMIT 1;
            
            INSERT INTO customer_address(customer_id, area_id, address_detail, is_default, updated_at, created_at)
            VALUES(
				@@Identity,
                @randomArea,
                '',
                1,
                curdate(),
                curdate()
            );
			
			set @lNameIndex = @lNameIndex + 1;
		
		END LOOP loop_label_last;
        
    
	SET @lNameIndex = 1; -- Note: assign 1 to last name index, for next loop.
	SET @fnameIndex = @fnameIndex + 1;
    
END LOOP loop_label_f;


-- update address_detail in customer_address
UPDATE customer_address ca
JOIN region r on ca.area_id = r.id and r.deep = 2
join region r2 on r.pid = r2.id and r2.deep = 1
join region r3 on r2.pid = r3.id and r3.deep = 0
SET ca.address_detail = concat(r3.ext_name, r2.ext_name, r.ext_name);


END $$

DELIMITER ;

調用SP:

call sp_generate_customers();

生成產品分類和產品數據

第零步: 手動插入產品分類到product_category表中

INSERT INTO product_category(code,name, parent_id, deep, updated_at, created_at)
VALUES
('BOOK', 'Book', 0, 0, curdate(), curdate()),
('BOOK_CODE', 'Code Book', 1, 1, curdate(), curdate()),
('BOOK_CHIDREN', 'Children Book', 1, 1, curdate(), curdate()),
('BOOK_SCIENCE', 'Science Book', 1, 1, curdate(), curdate());

第一步: 用Python寫個爬蟲工具,抓取書店的商品信息。

下面是抓取當當搜索“科學”關鍵字的書籍列表。

import requests
import csv
from bs4 import BeautifulSoup

def crawl(url):
    res = requests.get(url)
    res.encoding = 'gb18030'
    soup = BeautifulSoup(res.text, 'html.parser')
    n = 0
    section = soup.find('ul', id='component_59')
    allLIs = section.find_all('li')
    #print(allLIs)
    with open('output_science.csv', 'w', encoding='utf8') as f:
        csv_writer = csv.writer(f, delimiter='#') # 由於內容里有',',因此這里指定'#'作為csv分隔符
        csv_writer.writerow(['序號', '書名', '價格', '作者', '出版時間', '出版社'])
        
        for books in allLIs:
            title = books.select('.name')[0].text.strip().split(' ', 1)[0].strip()
            price = books.select('.search_pre_price')[0].text.strip('¥')
            authorInfo = books.select('.search_book_author')[0].text.strip().split('/')
            author = authorInfo[0]
            publishDate = authorInfo[1]
            publisher = authorInfo[2]
            n += 1
            csv_writer.writerow([n, title, price, author, publishDate, publisher])

url = 'http://search.dangdang.com/?key=%BF%C6%D1%A7&act=input'
crawl(url)

第二步: 導入csv數據到MySQL數據表mock_science中。

CREATE TABLE `mock_science` (
  `id` int(11) NOT NULL,
  `name` varchar(200) DEFAULT NULL,
  `price` double DEFAULT NULL,
  `author` varchar(100) DEFAULT NULL,
  `publish_date` varchar(100) DEFAULT NULL,
  `publisher` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

第三步: 插入科學類書信息到product表中

INSERT book(category_id, no, name, status,unit_price, author,publish_date,publisher, updated_at, created_at)
SELECT 
4,
REPLACE(LEFT(uuid(), 16), '-', ''),
name,
'ACTIVE',
price,
author,
publish_date,
publisher,
curdate(),
curdate()
FROM 
mock_science;

循環第一到第三步,可以插入更多的產品信息。練習數據庫最終抓取了JAVA兒童科學三個關鍵搜索出的第一頁書籍。

生成訂單數據

隨機生成訂單數據的SP(注意:這個sp生成的數據,還需要進一步處理):

USE mysql_practice;

DROP PROCEDURE IF EXISTS sp_generate_orders;

DELIMITER $$


-- Reference: https://www.mysqltutorial.org/select-random-records-database-table.aspx
-- Generate orders for last two years.
-- each day have orders range: [500, 5000]
CREATE PROCEDURE sp_generate_orders()
BEGIN

SET @startDate = '2020-03-01';
SET @endDate = curdate();


loop_label_p: LOOP

	IF @startDate > @endDate THEN
		LEAVE loop_label_p;
    END IF;


	SET @randCustomerTotal = FLOOR(RAND()*50) + 100;
    SET @randBookTotal = FLOOR(RAND()*5) + 1;
    
    
	SET @randQty = FLOOR(RAND()*3) + 1;
    
    
	SET @query1 = CONCAT('INSERT INTO customer_order(no, customer_id, book_id, quantity, total_price,discount, order_date, updated_at, created_at)');
    SET @query1 = CONCAT(@query1, ' select ', "'", uuid(), "'",', c.id, p.id,', @randQty, ', 0, 0, ', "'",@startDate,"'", ',', "'",curdate(),"'" ,',', "'",curdate(),"'");
    SET @query1 = CONCAT(@query1, ' FROM (select id from customer ORDER BY RAND() LIMIT ', @randCustomerTotal,') c  join ');
    SET @query1 = CONCAT(@query1, '  (select id from book order by rand() limit ', @randBookTotal,') p ');
    SET @query1 = CONCAT(@query1, 'where c.id is not null');
    

	PREPARE increased FROM @query1;
	EXECUTE increased;
    
    
    SET @startDate = DATE_ADD(@startDate,  INTERVAL 1 DAY);

END LOOP loop_label_p;


END $$

DELIMITER ;

總共會生成幾十萬或上百萬條order數據;最好先簡單加下index,不然query太慢,可以在創建db table后就加上。

添加index:

ALTER TABLE book ADD INDEX idx_unit_price(unit_price);

ALTER TABLE customer_order ADD INDEX idx_order_no(no);
ALTER TABLE customer_order ADD INDEX idx_order_date(order_date);
ALTER TABLE customer_order ADD INDEX idx_quantity(quantity);

更新order no:

-- update order total_price
-- please note it is better to add index first. otherwise it will be slow.

-- update order_no
update customer_order
set no = concat(REPLACE(LEFT(no, 16), '-', ''), customer_id, book_id)
where no is not null;
-- update total price

如果不想有重復的order no,可以通過下面的sql更新order no:

-- 處理重復的 order no
update customer_order co
join
(select no from customer_order co2 group by co2.no having count(*) > 1) as cdo
on co.no = cdo.no
set co.no = concat(REPLACE(LEFT(uuid(), 16), '-', ''), customer_id, book_id);

如果還有重復的order no,繼續run上面這個sql,直到沒有重復的即可。

更新order表里的total_price:

-- update total price
update customer_order co
join book b
on co.book_id = b.id
SET co.total_price = co.quantity * b.unit_price;

至此,我們的數據庫表和對應的mock數據已經基本完成。使用mysqldump備份一下:

mysqldump -u [username] –p[password] [database_name] > [dump_file.sql]

下一步

  • 視圖(View)
  • 存儲過程(Store Procedure)
  • 函數(Function)
  • 觸發器(Trigger)
  • 定時任務(Job)

參考資料

  1. MySQL Character Set
  2. MySQL Collation
  3. Generating random names in MySQL
  4. MySQL LOOP
  5. MySQL Select Random Records

原文地址:MySQL基礎知識:創建MySQL數據庫和表


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM