在網上有一種短8位UUID生成的方法,代碼來源:
JAVA生成短8位UUID
public static String[] chars = new String[] { "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" }; public static String generateShortUuid() { StringBuffer shortBuffer = new StringBuffer(); String uuid = UUID.randomUUID().toString().replace("-", ""); for (int i = 0; i < 8; i++) { String str = uuid.substring(i * 4, i * 4 + 4); int x = Integer.parseInt(str, 16); shortBuffer.append(chars[x % 0x3E]); } return shortBuffer.toString(); }
我們進行測試看到底多少會出現重復,寫了一個比較簡單的方法:
設置了線程池,數據庫連接池,每一個線程進行處理一百萬條數據,每次攜帶7萬條數據進行數據庫的插入。我們將ID設置為數據庫的主鍵,如果出現錯誤,則表示數據庫ID出現重復現象。
如果需要一次性插入更多的數據,或者在插入的時候報下面的錯誤:
Packet for query is too large (4,800,048 > 4,194,304)
修改 my.ini 加上 max_allowed_packet =67108864 67108864=64M 默認大小4194304 也就是4M 修改完成之后要重啟mysql服務,如果通過命令行修改就不用重啟mysql服務。 命令修改: 設置為500M mysql> set global max_allowed_packet = 500*1024*1024; 查看mysql的max_allowed_packet大小,運行 show VARIABLES like '%max_allowed_packet%';
下面是插入數據代碼,進行測試:
private static ThreadPoolExecutor threadPoolExecutor=new ThreadPoolExecutor(50, 100, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>(1024),new DefaultManagedAwareThreadFactory(), new ThreadPoolExecutor.AbortPolicy()); @Override public void saveTest() { threadPoolExecutor.execute(new saveUUID()); } /** * 設置保存信息的線程 */ class saveUUID implements Runnable{ public saveUUID(){ } @Override public void run() { System.out.println("===================>>>>>" + Thread.currentThread().getName()); List<Map<String,Object>> userList = new ArrayList<>(); int index=0; for(int i = 0; i< 1000000;i++){ long startTime = System.currentTimeMillis(); Map<String,Object> map = new HashMap<>(); map.put("id",getUUID8()); userList.add(map); index++; if (index==70000) { int count = testMapper.saveTest(userList); if (count > 0) { userList.clear(); index=0; long time =System.currentTimeMillis()-startTime; System.out.println( Thread.currentThread().getName() + "======>>>>" +time ); } } } if(!userList.isEmpty()){ testMapper.saveTest(userList); } } }
XML文件:
<insert id="saveTest" parameterType="java.util.List"> insert into tb_user(id,user_name) values <foreach item="item" index="index" collection="list" separator=","> (#{item.id},#{item.id}) </foreach> </insert>
進行測試當數據達到:2129580,也就是兩百萬的時候出現第一次重復,進行數據庫數據查詢,發現重復原因是因為MySQL不區分大小寫。
關於MySQL主鍵不區分大小寫,或則其他查詢不區分大小寫 Duplicate entry 'AOVbrXXF' for key 'PRIMARY'
將MySQL進行大小寫區分。再進行測試。
數據達到400萬條無重復,系統插入緩慢。
在基礎數據為0的情況下,插入數據100萬。系統花費時間為:

26.318秒
在基礎數據為0的情況下,插入數據200萬。系統花費的時間為:


64.097秒
在基礎數據為0的情況下,插入數據,系統花費時間:
DefaultManagedAwareThreadFactory-1======>>>>3209 DefaultManagedAwareThreadFactory-1======>>>>1033 DefaultManagedAwareThreadFactory-1======>>>>1089 DefaultManagedAwareThreadFactory-1======>>>>952 DefaultManagedAwareThreadFactory-1======>>>>921 DefaultManagedAwareThreadFactory-1======>>>>1119 DefaultManagedAwareThreadFactory-1======>>>>825 DefaultManagedAwareThreadFactory-1======>>>>911 DefaultManagedAwareThreadFactory-1======>>>>919 DefaultManagedAwareThreadFactory-1======>>>>850 DefaultManagedAwareThreadFactory-1======>>>>949 DefaultManagedAwareThreadFactory-1======>>>>983 DefaultManagedAwareThreadFactory-1======>>>>720 DefaultManagedAwareThreadFactory-1======>>>>1097 DefaultManagedAwareThreadFactory-1======>>>>862 DefaultManagedAwareThreadFactory-1======>>>>1173 DefaultManagedAwareThreadFactory-1======>>>>942 DefaultManagedAwareThreadFactory-1======>>>>808 DefaultManagedAwareThreadFactory-1======>>>>1159 DefaultManagedAwareThreadFactory-1======>>>>845 DefaultManagedAwareThreadFactory-1======>>>>998 DefaultManagedAwareThreadFactory-1======>>>>1478 DefaultManagedAwareThreadFactory-1======>>>>1420 DefaultManagedAwareThreadFactory-1======>>>>974 DefaultManagedAwareThreadFactory-1======>>>>1398 DefaultManagedAwareThreadFactory-1======>>>>4005 DefaultManagedAwareThreadFactory-1======>>>>2922 DefaultManagedAwareThreadFactory-1======>>>>2126 DefaultManagedAwareThreadFactory-1======>>>>1198 DefaultManagedAwareThreadFactory-1======>>>>931 DefaultManagedAwareThreadFactory-1======>>>>2197 DefaultManagedAwareThreadFactory-1======>>>>1014 DefaultManagedAwareThreadFactory-1======>>>>1231 DefaultManagedAwareThreadFactory-1======>>>>3416 DefaultManagedAwareThreadFactory-1======>>>>4471 DefaultManagedAwareThreadFactory-1======>>>>960 DefaultManagedAwareThreadFactory-1======>>>>1199 DefaultManagedAwareThreadFactory-1======>>>>1142 DefaultManagedAwareThreadFactory-1======>>>>940 DefaultManagedAwareThreadFactory-1======>>>>2211 DefaultManagedAwareThreadFactory-1======>>>>23994 DefaultManagedAwareThreadFactory-1======>>>>37751 DefaultManagedAwareThreadFactory-1======>>>>51163 DefaultManagedAwareThreadFactory-1======>>>>66993 DefaultManagedAwareThreadFactory-1======>>>>79591 DefaultManagedAwareThreadFactory-1======>>>>91846
由上可見當數據量達到一定量的時候,時間成指數上升。


如何能突破這個時間節點,估計還需要進一步優化。
生產UUID滿足千萬條數據

目前1147萬條數據,沒有出現重復,滿足原作者說的千萬條數據不重復

