基於Spring Boot的可直接運行的分布式ID生成器的實現以及SnowFlake算法詳解

本文轉載自查看原文 2019-12-23 21:18 2100 SnowFlake/ java/ Spring Boot/ 分布式

背景

最近對snowflake比較感興趣，就看了一些分布式唯一ID生成器（發號器）的開源項目的源碼，例如百度的uid-generator，美團的leaf。大致看了一遍后感覺uid-generator代碼寫的要更好一些，十分的精煉，短小精悍。

正好手頭有個任務要搞個發號器，百度的這個源碼是不能直接運行起來提供服務的，為了練練手，就把百度的uid-generator遷移到spring boot上重寫了一遍。

代碼基本一模一樣，就做了一些工程化的東西，讓uid-generator能以服務的形式跑起來，通過http接口對外提供服務。

可運行的代碼地址：點這里

SnowFlake數據結構

這里借用一下uid-generator的圖：

這一個結構就是一個snowflake算法里的id，共計64位，就是一個long。

sign是一個恆為0的值，是為了保證算出的id恆為正數。

delta seconds (28 bits)

當前時間，相對於時間基點"2016-05-20"的增量值，單位：秒，最多可支持約8.7年。時間基點是自己配置的。28位即最大表示2^28的數值的秒數，換算一下就是8.7年左右。

worker id (22 bits)

機器id，最多可支持約420w次機器啟動。內置實現為在啟動時由數據庫分配。420w = 2^22

sequence (13 bits)

每秒下的並發序列，13 bits可支持每秒8192個並發，即2^13個並發

這些位數都是可以改變的，對於很多公司來說，28位 delta seconds帶來的8.7年的最大支持時間是可預期的不夠用的，而22bit的worker id和13bit的sequence則是遠遠超出可預期的業務場景的，那么就可以自由的根據自己的需求，對這三個參數進行調整。

例如，{"workerBits":20,"timeBits":31,"seqBits":12}這樣的配置可以68年，100W次重啟，單機每秒4096個並發的情況，個人感覺還是比較合適的。

snowflake的實現有很多種方式，不過思想上都是一樣的。

SnowFlake發號實現

在了解SnowFlake的數據結構后，就可以來看看具體是如何生成ID的了。

其實這個過程，就是往delta seconds，sequence，worker id三個結構里填充數據的過程。

整體類圖如下：

SnowFlakeGenerator就是基於SnowFlake算法的UidGenerator的實現類，SnowFlake的實現就是在這個類里；

BitsAllocator就是對SnowFlake 的ID進行位操作的共聚類；

DatabaseWorkerIdAssigner就是一個基於DB自增的worker id 分配器的實現。

BitsAllocator

這個類是進行一些位操作的工具類，給每一個id 的delta seconds，sequence，worker id賦值就是通過這個類來實現的。這個類有以下成員變量：

/**
     * Total 64 bits
     */
    public static final int TOTAL_BITS = 1 << 6;

    /**
     * Bits for [sign-> second-> workId-> sequence]
     */
    private int signBits = 1;
    private final int timestampBits;
    private final int workerIdBits;
    private final int sequenceBits;

    /**
     * Max value for workId & sequence
     */
    private final long maxDeltaSeconds;
    private final long maxWorkerId;
    private final long maxSequence;

    /**
     * Shift for timestamp & workerId
     */
    /**
     * timestamp需要位移多少位
     */
    private final int timestampShift;
    /**
     * workerId需要位移多少位
     */
    private final int workerIdShift;

其他字段都好說，看名稱和注釋都能明白。最下面倆shift，可能現在看着有些摸不着頭腦，不過看后面的賦值過程就知道什么叫“shift”了

構造器：

    /**
     * Constructor with timestampBits, workerIdBits, sequenceBits<br>
     * The highest bit used for sign, so <code>63</code> bits for timestampBits, workerIdBits, sequenceBits
     */
    public BitsAllocator(int timestampBits, int workerIdBits, int sequenceBits) {
        // make sure allocated 64 bits
        int allocateTotalBits = signBits + timestampBits + workerIdBits + sequenceBits;
        Assert.isTrue(allocateTotalBits == TOTAL_BITS, "allocate not enough 64 bits");

        // initialize bits
        this.timestampBits = timestampBits;
        this.workerIdBits = workerIdBits;
        this.sequenceBits = sequenceBits;

        // initialize max value
        //-1 是111111111（64個1）
        //先將-1左移timestampBits位，得到111111100000（timestampBits個零)
        //然后取反，得到00000....1111...（timestampBits）個1
        //等價於2的timestampBits次方-1
        this.maxDeltaSeconds = ~(-1L << timestampBits);
        this.maxWorkerId = ~(-1L << workerIdBits);
        this.maxSequence = ~(-1L << sequenceBits);

        // initialize shift
        this.timestampShift = workerIdBits + sequenceBits;
        this.workerIdShift = sequenceBits;
    }

也很簡單，重點就在 “~(-1L << timestampBits) ”這樣一坨操作，可能理解起來會有些困難。這是一連串的位操作，這里進行一下分解：

- -1 左移 timestampBits 位，實際的二進制看起來是11111111......00000...（最前面的1是最高位，表示負數；后面有timestampBits個0)
- 對-（2^timestampBits)進行取反操作，的到了2的timestampBits次方-1。實際的二進制看起來就是1111（timestampBits個1）

這一通操作其實也就相當於2的timestampBits次方-1，也就是timestampBits位二進制最大能表示的數字，不過是用位運算來做的。如果不懂二進制的位移和取反，可以百度“位操作”補充一下基礎，這里就不展開了。

分配操作：

    /**
     * Allocate bits for UID according to delta seconds & workerId & sequence<br>
     * <b>Note that: </b>The highest bit will always be 0 for sign
     *
     * 這里就是把不同的字段放到相應的位上
     * id的總體結構是：
     * sign (fixed 1bit) -> deltaSecond -> workerId -> sequence(within the same second)
     * deltaSecond 左移（workerIdBits + sequenceBits）位，workerId左移sequenceBits位，此時就完成了字節的分配
     * @param deltaSeconds
     * @param workerId
     * @param sequence
     * @return
     */
    public long allocate(long deltaSeconds, long workerId, long sequence) {
        return (deltaSeconds << timestampShift) | (workerId << workerIdShift) | sequence;
    }

這里就是對delta seconds，sequence，worker id三個結構進行賦值的地方了，核心代碼之一。可以再看一下最上面的圖，sequence是在最右側（最低位），所以sequence不用做位移，直接就是在對的位置；

而workerId，需要左移workerIdShift才能到正確的位置。workerIdShift看上面的構造器，就是sequenceBits，就是sequence的位數；

deltaSeconds 左移timestampShift位，也就是workerIdBits + sequenceBits；

然后對這三個位移后的值進行“或”操作，就把正確的值賦到正確的位數上了。

DatabaseWorkerIdAssigner

SnowFlake中，deltaSeconds依賴時間戳，可以通過系統獲取；sequence可以通過自增來控制；這倆字段都是項目可以自給自足的，而WorkerId則必須還有一個策略來提供。

這個策略要保證每次服務啟動的時候拿到的WorkerId都能不重復，不然就有可能集群不同的機器拿到不同的workerid，會發重復的號了；

而服務啟動又是個相對低頻的行為，也不影響發號性能，所以可以用DB自增ID來實現。

DatabaseWorkerIdAssigner就是依賴DB自增ID實現的workerId分配器。

代碼就不貼了，就是個簡單的save然后取到DB的自增ID。

SnowFlakeGenerator

這里就是控制發號邏輯的地方了。

先看看成員變量和初始化部分：

@Value("${snowflake.timeBits}")
    protected int timeBits = 28;

    @Value("${snowflake.workerBits}")
    protected int workerBits = 22;

    @Value("${snowflake.seqBits}")
    protected int seqBits = 13;

    @Value("${snowflake.epochStr}")
    /** Customer epoch, unit as second. For example 2016-05-20 (ms: 1463673600000)*/
    protected String epochStr = "2016-05-20";
    protected long epochSeconds = TimeUnit.MILLISECONDS.toSeconds(1463673600000L);

    @Autowired
    @Qualifier(value = "dbWorkerIdAssigner")
    protected WorkerIdAssigner workerIdAssigner;

    /** Stable fields after spring bean initializing */
    protected BitsAllocator bitsAllocator;


    protected long workerId;


    /** Volatile fields caused by nextId() */
    protected long sequence = 0L;
    protected long lastSecond = -1L;


    @PostConstruct
    public void afterPropertiesSet() throws Exception {
        bitsAllocator = new BitsAllocator(timeBits,workerBits,seqBits);
        // initialize worker id
        workerId = workerIdAssigner.assignWorkerId();

        if(workerId > bitsAllocator.getMaxWorkerId()){
            throw new RuntimeException("Worker id " + workerId + " exceeds the max " + bitsAllocator.getMaxWorkerId());
        }

        if (StringUtils.isNotBlank(epochStr)) {
            this.epochSeconds = TimeUnit.MILLISECONDS.toSeconds(DateUtils.parseByDayPattern(epochStr).getTime());
        }

        log.info("Initialized bits(1, {}, {}, {}) for workerID:{}", timeBits, workerBits, seqBits, workerId);
    }

@Value注入的都是配置文件里讀取的值。

afterPropertiesSet里，將配置文件讀取到的值傳遞給BitsAllocator，夠造出一個對應的BitsAllocator；

然后生成一個workerId（插入一條DB記錄），初始化過程就完成了。

再看核心發號控制邏輯：

/**
     * Get UID
     *
     * @return UID
     * @throws UidGenerateException in the case: Clock moved backwards; Exceeds the max timestamp
     */
    protected synchronized long nextId() {
        long currentSecond = getCurrentSecond();

        // Clock moved backwards, refuse to generate uid
        //todo 時鍾回撥問題待解決
        if (currentSecond < lastSecond) {
            long refusedSeconds = lastSecond - currentSecond;
            throw new UidGenerateException("Clock moved backwards. Refusing for %d seconds", refusedSeconds);
        }

        // At the same second, increase sequence
        //同一秒內的，seq加一
        if (currentSecond == lastSecond) {
            //seq 加一，如果大於MaxSequence，就變成0
            //如果大於MaxSequence 就是seq能取到的最大值，二進制（seqBits -1）位全是1
            sequence = (sequence + 1) & bitsAllocator.getMaxSequence();
            // Exceed the max sequence, we wait the next second to generate uid
            //號發完了，等到下一秒
            if (sequence == 0) {
                currentSecond = getNextSecond(lastSecond);
            }

            // At the different second, sequence restart from zero
        } else {
            //新的一秒，重新開始發號
            sequence = 0L;
        }
        lastSecond = currentSecond;
        // Allocate bits for UID
        return bitsAllocator.allocate(currentSecond - epochSeconds, workerId, sequence);
    }

注意這是個synchronized方法，這是關鍵。

getCurrentSecond就是獲取當前以秒為單位的時間戳；

sequence計算邏輯

如果currentSecond和lastSecond一樣，那說明本次發號請求不是本秒的第一次，只要將sequence直接+1即可；如果+1后大於了MaxSequence（這里會用& bitsAllocator.getMaxSequence()設置為0），那說明本秒的sequence已經用完了，此時請求已經超出了本秒系統的最大吞吐量，這里需要調用getNextSecond(詳見github)，來等待到下一秒；

如果currentSecond和lastSecond不一樣，說名本次請求是全新的一秒，這時候sequence設置為0即可。

deltaSecond計算邏輯

就是currentSecond - epochSeconds，當前時間減去初始時間的秒數。

此時，workerId，deltaSecond，sequence都已經確定了具體的值，然后調用bitsAllocator.allocate方法，就可以生成一個全新的ID了，至此發號完成。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 分布式ID生成器-雪花算法(snowflake) 分布式全局ID生成器（雪花算法）分布式id生成器分布式的Id生成器分布式ID生成器分布式ID生成系統 UUID與雪花（snowflake）算法分布式id生成(UUID、雪花算法snowflake) snowflake ID生成器常用的分布式ID生成器分布式ID生成器PHP+Swoole實現(上) - 實現原理