遙感影像深度學習標注軟件的開發要點

本文轉載自查看原文 2019-03-05 13:01 1701

　　最近一個月開發了一款遙感影像深度學習標注軟件。經過一個月的艱苦編碼，基本已經穩定，講開發過程做一個簡單記錄，以備后用。

一、遙感影像的標注與圖片影像有何不同

1、遙感影像文件尺寸大，單幅影像動輒達到幾百M甚至上T的數據量。用labelme之類的軟件無法打開。

2、遙感影像很多是16Bit 32BiT 的數據，不經過數據類型轉換和拉伸處理無法正常顯示。

3、數據成果的分幅問題，普通圖像可以直接導出標注結果，但遙感圖像的標注結果想要被深度學習算法利用，必須要經過分幅處理。

4、多波段數據需要選擇波段。

針對這些遙感影像本身的特點，我們基於labelme，做了針對性的改進。

首先，大幅的影像，我們需要給影像做金字塔文件，在GDAL提供了很方便的金字塔函數:

CPLErr CPL_DLL CPL_STDCALL
GDALBuildOverviews( GDALDatasetH, const char *, int, int *,
                    int, int *, GDALProgressFunc, void * );

經過這樣的處理，我們就擁有了多層次的圖像數據，單這僅僅是第一步，為了平滑顯示遙感影像，需要對各層次影像進行切塊，也就是LOD技術(Levels of Detail的簡稱，意為多細節層次)。

實現LOD技術，不但在讓標注軟件在瀏覽影像時候更平滑順暢，而且方便用戶在不同尺度上對物體進行標注。

其次，要解決16BIT 和 32 bit 影像的顯示問題。我們都知道，計算機顯示圖像時，只能用RGB三原色模型，每個波段只能用8BIT顯示，那么在顯示圖像的時候必然會涉及到scalar type的轉換，

雖然GDAL 在 rasterIO 的時候，也可自動實現顏色scalar type轉換，但是這往往還不夠，為了使得顯示的顏色更加飽滿，通常需要去掉直方圖中前后各0.2% 的像素值。

/**/
void getHistogramMinMax(GDALDatasetH dataset, std::vector<double>& mins, 
    std::vector<double>& maxs, GDALProgressFunc callback = 0L) {
    GDALDataType datatype = GDALGetRasterDataType(GDALGetRasterBand(dataset, 1));
    ScalarType scalarType;
    if (datatype == GDT_UInt16) scalarType = UINT16;
    if (datatype == GDT_Int16)  scalarType =SINT16;
    if (datatype == GDT_UInt32) scalarType = UINT32;
    if (datatype == GDT_Int32)  scalarType = SINT32;
    int bands = GDALGetRasterCount(dataset);
    for (int i = 0; i < bands; i++)
    {
        // scalar min
        const float64 S_MIN = defaultMin(scalarType);
        // scalar max
        const float64 S_MAX = defaultMax(scalarType);
        int bins = S_MAX - S_MIN + 1;
        int *panHistogram = new int[bins];
        double *accuHistogram = new double[bins];
        GDALGetRasterHistogram(GDALGetRasterBand(dataset, 1),
            S_MIN - 0.5, S_MAX + 0.5,
            bins, panHistogram, false, false, 0, 0);
        /*acc  histo*/
        for (int j = 0; j < bins; j++) {
            accuHistogram[j] = panHistogram[j];
        }
        for (int j = 1; j < bins; j++) {
            accuHistogram[j] += accuHistogram[j - 1];
        }
        size_t samples = GDALGetRasterXSize(dataset);
        size_t lines = GDALGetRasterYSize(dataset);
        
        for (int j = 0; j < bins; j++) {
            accuHistogram[j] /= accuHistogram[bins - 1];
        }
        int min = 0, max = 0;
        for (int u = 0; u < bins; u++) {
            if (accuHistogram[u] > 0.02) {
                min = u;
                break;
            }
        }
        for (int u = 0; u < bins; u++) {
            if (accuHistogram[u] > 0.98) {
                max = u;
                break;
            }
        }
        std::cout << "= getHistogramMinMax:" << "S_MIN=" << S_MIN
            << " min=" << min << " max=" << max << std::endl;
        mins.push_back(S_MIN + min);
        maxs.push_back(S_MIN + max);
        delete [] panHistogram;
        delete[] accuHistogram;
    }    
    return;
}

對於大幅的遙感影像而言，如何將標注成果導出為CNN深度模型的輸入，也是一個必要要解決的問題，在GDAL中，有一個很好用的python腳本工具, 但是這個分塊工具的用途是給WMS服務用的，涉及許多的坐標系變換，用起來相當復雜，

我們采用自己分幅的方式，將成果輸出為固定大小的TILE. 這個時候要注意，分出來的每個小塊還是需要保留該區域的地理坐標信息，即保存它的 GeoTrans[6] 這個屬性。在這個過程中有一個地方要注意， GeoTrans[5]的值一般是一個負數。

bool Utils::gdal2Tile(QString image, int szInPixel, 
    QString outDir,
    QList<QPoint> validBlocks,           /*only valid blocks will be saved*/
    std::function<void(int&)> cb)
{
    //open file
    char *strImage = image.toLocal8Bit().data();
    GDALDatasetH dataset = GDALOpen(strImage, GA_ReadOnly);
    if (!dataset) {
        return false;
    }
    double geoTrans[6];
    GDALGetGeoTransform(dataset, geoTrans);
    /*get wkt*/
    std::string wkt = (char*)GDALGetProjectionRef(dataset);
    //get image info
    unsigned int height = GDALGetRasterYSize(dataset);
    unsigned int width = GDALGetRasterXSize(dataset);
    int bsNum = GDALGetRasterCount(dataset);
    GDALDataType dataType = GDALGetRasterDataType(GDALGetRasterBand(dataset, 1));
    int dataTypeSizeInBytes = GDALGetDataTypeSize(dataType) / 8;
    char* buf = new char[szInPixel * szInPixel * bsNum * dataTypeSizeInBytes];

    std::cout << "= gdal2Tile: outDir " 
        << outDir.toStdString() 
        <<" data type :" << dataType
        <<" dataType size:" << dataTypeSizeInBytes
        << std::endl;
    ossimString driverName;
    char **option = 0;
    if (image.endsWith("img", Qt::CaseInsensitive))    {
        driverName = "HFA";
        option = CSLSetNameValue(option, "BLOCKSIZE", "256");
    }
    else if (image.endsWith("ecw", Qt::CaseInsensitive)){
        driverName = "ECW";
    }
    else if (image.endsWith("gta", Qt::CaseInsensitive)){
        driverName = "GTA";
    }
    else if (image.endsWith("pix", Qt::CaseInsensitive)){
        driverName = "PCIDSK";
    }
    else if (image.endsWith("hdr", Qt::CaseInsensitive)){
        driverName = "ENVI";
        option = CSLSetNameValue(option, "INTERLEAVE", "bip");
        option = CSLSetNameValue(option, "SUFFIX", "REPLACE");
    }
    else{
        driverName = "GTIFF";
        option = CSLSetNameValue(option, "INTERLEAVE", "PIXEL");
        option = CSLSetNameValue(option, "COMPRESS", "NONE");
        option = CSLSetNameValue(option, "TILED", "YES");
        option = CSLSetNameValue(option, "BLOCKXSIZE", "256");
        option = CSLSetNameValue(option, "BLOCKYSIZE", "256");
        option = CSLSetNameValue(option, "BIGTIFF", "IF_NEEDED");
    }

    GDALDriverH driver = GDALGetDriverByName(driverName.c_str());
    int tile_x_num = std::ceil(1.0 * width / szInPixel);
    int tile_y_num = std::ceil(1.0 * height / szInPixel);
    for (int u = 0; u < validBlocks.size(); u++) {
        int i = validBlocks[u].y();
        int j = validBlocks[u].x();
        int xsize = (j != tile_x_num - 1) ? szInPixel : width - j*szInPixel;
        int ysize = (i != tile_y_num - 1) ? szInPixel : height - i*szInPixel;
        /*read all bands*/
        CPLErr err = GDALDatasetRasterIO(dataset, GF_Read, j*szInPixel,
            i*szInPixel, xsize, ysize, buf, xsize, ysize, dataType,
            bsNum, 0,
            bsNum * dataTypeSizeInBytes, bsNum * dataTypeSizeInBytes * szInPixel, dataTypeSizeInBytes);
        /*create new dataset, first get the output filename*/
        image = image.replace('\\', '/');
        int extPos = image.lastIndexOf('.');
        int lastSlashPos = image.lastIndexOf('/');
        QString baseName = image.mid(lastSlashPos, extPos - lastSlashPos);
        QString ext = image.mid(extPos, -1);
        if (outDir.endsWith('/') || outDir.endsWith('\\')) {
            outDir = outDir.replace('\\', '/');
        }
        else {
            outDir += '/';
            outDir = outDir.replace('\\', '/');
        }
        QString outImage = outDir + baseName
            + QString("_")
            + QString::number(i)
            + QString("_") + QString::number(j)            
            + ext;
        GDALDatasetH out = GDALCreate(driver,
            outImage.toLocal8Bit().data(),
            xsize,
            ysize,
            bsNum,
            dataType,
            option);

        /*write to new dataset*/
        GDALDatasetRasterIO(out, GF_Write, 0,
            0, xsize, ysize, buf, xsize, ysize, dataType,
            bsNum, 0,
            bsNum * dataTypeSizeInBytes, bsNum * dataTypeSizeInBytes * szInPixel, dataTypeSizeInBytes);

        if (!wkt.empty()) {
            GDALSetProjection(out, wkt.c_str());
            double newGeoTrans[6];
            memcpy(newGeoTrans, geoTrans, sizeof(double) * 6);
            newGeoTrans[0] = geoTrans[0] + geoTrans[1] * j * szInPixel;
            newGeoTrans[3] = geoTrans[3] + geoTrans[5] * i * szInPixel;
            GDALSetGeoTransform(out, newGeoTrans);
        }
        else {
            GDALSetProjection(out, wkt.c_str());
            double newGeoTrans[6];
            newGeoTrans[0] = 0;
            newGeoTrans[1] = 1;
            newGeoTrans[2] = 0;
            newGeoTrans[3] = ysize;
            newGeoTrans[4] = 0;
            newGeoTrans[5] = -1;
        }
        GDALClose(out);
        int percent = (u)*100.0 / validBlocks.size();
        cb(percent);
    }
    GDALClose(dataset);
    CSLDestroy(option);
    delete[] buf;
    int percent = 100;
    cb(percent);
    return true;
}

5、傾斜地物的標注

許多遙感影像中的地物，如球場，房屋等，是長方形的幾何體，但是由於拍攝的問題，我們得到的是傾斜的長方體，這種情況，如果用矩形框進行標注，會框選大部分的無效區域，如果用

多邊形標注，又會顯得麻煩，這時，如果用傾斜多邊形標注會非常簡潔。

這款遙感影像標注軟件是基於labelme的改進，目前在開發中，不過已經很好地解決了上述問題.

6. 標記軟件與數字地球的結合

數字地球提供了一個全局的環境，能幫助標注者輔助判讀圖像，CESIUM是一款基於webgl技術的，BS架構的開源數字地球軟件，相比worldwind java 和OSGEARTH ，這個社區更為

活躍，有豐富的案例和代碼可供參考。在RSLABEL標注軟件，提供了插件機制，將CESIUM作為一個模塊嵌入到工作環境中，用戶可將正在編輯的影像嵌入到數字地球上，在三維地形

的配合下進行判讀。網頁嵌入到應用程序中，需要用到QWebView , 該控件可以讓桌面應用程序裝載web頁面，通過evaluejavascript 函數讓網頁執行應用程序的指令。

https://github.com/enigma19971/RSLabel

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 遙感影像深度學習樣本制作工具深度學習遙感影像(哨兵2A/B)超分辨率遙感影像重投影 arcgis對谷歌遙感影像拼接 Cartopy繪圖—疊加遙感影像遙感大數據深度學習變化檢測 Windows下用Caffe跑自己的數據（遙感影像） Cesium 遙感衛星影像推送效果繪制如何選擇合適分辨率的遙感影像 MATLAB對TIFF遙感影像批量輻射定標