貝葉斯文本分類c#版


關於這個話題,博客園已經有多個版本了

這幾個版本中,最具有實用性的應該是Pymining版,Pymining可以生成模型,便於復用,同時也講解的較為清楚,感興趣的可以去看下原文。

Pymining是基於python的,作為c#控,決定參考Pymining寫一個c#版本的分類器,目前完成了朴素貝葉斯分類的移植工作。

下面是使用示例:

           var loadModel = ClassiferSetting.LoadExistModel;
//loadModel = true;
Text2Matrix text2Matrix = new Text2Matrix(loadModel);
ChiSquareFilter chiSquareFilter = new ChiSquareFilter(loadModel);
NaiveBayes bayes = new NaiveBayes(loadModel);

if (!loadModel)
{
Console.WriteLine("開始模型訓練...");

//var matrix = text2Matrix.CreateTrainMatrix(new SogouRawTextSource(@"E:\語料下載程序\新聞下載\BaiduCrawl\Code\HtmlTest\Jade.Util\Classifier\SogouC.reduced.20061127\SogouC.reduced\Reduced"));
var matrix = text2Matrix.CreateTrainMatrix(new TuangouTextSource());

Console.WriteLine("卡方檢驗中...");

chiSquareFilter.TrainFilter(matrix);

Console.WriteLine("訓練模型中...");

bayes.Train(matrix);
}
var totalCount = 0;
var accurent = 0;

var tuangouTest = new TuangouTextSource(@"E:\語料下載程序\新聞下載\BaiduCrawl\Code\HtmlTest\Jade.Util\Classifier\test.txt");

while (!tuangouTest.IsEnd)
{
totalCount++;
var raw = tuangouTest.GetNextRawText();
Console.WriteLine("文本:" + raw.Text);
Console.WriteLine("標記結果:" + raw.Category);
var category = GetCategory(raw.Text, bayes, chiSquareFilter, text2Matrix);
Console.WriteLine("結果:" + category);
if (raw.Category == category)
{
accurent++;
}
}

Console.WriteLine("正確率:" + accurent * 100 / totalCount + "%");

Console.ReadLine();

結果:

XMK3M@2KJO~)W4R~M}XHA]S

 

為了便於大家理解,下面將主要的模塊和流程進行介紹。

流程圖

 

        文本模式分類一般的過程就是對訓練集提取特征,對於文本來說就是分詞,分出來的結果通常比較多,不能全部用來做特征,需要對特征進行降維,然后在使用分類算法(如貝葉斯)生成模型,並以模型來對需要進行分類的文本進行預測。

程序結構

分類程序主要由配置模塊,分詞模塊,特征選取模塊,分類模塊等幾個部分組成,下面逐一介紹:

配置模塊

python版本的程序用一個xml來存儲配置信息,c#版本繼續沿用這個配置信息

View Code
<?xml version="1.0" encoding="utf-8" ?>
<config>
<__global__>
<term_to_id>model/term_to_id</term_to_id>
<id_to_term>model/id_to_term</id_to_term>
<id_to_doc_count>model/id_to_doc_count</id_to_doc_count>
<class_to_doc_count>model/class_to_doc_count</class_to_doc_count>
<id_to_idf>model/id_to_idf</id_to_idf>
<newid_to_id>model/newid_to_id</newid_to_id>
<class_to_id>model/class_to_id</class_to_id>
<id_to_class>model/id_to_class</id_to_class>
</__global__>

<__filter__>
<rate>0.3</rate>
<method>max</method>
<log_path>model/filter.log</log_path>
<model_path>model/filter.model</model_path>
</__filter__>

<naive_bayes>
<model_path>model/naive_bayes.model</model_path>
<log_path>model/naive_bayes.log</log_path>
</naive_bayes>

<twc_naive_bayes>
<model_path>model/naive_bayes.model</model_path>
<log_path>model/naive_bayes.log</log_path>
</twc_naive_bayes>

</config>

配置信息主要是存儲模型文件相關的文件路徑

讀取xml就簡單了,當然為了方便使用,我們建立幾個類

View Code
   /// <summary>
/// 全局配置信息
/// </summary>
public class GlobalSetting
{
public string TermToId { get; set; }
public string IdToTerm { get; set; }
public string IdToDocCount { get; set; }
public string ClassToDocCount { get; set; }
public string IdToIdf { get; set; }
public string NewidToId { get; set; }
public string ClassToId { get; set; }
public string IdToClass { get; set; }
}

/// <summary>
/// 卡方設置
/// </summary>
public class FilterSetting : TrainModelSetting
{
/// <summary>
/// 特征選取比例
/// </summary>
public double Rate { get; set; }

/// <summary>
/// avg max
/// </summary>
public string Method { get; set; }

}


public class TrainModelSetting
{
/// <summary>
/// 日志路徑
/// </summary>
public string LogPath { get; set; }

/// <summary>
/// 模型路徑
/// </summary>
public string ModelPath { get; set; }

}

/// <summary>
/// 貝葉斯設置
/// </summary>
public class NaiveBayesSetting : TrainModelSetting
{

}

另外,提供一個供程序訪問配置信息的工具類

View ClassiferSetting

分詞

要提取特征,首先要進行分詞,對c#來說,直接采用盤古分詞就可以了,當然,還需要對盤古做下簡單的封裝

View Code
public class PanguSegment : ISegment
{
static PanguSegment()
{
PanGu.Segment.Init();
}

public List<string> DoSegment(string text)
{
PanGu.Segment segment = new PanGu.Segment();
ICollection<WordInfo> words = segment.DoSegment(text);
return words.Where(w=>w.OriginalWordType != WordType.Numeric).Select(w => w.Word).ToList();
}
}

 

另外,可以添加一個停用詞過濾StopWordsHandler

View Code
public class StopWordsHandler
{
private static string[] stopWordsList = { " ", "", "我們", "", "自己", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "" };
public static bool IsStopWord(string word)
{
for (int i = 0; i < stopWordsList.Length; ++i)
{
if (word.IndexOf(stopWordsList[i]) != -1)
return true;
}
return false;
}

public static void RemoveStopWord(List words)
{
words.RemoveAll(word => word.Trim() == string.Empty || stopWordsList.Contains(word));
}

}

 

讀取訓練集

分類不是隨意做到的,而是要基於以往的知識,也就是需要通過訓練集計算概率

為了做到普適性,我們定義一個RawText類來代表原始語料

public class RawText
{
public string Text { get; set; }
public string Category { get; set; }
}


然后定義接口IRawTextSource來代表訓練集,看到IsEnd屬性就知道這個接口怎么使用了吧?

public interface IRawTextSource
{
bool IsEnd { get; }
RawText GetNextRawText();
}

對於搜狗的語料集(點擊下載),可以采用下面的方法讀取

View Code

 

同樣的,對於python版本的訓練集格式,可以使用下面的類來讀取

View Code

構建矩陣

在介紹矩陣之前,還需要介紹一個對象GlobalInfo,用來存儲矩陣計算過程中需要記錄的數據,比如詞語和id的映射

與python版本不同的是,為了方便訪問,c#版本的GlobalInfo使用單例模式。

View Code

 

從這里開始進入核心部分

這一部分會構造一個m*n的矩陣,表示數據的部分,每一行表示一篇文檔,每一列表示一個feature(單詞)

矩陣中的categories是一個m * 1的矩陣,表示每篇文檔對應的分類id。

和python不同的是,我為了省事,矩陣對象還包含了一文檔文類(罪過),另外為了方便查看特征詞,特意添加了一個FeatureWords屬性

View Code
    public class Matrix
{
/// <summary>
/// 行數目 代表樣本個數
/// </summary>
public int RowsCount { get; private set; }

/// <summary>
/// 列數目 代表詞(特征)數目
/// </summary>
public int ColsCount { get; private set; }

/// <summary>
/// 用於記錄文件的詞數目[0] = 0,[1] = [0]+ count(1),[2] = [1]+count(2)
/// </summary>
public List<int> Rows;

/// <summary>
/// 用於記錄詞id(termId) 與Rows一起可以將文檔區分開來
/// </summary>
public List<int> Cols;

/// <summary>
/// 與cols一一對應,記錄單篇文章中term的次數
/// </summary>
public List<int> Vals;

/// <summary>
/// 記錄每篇文章的分類,與Row對應
/// </summary>
public List<int> Categories;
public Matrix(List<int> rows, List<int> cols, List<int> vals, List<int> categories)
{
this.Rows = rows;
this.Cols = cols;
this.Vals = vals;
this.Categories = categories;
if (rows != null && rows.Count > 0)
this.RowsCount = rows.Count - 1;
if (cols != null && cols.Count > 0)
this.ColsCount = cols.Max() + 1;
}

private List<string> featureWords;
public List<string> FeatureWords
{
get
{
if (Cols != null)
{
featureWords = new List<string>();
Cols.ForEach(col => featureWords.Add(GlobalInfo.Instance.IdToTerm[col]));
}
return featureWords;
}
}
}

 

Matrix一定要理解清楚Row和Col分別代表什么,下面來看怎么生成矩陣,代碼較長,請展開查看

View Code
        public Matrix CreateTrainMatrix(IRawTextSource textSource)
{
var rows = new List<int>();
rows.Add(0);
var cols = new List<int>();
var vals = new List<int>();
var categories = new List<int>();
// 盤古分詞
var segment = new PanguSegment();

while (!textSource.IsEnd)
{
var rawText = textSource.GetNextRawText();

if (rawText != null)
{
int classId;

// 處理分類
if (GlobalInfo.Instance.ClassToId.ContainsKey(rawText.Category))
{
classId = GlobalInfo.Instance.ClassToId[rawText.Category];
GlobalInfo.Instance.ClassToDocCount[classId] += 1;
}
else
{
classId = GlobalInfo.Instance.ClassToId.Count;
GlobalInfo.Instance.ClassToId.Add(rawText.Category, classId);
GlobalInfo.Instance.IdToClass.Add(classId, rawText.Category);
GlobalInfo.Instance.ClassToDocCount.Add(classId, 1);
}

categories.Add(classId);

var text = rawText.Text;

//分詞
var wordList = segment.DoSegment(text);

// 去停用詞
StopWordsHandler.RemoveStopWord(wordList);
var partCols = new List<int>();
var termFres = new Dictionary<int, int>();
wordList.ForEach(word =>
{
int termId;
if (!GlobalInfo.Instance.TermToId.ContainsKey(word))
{
termId = GlobalInfo.Instance.IdToTerm.Count;
GlobalInfo.Instance.TermToId.Add(word, termId);
GlobalInfo.Instance.IdToTerm.Add(termId, word);
}
else
{
termId = GlobalInfo.Instance.TermToId[word];
}

// partCols 記錄termId
if (!partCols.Contains(termId))
{
partCols.Add(termId);
}

//termFres 記錄termid出現的次數
if (!termFres.ContainsKey(termId))
{
termFres[termId] = 1;
}
else
{
termFres[termId] += 1;
}

});

partCols.Sort();
partCols.ForEach(col =>
{
cols.Add(col);
vals.Add(termFres[col]);
if (!GlobalInfo.Instance.IdToDocCount.ContainsKey(col))
{
GlobalInfo.Instance.IdToDocCount.Add(col, 1);
}
else
{
GlobalInfo.Instance.IdToDocCount[col] += 1;
}
});
//fill rows rows記錄前n個句子的詞語數目之和
rows.Add(rows[rows.Count - 1] + partCols.Count);
}
}


//fill GlobalInfo's idToIdf 計算idf 某一特定詞語的IDF,可以由總文件數目除以包含該詞語之文件的數目,再將得到的商取對數得到

foreach (var termId in GlobalInfo.Instance.TermToId.Values)
{
GlobalInfo.Instance.IdToIdf[termId] =
Math.Log(d: (rows.Count - 1) / (GlobalInfo.Instance.IdToDocCount[termId] + 1));
}

this.Save();

this.IsTrain = true;

return new Matrix(rows, cols, vals, categories);
}

 

特征降維

選取適合的特征對提高分類正確率有重要的幫助作用,c#版本選取chi-square,即卡方檢驗

卡方計算公式:
t: term
c: category
X^2(t, c) = N * (AD - CB)^2
____________________
(A+C)(B+D)(A+B)(C+D)
A,B,C,D is doc-count
A: belong to c, include t
B: Not belong to c, include t
C: belong to c, Not include t
D: Not belong to c, Not include t

B = t's doc-count - A
C = c's doc-count - A
D = N - A - B - C

得分計算:
and score of t can be calculated by n
X^2(t) = sigma p(ci)X^2(t,ci) (avg)
i
X^2(t) = max { X^2(t,c) } (max)

下面是對應的代碼代碼執行完成后,會將選取出來的特征詞寫到日志文件中:

View Code
        /// <summary>
/// 訓練
/// 卡方計算公式:
/// t: term
/// c: category
/// X^2(t, c) = N * (AD - CB)^2
/// ____________________
/// (A+C)(B+D)(A+B)(C+D)
/// A,B,C,D is doc-count
/// A: belong to c, include t
/// B: Not belong to c, include t
/// C: belong to c, Not include t
/// D: Not belong to c, Not include t
///
/// B = t's doc-count - A
/// C = c's doc-count - A
/// D = N - A - B - C
/// and score of t can be calculated by next 2 formulations:
/// X^2(t) = sigma p(ci)X^2(t,ci) (avg)
/// i
/// X^2(t) = max { X^2(t,c) } (max)
/// """
/// </summary>
/// <param name="matrix"></param>
public void TrainFilter(Matrix matrix)
{
if (matrix.RowsCount != matrix.Categories.Count)
{
throw new Exception("ERROR!,matrix.RowsCount shoud be equal to matrix.Categories.Count");
}

var distinctCategories = matrix.Categories.Distinct().ToList();
distinctCategories.Sort();

//#create a table stores X^2(t, c)
// #create a table stores A(belong to c, and include t 創建二維數組
ChiTable = new List<List<double>>();
var data = new List<double>();
for (var j = 0; j < matrix.ColsCount; j++)
{
data.Add(0);
}

for (var i = 0; i < distinctCategories.Count; i++)
{
ChiTable.Add(data.AsReadOnly().ToList());
}

// atable [category][term] - count
ATable = ChiTable.AsReadOnly().ToList();

for (var row = 0; row < matrix.RowsCount; row++)
{
for (var col = matrix.Rows[row]; col < matrix.Rows[row + 1]; col++)
{
var categoryId = matrix.Categories[row];
var termId = matrix.Cols[col];
ATable[categoryId][termId] += 1;
}
}

// 總文檔數
var n = matrix.RowsCount;

// 計算卡方
for (var t = 0; t < matrix.ColsCount; t++)
{
for (var cc = 0; cc < distinctCategories.Count; cc++)
{
var a = ATable[distinctCategories[cc]][matrix.Cols[t]]; // 屬於分類cc且包含詞t的數目
var b = GlobalInfo.Instance.IdToDocCount[t] - a; // 包含t但是不屬於分類的文檔 = t的總數-屬於cc的數目
var c = GlobalInfo.Instance.ClassToDocCount[distinctCategories[cc]] - a; // 屬於分類cc但不包含t的數目 = c的數目 - 屬於分類包含t
var d = n - a - b - c; // 既不屬於c又不包含t的數目
//#get X^2(t, c)
var numberator = (n) * (a * d - c * b) * (a * d - c * b) + 1;
var denominator = (a + c) * (b + d) * (a + b) * (c + d) + 1;
ChiTable[distinctCategories[cc]][t] = numberator / denominator;
}
}

// chiScore[t][2] : chiScore[t][0] = score,chiScore[t][1] = colIndex
var chiScore = new List<List<double>>();
for (var i = 0; i < matrix.ColsCount; i++)
{
var c = new List<double>();
for (var j = 0; j < 2; j++)
{
c.Add(0);
}
chiScore.Add(c);
}

// avg 函數時 最終得分 X^2(t) = sigma p(ci)X^2(t,ci) p(ci)為類別的先驗概率
if (this.Method == "avg")
{
// 構造類別先驗概率pc [category] - categoyCount/n
var priorC = new double[distinctCategories.Count + 1];
for (var i = 0; i < distinctCategories.Count; i++)
{
priorC[distinctCategories[i]] = (double)GlobalInfo.Instance.ClassToDocCount[distinctCategories[i]] / n;
}

// 計算得分
for (var t = 0; t < matrix.ColsCount; t++)
{
chiScore[t][1] = t;
for (var c = 0; c < distinctCategories.Count; c++)
{
chiScore[t][0] += priorC[distinctCategories[c]] * ChiTable[distinctCategories[c]][t];
}
}
}
else
{
//method == "max"
// calculate score of each t
for (var t = 0; t < matrix.ColsCount; t++)
{
chiScore[t][1] = t;
// 取最大值
for (var c = 0; c < distinctCategories.Count; c++)
{
if (chiScore[t][0] < ChiTable[distinctCategories[c]][t])
chiScore[t][0] = ChiTable[distinctCategories[c]][t];
}
}

}

// 比較得分
chiScore.Sort(new ScoreCompare());
chiScore.Reverse();

#region
var idMap = new int[matrix.ColsCount];

// add un-selected feature-id to idmap
for (var i = (int)(ClassiferSetting.FilterSetting.Rate * chiScore.Count); i < chiScore.Count; i++)
{
// 將未選中的標記為-1
var termId = chiScore[i][1];
idMap[(int)termId] = -1;
}
var offset = 0;
for (var t = 0; t < matrix.ColsCount; t++)
{
if (idMap[t] < 0)
{
offset += 1;
}
else
{
idMap[t] = t - offset;
GlobalInfo.Instance.NewIdToId[t - offset] = t;
}
}

this.IdMap = new List<int>(idMap);
#endregion

StringBuilder stringBuilder = new StringBuilder();
stringBuilder.AppendLine("chiSquare info:");
stringBuilder.AppendLine("=======selected========");
for (var i = 0; i < chiScore.Count; i++)
{
if (i == (int)(ClassiferSetting.FilterSetting.Rate * chiScore.Count))
{
stringBuilder.AppendLine("========unselected=======");
}
var term = GlobalInfo.Instance.IdToTerm[(int)chiScore[i][1]];
var score = chiScore[i][0];
stringBuilder.AppendLine(string.Format("{0} {1}", term, score));
}
File.WriteAllText(ClassiferSetting.FilterSetting.LogPath, stringBuilder.ToString());

GlobalInfo.Instance.Save();

this.Save();

this.IsTrain = true;
}

 

貝葉斯算法

具體可以參見開頭推薦的幾篇文章,知道P(C|X) = P(X|C)P(C)/P(X)就可以了

下面是具體的實現代碼 

View Code
        public List<List<double>> vTable { get; set; }

public List<double> Prior { get; set; }

public void Train(Matrix matrix)
{
if (matrix.RowsCount != matrix.Categories.Count)
{
throw new Exception("ERROR!,matrix.RowsCount shoud be equal to matrix.Categories.Count");
}

// #calculate prior of each class
// #1. init cPrior:

var distinctCategories = matrix.Categories.Distinct().ToList();
distinctCategories.Sort();
var cPrior = new double[distinctCategories.Count + 1];

// 2. fill cPrior
matrix.Categories.ForEach(classid => cPrior[classid] += 1);

//#calculate likehood of each term
// #1. init vTable: vTable[termId][Category]
vTable = new List<List<double>>();
for (var i = 0; i < matrix.ColsCount; i++)
{
var data = cPrior.Select(t => 0d).ToList();
vTable.Add(data);
}

// #2. fill vTable
for (var i = 0; i < matrix.RowsCount; i++)
{
for (var j = matrix.Rows[i]; j < matrix.Rows[i + 1]; j++)
{
vTable[matrix.Cols[j]][matrix.Categories[i]] += 1;
}
}

//#normalize vTable
for (var i = 0; i < matrix.ColsCount; i++)
{
for (var j = 0; j < cPrior.Length; j++)
{
// P(x|c) = term 個數 / 分類個數
if (cPrior[j] > 1e-10)
vTable[i][j] /= (cPrior[j]);
}
}

//#normalize cPrior P(C) = C/TC
for (var i = 0; i < cPrior.Length; i++)
{
cPrior[i] /= matrix.Categories.Count;
}

this.Prior = new List<double>(cPrior);

this.IsTrain = true;

this.Save();

}

 

預測

引用作者的話:

PyMining的訓練、測試的過程可以獨立的運行,可以先訓練出一個模型,等到有需要的時候再進行測試,所以在訓練的過程中,有一些數據(比如說chi-square filter)中的黑名單,將會保存到文件中去。如果想單獨的運行測試程序,請參考下面的一段代碼,調用了NaiveBayes.Test方法后,返回的resultY就是一個m * 1的矩陣(m是測試文檔的個數),表示對於每一篇測試文檔使用模型測試得到的標簽(屬於0,1,2,3)中的哪一個,precision是測試的准確率。

 

預測首先是構造一個矩陣,構造過程和訓練時類似:

        public Matrix CreatePredictSample(string text)
{
if (!this.IsTrain)
{
throw new Exception("請選訓練模型");
}

// 盤古分詞
var segment = new PanguSegment();
//分詞
var wordList = segment.DoSegment(text);

// 去停用詞
StopWordsHandler.RemoveStopWord(wordList);
var cols = new List<int>();
var vals = new List<int>();
var partCols = new List<int>();
var termFres = new Dictionary<int, int>();
wordList.ForEach(word =>
{
int termId;
if (GlobalInfo.Instance.TermToId.ContainsKey(word))
{
termId = GlobalInfo.Instance.TermToId[word];

if (!partCols.Contains(termId))
partCols.Add(termId);

//termFres 記錄termid出現的次數
if (!termFres.ContainsKey(termId))
{
termFres[termId] = 1;
}
else
{
termFres[termId] += 1;
}
}

});

partCols.Sort();
partCols.ForEach(col =>
{
cols.Add(col);
vals.Add(termFres[col]);
});

return new Matrix(null, cols, vals, null);
}



然后將構造出來的矩陣進行降維,只選取卡方選擇出來的詞語做特征

        public void SampleFilter(Matrix matrix)
{
if (!this.IsTrain)
{
throw new Exception("請選訓練模型");
}
//#filter sample
var newCols = new List<int>();
var newVals = new List<int>();
for (var c = 0; c < matrix.Cols.Count; c++)
{
if (IdMap[matrix.Cols[c]] >= 0)
{
newCols.Add(matrix.Cols[c]);
newVals.Add(matrix.Vals[c]);
}
}
matrix.Vals = newVals;
matrix.Cols = newCols;
}


最后將選取相互來的特征交給貝葉斯算法進行計算,選取得分最高的做為結果

View Code
        /// <summary>
/// 測試
/// </summary>
/// <param name="matrix"></param>
/// <returns></returns>
public string TestSample(Matrix matrix)
{
var targetP = new List<double>();
var maxP = -1000000000d;
var best = -1;
// 計算最大的P(C)*P(X|C)
for (var target = 0; target < this.Prior.Count; target++)
{
var curP = 100D; // 放大100倍
curP *= this.Prior[target];

for (var c = 0; c < matrix.Cols.Count; c++)
{
if (this.vTable[matrix.Cols[c]][target] == 0)
{
curP *= 1e-7;
}
else
{
curP *= vTable[matrix.Cols[c]][target];
}
}
targetP.Add(curP);
if (curP > maxP)
{
best = target;
maxP = curP;
}
}

return GlobalInfo.Instance.IdToClass[best];

}


模型的保存

模型的計算其實需要較長的時間,特別是當訓練集較大的時候,所以我們可以將訓練好的模型保存起來

下面是保存貝葉斯模型的code

View Code
        /// <summary>
/// 貝葉斯模型
/// </summary>
[Serializable]
public class NaiveBayesModel
{
public List<List<double>> vTable { get; set; }
public List<double> Prior { get; set; }
}

public override void Save()
{
try
{
var model = new NaiveBayesModel { vTable = this.vTable, Prior = this.Prior };
SerializeHelper helper = new SerializeHelper();
helper.ToBinaryFile(model, ClassiferSetting.NaiveBayesSetting.ModelPath);
}
catch
{
Console.WriteLine("加載卡方模型失敗");
}
}

public override void Load()
{
try
{
Console.WriteLine("加載貝葉斯模型……");
SerializeHelper helper = new SerializeHelper();
var model = (NaiveBayesModel)helper.FromBinaryFile<NaiveBayesModel>(ClassiferSetting.NaiveBayesSetting.ModelPath);
this.vTable = model.vTable;
this.Prior = model.Prior;
}
catch
{
Console.WriteLine("加載貝葉斯模型失敗");
}
}

 

  源代碼下載 數據請自備

 

有什么意見或者問題歡迎留言


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM