今天我們要部分實現背單詞功能,在開始正題之前,還是附上背單詞軟件的下載鏈接:http://apk.91.com/Soft/Android/com.carlos.yueci-4.html
最近實驗室有了任務,時間會緊些,因此這個項目要加快進度了。
首先把我在系列二中的功能分析拷貝過來:
功能2、背單詞。
實現方法:這里要用到第二個數據庫,背單詞的詞庫。我們需要一個存放單詞的TXT文件,通過解析這個TXT文件,將要背的單詞解析並存進數據庫中,然后根據一定的規 律彈出單詞。
所用到的技術:
1)數據庫,同前面的數據庫技術相似;
2)對TXT文件中的單詞進行解析,字符串解析函數;
3)單詞狀態機,設計一定的算法,按照一定的規律彈出單詞,並進行背詞操作。(這個確實挺麻煩)
4)文件瀏覽,做一個簡易的文件瀏覽器,用於瀏覽SD卡中的單詞源文件txt,然后導入詞庫。這個屬於比較單獨的一個功能。
今天的主要工作是實現數據庫,解析txt單詞本,然后篇幅允許的話會分析一下拓詞軟件的背單詞算法。
一、數據庫部分。
為了避免和詞典功能對應的數據庫相互沖突,我們重新建一個DatabaseHelper,而不是采用在一個數據庫中創建多個表的方法,這樣操作起來互不干擾,不容易出錯。
這里創建的數據庫用來記錄所需要背的單詞、釋義,以及背詞次數,答錯次數,掌握程度,等等。這里我借鑒了拓詞的“掌握程度”的思路,利用掌握程度來決定單詞出現的頻度。
首先創建一個DatabaseHelper子類:
package com.carlos.database; import android.content.ContentValues; import android.content.Context; import android.database.Cursor; import android.database.sqlite.SQLiteDatabase; import android.database.sqlite.SQLiteDatabase.CursorFactory; import android.database.sqlite.SQLiteOpenHelper; public class DataBaseHelper extends SQLiteOpenHelper{ public Context mContext=null; public String tableName=null; public static int VERSION=1; public SQLiteDatabase dbR=null; public SQLiteDatabase dbW=null; public DataBaseHelper(Context context, String name, CursorFactory factory, int version) { super(context, name, factory, version); // TODO Auto-generated constructor stub mContext=context; tableName=name; dbR=this.getReadableDatabase(); dbW=this.getWritableDatabase(); } public DataBaseHelper(Context context, String name, CursorFactory factory){ this(context,name,factory,VERSION); mContext=context; tableName=name; } public DataBaseHelper(Context context, String name){ this(context,name,null); mContext=context; tableName=name; }; @Override public void onCreate(SQLiteDatabase db) { // TODO Auto-generated method stub db.execSQL("create table glossary(word text,interpret text," + "right int,wrong int,grasp int,learned int)"); } @Override public void onUpgrade(SQLiteDatabase arg0, int arg1, int arg2) { // TODO Auto-generated method stub } /** * * @param word * @param interpret * @param overWrite 是否覆寫原有數據 * @return */ public boolean insertWordInfoToDataBase(String word,String interpret,boolean overWrite){ Cursor cursor=null; cursor= dbR.query(tableName, new String[]{"word"}, "word=?", new String[]{word},null, null, "word"); if(cursor.moveToNext()){ if(overWrite){ ContentValues values=new ContentValues(); values.put("interpret", interpret); values.put("right", 0); values.put("wrong",0); values.put("grasp",0); values.put("learned", 0); dbW.update(tableName, values, "word=?", new String[]{word}); cursor.close(); return true; }else{ cursor.close(); return false; } }else{ ContentValues values=new ContentValues(); values.put("word", word); values.put("interpret", interpret); values.put("right", 0); values.put("wrong",0); values.put("grasp",0); values.put("learned", 0); dbW.insert(tableName, null, values); cursor.close(); return true; } } }
db.execSQL("create table glossary(word text,interpret text,right int,wrong int,grasp int,learned int)");
word: 單詞;
interpret: 翻譯;
right: 答對次數;
wrong: 答錯次數;
grasp: 掌握程度;
learned: 用於標識該單詞是否已經背過
另外這里添加了一個方法,外部可以調用該方法insertWordInfoToDataBase,把單詞和其對應的翻譯導入數據庫,isOverWrite用來設置是否覆蓋數據庫中原有的單詞信息。
然后在使用數據庫時和之前詞典功能中類似,實例化一個DatabaseHelper子類對象,然后獲得readableDatabase和writableDatabase 即可執行增刪改查等操作。
二、解析單詞TXT文本。
可能朋友們一直有個疑惑,到底單詞是以怎樣的形式導入到詞庫中的?其實用txt導入是我的一個巧合性的選擇,因為正好我背單詞時使用的是一個PDF的單詞表,於是我就把這個PDF轉換成txt,然后想辦法對這個txt進行解析,txt中的內容格式如下:
hello int.你好,哈嘍 beat v.打敗,戰勝happy a.歡樂幸福
number n.數字 alphabetical a.按字母表順序的 wind n.風
每一行的單詞可以是多個,單詞和該單詞的釋義之間必須要有一個或多個空格。
如何解析呢?基本的思路,就是要把單詞和釋義“分離”開來,並且要一一對應。這里我首先想到使用正則表達式進行匹配,基本的思路如下:
1、首先在一行單詞表中把所有的單詞找出來,存放在一個ArrayList<String >數組中,使用正則表達式 “[a-zA-Z]+[ ]+” [a-zA-Z]+ 表示有一個或多個英文字母,后面緊跟着一個或多個空格[ ]+。
hello int.你好,哈嘍 beat v.打敗,戰勝happy a.歡樂幸福
這樣可以就可以提出三個單詞,hello ,beat, happy
2、然后把一行字符串中所有的單詞位置替換成一個標記—— <S%%E>,那么字符串將會變成:注意這個標記可以自己設計成其它的,但最好特殊一點!
<S%%E>int.你好,哈嘍 <S%%E>v.打敗,戰勝<S%%E>a.歡樂幸福
3、然后在字符串尾再加上一個標記:<S%%E>
<S%%E>int.你好,哈嘍 <S%%E>v.打敗,戰勝<S%%E>a.歡樂幸福<S%%E>
4、這樣我們可以發現在%E> 和<S%之間的內容肯定是翻譯,那么我們用一個正則表達式%E>[^<S%%E>]+<S% 把它匹配出來,[^<S%%E>]+ 的含義是%E>和<%S之間不能有:<S%%E>,這樣就可以避免一次就把多個翻譯匹配出來。
%E>int.你好,哈嘍 <S%
%E>v.打敗,戰勝<S%
%E>a.歡樂幸福<S%
然后取第4到倒數第4個字符之間的字符創建字符串就可去除標記,獲得最終的翻譯:
int.你好,哈嘍
v.打敗,戰勝
a.歡樂幸福
這樣我們把單詞和其對應的翻譯就獲得了,然后把其insert到前面創建的數據庫就可以把單詞信息導入到數據庫了:,這里需要用到上面提到的數據庫中的insertWordToDatabase方法。
這里我把這個方法封裝到一個類中了:
package com.carlos.text_parser; import java.util.ArrayList; import java.util.regex.Matcher; import java.util.regex.Pattern; import android.content.Context; import android.database.Cursor; import android.database.sqlite.SQLiteDatabase; import com.carlos.database.DataBaseHelper; public class WordListParser { public DataBaseHelper dbHelper=null; public Context context=null; public String tableName=null; public WordListParser(){ } public WordListParser(Context context, String tableName) { this.context=context; this.tableName=tableName; dbHelper=new DataBaseHelper(context, tableName); } public void parse(String lineStr){ int countWord=0; int countInterpret=0; int count=0; String strInterpret=""; String str=""; char[] charArray=null; Pattern patternWord=Pattern.compile("[a-zA-Z]+[ ]+"); //"%>[^<%%>]*<%" Pattern patternInterpret=Pattern.compile("%E>[^<S%%E>]+<S%"); Matcher matcherWord=patternWord.matcher(lineStr); Matcher matcherInterpret=null; ArrayList<String> wordList=new ArrayList<String>(); ArrayList<String> interpretList=new ArrayList<String>(); while(matcherWord.find()){ str=matcherWord.group(); charArray=str.toCharArray(); if(charArray.length>0 && (charArray[0]>='A'&& charArray[0]<='Z' )){ charArray[0]+=('a'-'A'); str=new String(charArray,0,charArray.length); //首字母去掉大寫 } wordList.add(str.trim()); } if(wordList.size()<=0) return; matcherWord.reset(lineStr); if(matcherWord.find()){ strInterpret=matcherWord.replaceAll("<S%%E>"); } strInterpret+="<S%%E>"; matcherInterpret=patternInterpret.matcher(strInterpret); while(matcherInterpret.find()){ str=matcherInterpret.group(); interpretList.add(new String(str.toCharArray(),3,str.length()-6)); } countWord=wordList.size(); countInterpret=interpretList.size(); count=countWord>countInterpret?countInterpret:countWord; for(int i=0;i<count;i++){ dbHelper.insertWordInfoToDataBase(wordList.get(i), interpretList.get(i), true); } } // public boolean isOfAnWord(int index,char[] str){ // int i=index; // for( ;i<str.length;i++ ){ // if(isAlpha(str[i])==false) // break; // } // if(i==index) // return false; // if(i>=str.length) // return true; // if(str[i]=='.') // return false; // return true; // // } // // // public boolean isAlpha(char ch){ // if((ch>='A'&&ch<='Z') ||(ch>='a'&&ch<='z')){ // return true; // } // else // return false; // } // // // public boolean isChinese(char ch){ // if(ch>129) // return true; // else // return false; // } }
三、拓詞算法分析
誠然,“悅詞”軟件在背單詞的思路上是參考了拓詞,但因為做這款應用的初衷是方便我背單詞,所以只保留了背單詞算法中最核心的部分。下面我介紹下基本思路:
這里采用了掌握程度的思路,采用了狀態機的模式:
使用了狀態變量:
1、process:每一天總的學習進度:對應的狀態值有
public final static int GRASP_89=8; //89級掌握程度,屬於復習背過的單詞的階段,下同
public final static int GRASP_67=6;
public final static int GRASP_45=4;
public final static int GRASP_23=2;
public final static int GRASP_01=0;
public final static int LEARN_NEW_WORD=10; //當天學習新單詞的階段
2、processLearnNewWord:學習新單詞階段對應的子進度,對應的狀態有:
public final static int STEP_1_NEWWORD1=0; //階段1,學習新單詞10個
public final static int STEP_2_REVIEW_20=1; //階段2,復習20個舊的單詞
public final static int STEP_3_NEWWORD2=2; //階段3,學習新單詞10個左右
public final static int STEP_4_REVIEW_6=3; //階段4,復習約6個舊單詞
3、wordCount:某一階段內已經背的單詞數
新的一天開始后,應用會自動將process重置為GRASP_89(如何實現這一點后面會講),開始復習背過的單詞,先背掌握程度高的,然后掌握程度逐漸降低。89掌握程度約背全部89掌握程度的20%,67掌握程度背全部67掌握程度的45%,45掌握程60%,23 70% ,01 85%,這個概率可以根據個人理解改變,當完成所有舊單詞的復習過程后,進入新單詞學習階段。
在新單詞學習階段,首先背十個新單詞,然后轉為復習01掌握程度的20個左右的單詞,然后再被10個新單詞,然后再復習01掌握程度的6個左右的單詞,然后進入下一輪循環。新單詞階段一直循環上述四個階段,直至完成當天的背詞任務。
對於出錯單詞的處理,這里我添加了一個隊列LinkedList<WordInfo> wrongWordList ,用於存放背錯的單詞。當被錯的單詞數目達到一定值后,狀態機會自動進入背錯詞狀態,程序會把前一段時間被錯的單詞順序彈出來,直至隊列變空,然后單詞狀態機回到原來的背詞狀態。 出錯狀態用狀態變量processWrong控制,該變量的優先級要高於process,當processWrong為true時,狀態機將優先進入彈錯詞狀態,當processWrong變回wrong,狀態機再執行原來的process.
以上就是基本思路,下面將該思路封裝成一個類WordBox
package com.carlos.wordcontainer; import java.util.ArrayList; import java.util.Collection; import java.util.Iterator; import java.util.LinkedList; import java.util.Queue; import java.util.Random; import java.util.Stack; import android.content.ContentValues; import android.content.Context; import android.database.Cursor; import android.database.sqlite.SQLiteDatabase; import com.carlos.database.DataBaseHelper; public class WordBox { public Context context=null; public String tableName=null; private DataBaseHelper dbHelper=null; private SQLiteDatabase dbR=null,dbW=null; public final static int GRASP_89=8; public final static int GRASP_67=6; public final static int GRASP_45=4; public final static int GRASP_23=2; public final static int GRASP_01=0; public final static int LEARN_NEW_WORD=10; public final static int LEARNED=1; public final static int UNLEARNED=0; public static int process=GRASP_89; //總學習進度控制變量 public static int wordCount=0; //在某一復習階段背的單詞數 public static boolean processWrong=false; //是否要開始背錯誤的單詞 public final static int STEP_1_NEWWORD1=0; public final static int STEP_2_REVIEW_20=1; public final static int STEP_3_NEWWORD2=2; public final static int STEP_4_REVIEW_6=3; public static int processLearnNewWord=STEP_1_NEWWORD1; public LinkedList<WordInfo> wrongWordList=null; public Random rand=null; public WordBox(Context context,String tableName){ this.context=context; this.tableName=tableName; dbHelper=new DataBaseHelper(context, tableName); dbR=dbHelper.getReadableDatabase(); dbW=dbHelper.getWritableDatabase(); wrongWordList=new LinkedList<WordInfo>(); rand=new Random(); } @Override protected void finalize() throws Throwable { // TODO Auto-generated method stub dbR.close(); dbW.close(); dbHelper.close(); super.finalize(); } public void removeWordFromDatabase(String word){ dbW.delete(tableName, "word=?", new String[]{word}); } /** * 多個條件查找Where子句時需要用and 或or連接 * @param grasp * @param learned * @return */ public int getWordCountByGrasp(int grasp ,int learned){ //獲得數據庫中某個掌握程度的單詞的個數 Cursor cursor=dbR.query(tableName, new String[]{"word"}, "grasp=? and learned=?", new String[]{grasp+"",learned+""}, null, null, null); int count=cursor.getCount(); cursor.close(); return count; } public int getTotalLearnProgress(){ int learnCount=0; int totalCount=0; Cursor cursor=dbR.query(tableName, new String[]{"word"}, "grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=?", new String[]{"3","4","5","6","7","8","9","10"}, null, null, null); learnCount=cursor.getCount(); Cursor cursorTotal=dbR.query(tableName, new String[]{"word"}, "word like?", new String[]{"%"}, null, null, null); totalCount=cursorTotal.getCount(); cursor.close(); cursorTotal.close(); if(totalCount==0){ return 0; } return (int)(((float)learnCount/(float)totalCount)*100); } public int getWordCountOfUnlearned(){ Cursor cursorTotal=dbR.query(tableName, new String[]{"word"}, "word like?", new String[]{"%"}, null, null, null); int totalCount=cursorTotal.getCount(); Cursor cursor=dbR.query(tableName, new String[]{"word"}, "grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=? or grasp=?", new String[]{"3","4","5","6","7","8","9","10"}, null, null, null); int learnCount=cursor.getCount(); cursor.close(); cursorTotal.close(); return totalCount-learnCount; } public WordInfo getWordByGraspByRandom(int fromGrasp,int toGrasp,int learned){ //從數據庫中隨機取出某個特定掌握程度區間的單詞,加learned是區別學習程度為0的學過得和沒學過的 int totalCount=0,temp=0; ArrayList<Integer> graspsNotEmpty=new ArrayList<Integer>(); for(int i=fromGrasp; i<=toGrasp;i++){ temp=getWordCountByGrasp(i,learned); //這說明給定掌握程度范圍內沒有單詞 totalCount+=temp; if(temp>0) graspsNotEmpty.add(i); //把對應的grasp添加 } if(totalCount<=0){ //這里應該在外部添加判斷空的表不能夠進來 return null; } int length=graspsNotEmpty.size(); if(length<=0) return null; //在有數據的掌握程度中隨機找出一個單詞來 int graspInt=graspsNotEmpty.get(rand.nextInt(length)); //隨機確定一個掌握程度 int count=getWordCountByGrasp(graspInt, learned); //確定該掌握程度單詞數,獲得Cursor對象,利用move方法進行隨機移動 int index=rand.nextInt(count)+1; Cursor cursor=dbR.query(tableName, new String[]{"word","interpret","right","wrong","grasp"},"grasp=? and learned=?" , new String[]{graspInt+"",learned+""}, null, null, null); cursor.move(index); String word=cursor.getString(cursor.getColumnIndex("word")); String interpret=cursor.getString(cursor.getColumnIndex("interpret")); int wrong=cursor.getInt(cursor.getColumnIndex("wrong")); int right=cursor.getInt(cursor.getColumnIndex("right")); int grasp=cursor.getInt(cursor.getColumnIndex("grasp")); cursor.close(); return new WordInfo(word, interpret, wrong, right, grasp); } /** * 隨機從詞庫中找一個單詞! */ public static int lastGetIndex=0; public WordInfo getWordByRandom(){ int count=0; Cursor cursor=dbR.query(tableName, new String[]{"word","interpret","right","wrong","grasp"},"word like?" , new String[]{"%"}, null, null, null); if((count=cursor.getCount())<=0){ cursor.close(); return null; } int i=0; int index=0; while(i<6){ index=rand.nextInt(count)+1; if(index!=lastGetIndex) break; i++; } lastGetIndex=index; cursor.move(index); String word=cursor.getString(cursor.getColumnIndex("word")); String interpret=cursor.getString(cursor.getColumnIndex("interpret")); int wrong=cursor.getInt(cursor.getColumnIndex("wrong")); int right=cursor.getInt(cursor.getColumnIndex("right")); int grasp=cursor.getInt(cursor.getColumnIndex("grasp")); cursor.close(); return new WordInfo(word, interpret, wrong, right, grasp); } String[] logProcess=new String[]{"G01","","G23","","G45","","G67","","G89","","NEW WORD"}; String[] logLearn=new String[]{"NEW1","REVIEW20","NEW2","REVIEW6"}; //外部接口,點擊事件后獲得單詞 public WordInfo popWord(){ WordInfo wordInfo=null; /** * 打印參數信息 */ if(processWrong){ return getWrongWord(); } switch(process){ case GRASP_89:{ if((wordInfo=getWordByAccurateGrasp(GRASP_89, GRASP_67,0.1))!=null) return wordInfo; } case GRASP_67:{ if((wordInfo=getWordByAccurateGrasp(GRASP_67, GRASP_45,0.3))!=null) return wordInfo; } case GRASP_45:{ if((wordInfo=getWordByAccurateGrasp( GRASP_45,GRASP_23,0.4))!=null) return wordInfo; } case GRASP_23:{ if((wordInfo=getWordByAccurateGrasp(GRASP_23, GRASP_01,0.5))!=null) return wordInfo; } case GRASP_01:{ if((wordInfo=getWordByAccurateGrasp(GRASP_01,LEARN_NEW_WORD,0.5))!=null) return wordInfo; } case LEARN_NEW_WORD:{ return learnNewWord(); } default: break; } return null; } //外部敲擊后反饋回來的函數 public void feedBack(WordInfo wordInfo,boolean isRight){ if(wordInfo==null) return; //對可能出現的空指針異常進行處理 String word=wordInfo.getWord(); int right=wordInfo.getRight(); int wrong=wordInfo.getWrong(); int graspInt=0; if(isRight){ right++; }else{ wrong++; //更新答對答錯次數 } if(right-2*wrong<0){ graspInt=0; }else if(right-2*wrong>10){ graspInt=10; }else{ graspInt=right-2*wrong; } //更新數據庫 ContentValues values=new ContentValues(); //更新應該只會更新添加的項吧,暫時這么處理 values.put("right", right); values.put("wrong",wrong); values.put("grasp",graspInt); values.put("learned", LEARNED); dbW.update(tableName, values, "word=?", new String[]{word}); //若出錯,將數據存在出錯隊列中 if(isRight==false){ wordInfo.setRight(right); wordInfo.setWrong(wrong); wordInfo.setGrasp(graspInt); wrongWordList.offer(wordInfo); } } //新詞學習階段調用的函數 public WordInfo learnNewWord(){ //這里設置一個彩蛋 WordInfo wordInfo=null; switch(processLearnNewWord){ case STEP_1_NEWWORD1:{ if((wordInfo=getWordByGraspByRandom(GRASP_01,GRASP_01,UNLEARNED ))==null || wordCount>rand.nextInt(3)+9 ){ processLearnNewWord=STEP_2_REVIEW_20; wordCount=0; //這里表示所有的詞都已經學完了 if(getWordCountByGrasp(GRASP_01, UNLEARNED)<=0){ process=GRASP_89; } }else{ wordCount++; return wordInfo; } } case STEP_2_REVIEW_20:{ if((wordInfo=getWordByGraspByRandom(0,2, LEARNED))==null){ processLearnNewWord=STEP_3_NEWWORD2; wordCount=0; }else{ wordCount++; if(wordCount>rand.nextInt(3)+19){ processLearnNewWord=STEP_3_NEWWORD2; wordCount=0; if(wrongWordList.size()>0) processWrong=true; } return wordInfo; } } case STEP_3_NEWWORD2:{ if((wordInfo=getWordByGraspByRandom(GRASP_01,GRASP_01,UNLEARNED ))==null || wordCount>rand.nextInt(3)+9 ){ processLearnNewWord=STEP_4_REVIEW_6; wordCount=0; }else{ wordCount++; return wordInfo; } } case STEP_4_REVIEW_6:{ if((wordInfo=getWordByGraspByRandom(0,2, LEARNED))==null){ processLearnNewWord=STEP_1_NEWWORD1; wordCount=0; /** * 這里必須返回一個非空值,否則程序將面臨空指針異常(會執行default) * 解決這個問題的方法是從數據庫中隨機取一個單詞填坑。 */ return getWordByRandom(); }else{ wordCount++; if(wordCount>rand.nextInt(3)+5){ processLearnNewWord=STEP_1_NEWWORD1; wordCount=0; if(wrongWordList.size()>0) processWrong=true; } return wordInfo; } } default: return null; } } //復習階段調用的取詞函數 public WordInfo getWordByAccurateGrasp(int curentGrasp,int nextGrasp,double percent){ int count=0; if((count=getWordCountByGrasp(curentGrasp,LEARNED)+getWordCountByGrasp(curentGrasp+1,LEARNED))<=0 || wordCount>=count*percent){ process=nextGrasp; wordCount=0; return null; }else{ wordCount++; if(wordCount%(rand.nextInt(2)+19) ==0 && wrongWordList.size()>0 ){ //錯誤列表中必須有單詞 processWrong=true; } /** * return getWordByGraspByRandom(rand.nextInt(2)+curentGrasp,LEARNED ); * 這樣寫會可能返回空值!需要逐個排除 */ return getWordByGraspByRandom(curentGrasp,curentGrasp+1, LEARNED); } } //學習錯詞的函數 public WordInfo getWrongWord(){ //該函數被調用時,意味着錯誤詞列表中一定有單詞 WordInfo word=null; word=wrongWordList.poll(); if(wrongWordList.size()<=0){ processWrong=false; //停止顯示錯詞 } return word; } }
一些細節在代碼中都有說明,該對象的調用popWord()獲得單詞對象WordInfo ,而feedBack用於將背單詞的結果(對或錯)反饋到數據庫中,修改該單詞的掌握程度。
還有一些隨機獲得單詞,等等方法,以后在實現背單詞UI時會有涉及。若有不清楚之處可以評論
另外給出WordInfo單詞信息類:
package com.carlos.wordcontainer; public class WordInfo{ public String word; public String interpret; public int wrong; public int right; public int grasp; public WordInfo(String word, String interpret, int wrong, int right, int grasp) { super(); this.word = word; this.interpret = interpret; this.wrong = wrong; this.right = right; this.grasp = grasp; } public String getWord() { return word; } public void setWord(String word) { this.word = word; } public String getInterpret() { return interpret; } public void setInterpret(String interpret) { this.interpret = interpret; } public int getWrong() { return wrong; } public void setWrong(int wrong) { this.wrong = wrong; } public int getRight() { return right; } public void setRight(int right) { this.right = right; } public int getGrasp() { return grasp; } public void setGrasp(int grasp) { this.grasp = grasp; } }
OK,這節就講到這里了,背單詞的算法部分已經完成,下一節我們將搭建背單詞的UI界面,敬請期待