shell操作文件的幾條命令：刪除最后一列、刪除第一行、diff等

本文轉載自查看原文 2013-06-28 17:40 6025 shell awk sed diff comm grep

刪除文件第一行： sed '1d' filename

刪除文件最后一列： awk '{print $NF}' filename

awk刪除重復行的命令：awk '{if (!seen[$0]++) {print $0;}}' filename

比較文件的兩種方法：

1）comm -3 --nocheck-order file1 file2

2) grep -v -f file1 file2 :輸出file2中有file1中沒有的行

當然還有diff file1 file2

貼一段昨天寫的shell腳本~

#!/bin/bash
date_time=`date +'%H_%M_%S'`
yesterday=`date -d"-1 day" +'%Y_%m_%d'`
today=`date +'%Y_%m_%d'`
date_day_time=`date +'%Y_%m_%d_%H_%M_%S'`

mkdir /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/same_similiar_log/$today

# begin to get input files which haven't been deal with
today_input=/home/crawler/petabyte/crawllog/news_data/$today
yesterday_input=/home/crawler/petabyte/crawllog/news_data/$yesterday

/opt/hadoop/program/bin/hadoop fs -ls $yesterday_input/ > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get
/opt/hadoop/program/bin/hadoop fs -ls $today_input/ >> /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get

sed '1d' /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get_without_first_line

awk '{print $NF}' /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_get_without_first_line > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input

#comm -3 --nocheck-order /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diff

grep -v -f /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diff

awk '{print $NF}' /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_diff > /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/today_new_input

mv /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/all_input /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done


# begin to compute same_similary_news
inputfile1=""
while read line
do
  inputfile1=$inputfile1,${line}
done < /home/spamdetect/changxiaojia/workspace/finance/same_similar_news_mining/mid_files/input_done
echo $inputfile1

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 shell刪除最后一列、刪除第一行、比較文件 Linux shell 用sed刪除第一行、最后一行或增加刪除某行 linux提取第一列且刪除第一行（awk函數） Linux中通過命令直接刪除文件中最后一行刪除文件的第一列 -Linux shell下如何刪除文件的某一列 awk命令獲取文件的某一行某一列 linux系統中如何刪除第一行、前兩行，最后一行、最后兩行 linux系統中如何刪除最后一列 linux系統中刪除文件的第一列