gmq版本
This commit is contained in:
@@ -1,15 +0,0 @@
|
|||||||
Some dict/zh data is from [github.com/fxsjy/jieba](https://github.com/fxsjy/jieba)
|
|
||||||
|
|
||||||
update at 2023-11-16:
|
|
||||||
|
|
||||||
add two new dict documents , which from [github.com/GuocaiL/nlp_corpus](https://github.com/GuocaiL/nlp_corpus)
|
|
||||||
|
|
||||||
generated by `nlp_corpus/open_ner_data/boson/boson.txt`, `open_ner_data/people_daily/people_daily_ner.txt`, `open_ner_data/tianchi_yiyao/train.txt`,`open_ner_data/ResumeNER/dev.txt`
|
|
||||||
|
|
||||||
1. tf_idf.txt
|
|
||||||
|
|
||||||
The first column of this document is the term , the second column is the word frequency of the corresponding term, and the third column is the inverse document frequency of the corresponding term
|
|
||||||
|
|
||||||
2. tf_idf_origin.txt
|
|
||||||
|
|
||||||
the origin corpus text
|
|
||||||
@@ -1 +0,0 @@
|
|||||||
dict.txt 通过内部工具生成, Copyright 2017 ego authors. 商用和拷贝请注明来源和版权
|
|
||||||
885298
gse/dict/jp/dict.txt
885298
gse/dict/jp/dict.txt
File diff suppressed because it is too large
Load Diff
270132
gse/dict/zh/idf.txt
270132
gse/dict/zh/idf.txt
File diff suppressed because it is too large
Load Diff
352279
gse/dict/zh/s_1.txt
352279
gse/dict/zh/s_1.txt
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -1,88 +0,0 @@
|
|||||||
,
|
|
||||||
.
|
|
||||||
?
|
|
||||||
!
|
|
||||||
"
|
|
||||||
@
|
|
||||||
,
|
|
||||||
。
|
|
||||||
、
|
|
||||||
?
|
|
||||||
!
|
|
||||||
:
|
|
||||||
“
|
|
||||||
”
|
|
||||||
;
|
|
||||||
|
|
||||||
(
|
|
||||||
)
|
|
||||||
《
|
|
||||||
》
|
|
||||||
~
|
|
||||||
*
|
|
||||||
<
|
|
||||||
>
|
|
||||||
/
|
|
||||||
\
|
|
||||||
|
|
|
||||||
-
|
|
||||||
_
|
|
||||||
+
|
|
||||||
=
|
|
||||||
&
|
|
||||||
^
|
|
||||||
%
|
|
||||||
#
|
|
||||||
`
|
|
||||||
;
|
|
||||||
$
|
|
||||||
¥
|
|
||||||
‘
|
|
||||||
’
|
|
||||||
〉
|
|
||||||
〈
|
|
||||||
…
|
|
||||||
>
|
|
||||||
<
|
|
||||||
@
|
|
||||||
#
|
|
||||||
$
|
|
||||||
%
|
|
||||||
︿
|
|
||||||
&
|
|
||||||
*
|
|
||||||
+
|
|
||||||
~
|
|
||||||
|
|
|
||||||
[
|
|
||||||
]
|
|
||||||
{
|
|
||||||
}
|
|
||||||
啊
|
|
||||||
阿
|
|
||||||
哎
|
|
||||||
哎呀
|
|
||||||
哎哟
|
|
||||||
唉
|
|
||||||
俺
|
|
||||||
俺们
|
|
||||||
按
|
|
||||||
按照
|
|
||||||
吧
|
|
||||||
吧哒
|
|
||||||
把
|
|
||||||
罢了
|
|
||||||
被
|
|
||||||
本
|
|
||||||
本着
|
|
||||||
比
|
|
||||||
比方
|
|
||||||
比如
|
|
||||||
鄙人
|
|
||||||
彼
|
|
||||||
彼此
|
|
||||||
边
|
|
||||||
别
|
|
||||||
别的
|
|
||||||
别说
|
|
||||||
并
|
|
||||||
236754
gse/dict/zh/t_1.txt
236754
gse/dict/zh/t_1.txt
File diff suppressed because it is too large
Load Diff
107536
gse/dict/zh/tf_idf.txt
107536
gse/dict/zh/tf_idf.txt
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user