The Relationship between news and stocks 11
2022. 7. 28. 00:34ㆍProject/뉴스기사로 인한 주가 등락 예측
728x90
반응형
코드
Konlpy Kkma를 사용해서 토큰화 진행 -> 너무 오래걸림
from konlpy.tag import Kkma
from tqdm import tqdm, tqdm_pandas
def token(news):
return Kkma().morphs(news)
tqdm.pandas()
news['token'] = news.text.progress_apply(token)
Stopword 처리하기
def stopword(x):
stopword = [r'상승.*', r'하락.*', r'급등.*', r'급락.*', '상승세', '하락세', '폭등', '폭락', '오름세', '약세', '강세', '의', '가', '이', '은', '들', '는', '좀', '잘', '걍', '과', '도', '를', '으로', '자', '에', '와', '한', '하다', '에', '은', '는', '하']
return [i for i in x if i not in stopword]
tqdm.pandas()
news["token"] = news.token.progress_apply(stopword)
728x90
반응형
'Project > 뉴스기사로 인한 주가 등락 예측' 카테고리의 다른 글
The Relationship between news and stocks 13 (0) | 2022.07.29 |
---|---|
The Relationship between news and stocks 12 (0) | 2022.07.28 |
The Relationship between news and stocks 10 (0) | 2022.07.26 |
The Relationship between news and stocks 9 (0) | 2022.07.26 |
The Relationship between news and stocks 8 (0) | 2022.07.22 |