You can download this code by clicking the button below.
This code is now available for download.
This function takes a string as an input, removes punctuation, counts the frequency of each word, and returns the top 10 most frequent words.
Technology Stack : string, re, collections, heapq
Code Type : Function
Code Difficulty : Intermediate
def sorted_words(text):
from string import punctuation
import re
from collections import Counter
from heapq import nlargest
# 移除文本中的标点符号
text = re.sub(rf'[{punctuation}]', '', text)
# 将文本转换为小写并分割成单词
words = text.lower().split()
# 计数单词出现的频率
word_counts = Counter(words)
# 获取出现频率最高的10个单词
most_common_words = nlargest(10, word_counts.items())
# 返回排序后的单词列表
return [word for word, count in most_common_words]