Joke Collection Website - Blessing messages - How to capture popular comments of Netease Cloud Music with Python
How to capture popular comments of Netease Cloud Music with Python
order
Recently, I have been studying the content related to text mining. The so-called clever woman cannot cook without rice. If you want to analyze the text, you must first get the text. There are many ways to obtain text, such as downloading ready-made text documents from the Internet or obtaining data through APIs provided by third parties. But sometimes the data we want cannot be obtained directly, because there is no direct download channel or API for us to obtain the data. So what should we do at this time? There is a better way to get the desired data through web crawler, which is to write computer programs to pretend to be users. With the high efficiency of computer, we can get data conveniently and quickly.
About reptiles
So how do you write reptile? There are many languages that can write reptiles, such as Java, php, python and so on. Personally, I prefer python. Because python not only has a powerful built-in network library, but also many excellent third-party libraries, others made wheels directly, so we can just use them, which brings great convenience to writing reptiles. It is no exaggeration to say that you can actually write a small crawler with less than 10 lines of python code, while you can write a lot of code in other languages. Simplicity is a great advantage of python.
Well, before it's too late, let's get down to business today. In recent years, Netease cloud music has become popular. I am a user of Netease Cloud Music myself, and I have used it for several years. I used to use QQ music and cool dogs. Based on my own personal experience, I think the biggest feature of Netease Cloud Music is accurate song recommendation and unique user comments (solemnly declare! ! ! This is not a soft article, not an advertisement! ! ! Only represents personal views, don't spray if you don't like it! )。 There are often some divine comments that are praised a lot under a song. In addition, Netease Cloud Music moved selected user comments to the subway a few days ago, and Netease Cloud Music's comments were on fire again. So I want to analyze Netease Cloud's comments, find out the rules, and especially analyze the characteristics of some hot reviews. With this goal in mind, I started grabbing comments from Netease Cloud.
online library
Python has two built-in network libraries, urllib and urllib2, but neither of them is particularly convenient to use, so here we use a well-received third-party library, requests. With the request, you only need a few lines of code to set up the proxy, simulate login and other complex crawler work. If pip is already installed, you can use pip installation request to install directly.
The address of Chinese document is here = (organic) | utmcmd = organic; playerid = 8 15689 1 1; _ _ utmb = 94650624 . 23 . 10. 1490672820 ",
Connection':' Keep Alive',
Quoted by ":/"}
# Set proxy server
Agent = {
Element (url):
hot_comments_list = []
Hot_comments_list.append(u "user ID user nickname user avatar address comment time likes total comment content")
Params = get_params( 1) # Page 1
encSecKey = get_encSecKey()
json_text = get_json(url,params,encSecKey)
json_dict = json.loads(json_text)
Hot _ comments = JSON _ dict ['Top Comments'] # Top Comments
Print ("* * * has %d popular comments!" % len (popular comment))
For items in popular comments:
Comment = item['content'] # comment content
LikedCount = item['likedCount'] # Always like numbers.
Comment_time = item['time'] # comment time (timestamp)
UserID = item[' User'] ['User ID'] # Commenter ID
Nickname = Project ['User'] ['Nickname'] # Nickname
Avatar website = project ['user'] ['avatar website'] # avatar address
Comment_info = userID+""+nickname+"+avatar URL+""+comment _ time+""+likedcount+""+comment+u" "
hot _ comments _ list . append(comment _ info)
Return to the list of popular comments
# Capture all comments on a song
Define get all comments (url):
All_comments_list = [] # Store all comments.
All_comments_list.append(u "user ID user nickname user avatar address comment time likes total comment content) # header information
params = get_params( 1)
encSecKey = get_encSecKey()
json_text = get_json(url,params,encSecKey)
json_dict = json.loads(json_text)
comments _ num = int(JSON _ dict[' total '])
if(comments_num % 20 == 0):
- Previous article:How long can Changzhou find out the violation?
- Next article:How to deal with the delay of receiving SMS by mobile phone! !
- Related articles
- It turns out that every short message that can make me turn around is an episode of your regretful moment.
- 315 exposed low-end children’s smart watch: a walking peeping device
- Is it illegal to expose Lao Lai on the Internet?
- Tc is the abbreviation of which city in China.
- Why can't my mobile card work? You can't plug in other mobile phones or send text messages. The prompt is "Please check the network settings".
- Small class warm reminder short content
- I received a text message today, saying that I want to fall in love with you, okay? I answered who he was and never answered me again, wondering if he was a liar.
- Can mobile and multi-number extensions be found out by others?
- Why can't WeChat send messages? What happened?
- Overdue credit cards have been negotiated for repayment. Why did you send a text message about the publicity period?