Quantcast

Jump to content


Photo

Python & Neopets

python web pages

  • Please log in to reply
4 replies to this topic

#1 Irradium

Irradium
  • Pyro (699) Maniac

  • 892 posts


Users Awards

Posted 08 June 2014 - 11:45 AM

Right, summer's coming around again, and I feel like I want to start writing one or two programs for Neopets.

 

However, I have no idea how to do web scraping in Python 3. It used to be that I would use either requests, urllib3, or mechanize with Python 2.7 (urllib2 + urllib was broken as shit), but I wasn't exactly great at them anyway. :p

 

Can someone recommend me a library and/or a good general method for doing web scraping please? ^_^

 

Edit: Oh, also sending requests to a website, I have no clue about that either, hehe.


Edited by Irradium, 08 June 2014 - 11:50 AM.


#2 Dan

Dan
  • Resident Know-It-All

  • 6382 posts


Users Awards

Posted 08 June 2014 - 12:39 PM

I don't think there's one particular library or method you'd want to use for "web scraping". It's popularly known that HTML is generally a mess and this is majorly evident with Neopets in particular, making any kind of standardised parsing difficult. There are, however,  a number of libraries which do give it a good go -- but you'll be lucky if you find one in Python that plays nice with Neopets.

 

You're best off spending a bit of time thinking as to what the logic flow of the program you want to write should be, then focussing on what parsing of the HTML is necessary to achieve that.

 

 

 

(I think an answer that pushes you in the right direction -- and the right thought processes -- is far more beneficial than just giving you the code)



#3 Irradium

Irradium
  • Pyro (699) Maniac

  • 892 posts


Users Awards

Posted 09 June 2014 - 09:11 AM

I don't think there's one particular library or method you'd want to use for "web scraping". It's popularly known that HTML is generally a mess and this is majorly evident with Neopets in particular, making any kind of standardised parsing difficult. There are, however,  a number of libraries which do give it a good go -- but you'll be lucky if you find one in Python that plays nice with Neopets.

 

You're best off spending a bit of time thinking as to what the logic flow of the program you want to write should be, then focussing on what parsing of the HTML is necessary to achieve that.

 

 

 

(I think an answer that pushes you in the right direction -- and the right thought processes -- is far more beneficial than just giving you the code)

 

Yeah, I would agree with the last statement. I reckon BeautifulSoup + requests + spending several hours reading the docs should do just fine. :)



#4 Josh

Josh
  • 318 posts

Posted 02 August 2014 - 08:55 PM

Or you could just use the library I wrote in a week or two located here: https://github.com/jmgilman/Neolib

 

It even has a fork that someone created awhile ago trying to bring it up to date. 



#5 theorange

theorange
  • 7 posts

Posted 04 August 2014 - 05:27 AM

Easy question, you're pretty much going to be using the same stuff, beautifulsoup + request or mechanize, lots of other options to choose from too.

 

Beautiful soup - http://www.crummy.co.../BeautifulSoup/

Requests - http://docs.python-r...test/index.html

 

Read up on their documentation and you'll be golden. 

 

 

Here is a site with tutorials on using them:

Beautiful soup - http://www.pythonfor.../beautifulsoup/

Requests - http://www.pythonfor...s.com/requests/

Web general - http://www.pythonfor...hon-on-the-web/

 

Read the documentation, read the tutorials, then start doing small projects like scraping your inventory/shop inventory from each shop, then expand from there.





Also tagged with one or more of these keywords: python, web, pages

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users