mscerts.net
 
Adaptive Technologies
Adobe AIR Apps
Artistic Software
Communications
Database
Desktop Environment
Documentation
Education
Games
Home Automation
Information Management
Internet
Multimedia
Office
Printing
Programming
Religion
Science
Science and Engineering
Security
System
Terminals
Text Editing&Processing
Utilities
 
 
myGengo 1.2.0
corejet.jira 1.0 Alpha 1
jsoncmd 0.0
correct_pycountry 0.12.2
Monsters 1.0
corejet.core 1.0 Alpha 1
Scope::Container::DBI 0.04
DBIx::DataModel 1.27
Word Golf 0.2.1
Plack::Middleware::LogWarn 0.001002
 
 
 

html2data 0.3

A simple way to transform a HTML file or URL to structured data

html2data offers a simple way to transform a HTML file or URL to structured data. For example:

>>> ## start the console
>>> from html2data import html2data
>>> html = """< !DOCTYPE html >< html lang="en" >< head >< /head >
 < body >
 < h1 >< b >Title< /b >< /h1 >
 < div class="description" >This is not a valid HTML
 < /body >
 < /html >"""

>>> config = {
 'map': [
 ['body_title', u'//h1/b/text()'],
 ['description', u'//div[@class="description"]/text()'],
 ]
 }

>>> handler = html2data()
>>> received_obj = handler.load(html = html, config=config)
>>> print received_obj
{ 'body_title': 'Title', 'description': 'This is not a valid HTML'}

Requirements:

· Python
· lxml
· httplib2

  Other
-   comicnamer 1.0
-   Eclip-X 0.5
-   Command Line Progress Bar 1.11.0
-   PyCalCount 0.1.0
-   CF-HC 0.0.2
-   auto-translator 0.88
-   GTK-Shutdown 0.1
-   GTK-Rarig 0.1
-   GrabCartoons 2.7
-   MivRHash 1.0.1
-   Adobe Flash Player for Linux 10.1.82.76
-   Change Log Manager 4.3.0
-   Toast Machine 0.1 Beta
-   zeitgeist-datahub 0.5.0
-   grin 1.2
-   transmission-helper 0.4
-   motorix r10
-   recollrunner 0.1
-   Websourcebrowser 0.4a
-   irclog2html 2.9.1
 
 
                mscerts.net