Importing wikipedia Data into Google Docs

The Google spreadsheet function =importHTML(“”,”table”,N) will scrape a table from an HTML web page into a Google spreadsheet. The URL of the target web page, and the target table element both need to be in double quotes. The number N identifies the N’th table in the page (counting starts at 0) as the target table for data scraping.

Google Spreadsheet has an extremely useful function that allows you to import various kinds of data into a spreadsheet file. Suppose you find a table of useful data on a web page that you want to have it as an Excel file. That’s possible with Google Spreadsheet’s Import function.

The import function makes it possible to grab all kinds of online data and turn them into spreadsheet files to make analysis, create graphs etc. The =ImportHTML function has the following format:

=ImportHTML(”URL”,”query”,index)

URL is the URL of the target web page; query can be either “list” or “table” and index is the order of the element (query) on the page. If a page contains multiple tables (or lists) and you want to import only the third table (or list) than the index value will be 3. Quotation marks around URL and query is necessary.

Lets take an example of “http://en.wikipedia.org/wiki/List_of_most_populous_cities_in_India” URL

Then the query will be

=ImportHtml(“http://en.wikipedia.org/wiki/List_of_most_populous_cities_in_India”,”table”,2)

Screenshot:

importHTML screenshot

 

Leave a Reply

Your email address will not be published. Required fields are marked *