Sunday, January 24, 2016

Easy data scraping with Google Apps Script in 5 minutes

I'm using Google Apps Script for a lot of things - from automate tasks to data analysis. I have discovered, that there was repetitive use-case: scrape data from web  and parse exact value from HTML source code. If you are novice in programming, you probably know, that's difficult to write and use regular expresion. For me too :) I have written Google Apps Script library, which helps you to parse data in 5 minutes.

Let's create a small example. In our company we have created a Google Apps application SignatureSatori to create and setup email signatures for all users in domain. Like a good growth hacker I benchmark the competitors how quickly get new users. There is a Google Apps Marketplace, which estimate number of users. I need save that numbers each day.


1) Create a new Google Apps Script and insert a new library (Resources -> Library)
M1lugvAXKKtUxn_vdAG9JZleS6DrsjUUV

Parser library takes three parameters - input text and pattern which bounds desired text.
Parser // name of library
    .data(content) // input text
    .from(fromText) // from text pattern
    .to(toText) // to text pattern
    .build(); // run parser and return value

2) Now open desired web-page (e.g Chrome Webstore in our case). Click on specify HTML element by right mouse button and select Inspect element.


3) Find the right part of HTML and copy fromText and toText

4) Now we have all required information to complete script
function getData() {
    var url = "https://chrome.google.com/webstore/detail/signaturesatori-central-s/fejomcfhljndadjlojamaklegghjnjfn?hl=en";
    var fromText = '<span class="e-f-ih" title="';
    var toText = '">';
  
    var content = UrlFetchApp.fetch(url).getContentText();
    var scraped = Parser
                    .data(content)
                    .from(fromText)
                    .to(toText)
                    .build();
    Logger.log(scraped);
    return scraped;
}

5) The last and the easiest step is copy parsed data into Spreadsheet
function SAVE_DATA() {
 var sheet = SpreadsheetApp.openById(spreadsheetId).getSheetByName(sheetName); // insert Spreadsheet Id and Sheet name
 sheet.appendRow([ new Date(), getData() ]);
}

6) If you want to log during scraping (e.g. if you want to debug wrong value), call .setLog() function before final .build() function:
 
Parser
    .data(content)
    .setLog()
    .from(fromText)
    .to(toText)
    .build();
Completed code of Parser library Enjoy!

Are you interested in this topic? Follow me on Twitter, Google+ or subscribe RSS

22 comments:

  1. Thanks for sharing such a great post.

    ReplyDelete
  2. Google has made so many technical things so much easy to handle and understand.I really appreciate this effort.Your should write something for www.summarizing.biz/auto-summarize-online/ because students really follow this link.Thanks for this post.

    ReplyDelete
  3. Google changed CSS, now from/to must be:

    var fromText = "div class="LCopac qlfxzd">";
    var toText = "

    But anyway I'm getting error:
    ReferenceError: "Parser" is not defined.

    ReplyDelete
  4. It is true that day by day the demand for the MBA course is really rising and why not? Poeple love to make better career through this. homepage will give you some amazing idea and very effective for the papers writing.

    ReplyDelete
  5. You can be sure that you will only get the best online copy watch. Remember that buying fake watches is very similar to buying any other typeReplica watchesReplica Watches Uk

    ReplyDelete
  6. Amazing info for IT students. They will gain information regarding data scrapping with google apps. Well, here i'm going to introduced our recent work just check it once https://www.bestghostwriters.net/academic-papers-ghostwriting

    ReplyDelete
  7. How can I scrap the headings ?

    ReplyDelete
  8. The new google app makes things so easy for the data scrapping factors and you can really do your data scrapping task with those.

    ReplyDelete
  9. to get a new job resume and cover writing is very important . a good resume and cover writing help to get a job quickly . your ideas about cover latter writing is very unique . i found something new from your writing . keep writing like this useful link for the students that is very helpful for the writing services.

    ReplyDelete
  10. Why Google Drive ask permission when adding to the library?

    ReplyDelete
    Replies
    1. Hi, there is another function inside for better data caching, which is using Google Drive file as cache. Check code above (link is provided)

      Delete
  11. The best and easy option provided by your article. Thank you for sharing this article.

    ReplyDelete
  12. I am receiving the follow error: ReferenceError: "Parser" is not defined.

    Any thoughts?

    ReplyDelete
  13. if you are getting the parser is not defined error you need to go to ressources --> libraries --> add a library and enter M1lugvAXKKtUxn_vdAG9JZleS6DrsjUUV

    ReplyDelete
  14. Your website is really cool and this is a great inspiring article. best backlinks

    ReplyDelete
  15. Hi there, while looking at a few solution at scraping Facebook page in the new world order I ended up here. First up great library I'm adding it to my toolkit. I was just curious if by any chance you've found a way to retrieve data from Facebook that doesn't imply using Graph API?

    ReplyDelete
  16. It is interesting to read your blog post and I am going to share it with my friends.aybabg

    ReplyDelete
  17. I can't stop myself to read this post! Thanks. Kanhasoft is the mobile and web solution developer in India and USA. We are developing a complete enterprise solution to boost the business. Visit our site to know more.

    ReplyDelete
  18. exact replica watches, combining elegant style and cutting-edge technology, a variety of styles of replica omega speedmaster watches, the pointer walks between your exclusive taste style.

    ReplyDelete
  19. replica gucci shoes, combining elegant style and cutting-edge technology, a variety of styles of replica gucci mens shoes, the pointer walks between your exclusive taste style.

    ReplyDelete
  20. This is a really helpful blog to me and my team.(https://counfreedise.in/)

    ReplyDelete