Monday, December 7, 2020

16 secrets tips, tricks and features for new Google Apps Script Editor (v2020)

Google Apps Script has a new editor, which is better, nicer, and completed ready for future new features.
Today, I would like to introduce you to several dirty & secret tricks, what you can do.
(All shortcuts have been tested on MacOS)

#1 Change order of your files in Apps Script project


#2 When you click save, all project's files are saved


#3 Show / hide not used code in function(s) with small arrow


#4 Exchange two lines with [ALT] + [UP] or [ALT] + [DOWN] 

#5 Copy line below or above with [ALT] + [SHIFT]+ [UP] or [ALT] + [SHIFT] + [DOWN] 



#6 Expand or shrink block selection with [CTRL] + [SHIFT]+[CMD]+ [LEFT] or [CTRL] + [SHIFT]+[CMD]+ [RIGHT] 


#7 Multi-cursor for editing more lines together [ALT] + click to position


#8 Better code refactoring with Rename symbol, which replaces all findings


#9 Peek definition is best quick insight about function definition


#10 Code suggestion for native JavaScript objects (e.g. Date) is available


#11 Always visible panel with execution logs



#12 Different colors console.warn() and console.error()




#13 Hyperlink in logging console leads to code line 

#14 Better debugging to deeper inspect of function callstack and variables (local and global scope)


#15 Change editor into dark-mode with command 

 

#16 Inline documentation with example in code suggestion



If you have your own favourite tip for new Apps Script editor, ping me on Twitter @ivankutil




Thursday, May 28, 2020

Machine learning in Google Sheet with Tensorflow.js and Google Apps Script


This article will show you how you can setup, train, and predict spreadsheet data with deep-learning framework Tensorflow.js. You don't need to call REST APIs or use other 3rd parties storage and algorithm. All your data stay in your secure Google Sheet.

There is a Google Spreadsheet with demo dataset + full Apps Script inside at the end of article.



Intro

Google has recently introduced a new JavaScript runtime (V8 engine) into Google Apps Script. It enhances G Suite platform for new use-cases of automation. It replaces the old Mozilla's Rhino JavaScript interpreter and allows you to include modern JavaScript libraries.

TensorFlow was originally for Python, but Google added support for more programming languages later. (nodejs. JavaScript, Swift,..). Keras is high-level neural networks API that is on top of TensorFlow. It is appropriate for beginners and helps you to build neural networks. Tensorflow.js is JavaScript-based framework for building neural networks and syntax is similar to Keras.

The whole machine learning topic is very complex. It contains a lot of use-cases, design of architectures, setups, and tiny tweaking. My aim is not to show you step-by-step tutorial which would cover machine learning, but inspire you and show you another point-of-view about the power of Google Sheets with Google Apps Script.

Disclaimer: I have used to small hack, to included Tensorflow.js library. I cannot guarantee, that you get 100% accuracy of result.

Use case

I guess that you have plenty of data in your Google Sheets. Imagine the scenario, that based on some multiple columns (with numbers) you want to predict the value in the last column. It is useful if you want to forecast future values from past values or some values are missing and you can fill the gaps. The scenario is named Multivariate Regression.


Deploy Tensorflow.js in Google Apps Script

I copied the whole Tensorflow.js library into one-file code into Google Apps Script project as file tf-js.gs. 


I had to prepare Tensorflow.js library before training and predicting. First, the library uses the name global for the global variable. It was an easier part because I only defined a new variable and added a new line of code:



Second, the Tensorflow.js library uses native APIs  "measuring time" - specifically Performance.now()  or  process.hrtime() 

Performance.now() is available only in browser API (Chrome) and process.hrtime() is available only in backend language API (node.js). I got an error "Cannot measure time in this environment. You should run tf.js in the browser or in Node.js" in Google Apps Script, because I could not use first and second method.

I did not fully reverse engineered library, but I think measuring time is used to for yielding  main thread for other tasks. For this reason I setup yieldEvery as "never" during model compilation. (https://js.tensorflow.org/api/latest/)

If you have more elegant solutions, ping me on Twitter or email.

Data

Boston Housing Prices dataset is "hello world" entry task in the machine learning world. It is a collection of 500 simple real-estate records collected in Boston (Massachusetts) in the late 1970s. Each row includes numeric measurements of a Boston neighborhood (e.g. crime rate, the typical size of homes, how far the region is from the closest highway, whether the area has waterfront property..). 

These columns are named as a features (=inputs into the machine learning model).


We want to predict the price of the home according to this dataset. This column is one and its name is a target (=output from machine learning model).

This prepared function download dataset from Google Cloud Storage into Google Sheet directly.

Data preparation

We have to divide data into training and testing dataset. A variable rowSplit defines row number for this splitting. In our case rows from 2 to 336 will be used as training dataset. The remain rows (from 337 to 507) as a testing dataset. The variables FEATURE_COLUMN_FROM and FEATURE_COLUMN_TO define features columns for training, testing and predicting.   In our case features data are loaded from 1 - 12 columns.


As a last step, select range in Google Sheet. We want to estimates values in column M according to values A- L in selected rows 7-9.

Tensorflow does not work with Array data structure, but with Tensors. Tensors are multi-dimensional data structures. Function createTensor() creates 2D tensor for us.

Several features (columns) contain values in different scale (e.g. tax values 187 - 711) than others (e.g crime rate 0.01 - 88.98). We have to normalize and transform values that improve the performance and training stability of model.


Building the model and training

As you have already known, that deep-learning networks contains more layers with neurons. We need to define the architecture of neural network layer by layer. The syntax is similar metioned Keras. We have an architecture with two layers and each of them contains 50 neurons.


There are activation functions (Sigmoid) in every hidden layer.


The last layer contains only one neuron with default (linear) activation function. It is linear, because our example is regression use-case.

Next step is compilation. In this step, we need to setup
  • optimizer (Stochastic gradient descent), how to find the best solution and neuron's weights
  • loss function (meanSquaredError), how measure optimal solution

Now it is time for training. Method .fit() trains model from data over several iterations (=EPOCHS)

These values like number of epochs, number of layers, number of neurons, type of activation function are hyperparameters. Data scientists around the world tune these values and compare it with previous settings. More info about settings in Tensorflow.JS API https://js.tensorflow.org/api/latest/


Evaluation allows you to check the accuracy of your model. Less loss value is better. You should compare Train loss vs. Test loss. Bigger train loss means Overfitting and it is not ideal.

Prediction

When we are satisfied with quality of our model and loss value is optimal. Now you can predict futures values.  We also need to convert Array prediction values into Tensors and normalized it as well.

In our code snippet predicted values are saved into cell Notes and you can compare it with original values.


Here is a main function, which load, prepare, train the data.



Try it yourself!

If you want to test and play with it, full dataset + Google Apps Script code is available in this Google Sheet

1. Create a copy of spreadsheet
2. Select any of rows (last value will be skipped during training)
3. Open menu Tools --> Script editor
4. Select menu Run --> Run function and choose Main
5. The predicted values will be saved as a notes 


If you like Google Apps Script, folks like you are in this Google Groups community
https://developers.google.com/apps-script/community


Monday, February 24, 2020

Extract archived files directly from Google Drive with serverless tool Google Colab

Today I will show you how you can extract files from zip archives directly in Google Drive without any external 3rd tools. Everything is completed serverless and in da cloud :-)

Google Colab is a pretty neat tool for data scientists, who operate and manage Jupyter notebooks in the cloud. I prefer Google Colab because supports Python 3 runtime with CPU, GPU or TPU processors. You can also mount your Google Drive as an external drive.



Colab prepares the virtual machine for you with Jupyter notebook. Notebook contains several cells, where you can write Python code and run it with "Play" button.

There is also the magic behind "magic syntax sugar", which allows you to use more advanced commands. One of them is %%bash, which converts the rest of the commands into bash commands.



When we combine the above knowledge, we will get a powerful tool like an army knife:

1. Open Google Colab https://colab.research.google.com/notebooks/intro.ipynb#recent=true and create a new notebook.

2. Create a new notebook and rename it. At the right top click to button Connect, which starts your virtual machine. 

3.Now it is time to mount your Google Drive. Click to button Mount Drive. You have to authorize Colab application to your account. The path to your files is drive/My Drive/





Copy this snippet of code int your Colab cell


There are two - change directory and unzip desired files.

Click to "play" button to run the selected cell (=snippet of code). Files will be extracted into same locations.

In my example, I did Google Takeout to my Google Drive and backup zip files were saved into Takeout folder to my root folder. 



Note:
Bash runs cell in a subprocess, so it could take some time to unzip a lot of files