Cache for Data Studio connectors

This post is also available in: Español (Spanish)

In the past article Zoho Desk + Data Studio, we explain how to get started in the world of Data Studio connectors.

Now it is time to optimize. One of the methods of optimization provided by the Data Studio technology stack is to use the cache system.

The cache has two goals:

  1. Reduce API usage.
  2. Reduce the response time to the user.

We are going to analyze the impact that the use of the cache has on our connectors.

Before presenting the results to you, let’s put some context on this experiment:

  • The Google Cache service offers three levels, we have always used the Script cache.
  • The Google Cache service is limited to 100Kb, so we have introduced an intermediate layer to be able to cache more data.
  • We are going to perform the experiment against two different services:
    1. API Rest.
    2. Public CSV file.

API Rest

Our connector obtains all the historical data available for the consulted endpoint.

Below we show the distribution of the executions used for the analysis:

API response time evolution

We can observe changes in the response time of the service, as it is a third party service, we cannot do anything here.

The average response time was 60,458 ms.

Cache response time evolution

In this case, we see that the response time of the cache is more stable, with an average response time of 945 ms

Summary API Rest

Cache usage has impacted on areduction in response time strong, >98%

Regarding the API consumption, we must take into account that each connector execution (for this case) involves an average of 203 calls to the api, so we can calculate the saved consumptions:

  • API calls made: 17,458 calls
  • API calls not made by use of cache: 195,286 calls

This allows us to save almost 94% of API calls.

Public CSV file

In this case, the connector reads a CSV file of about 40 Mb on the fly to load it into Data Studio.

For the analysis the distribution of the executions has been the following

In this case, the improvement is significant, but not of the same order of magnitude as the previous one. The most likely cause is the size of the cache, being such a large size, the advantage of the buffer is diminished.

Response time evolution reading file

The average response time is 21.326 ms

Evolution of cache response time

The average response time is 8,098 s

Summary CSV file

The use of cache memory in this case has impacted in a strong reduction of the response time ,> 62%

In this case, the API call factor does not apply.

Conclusions

Below we collect all the key information in a table so that we can see the overall picture.

Endpoint type Query time Cache time Time saving API consumption reduction
API Rest60.458 ms945 ms98%94%
CSV File21.326 ms8.098 ms62%100%

As you can see, not only do we get significant improvements with the use of the cache, in some cases, it can make it viable without an intermediate storage.

This makes it possible to address the operating systems directly, however, paying a price at the time of loading. Allowing to raise certain reports and dashboards directly against the transactional sources without the need to consolidate the data.

Leave a Reply

Your email address will not be published.