Achieving massive parallel calculation with Google Cloud Platform™️
avgidea, Inc. (Teppei Yagihashi, Founder CEO) has newly developed high speed Cross-tabulation system based on Google Cloud™️ for Paasons Advisory, a strategic boutique at Hakuhodo Inc. Data analysts has experimentally started using the system since March 2020.
avgidea brings various business ideas into practice and offers them to users as form of services or applications using IT technologies. We offer system development and maintenance using cloud technologies such as G Suite™️ and Google Cloud Platform™️ (GCP™️).
Paasons Advisory is a boutique dedicated to strategic planning within Hakuhodo. By fully utilizing cloud platform, Paasons Advisory offers highly agile planning service while constructing always-connected community at strategic layer with clients.
Passons Advisory utilizes Google Cloud G Suite for day-to-day work, therefore, instantaneously starts building cross tabulation application using Google Sheets™️. However, they have realized the boundary of this approach when data volume and a number of analysis axises grows, and performance degradation and spikes of CPU usage at client PCs are noticeable, which start affecting data analysts essential analysis work.
For above issues faced by Paasons Advisory, avgidea has newly designed and built a system, which distributes heavy workloads to cloud. In the past, workloads of “Store”, “Process” and “View” are only beard by Google Sheets. We distribute “Store” and “Process” to GCP and let Google Sheets to focus on “View” and successfully reduce load of client PCs. Further, we achieve parallel execution which is common in high performance computing area and lead to shortening total execution time.
The application built by avgidea for Passons Advisory can be utilized as a general parallel execution platform including cross tabulation so we plan building this as service and provide it to other companies.
Authors: Teruaki Hayashi, Yukio Ohsawa (Ohsawa Research Lab. at Tokyo University)
Abstract Recently, data exchange platforms have emerged in the digital economy to enable better resource allocation in a data-driven society, which requires cross-organizational data collaborations. Understanding the characteristics of the data on these platforms is important for their application; however, the structures of such platforms have not been extensively investigated. In this study, we apply a network approach with a novel variable-based structural analysis to the metadata of datasets on two data platform services. It was noted that the structures of the data networks are locally dense and highly assortative, similar to human-related net-works. Even though the data on these platforms are designed and collected differently, depending on the use objectives, the variables of heterogeneous data exhibit a power distribution, and the data networks exhibit multi-scaling behavior. Furthermore, we found that the data collection strategies of the platforms are related to the variety of variables, density of the networks, and their robustness from the viewpoint of sustainability and social acceptability of the data platforms.