Avgidea has implemented new Data Exchange and Function capabilities in the Avgidea Data Platform (ADP) as an environment for business-to-business data exchange and execution of AI predictive models. Data engineers and data scientists can reduce their workload in exchanging/processing data and deploying AI predictive models with customers and group companies.
In addition to Avgidea's SaaS offering, ADP can also be deployed as an OEM product in a company's Google Cloud™️ project and offered as a service to its own users. Once deployed as an OEM product, Avgidea will add new features, apply updates, and perform maintenance tasks such as monitoring, operation, and maintenance.
Two new components have been added to the Avgidea Data Platform.
Avgidea Data Exchange (ADX)
ADX integrates with various data services in the public cloud provided by different vendors, allowing users to move data to different storage and databases with no code. In addition, data owners can limit the scope of data sharing by explicitly selecting where and with which data is exchanged.
Like social networking services, users can communicate with other users in a closed environment, as data is shared only among users on the platform.
Avgidea Function (AFX)
By utilizing AFX, preprocessing operations such as data cleansing and character code conversion that occur before importing data into the database can be performed on files and directories stored in the ADP storage as functions written in Python.
You can also attach AI prediction models built with custom Python packages, TensorFlow™️, PyTorch™️, scikit-learn™️, etc. as libraries and run them against files in ADP's storage. functions registered in AFX can be explicitly shared on the GUI, allowing end users to directly execute AI predictive models created by data scientists.
Running scikit-learn clustering: https://youtu.be/DBF0_DAPMPY
Avgidea Data Platform Usage Scenarios
Avgidea, Inc., creator of end-to-end data management platform, announces launch of Avgidea Data Platform (ADP) to improve open data usability with Google Sheets, G Suite.
Open data is provided in all markets worldwide and users need to search what they need through the internet every time. If data analysts or scientists are able to find appropriate data, they download them in local environments, typically as text or CSV files, then use various tools to modify and aggregate before analyzing them. It’s time-consuming and often blurs what they are trying to achieve in the first place.
Avgidea Data Search and Avgidea Query Editor prevent users switching multiple tools back and forth, and minimize tasks related to open data by implementing data search and query features as an add-on of Google Sheets, a very popular table calculation service.
Avgidea Data Search (ADS)
From Avgidea Data Search, users may search open data which is scattered on the internet and import it to Google Sheets directly with a single click.
ADS supports part of ckan and Dataverse API and users can easily import CSV data from various sites on the internet. ADS maintains the latest endpoints of ckan and Dataverse available globally, therefore they can simply select an instance from the list and immediately start searching data from the instance.
Avgidea Query Editor (AQE)
Avgidea Query Editor can let users submit a query directly to instances compliant to SPARQL specification and import query results into Google Sheets.
There exists several SPARQL compliant data platforms such as Amazon Neptune or Virtuoso Universal Server. As long as endpoints are available on the internet, users may import query results into Google Sheets via AQE.
Avgidea Data Search and Avgidea Query Editor are offered as an add-on of Google Sheets, therefore users may install from G Suite Marketplace using an existing Google account (G Suite or Gmail) and immediately start using both components under free plan.
If users are willing to use the products without any restriction, they may purchase subscription plan from Avgidea's website. Users are eligible to receive updates of new features and upgrades.
Program for healthcare agencies / research institutions
For healthcare agencies or research institutions which are willing to access open data for COVID-19 related activities, Avgidea offers a free subscription license of ADS and AQE for 1 year. Please contact Avgidea directly if those institutions are willing to use products for such purposes.
Also, if any of healthcare agencies or research institutions already make research data available public through ckan or Dataverse instances, Avgidea can register them as endpoints in ADS.
About Avgidea Inc.
Avgidea brings various business ideas into practice using IT technologies and offers them to the world as service and application. We provide data platform service and also development and consulting business using G Suite and Google Cloud Platform.
URL : https://www.avgidea.io
Inquiry : https://www.avgidea.io/contact1.html
Achieving massive parallel calculation with Google Cloud Platform™️
avgidea, Inc. (Teppei Yagihashi, Founder CEO) has newly developed high speed Cross-tabulation system based on Google Cloud™️ for Paasons Advisory, a strategic boutique at Hakuhodo Inc. Data analysts has experimentally started using the system since March 2020.
avgidea brings various business ideas into practice and offers them to users as form of services or applications using IT technologies. We offer system development and maintenance using cloud technologies such as G Suite™️ and Google Cloud Platform™️ (GCP™️).
Paasons Advisory is a boutique dedicated to strategic planning within Hakuhodo. By fully utilizing cloud platform, Paasons Advisory offers highly agile planning service while constructing always-connected community at strategic layer with clients.
Passons Advisory utilizes Google Cloud G Suite for day-to-day work, therefore, instantaneously starts building cross tabulation application using Google Sheets™️. However, they have realized the boundary of this approach when data volume and a number of analysis axises grows, and performance degradation and spikes of CPU usage at client PCs are noticeable, which start affecting data analysts essential analysis work.
For above issues faced by Paasons Advisory, avgidea has newly designed and built a system, which distributes heavy workloads to cloud. In the past, workloads of “Store”, “Process” and “View” are only beard by Google Sheets. We distribute “Store” and “Process” to GCP and let Google Sheets to focus on “View” and successfully reduce load of client PCs. Further, we achieve parallel execution which is common in high performance computing area and lead to shortening total execution time.
The application built by avgidea for Passons Advisory can be utilized as a general parallel execution platform including cross tabulation so we plan building this as service and provide it to other companies.
Please let us know if you are interested in development or consultation on G Suite or Google Cloud Platform! We are happy to assist you!
A new pre-print research paper analyzing public meta data of D-Ocean has been published.
This research paper is about analyzing data jacket and D-Ocean's public meta data and evaluates structural characteristics as "data platform". Please read it through if you are interested!
Variable-Based Network Analysis of Datasets on Data Exchange Platforms
Teruaki Hayashi, Yukio Ohsawa (Ohsawa Research Lab. at Tokyo University)
Recently, data exchange platforms have emerged in the digital economy to enable better resource allocation in a data-driven society, which requires cross-organizational data collaborations. Understanding the characteristics of the data on these platforms is important for their application; however, the structures of such platforms have not been extensively investigated. In this study, we apply a network approach with a novel variable-based structural analysis to the metadata of datasets on two data platform services. It was noted that the structures of the data networks are locally dense and highly assortative, similar to human-related net-works. Even though the data on these platforms are designed and collected differently, depending on the use objectives, the variables of heterogeneous data exhibit a power distribution, and the data networks exhibit multi-scaling behavior. Furthermore, we found that the data collection strategies of the platforms are related to the variety of variables, density of the networks, and their robustness from the viewpoint of sustainability and social acceptability of the data platforms.